Frankly I wonder if an even simpler hack, pre-hooking the perform (in C++) to allow a custom even if fixed-prehook thing before any method is called wouldn’t give us at least a much better Prototype. Basically, instead of only allowing a post hook as doesNotUnderstand there could be a pre-hook that allows hijacking of method calls. Somewhat risky to use of course, but it would allow a much better know. There’s the issue of how to handle exiting such a pre-hook into the normal method lookup, but if I understand the primitive call fallback mechanism, one could simply have a “standard” _perform which when called from this custom prehook would also exit it. So, the default code in Object for this prehook would simply call into that
Proto {
var ownDictOfMethods;
performPrehook { |selector ...args|
var m = lookupInOwnDict(selector);
if (m.notNil) { ^m.perform(args) } { _perform }
}
}
Essentially, this would mean inserting a call to an sclang method into the method dispatch logic itself – and it would have to do this for literally every method call. It’s probably reasonable to expect at least an order of magnitude’s slowdown from that.
Hmm, what about checking (in the C++ dispatch code) if the preHook method is not overridden from Object’s and skipping the call into it if it’s not? That can’t bee all that expensive and the answer could even be cached per class.
The issue (I have) with Proto right now is that there are hundreds of methods of interest, but Proto only overrides a few, so it’s not a very scalable approach. E.g. I wanted to make a Proto with a custom do, but it turned that (even) do is not overridden in Proto, so not customizable (via Proto’s dict) without subclassing or extending Proto first to override do as a method… and that needed a classlib recompile. It’s true that you only need one recompile per method that you figure out you want to be able to override (in Protos) at runtime. Perhaps some kind of Proto-generator that inspects the classlib and adds every single method might work, but I see issues with conflicting arg lists with that approach…
Honestly, even a better Proto will probably not be very satisfactory because there are over 100 checks of the isKindOfSlot in the C++ code which will fail for Proto-d stuff as opposed to actual sub-classing. Perhaps a way around this would be if for every class in the classlib there was an auto-generated Proto-class with the methods auto-made overridable. E.g. for Symbol you’d get a Symbol_Proto, for Dictionary you’d get at Dictionary_Proto etc. But doing this for all classes could easily double the classlib compile time.
The other possible issue is (and here, I’m not completely certain) that the interpreter may not be reentrant. That is, there are backend functions (MIDI and OSC response functions) that prepare the stack and call runInterpreter, but I tried to do this once from within a primitive and kaboom. I’m not sure it’s even possible for the interpreter, when figuring where to dispatch a message to, to call into the language to do some of the work. Maybe it’s possible; maybe it could be possible with surgery; maybe it’s just not possible.
I’d also suggest that it may not be highly important in the end to have a perfectly transparent pseudo-class. I have more extended reasoning about that but I’ll leave it here for now.
James is right, adding a “pre-intercept” for all method calls would be inordinately expensive. Simple things like accessing public properties of an object (which currently compile to / are executed as something equivalent to a O(1) method lookup plus a stack push), would all of a sudden need to do a second method lookup, allocate a frame, execute the intercept method, before continuing. For property lookups, this could be orders of magnitude slower even for a one-line intercept function.
isKindOfSlot should only really be used in C++ when reasoning about the slot structure of an Object (in cases where C++ wants to look up a property of an object without perform-ing it via the property getter method). These cases would not / will never work with Proto, because it’s “slot structure” can vary at runtime, so there’s no way for C++ to look up it’s slots anyway. There are better ways to encode these kinds of memory structure details than linking it to a SC type, but even the much more “smart” implementations of this don’t really get you a working, fully dynamic Proto. There are probably many cases where isKindOfSlot is used more freely than is strictly necessary, so it’s not impossible to remove some of these. This would mainly require that no downstream code makes any assumptions about e.g. slot layout, size, etc. of an object - this is not always easy to determine analytically, but may be something to consider if you have specific cases you want to fix.
A much better fit for sclang would be to use the existing doesNotUnderstand hook as-is (since it entails no extra cost on other non-Prototype-y parts of the language). You’d ideally need to:
Define a very small subset of Object methods whose presence/implementation is hard requirement.
Define another subset of Object methods whose implementation is effectively “optional” for Object.
Define a class that re-points all methods in [2] to doesNotUnderstand. This class becomes your proto, or a base class of your Proto implementation.
It is not possible to do [3] without boilerplate right now, but it’s at least possible to do in C++ - and the boilerplate implementation is pretty low-cost. There will be things in list [1] that are annoying and may block the prototyping that you want to do - this is a guess, but I think fixing a handful of individual cases here could be pretty trivial.
You’ll run afoul of isKindOfSlot limitations in the backend - but these, as I said, represent either hard relationships to slot structure / storage or simply overzealous type-checking that can be removed. You’ll also run afoul of isKindOf limitations SC - in cases where these aren’t necessary, removing them can be pretty easy. With the changes in [3] it may be that you can just override isKindOf itself, and embed your own little type system inside of SC.
On other proposal (I think from Julian, back in the day) was to have a nearly-empty superclass of Object (which currently the library-compiler disallows), e.g.:
NullObject {
doesNotUnderstand { ... }
}
Object : NullObject {
... pretty much everything that is in Object now...
}
Then a prototype class could inherit a basically empty set of methods from NullObject, leaving everything to be “soft-definable.”
Edit: The “empty superclass” would at least have to define *new calling into the primitive that is currently called by Object *new.
Isn’t that what scztt is doing with his Neutral class (&quark). @scztt how do/did you test that anyway? Did you modify the compiler to use a different base class?
I can’t recall the exact reasons that it’s not possible to actually have the Neutral class as it’s written. But, behaviorally, a proper Neutral class would behave exactly as if I had simply manually forwarded all of Object’s non-essential methods to doesNotUnderstand. To be honest, doing this manually would be a fantastic investigation - in part, it would allow us to split Object.sc into pieces that are a priori requirements for the language, and pieces that are “semantic sugar” methods, which a great refactoring task whether or not it leads to any new functionality in the core library.
And yes, Neutral is julian’s, not mine. IIRC it was a proposed solution for the rather hairy problems of Rest, wasn’t it @jamshark70?
I can say, from painful experience just now, that you can’t do something like:
Neutral {
// I believe this will be necessary
// *all* instance creation eventually ends up
// in this primitive (by chains of `super.new`)
// without this, no subclass of Neutral
// would be able to create any instances
// (possibly need newCopyArgs too)
*new { arg maxSize = 0;
_BasicNew
^this.primitiveFailed
}
// without this, of course, a Neutral
// would fail to understand doesNotUnderstand
// --> infinite recursion
doesNotUnderstand { arg selector ... args;
DoesNotUnderstandError(this, selector, args).throw;
}
}
Object : Neutral {
...
}
… because currently, the only way to indicate that Neutral should have no superclass is to omit the specification, and the class library compiler assumes that every class that doesn’t specify a superclass should inherit from Object, so Neutral ← Object and Object ← Neutral. The class library compiler doesn’t handle this, so it goes into infinite recursion and the memory allocation takes down your machine.
It would be possible to do this with new class library rules:
Neutral inherits from nothing.
Any other class’s default superclass should be Object.
(Object doesn’t need to be handled specially because it should say Object : Neutral.)
(Neutral subclasses don’t need special handling because they can also indicate : Neutral.)
Reasonable enough.
I don’t remember, this many years later (and I’m not sure it matters now ).
Agreed that this is annoying and a strong motivation to avoid compiling the class library, but I’m afraid I couldn’t reproduce it on a Win 10 machine (with SC 3.11). After a bootup just now, the first library compilation took some 25 seconds; the second, 1.7 seconds. File system cache failures are likely to be specific to your machine, unfortunately.
Another way to speed up recompile cycles is to close the help browser when you’re not using it. If the help browser is closed, recompiling will skip help file indexing (until you open the help – I do understand why we want help to be visible by default, especially to new users, but index-upon-recompile is expensive – I don’t understand why it can’t wait to index until the user requests something in the help browser).
The thing is that on Win 7 that initial SC compile classlib is abut 4 times shorter! (2.5 secs vs 10 secs. And yes recompiles are down to 1 sec on Win 7. This all on an Samsung Evo 860 SSD + 12GB RAM.) There’s something really weird going on with caching and/or classlib compile in general on Win 10 making it 4-5 times slower than on Win 7 on the same machine. (Audacity is another program with slower startup on Win 10, but e.g. FL Studio 20 starts blazing fast on both Win 7 and Win 10–starts faster than even Audacity on Win 10!) Also I’m still on Win 10 1903 not 1909 (latest) for Explorer search-bar nuisance reasons. (I have more than enough SSD space to clone my Win 10 installation and upgrade only a clone to 1909 to see what happens, but it’s still work… I hand’t found time for.)
Oh, but I’m doing that already. The help browser indexing business is even more painful on Win 10: 5 secs on Wn 7, 18 secs on Win 10.
One way to handle this which I actually though was already implemented (possibly just in a private fork–because how else could someone have even tested Neutral) is to make the compiler not auto-inherit from Object classes put a special dir (in each Quark etc.) I actually thought BaseClasses as a dir was handled specially like that much like e.g. SystemOverwrites is.
It seems to be using bfs::ifstream’s open method (PyrLexer.cpp line 183). I don’t think we’re doing anything special with file opening.
Or, look at it this way: if our code to traverse the class tree hasn’t changed between your two environments, then it must be either something in boost filesystem, or in Windows, or in the interaction between bfs and the different versions of Windows.
Really any kind of OS difference - both Win versions or Win <> Mac - points toward parts of the compilation process being disk bound. This matches observations I’ve made when profiling compilation in the past. I would venture a guess that it’s caused by signing issues in Win10, but it might as well be caused by running SC off a USB stick, a network drive, or from slow Raspberry Pi memory - for me, this is less an issue of OS fileystem esoterica, and more a straightforward problem of making the pipeline asynchronous, so we can do compilation tasks while we wait for more files to be loaded.
We’ve already shed a ton of weight in our compilation time, and the algorithm is still pretty straightforward and naive (naive not meaning bad, just meaning… written the normal way you’d write something like that). I feel pretty confident if someone took up the work of splitting reading, parsing, and compilation into queue-able tasks - maybe do some simple caching? - we could get the time down to <1s or even <100ms on most normal platforms.
On my Linux machine, where filesystem caching works properly, I usually get 1.1 - 1.2 seconds, which is acceptable IMO.
On the Windows machine I tried yesterday, filesystem caching also seems to be working properly. It seems that something is interfering with filesystem caching on RFluff’s machine: maybe an errant antivirus definition for .sc files?
But I guess I’m not quite following that point. Let’s assume for the moment that half of my 1.1 second compile time is compilation and half is filesystem access (sped up by caching) – well, let’s be really generous and say 0.8 seconds is compilation. If the filesystem cache doesn’t contain our *.sc files, and it takes 10 seconds to compile the library, async compilation queues would allow the 0.8 seconds to be done concurrently – and the most improvement you could hope for is to bring it down to 9.2 seconds. It’s not been done probably because saving 0.8 seconds isn’t worth the engineering effort (especially if the real pain point is the disk access).
I think it would be better to put the effort into tweaking the compiler to allow classes that don’t derive from Object (or, as you suggested, a class that explicitly forwards almost every Object method selector to doesNotUnderstand). Or, if we know there are some environments where filesystem caching isn’t reliable, it might be reasonable for sclang to cache the contents of class library files (reading from disk only if the timestamp on disk is newer than the last cache time).
I can’t really compare those because I have more extensions in Linux (though Windows generally performs worse over a smaller number of files). Also, if we’re looking at the impact of filesystem caching, the relevant measure is not OS vs OS, but without cache vs with cache.
OS
Without cache (first run)
With cache (subsequent)
Windows
Between 10 and 25 seconds
0.6 to 1.8 seconds
Linux
Usually 10 to 12 seconds
1.0 to 1.2 seconds
In both environments, the .sc files being present in cache makes a massive difference to startup performance.
AFAICS there is something in your system configuration that is causing the FS cache to perform poorly, and whatever that is, it isn’t happening on either of my machines. My best guess is an overactive antivirus.
There are apparently some fairly real file system slowdowns on Win 10, and pretty hard to diagnose
And from the very end of that page a MSFT guys says that Win10 1903 has the fixy fix for the slow FS perf… but apparently not fixed enough compared to Win 7…