Downsizing the Class Library

Maybe something without the need to install ruby.

I just came across @lnihlen’s proposal from last year:

its worth reading in its entirety but I thought I would pull some quotes… She suggests three levels - Core, Middle (modules broken out from core but still shipped and supported) and finally Exterior: Quarks moved out of the project. She prefers a truly minimal Core:

The Core consists only of classes that define fundamental types, like Integer or Symbol , and services essential to the interpreter like Interpreter , Class , or Method . […] Developers can hold the test and documentation coverage to a higher standard with a smaller Core. […] The Core should be minimally small, including only enough code to support UnitTest and script validation.

she envisions automated “validation processes” for each of the modules broken out from Core:

The process can include UnitTest and integration test scripts maintained in the individual modules. Any developer making changes to the language or the Core library will need to validate the proposed change against every middle layer […] Developers can make better decisions because the validation signals are a credible source of confidence for developers (and reviewers) to gauge the impact […] of any change

She’s not worried about dependencies between the minimal core and the middle level since both will continue to be shipped. Re: the Exterior Quarks:

With improvements to library validation developers can gain the confidence needed to remove unowned modules from distribution.

She also points directly at Object being bloated (something @jordan has also posted about) suggesting that we should rely on class extensions:

Object is a clear candidate for the Core but has around 270 methods. […] Anything in Object not needed for validation should be moved to modules and added as class extensions. For example, things like isUGen and numChannels could likely go to an audio or synth module, whereas awake , beats , and clock could all move to a timing module.

1 Like

I find this Object situation a bit strange.

I’ve written a piece of code that ‘scans’ Class dependencies by reading the associated .sc file, and references every mention of a class name (like Jordan did before I think). This is biased because it doesn’t remove comments before doing so, and also reads the same .sc file several times if several classes are declared inside the same .sc file.

This can either only scan the class file, or the class file plus it’s parents.

This can also reference every class mentioned inside the starting class file, and keep scanning every class it encounters while doing so.

When I did this with Object, I had those results.

It would seem that Object.sc needs access to 882 classes (including itself) to be functioning properly. Due to inheritance, this means that every SC Class needs at least those classes to function properly.

I suppose those classes could be considered as the current ‘core’, where all other classes seems to be easier to be removed?

Anyway, I think that 881 dependencies is a lot for a class that is supposed to be a ‘root’ node, right? The trimming will be difficult if the trunk is too big, because the branches inherit from a lot of dependencies. But if we can somehow push those dependencies ‘away’ from the bottom of the tree, we might remove a lot of dependencies on areas where a dependency is not really needed (sorry didn’t find a better way to express this) ?

So to continue, the other way around :

A simple code to find a class dependants inside the Library. This only searches for the presence of the class name as a string inside .sc class files.

Due to inheritance, this isn’t enough. Finding dependencies also means looking directly for method calls inside a class. In the same manner as the one above, this code allows to search for method names (selectors) inside the class library files. It returns a list of every .sc file mentioning the method name as a string. There’s also a function to do this for every method a class provides.

This needs more work to be really usable.

  • I didn’t remove comments before searching for selectors. A comment containing the word “as” is sufficient for the algorithm to ‘think’ that the “as” function is used in the current class file.
  • Due to polymorphism, some results are collisions. It could be improved to detect method overriding, then it would need the ability to figure out if the method call is actually related to the analysed class or not.

I don’t know why you deleted your post. You proposed a particular example inside Object.sc where we could remove an (maybe) unnecessary dependency. I really think this is the right approach to figure out the difficulties we’re facing.

So I’ll be quoting you from the mail I received, tell me if you really want those words to be removed from the forum, but I think they’re useful.

We would only need to be sure that no other calls to these methods exist in Core.

Hence the code above. This still is a real thing to do, but I don’t think it’s scary. I doubt we can automate everything, but even if the numbers look ‘big’, for this particular problem, I think we’re facing a semantic issue, not a design issue. But maybe I’m wrong.

Where things could get tricky eventually is if there are calls to this method in modules other than the “synth” module…

I still don’t understand how Object can have so many ‘hard coded’ dependencies. Considering this is the root node every other class inherits from, it should have 0 dependency. Then, as you go up (or down if you live in Australia) the inheritance tree, there should only be dependencies related to superclasses.

Now I agree that derogating to this rule is a powerful way to implement things quickly and to have a modular syntax, but I think that’s how we end up in this ‘stagnation state’.

So to get back to your question, that is why I wanted to implement a tool that scans for dependencies.

  • When someone submits a new module, the module is scanned for it’s dependencies. It gets a position in the tree. This position is calculated so every dependency is placed before the module. This ensure the module can be removed without affecting the rest of the tree.

  • When the module is updated (code change), the new dependencies are scanned again. If one of the dependency is situated after the module inside the tree, the change is rejected. Ideally, the change isn’t just rejected : if there is no circular dependency between affected classes, we can rebuild the tree recursively. If there’s a circular dependency, the change is rejected. This is a breaking change.

  • This implies : removing a Class also remove all of it’s dependants. Cutting a branch from the tree means cutting the whole branch at once.

Before that, I’d like to point out that some .sc file declare only one class, where others declare several classes. It would seem there is no consensus about this, but I’d say that declaring one class per file and using folders to manage their position in the tree could reduce dependencies issues and increase readability. Regrouping several classes inside an .sc file could be useful too, but seems less manageable. Mixing both doesn’t seem a good idea.

This code allows to extract a class source code (as a string). This could be used to do the split mentioned above.

But since it reads the content of the .sc file, it doesn’t read additional manipulations that might have been altering the class after it has been declared, like debug.sc seems to do. I don’t really know how this file works, but this could be a solution to adapt Classes to versioning, or to dynamically add functionalities without introducing dependency.

So anyway I figured out that debug.sc is a ‘Class Library File’, but isn’t reachable using the Class class, because it is not directly related to a particular class. I don’t know if this situation can be considered a problem. I iterated through my class library folder, then removed every file that could be reached through Class, and here are the remaining files. I don’t know what the status of those files is.

yes sclang (like Smalltalk) allows methods to be added to classes in separate files (“class extensions”). So debug.sc adds a debug method to Object, RawArray and Collection. If (strictly hypothetical example!) a refactoring wanted to move Collection out of Core, the debug method extension for Collection in debug.sc would have to move with it.

So for now we have Object that depends on a lot of other classes, like Array.

This this in Object.sc :

dup { arg n = 2;
		var array;
		if(n.isSequenceableCollection) { ^Array.fillND(n, { this.copy }) };
		array = Array(n);
		n.do {|i| array.add(this.copy) };
		^array
}

We could remove this dup method from Object.sc, and put it in a different file. If every class needed for this method is present inside the current project class tree, this method is added to the Object class as a class extension. Object is not dependant on the subclasses, but retains it’s functionalities if the subclasses are available ?

If we wanted to factor out Array into a “collections” library (again just a hypothetical!) , then yes we would have to move Object’s dup method into that new library as a class extension. This is what Lucile is suggesting when she writes

There may also be opportunities to delegate responsibility for conversions for example to Target classes see this interesting discussion on GitHub Remove Object.as* method to make refactoring easier · supercollider/supercollider · Discussion #6065 · GitHub

1 Like

Is there a dedicated thread on this forum or github with the list of the future core class library?
I would like to know which ones will stay and which ones will be added.

Also, I would like to suggest some classes to include in the core library, but I cannot find the best thread to discuss such things.

For example, I think @julian’s Strang is good enough to merge with String. Where is the best place to do this?

Post it there. If you feel this is related to the topic, post it there. If you post it somewhere else, paste it there or paste a link to the post there.

I wrote three big comments above, trying to say ‘hey, having a tool to help us inspect classes dependencies might be a good idea, what d’you think?’, and only got semiquaver to answer. But when I started a thread about guarantees involved by SC’s development, everybody went crazy and posted useful links and brilliant ideas… The discussion revolving around this is spread across several threads on this forum, and a lot of GitHub pages. As much as I love anarchy and freedom, this makes it difficult to catch the whole picture at once.

‘Downsizing the Class Library’ is the first step to this whole Quark / Package Manager / SC4 thing. Every time we discuss those, we seem to discuss them all at once, because they’re interdependent. So anything related to this is likely to be impacting the ‘downsizing’ itself. So we might as well centralize the discussion here as much as possible, even if we focus more on the ‘downsizing’ here than later stages.


Regarding the ‘core library’, I have the feeling that it is a bit early to provide a list of core classes ?

Easy examples are : ‘Object’ should be in core, ‘LIDGui’ should be outside of core.

But what about Environment ?

ClassBrowser uses Environment. So maybe Environment should be in core to provide an easy way to write Quarks. But what I can do with Environment, I can also do with an object prototype using an IdentityDictionary. This is a bit more difficult, but allows us to remove Environment from the core, downsizing dependencies. But everything I can do with IdentityDictionary, I can also do with standard variables and functions architecture. This is much more work, but way less dependencies. So we could say Environment and IdentityDictionaries as objects prototypes are forbidden inside the core library to reduce it’s dependencies. Good then. But ‘core’ isn’t about dependencies only. Maybe we don’t need IdentityDictionary for the core to function properly, but does it mean IdentityDictionary is outside of the core ?

I have an opinion on those particular questions but I’m unable to know if this opinion is good or bad. This still requires me to get a better understanding on the situation. But that’s not important at all. What is important is : do the community have strict answers to these particular questions ? I didn’t see it written down somewhere. But I assume a consensus could emerge now that the question has been asked. People can advance their arguments, and we’ll collectively decide after that. Then, we’ll need to adapt the current code to fit what we decided.

There’s a myriad of tiny questions like that, and I think they will emerge as we try to understand which classes are currently problematic, and why, not as we decide what should be in core or not.


Still, that’s just a remark about how I see these steps succeed in time. Thinking about what could be the core library is important and needed.

1 Like

Perhaps it would be a good idea, in the interregnum until this type of change in the language is possible, to try to sophisticate the testing system as much as possible. Create a robust test of the old code, to continue safely when the time to major refactor arrives.

2 Likes

The following thread could be an example of why I want to not only reduce the class library, but also put in more needed classes and methods:

There is a Quark with this functionality, but it is not easy to find. Will the new Quark or the new package manager make it easy for users to find such functionality, minimising the time spent googling?

Yes, the interface would definitely need to be reworked so it’s nice and easy to use for beginners. During dev meetings, Dyfer put a lot of emphasis on the notion of ‘searchability’. I would say that, even if the Quark interface isn’t strictly a technical part of the software, having a bad one could cause more trouble than a classic ‘code bug’. I think the new Quark system should implement a robust tag system. More than that, I think we need to produce collective documentation (“A list of common used Quark”, “How to use Quark Foo”, “Introduction to the Quark system”). But I know how time consuming writing documentation is…

2 Likes

One option would be to

  • Leave the current Quark system as it is.
  • Introduce a new system on which each individual quark can only be published if the following conditions are met
    • OS compatibility
    • A complete Help document per class in that quark, following the Quark Help document template.
      • The document should be written so that users who don’t know that quark can follow and understand it.
        ← one problem: no machine or algorithm can do it.
      • The document should be understandable with basic knowledge of sclang and scserver.
        ← one problem: no machine or algorithm can do it.
    • dependency
      • complete lines of code to install this quark with the required quarks
    • conflict report by the publisher of that quark
    • conflict warning from the quark system
      • e.g.: this quark may cause problems with the following quarks: a, b, c…

Then the old Quark system would be used privately and the new Quark system would be used publicly, I suppose…

Scott proposed several steps to implement the new Quark system. I’ve seen several people agree with this proposal, and even if I’m not competent enough to tell if it’s good or bad, I trust him and will start working, slowly, on this.

One of the point is to :

  1. Define a new Quark 2.0 specification
  2. Adapt current Quarks (1.0) to the 2.0 specification

I suppose there will be a time where Quarks are still available as 1.0 while being also available as 2.0, so we can check if everything works, give users some time to switch from 1.0 to 2.0 (though I suppose a well done job shouldn’t impact users, but who knows yet), etc.

I’d say yes… and no.

Yes because what we intend to do is to move some of the current core functionalities inside Quarks so we don’t depend on them when a specific project does not depend on them. For example, QT. If your project is not using it’s GUI system, you should be able to run or share the project without QT being compiled or installed. Still, I think that QT is an important feature of the SuperCollider project. It allows to run the IDE, create Graphical Interfaces, and I see a lot of people using it. A QT Quark should definitely be well documented, OS compatible, tested, etc. Because even if it’s removed from the ‘technical core’, I think it will remain a ‘core feature’ of the SuperCollider project.

No because Quark is not only a developer tool, but also an user tool. Two examples :

A ‘beginner’ has a funny idea. It’s an artistic thing. Let’s say it’s a class that polls a UGen and draws shapes on a window based on the result of the poll. With this, you can easily have visual effects when you are livecoding. But the ‘beginner’ is not a ‘good’ developer, so his Quark doesn’t work on Linux, and he’s uncomfortable with writing documentation, so he didn’t do it. I don’t think those two reasons should prevent this user from distributing it’s work. I’d personally rather have people sharing bad code than not sharing at all.

Second example is that you can currently ‘hack’ the Quark system to use it as a project manager. I never did this, but I think some people on this forum could explain a bit more how they do it. Having to run tests or to write documentation would prevent this usage. But maybe this hack should be prohibited. Don’t know.

This adds complexity to the problem, because this ‘splits’ Quarks in two categories, that are difficult to distinguish (as you said, a machine can’t do that). But I think preserving user’s ability to create Quarks easily is fundamental.


First step is to take a look at how other projects do this, to get a better understanding of the problems, and the solutions :

Then we’re likely to develop a ‘Versioning Quark’, that should handle this automagically.
Then start implementing it into existing Quarks.
Then see how we use it to automatically compile the right versions of Classes.

3 Likes

In this case, I think the old Quark can be used. Otherwise, the new Quark system should have a “Visibility” option:

  • Private
  • public
  • Unlisted. Sharing is possible via a link

We are used to this in YouTube:

I think writing classes and methods is not easy for beginners. I think that beginners usually prefer to write functions instead of classes and methods…

Beginners who want to experience a lot of things learn and do experience from a lot of code and references. There are many cases where installing a quark causes errors when compiling: Duplicate classes found.

ERROR: duplicate Class found: 'ClassName'

list of the path to an sc file

ERROR: There is a discrepancy.

In my opinion, this is an obstacle to the development of learners and discourages their enthusiasm.

I found the following thread:

I would like to tell the developers and authors of these quarks to consider including them in the core library.

But I think nobody will agree.
Anyway, the quark mentioned there should be considered as important!

The opinions of the authors also seem to be important

One more thing:
It would be nice if the Quarks help documents appeared in the Quarks category.