Quark Versioning / Dependency Management

over on github @scztt wrote:

versioned Quark dependencies don’t really work - if your accumulated dependencies includes Foo -> 1.0 and Foo -> 1.1 , you will just get one of the two in an unspecified way, and probably something will be implicitly broken. In order to ever fix this, we would at least need versions to be meaningful and parse-able (e.g. semver versions like 1.2.3) - which they currently are not. Maybe this is a change to rev the entire concept of versions in the Quarks system, so they actually work the way they look like they should?

as background, in this issue, Quarks version can be specified in the description and the tag separately/inconsistently · Issue #6030 · supercollider/supercollider · GitHub , @MarcinP points out that currently there are two ways of specifying a Quark’s version - in the .quark file under the \version key, and (hopefully!) in a git tag… and that these can conflict. The \version in the .quark file is displayed but otherwise not used…

What’s the way forward here? Can we deprecate the \version key and rely on/require semver labeled git tags? What to do about the many Quarks that don’t comply now? (I guess It is possible to tell if a given commit is ahead or behind another in the same branch using something like git rev-list --left-right --count but… )

maybe first check for a git tag, and only if absent, fall back to \version?

Though there is no guarantee that git tags follow a useful versioning scheme!

I mean looking for a tag in semver format - if someone insists on creating tags with e.g. nonsensical version ordering (like tagging 0.0.5 after 1.0.6) it’s their choice obviously but I’m not sure if we should try to be everything to everyone in such case.

(Actually at this moment it’s not yet clear to me how to specify a (minimum) version for a dependency in the .quark file. I’ll have to to some searching…)

per this helpfile Using Quarks | SuperCollider 3.13.0 Help

dependencies is a list of Quarks or git urls with optional an @refspec
Bjorklund
cruciallib@tags/4.1.4

I don’t believe the tags are interpreted as “at least” a certain version number (this is the problem!)

Here’s some suggestions for a way forward, a bit of an informal roadmap. ALL of these things would not need to be done to properly “rev” quarks, but I would consider this to be a rough order of operations - I would expect this all to be “beta” until everything from this list is covered.

I think doing these things would be a HUGE step forward for SuperCollider - among other things, it would open the door to moving core library functionality like JITLib to properly versioned quarks, and manage separately from the “core” classes where there’s a MUCH higher bar for backwards compatibility. Note that moving core functionality to a quark does NOT mean removing it from the default install (e.g. the “minimal PD” edition that is equally loved and loathed) - it would be straightforward to have a default list of quarks already-installed as part of the base SuperCollider app package. It would simply mean that parts of what is now the core class library could be versioned separately from the core library, and in a predictable way - this would mean e.g. @julian could make major upgrades / breaking changes to JITLib without requiring that users with “old” jitlib code be stuck on an old version of SuperCollider forever - obviously a big benefit.

  1. Create a new spec for Quarks, e.g. Quarks 2.0. Most existing quarks are unversioned, or don’t follow a formal versioning convention, and trying to keep backwards compatibility here would force a huge level of complexity, and probably ruin the predictability you want from a package system. This is a chance to clean house. The existing quarks system will remain available for those who want to use it.
  2. Include specific requirements for versioning in Q2 - this should probably be git tags with a specific semver syntax. I’d opt for git tags, because anything filesystem-based would require iteratively checking out every commit for a quarks file to discover all the versions - this is untenable for an entire archive of hundreds of quarks. Additionally, git tags are discoverable via HTTP requests to most git servers (github, bitbucket, etc), so it’s possible to do package downloads without having a local copy of git at all.
  3. Specify YAML as the format for the Q2 specification. Right now, quark files are executable sclang code, which means they can’t be consumed by external tools, and can’t be used if you e.g. bork your SuperCollider install by installing a bad package. sclang can already ingest yaml.
  4. A new quark file spec should include a few things that are missing now:
    1. sclang version requirements specification, in semver format ofc
    2. A pointer to a unit test or set of tests that validate that a quark is installed and runnable in a basic way. This can be minimal (e.g. just create a class) but IMO is a hard requirement, since a LARGE number of quarks in the archive now are not installable, cannot be instantiated in even a basic way, or will immediately render sclang unbootable. These can be used to check compatibility with new sclang versions, and discover unmaintained quarks in the archive.
    3. A mandatory description and author contact info field (these both exist, but they are often unused)
    4. A dependencies section that includes fuzzy semver specifications (e.g. >1.2)
    5. A “special” version designation for “1.0” quarks. This would install via the old quarks system, and could probably be accompanied by a warning “e.g. you’re installing an Q1 quark, this can break your sclang install and may not play well with other quarks”.
    6. A specification for a help file that has an overview description of the quark, or e.g. to it’s core class - something that will enable a user to figure out how to use it without guessing where to look in the help system.
  5. Create a Q2 “front-end” class that is able to execute a new sclang instance WITHOUT any quarks installed (this should be possible by ignoring the default sclang_config.yaml). This would need to be able to (a) install a list of quarks (this could be passed via a temp yaml file), (b) boot a temporary sclang instance with the “new” quarks list to ensure that the configuration is valid, e.g. quarks are not breaking compilation, and (c) run unit tests for those quarks to ensure there are no obvious failures, e.g. errors creating a core class. If this fails, the quarks install should fail and the users config yaml shouldn’t be affected - this should prevent quarks from trashing your SuperCollider install in most cases. The best process would be to install and test individual quarks in a sort of depth-first traversal of dependencies, recompiling in between - that way, you can give a specific error message e.g. “installation of quark X failed”.
  6. In order to facilitate different quarks configurations installed that are usable on the same system, installed quarks should be copied to a version-based path RATHER than sharing a single git repository. In other words, if a quark was cloned to downloaded_quarks/my_quark, then installing v1.2.3 should check out that tag, copy that to versioned_quarks/my_quark/v1.2.3, and point the user config file to that location. If this isn’t done, it will only ever be possible to have a single configuration for a given system, since installing quarks could change the state of quark git repos.
  7. Proper “fuzzy” semver requirements needs both dependency resolution and a lock file representing the “resolved” quark versions - these features are common to all package managers…
  8. Ruby has a similar package system as SuperCollider - namely, there must be one valid version of every package for a given set of requirements (unlike e.g. node, where you can have 7 verisons of a package installed for a given project, each depended on by a different other package). I would suggest copying this algorithm direct from the ruby package manager - I also implemented a version of this for quarks in this tool: https://github.com/scztt/qpm/blob/master/qpmlib/dependency_graph.py - this would need to be translated from python to sclang, but I think the algorithm is pretty clear and doing it line-by-line should be easy.
  9. I’m not sure what to do about the lock file - probably the “lock” could simply be the list of “final” quark paths in the config yaml file, especially since they should include version numbers in the path. This still makes it difficult to “recover” a specific quark configuration on another system - one option here might be to add a new area in the config yaml, where quarks could be specified by name-and-version rather than by include path. This way, you could specify your “quark repository path” separately and the yaml files become less system specific.
  10. Existing quarks should be incrementally moved to the new format, probably starting with well-known / highly used quarks. This would probably entail adding the minimal unit tests I mentioned, but since many quarks right now don’t work anyway, this would be a good way to purge broken things and point new users to well-maintained and high-functioning quarks. All legacy quarks that are moved forward should be marked with a sclang version requirement of ~3.X where X is the current minor sclang version (so, they would not be installable on versions before X). Note that because dependencies can specify legacy quarks, this means we don’t need to move ALL the dependencies of a given quark forward in order to move it forward, though obviously having legacy dependencies should be avoided as much as possible. This should give us guidance as to what quarks to move forward next.
  11. Unit tests should be added that install and test all quarks in the catalog via the install class I mentioned earlier. sclang changes can easily be tested against the entire quarks library, to detect new regressions and quarks that don’t play will with a change. Obviously, maintained quarks should be fixed, unmaintained quarks or ones where a fix is untenable could have their version pinned to an older sclang version - this should be exceedingly rare though.
  12. Once a critical mass of quarks is in the new system, the old quarks could be explicitly marked as “legacy” in the UI or explicitly hidden e.g. behind an “show legacy quarks” checkbox. This should eliminate a HUGE amount of new user frustration and instability.
6 Likes

This one. Please. If you could give a folder with an scd file and all needed classes and have it just work, that would be awesome. Right now…nightmare of possible conflicts.

Sam

hmmm I believe there is code in this PR from @miguel-negrao that implemented relative paths in sclang.conf.yaml

If anyone familiar with C++ wanted to peek and see if there’s anything here we could use…

This is the first time I realise that this is a consideration. Then the pattern system or the GUI classes may be candidates? Would it be okay not to presume anything for now?

With regards to the above, there is one issue with the quark system and I wonder if there could be improvement in this: Moving classes from a Quark to the core library and back is really a hassle, because it is very easy to get duplicate class errors. Having class extensions in often used Quarks will silently build up a dependency on that extension, and when it is gone, it will break code. It may be hard in further future to reconstruct the dependencies. Organising this in larger communities is even harder. Can we learn from other communities (like Haskell maybe) how to best achieve this? Or is something already planned that would solve it?

I think this is where having a concrete sclang version dependency (4.1 above). Suppose that a quark specifies a dependency on sclang v3.20 - this means that it has been fully tested and supports this version. Supposing that we allow breaking changes in minor sclang versions, which is SORT OF allowed now (e.g. 3.21 can remove things, rather than moving to 4.0). What does an update like this look like?

  • sclang 3.21 removes ClassX.
  • MyQuark 1.1 expects ClassX to exist, and it also depends on <=sclang 3.20
  • With no work, MyQuark will not be installable in sclang 3.21, because it does not list support this version.
  • When sclang 3.21 is being released, there are several things that could happen (assuming here that we DON’T KNOW that this quark necessarily depends on ClassX):
    • The sclang classlib team installs and runs basic unit tests for MyQuark as part of the release process for 3.21. This should immediately reveal 99% of missing-class-related problems, since these are checked at compile time (it’s also possible to detect potential missing classes at runtime, I’m doing this in QuarkEditor.quark). They could then choose to propose a fix to the quark themselves, or notify the maintainer.
    • The maintainer of MyQuark removes references to ClassX from the code, and releases a new version with sclang_version: "<=3.21", or possibly strictly limiting to 3.21 if it won’t run with older versions. This should resolve the problem more or less transparently for users.
    • The sclang classlib team moves the removed code to e.g. ClassXDeprecated.quark. The maintainer of MyQuark releases a new version with this additional dependency, allowing it to operate exactly as it did before.

I think the class extension case is the only one thats a little wierd - generally, I’d imagine that a class extension to a class that doesn’t exist is usually NOT a fatal error. IIRC this is currently only a warning anyway, but probably it would be good to have a more formal way to specify in code that a class extension is optional, and if the class doesn’t exist, no warning/error is necessary.

I think the key thing here is that the dependency, with a revised quark system, is NOT silent - it’s expressed as a dependency on a version of sclang. Obviously this is a monolithic dependency, which can create problems - this is one reason why it would be beneficial to start slicing sclang up into modules: it allows dependencies to be specified in a more specific and focused way. The above change would instead be a change to something like Core_ClassX.quark rather than sclang as a whole, which would mean a much easier upgrade path for quarks that might depend on this piece of functionality.

I imagine we could endlessly slice and dice the classlib into modules - it all takes effort, so it should definitely be incremental and done according to real value. But, it’s VERY easy to map dependencies in the classlib, and dependencies between quarks and classlib functionality - most of this can be done by searching for references to classes at runtime. It’s a little harder to discover dependencies on e.g. extension methods, so modularizing code that contains class extensions takes a little more manual care.

Ultimately, I think there are probably some very obvious targets of opportunity, e.g. clusters of highly inter-dependent classes that are otherwise depended on by few external things. Jitlib and patterns are obvious ones - in fact, the directory structure of the classlib already sort of exposes what our modules would be, and I wouldn’t be surprised if these were already pretty isolated in terms of dependencies.

But yes, I think this is a purely incremental process - we SHOULD assume that this kind of refactoring would be possible and easy with a “v2” quarks system, but not assume that we would need to take any particular strategy with it…

(splitting conversation about sclang.conf.yaml to this thread Distributing code - relative path for sclang.conf.yaml)

@scztt Should Quark2 be implemented in supercollider? Are we using Quark2 to update the core library, and with it, Quark2 itself? Wouldn’t a tool that doesn’t depend on supercollider running be better?
Never really understood why Quark wasn’t a separate program to begin with as you have to recompile the class library anyway.

Some people have mentioned that install git on windows is a little more involved than elsewhere. Perhaps something standalone could be made, perhaps in go using GitHub - go-git/go-git: A highly extensible Git implementation in pure Go. ?

Thought it might be fun to try and map the dependencies within the current scclasslibrary by looking for class names directly reference in the source code.

I think splitting apart the existing class library without breaking anything will be harder that expected/require many class extensions which might make it harder to reason about the code.

The regex I’ve used will match things in comments (would be nice to have a sclang parser in sclang!), but seems pretty good. It also doesn’t matching things like thisProcess which implies a dependency on Main.
Obviously everything also depends on Object as well.

Here is files that reference other files.

sc
~getNicePath = {|str| str.asString.split($/)[6..].reduce('+/+') };

~getMentionedClasses = {|class|
	[class.asSymbol, ~getNicePath.(class.filenameSymbol.asString)] ->
	File.readAllString(class.filenameSymbol.asString)
	.findRegexp("[ ({][A-Z][a-zA-Z0-9_]+")
	.collect({|n|
		n[1]
		.reject([$ , $(, ${].includes(_)  )
		.asSymbol
	})
	.reject({|n|
		n.asClass.isNil
	})
	.collect({|c|
		[c, ~getNicePath.(c.asClass.filenameSymbol)]
	})
	.asSet
};

~r = Class.allClasses
.reject(_.isMetaClass)
.collect({|c| ~getMentionedClasses.(c, ) })
.asEvent;

t = TreeView().front;
t.columns_(["Class", "File"]);
~r.keysValuesDo({|k, v|
	var i;
	t.addItem([k[0].asString, k[1].asString]);
	i = t.itemAt(t.numItems - 1);
	v.do({|ar|
		i.addChild([ar[0].asString, ar[1].asString])
	})
});
t.canSort = true;

Here is Classes referencing other classes

sc
(

~getNicePath = {|str| str.asString.split($/)[6..].reduce('+/+').asSymbol };

~getFileConnections = {|class|
	~getNicePath.(class.filenameSymbol.asString) ->
	File.readAllString(class.filenameSymbol.asString)
	.findRegexp("[ ({][A-Z][a-zA-Z0-9_]+")
	.collect({|n|
		n[1]
		.reject([$ , $(, ${].includes(_)  )
		.asSymbol
	})
	.reject({|n|
		n.asClass.isNil
	})
	.collect({|c|
		~getNicePath.(c.asClass.filenameSymbol)
	})
	.asSet
};


~r = Class.allClasses
.reject(_.isMetaClass)
.collect({|c| ~getFileConnections.(c, ) })
.asEvent({|a, b| (a ++ b).asSet });


t = TreeView();

t.columns = ["File", "Count" ];

~r.keysValuesDo({|k, v|
	var i = t.addItem([k.asString, nil]);
	v.do({|c| i.addChild([c]) });
	i.setString(1, v.size.asString);
});

t.canSort = true;

t.itemPressedAction({|a|
	a.postln
});

t.front;

)

As graphs…

Directories referencing directories, the arrows means ‘references’, so an incoming arrow means ‘is a dependant’.

And then, just for fun, classes…

1 Like

Here’s the same thing but using the actual compiled result of the class library. This doesn’t catch dependencies on extension methods, and doesn’t catch cases where a class name is e.g. constructed dynamically from a string, but I would argue that these may not constitute proper dependencies anyway (“missing” methods are still valid calls in many cases, and construction a class from a string should always have error handling anyway, which would make it not a hard dependency). I mocked this up quickly, so it might not be totally accurate.

There are some very obvious slices that could be made. As I suspected, JITlib is pretty isolated - there are a few places in Common that depend on JITlib details, but these would be easy to eliminate. The GUI module could also easily be extracted, the only dependencies are things that should probably be class extensions anyway. We know UnitTesting is a proper module, because it USED to be a quark. SCDoc could be separated as well, it seems like the main things that depend on it are help doc GUI things, which could just be moved to SCDoc rather than the GUI folder. Quarks is highly separable, apart from some silly extensions. Predictably, most folders in Common have complex inter-dependencies - it might look different if we broke things down in some other way than by folder, but in the end I think it would probably be best to keep Common relatively monolithic anyway.

At a high level, this might look like splitting the class library up as: [Common/Audio/Bela, Common/GUI, Common/Quarks, Common/UnitTesting, Common/Unix, Common/{everything else...}, JitLib, SCDoc, Platform]. Probably pattern things could be split off as well (this would be beneficial), but at a glace it’s a bit more challenging to do.

Long ago I implemented a version of the Quarks system in Python, because I thought the same thing (GitHub - scztt/qpm: qpm). But, our biggest problem overall is participation, maintenance, and keeping a pace of quality-of-life improvements. Ultimately whatever we gain by implementing a Quarks system with a “more appropriate” language like Python, or using mechanisms that are more theoretically stable/reliable (nice command line tooling) - comes back to bite us as the number of possible contributors / maintainers gets trimmed down even further.

We can increase the stability quite a lot by simply doing quark installs with a “pure” out-of-process sclang instance that doesn’t load any external files, doing some basic sanity checking and rollback in case of failure, and maybe adding a “safe mode” to the IDE. All of this stuff can be added to the current Quarks system now, I think quite easily.

Removing the git dependency is also not SO hard - essentially all quarks are either on GitHub or a comparable git hosting backend. AFAIK all of these backends have public API’s to query repo tags and download files - the quark python tool I built just queries github directly to discover quarks. We can NEARLY do all of this now in sclang, but our only mechanism for doing this is the Download class, which isn’t a complete enough HTTP query API - we’d need to add some basic functionality to this for things like headers and slightly better response handling. I think this would be much easier and better than relying on an external tool, which just puts users in the familiar rat race of “oh, to install X, I need to install Y, but first that requires I install Z…”

2 Likes

With “silently” I meant that assuming that certain quarks are installed for most people (like the sc3plugins are now), the use of methods from these quarks will just be standard and their dependency go unnoticed. The fact that these methods depend on the quarks does not appear in the code, but in the langauge configuration. Later on, one will have no way to figure out what extensions are necessary to run a specific piece of code.

The underlying problem of sclang is one of its features: that the typical entity that is distributed in a community is a small piece of text, rather than a file or even a folder. It is an aesthetic as well, it is a unit of a small composition.

Now when we want to modularize, we have to find a way to support this common paradigm, otherwise we will lose this feature (which is central in my opinion).

When I wrote the String.include message, I had in mind that one can stick that in front of any bit of code a little easier. Still, this becomes clumsy, if you imagine that you need a handful of dependencies and their versions. Perhaps one could bundle dependencies into a single name for this.

But then, once installed, the dependencies stick around, they are not uninstalled after you run the piece of code. This means running pieces of code practically amounts to an accretion of dependencies over time.

In turn, for one’s own code, especially as a beginner, this means that you build up “silent” dependencies in the above sense.

When thinking this through it seems to me that the best thing would be to first implement the dynamic extension of the class library. As far as I know, the current method table could be replaced by a hash table, which can be extended.

Then, every small piece of code could carry with it a backback with a little library, which gets pushed onto that table and removed afterwards.

But without this, the cleanliness of modularisation might well just end in dependency hell.

2 Likes

Gotcha - this is a kind of low-level friction thing that is likely annoying to every SC users, and those of us who’ve been dealing with it for a million years just have tiny calcified pockets of our brains devoted JUST to managing this :slight_smile:

Honestly, a decent solution here is pretty reachable:

  • Specify a way to call out dependencies in a floating .scd file - i’m imaging a blog of yaml in a comment at the top, that mirrors the format of the Quarks dependencies field. This would act like a standalone quark in a single file.
  • Add a method to inject your current deps list into the current scd file - or e.g. resave it as a “frozen” scd file with explicit dependencies: +File { *saveQuarklet { |path, string| /* ... */ } }
  • Add a method to thaw a frozen scd, which would require (a) grabbing dependencies, and (b) creating a corresponding sclang-conf, +File { *loadQuarklet { |scdPath, targetFolder| /* ... */ } }

The only slightly clunky thing here is that you would need to run the above in it’s own sclang instance, which requires a recompile and some fiddling in the ScIDE. This is maybe solveable via deep changes to class loading as you mentioned, but honestly I think addressing in a basic way would make the workflow 100x smoother than now and it could be done immediately.

Incidentally, this kind of multi-engine, multi-configuration setup is planned for the VSCode client - once I can finish that work, having several co-existing configs running at once, and automatically picking up config from e.g. a sidecar file or workspace folder will be automatic.

1 Like

Re “silent dependencies” - implicit class dependencies can be detected with pretty high accuracy straight from the code, as long as your scd is compile-able. It’s also POSSIBLE with method extension dependencies but - I tried implementing this in QuarkEditor and it gave me too many false positives / misses to really feel reliable… And anyway, it’s a priori not really possible to detect method extension dependencies because of things like doesNotUnderstand - the best that can be done is to make a good guess.

Yes, agreed, this would be immediately useful, even if it may not solve all of the problem.

P.S. Let’s not move towards the big silo of VSCode if we can though. Don’t you think this may be feasible in the scide?