Tree-sitter support for SuperCollider

madskjeldgaard · February 11, 2021, 10:52am

Hello everyone
I am happy to announce a little pet project: I have been mapping the sclang grammar to be used with the awesome tree-sitter code parser. Tree-sitter allows for advanced syntax highlighting and very precise and scope aware code analysis. For now it is mostly used to create very nice code highlighting but because it is a very precise node tree of the document it could potentially have more usecases (such as very precise syntax errors that are generated as you type). One of the things that I like the most about tree-sitter is that it reparses each node as you type (and not the whole document) which gives you immediate visual information about the code you are typing.

The superCollider grammar for tree-sitter is here:

And more info on tree-sitter:
tree-sitter.github.io/

madskjeldgaard · February 11, 2021, 10:54am

The grammar has been added to neovim via the nvim-tree-sitter plugin. To use it just open neovim and type :TSInstall supercollider.

If anyone feels like adding support for this grammar to other code editors, then please do so! I only use neovim so that’s one I have added support for. Would be awesome to see this in VSCode, Atom etc.

madskjeldgaard · February 11, 2021, 10:55am

Here is an example of what the node tree looks like under the hood

(
	{|freq=110|
		var sig = SinOsc.ar(freq: freq);
	
		sig * 0.5
	}.play
)

Is parsed as:

(source_file [0, 0] - [7, 0]
  (code_block [0, 0] - [6, 1]
    (function_call [1, 1] - [5, 7]
      (receiver [1, 1] - [5, 2]
        (function_block [1, 1] - [5, 2]
          (parameter_list [1, 2] - [1, 12]
            (argument [1, 3] - [1, 11]
              name: (identifier [1, 3] - [1, 7])
              value: (literal [1, 8] - [1, 11]
                (number [1, 8] - [1, 11]
                  (integer [1, 8] - [1, 11])))))
          (variable_definition [2, 2] - [2, 33]
            name: (variable [2, 2] - [2, 9]
              (local_var [2, 2] - [2, 9]
                name: (identifier [2, 6] - [2, 9])))
            value: (function_call [2, 12] - [2, 33]
              (class [2, 12] - [2, 18])
              (class_method_call [2, 18] - [2, 21]
                name: (class_method_name [2, 19] - [2, 21]))
              (class_method_call [2, 21] - [2, 33]
                (parameter_call_list [2, 22] - [2, 32]
                  (argument_calls [2, 22] - [2, 32]
                    (named_argument [2, 22] - [2, 32]
                      name: (identifier [2, 22] - [2, 26])
                      name: (variable [2, 28] - [2, 32]
                        (local_var [2, 28] - [2, 32]
                          name: (identifier [2, 28] - [2, 32])))))))))
          (binary_expression [4, 2] - [4, 11]
            left: (variable [4, 2] - [4, 5]
              (local_var [4, 2] - [4, 5]
                name: (identifier [4, 2] - [4, 5])))
            right: (literal [4, 8] - [4, 11]
              (number [4, 8] - [4, 11]
                (float [4, 8] - [4, 11]))))))
      (instance_method_call [5, 2] - [5, 7]
        name: (method_name [5, 3] - [5, 7])))))

scztt · February 11, 2021, 11:22am

This is phenomenal mads !!!

One immediate use-case for this would also be to fix the SuperCollider IDE’s long-broken auto-indent logic. I’ve tried several times to improve this, but it’s a pretty tough problem if you don’t have a proper AST and I can never get all of the cases right.

There’s another interesting use case for this as well… sclang would gain a lot of power if it could introspect into it’s own AST. Currently, our AST’s are represented as C++ objects, which - for various reasons - are difficult to “keep around” after compilation, and to expose to sclang. But, a much simpler solution would be to have our C++ AST simply assemble a tree-sitter AST. We’d be skipping the most interesting part of TS - the parsing - and just assembling the nodes directly (we’d need to use some private interfaces for this, but it looks do-able). The advantage here is that sclang would gain the ability to both inspect it’s own AST and do the cool “live” parsing that TS provides, which would be quite interesting for e.g. creating livecoding languages.

VSCode has good tree-sitter support, I may try to add this at some point…

madskjeldgaard · February 11, 2021, 11:48am

That’s great Scott!
I should say that the grammar I have written is still work-in-progress (see the readme for a todo list) but I would say it has 95% of the sclang grammar implemented now (and I am constantly new esoteric language features (mainly by stress testing the tree sitter parser with Fredrik Olofsson’s tweets)).

I think that all sounds like a great idea. I think it would be awesome to implement it in the scide (and one of the dreams here is to allow beginners to see syntax errors precisely and immediately without having to wait until executing the code to rummage through the post window for details)*.

If anyone is new to tree-sitter then this talk is a pretty good intro

*The grammar needs to be fully implemented for this to work properly of course

jamshark70 · February 14, 2021, 2:14am

Nice

This seems relevant to one of my long-standing wishlist items: Support in the IDE for custom syntax.

We allow users to create custom syntax using the interpreter preprocessor.

The IDE rigorously enforces one and only one syntax. Particularly restrictive to me is that the IDE permits running code interactively only for .sc and .scd files (execution is disabled for other file extensions ), and these file types handle syntax highlighting in one and only one way. Implicitly, then, the IDE discourages live coding dialects. I’m quite sure this is not the message we want to send, but it is the reality now. My live coding dialect looks hideous in the IDE, and there is currently nothing I can do about it except switch to emacs.

If tree-sitter parsing were integrated into the IDE, it would open the door to custom tree-sitter profiles for other file extensions.

It would also be an opportunity to address some of the nagging inconsistencies in cursor navigation in the IDE. Some of them are quite silly indeed.

hjh

madskjeldgaard · May 16, 2021, 5:37pm

I’ve updated the README quite a bit to make it easier to contribute ( an implicit cry for help, haha! ) GitHub - madskjeldgaard/tree-sitter-supercollider: SuperCollider grammar for the tree-sitter code parser

thresholdpeople · May 16, 2021, 8:29pm

do you have a recommended workflow for installing and using this?

i’ve mostly been using scide, but have also just recently installed atom - which seems to have tree-sitter support - and supercollider for atom

is there a best practice for where to install your repository? i suppose i’m specifically asking about where to put it so atom will recognize it.

tree-sitter only mentions that atom uses something similar, but different, but i take it this is outdated info?

madskjeldgaard · May 16, 2021, 9:01pm

I’m sorry but I don’t use Atom, but I’m pretty sure there must be a recommended procedure for using it in Atom.

For comparison: In NeoVim, which is my ide, you need to install a tree-sitter-plugin which has a built in link to my repo allowing you to run the command TSInstall supercollider to automatically install it . Maybe there is a similar workflow somehow in Atom?

thresholdpeople · May 16, 2021, 9:24pm

thanks mads, will dig deeper.

i tried installing neovim, but too much head scratching for me at the moment - installing vim-plug was pretty challenging for me. i don’t have enough experience using terminal for things to make the process efficient.

thinking that atom is going to be more newbie friendly.

madskjeldgaard · May 16, 2021, 9:40pm

Yeah I think you’re right ! Let me know how it goes

madskjeldgaard · May 18, 2021, 9:03am

The grammar has gotten a significant update and is getting close to finished. I’ve added some showcase examples where you can see LSP type usages of this in action:

Sam_Pluta · May 18, 2021, 9:42am

Poking around a bit (and these tools are new for me, so don’t quote me on this), it looks like if Mads were to publish the parser to npm:

https://flight-manual.atom.io/hacking-atom/sections/creating-a-grammar/

and then we install it in Atom. I don’t know how you then use the tree-sitter parser with sc in Atom, but it seems like you can just swap out the parser and keep the rest?

Sam

madskjeldgaard · May 18, 2021, 9:59am

ah that’s great Sam. I think most of the work is done in that regard: tree-sitter-supercollider/package.json at main · madskjeldgaard/tree-sitter-supercollider · GitHub

I guess I just need to figure out how to publish it on npm

madskjeldgaard · May 18, 2021, 10:29am

I’ve published the parser om npm now, if anyone wants to try it out with Atom. And if so, could someone add instructions on how to use it/set it up in atom either here or to the readme on github? I don’t use Atom so I can’t be arsed to do it honestly, haha.

thresholdpeople · May 18, 2021, 11:47am

thank you Sam and Mads! i’ll give this a try this evening - i was hoping to try installing it before work just now, but i think i’ve stumbled upon what’s been causing me issues getting neovim and vim plug and etc working. seems like node-gyp and xcode CLT disagreements on macos catalina.

thresholdpeople · May 18, 2021, 5:28pm

hmm, well it’s not going to be so straightforward in Atom, I don’t think.

Atom wants its grammars bundled differently than tree-sitter - I don’t know if this makes a real difference or not in practice - the tree-setter way vs Atom’s tree-sitter implementation:

from Atom manual - Sunsetting Atom - The GitHub Blog

The Package

Once you have a Tree-sitter parser that is available on npm, you can use it in your Atom package. Packages with grammars are, by convention, always named starting with language. You’ll need a folder with a package.json, a grammars subdirectory, and a single json or cson file in the grammars directory, which can be named anything.
language-mylanguage
├── LICENSE
├── README.md
├── grammars
│   └── mylanguage.cson
└── package.json
The Grammar File

The mylanguage.cson file specifies how Atom should use the parser you created.

Basic Fields

It starts with some required fields:
name: 'My Language'
scopeName: 'mylanguage'
type: 'tree-sitter'
parser: 'tree-sitter-mylanguage'

It seems very strange to me that Atom would allow tree-sitter to work, but in order for it to do so, a large reorganization of the documents, file names, etc is necessary!

At this point it seems like it’ll be simpler to just get Neovim working!

madskjeldgaard · May 18, 2021, 5:48pm

Hmm that’s very strange since tree sitter was originally made for Atom AFAIK. There must be a nice command or something we can run to make this work

thresholdpeople · May 18, 2021, 6:00pm

in looking around at some other grammar files in Atom - for instance crucialfelix’s SuperCollider for Atom implementation, it seems that these headers and naming conventions are the way their other system works… so maybe what they’re calling integration or compatibility is actually closer to conversion.

thresholdpeople · May 18, 2021, 6:38pm

hmm, I just found this: Guide to writing your first Tree-sitter grammar · GitHub - it has a section on setting up a tree-sitter language in Atom. I will explore it.

According to that guide, while it will be necessary to have a grammars folder, and tree-sitter-supercollider.cson file, that file, and the packages.json file pretty much will work to link to the originally installed tree-sitter language elsewhere. So updating the tree-sitter files shouldn’t require doing all of these steps again.

These instructions are for how to graft an existing language into an existing package in Atom, while the document that Sam linked to seems like it’s meant more for how to build a package around the tree-sitter stuff.

Will report back!