RTNeural real-time neural inference UGen

Sam_Pluta · September 20, 2024, 9:58pm

Dear all,

I have made a new plugin which implements the RTNeural real-time neural inference engine:

A mac universal release build is here:

RTNeural is a SuperCollider UGen which uses the RTNeural inference engine to load and run RTNeural format neural network models at audio and control rates. See the RTNeural github page (GitHub - jatinchowdhury18/RTNeural: Real-time neural network inferencing) for neural network layers supported.

Models can be trained in pytorch or tensorflow. RTNeural was designed with tensorflow/keras-shaped models, but I give examples on how to save Linear, LSTM, and GRU based networks in the correct format.

-Linux users should be able to build with the build instructions.
-HELP - The Readme has instructions for building on PC. I am able to build it on Parallels, but the scx file is corrupted. It think this is a Windows ARM thing. If anyone can build it on PC and gets it working, can you DM me.

Any feedback on this is welcome.

FYI - I have also made a UGen that does real-time inference with onnxruntime. I need to polish this up. It essentially runs the same as RTNeural, BUT loads standard onnx files (which pytorch can save directly) and uses Onnxruntime for inference. The disadvantage is that it is much less efficient (2-10x less efficient). I will polish this up and release it as well very soon. My hunch is that there will be better and better inference engines (onnxruntime is already realtime safe) that may be able to load onnx format, so going this direction could have advantages.

Sam

Sam_Pluta · September 22, 2024, 10:50pm

I realize the question was erased, but two things:

there is a huge help file for this plugin. There are also a bunch of examples in python in addition to supercollider. The training needs to happen in python for this plugin and I have shown how to do some things.
The plugin is capable are far more than I achieve in the examples. If you can train a network and save it in the right format, it should be able to do inference on the network. I honestly don’t know how this will be useful. Probably someone will find an unexpected use for this. I enjoyed the process of making a general use real-time audio plugin for neural networks, but as with many things ai, the research on how to train the models correctly for audio things is in the nascient stage. There are some nice papers out there with some good ideas!

Sam

Jose · September 27, 2024, 2:56pm

Thanks, @Sam_Pluta !!
Quick question, what is the difference between RTNeural and Proteus? Is RTNeural a more general implementation that integrates Proteus?

Best,

José

Sam_Pluta · September 27, 2024, 7:08pm

Yes, exactly. Proteus is an LSTM neural net with a hidden size of 40. All of the Proteus models were trained at 44100 and 16 bit. The inference is handled by RTNeural, and since it is a fixed graph, it is actually more efficient than RTNeuralUGen.

RTNeural, the UGen, can load Proteus networks that have been saved in RTNeural format (I provide a script to convert a pytorch json to an RTNeural (keras) json).

But it can also load any network saved in RTNeural format - it can use Dense, GRU, LSTM, Conv1D, Conv2D, BatchNorm1D, and BatchNorm2D layers and supports most activation functions. So if you train a GRU model with hidden_size 80 at 96K with a 24bit file, RTNeural could load that model and run it. Proteus cannot.

Another way to put it: Tequila is a kind of mescal made using the blue agave cactus, but there are 30 other kinds of mescals made with different cacti, none of which are tequila. Proteus is tequila. RTNeural is mescal.

BTW - I just published a bug fix to both RTNeural and Proteus having to do with SynthDef graph ordering problems. So if you want to use them, you should probably update.

Sam

Jose · September 27, 2024, 10:37pm

Thanks for the explanation!
I really liked the comparison with Tequila

I’m sure I’ll use it!
I’d like to know if it’s possible to get more timbre variation for exemple by changing the model’s parameters? Or do you know if it’s possible to interpolate between models, for example moving from one to another and see what happens in intermediate states or to create a dynamic change in the processing?

Thanks again!

Best,

José

Sam_Pluta · May 7, 2025, 5:45pm

Hello again.

I want to announce that RTNeural Plugin is now a beta release:

Compiled releases for Mac and Windows are available on the releases tab. Linux users have to compile themselves.

It runs in SuperCollider, pd, and Max/MSP (windows pd build needs a little help so if anyone wants to help with that, I’d appreciate it).

This project is designed to be a general case neural inferencing engine that runs at both audio rate and control rate. All training is done in python. All inference happens on the server in SC and there are control and ~ objects in max and pd. Any rtneural shaped network should be able to be loaded. See Jatin’s repository here for more details.

There are 3 demo videos on YouTube outlining how this works:

General overview (everything but LSTM time series prediction):

LSTM note prediction in SC:

LSTM note and rhythm prediction in Max:

A pd video for the LSTM prediction will happen once I figure out pd list processing, haha.

Let me know if there are any questions or issues. Bugs can also be reported on GitHub.

Sam

jamshark70 · May 7, 2025, 10:13pm

Because of teaching duties since 2020 or so, I’ve actually got a pretty good handle on pd list processing (and it took some time – it’s quite different from the SC way) – I wouldn’t mind to take a look, possibly weekend…?

hjh