Experiences with RAVE UGen?

Robin_Morabito · August 17, 2022, 4:35pm

Hi everyone,
I haven’t found any mention about this in the forum yet, but I’d like to collect experiences (if any!) with the Rave supercollider UGen by Victor Shepardson.

We’re working in the same lab and I’ve become increasingly interested into using this UGen to augment audio in Supercollider. However I’m very noob and mostly hoping to see what other people have been doing to learn some more, before approaching the task.
(other stuff about Rave is here:
GitHub - acids-ircam/RAVE: Official implementation of the RAVE model: a Realtime Audio Variational autoEncoder)

Jose · August 18, 2022, 11:12am

Hi Robin,

Thanks for the link and to Victor Shepardson for porting Rave to SuperCollider!

I just compiled it and it works ! I have only tested it with a model that comes with the Max version of Ircam. Apparently there are several models already made for example https://neutone.space but I don’t know if they are compatible with this version. Maybe the documentation can be completed to be the same as the Max version (for exemple “advanced” tab in Max with a GUI)?
In general this treatment has a lot of latency, but it can be used to generate interesting sounds or when latency is not important.
I will continue to test it !

All the best,

José

victor-shepardson · August 18, 2022, 3:11pm

Thanks @Robin_Morabito

For anyone who saw rave-supercollider before last week, it’s now been completely overhauled with separate Encoder and Decoder UGens and a more straightforward interface.

@Jose I just pushed support for Neutone models! RAVEEncoder and RAVEDecoder only, since the Neutone export process replaces the forward method and discards the prior (?). After downloading models in the Neutone app, they can apparently be found at /Users/<user>/Library/Application Support/Qosmo/Neutone/models/ (on a mac). If you can get an original RAVE export, that’s still better.

I haven’t tried rave-supercollider with a low-latency RAVE model yet – if anyone does please share!

elgiano · August 18, 2022, 8:15pm

Wow! I’ve seen it before last week, so I’m definitely looking forward to try the new gears

I’ve been looking into RAVE and training some models… do you know if there is any resource to help tuning the training? Like choosing parameters, making nice datasets, and understanding the relationships between the two?

And also, I usually train at 44100, does it mean that I can’t use those models in SC, or would they work if SC is also running at 44100?

Jose · August 19, 2022, 11:27am

Thanks @victor-shepardson !!

I just tested the Neutone models and they work great, here an audio exemple with a violin model with SC example

In RAVEEncoder if the latenSize is different from the model, the server crash. Is there any way to retrieve the latenSize of a model before send to RAVEEncoder?

Otherwise here a SC example of Max nn~ ‘advanced’ exemple “to get control over the generation”, of course you can change with different techniques any of latent dimensions…

b = Buffer.read(s, Platform.resourceDir +/+ "sounds/a11wlk01.wav"); 

~synth = {
	var z = RAVEEncoder.new(
		"/Users/jose/nn_tilde/help/wheel.ts", 8, PlayBuf.ar(1, b, BufRateScale.kr(b), loop: 1.0),// input for latent embedding
	);
	
	z[1] = MouseX.kr(-3, 3);
	z[4] = MouseY.kr(-3, 3);
	
	Limiter.ar(
		RAVEDecoder.new(
			"/Users/jose/nn_tilde/help/wheel.ts",
			z //latent input
		)
	)!2;
}.play;

)

Thanks again for your work!

All the best,

josé

victor-shepardson · August 26, 2022, 2:21pm

I think it should work fine with different sample rates and block sizes as long as scsynth matches the RAVE model. Let me know if it doesn’t!

victor-shepardson · August 26, 2022, 2:28pm

It’s tricky, you have to supply the latent size to RAVEPrior and RAVEEncoder because sclang needs to create the OutputProxy for each latent before scsynth actually loads the torchscript file. I’m not sure how to get around this – reading the torchscript file from sclang for example looks hard. I will fix it not to crash the server though.

thanks for the example!

jarm · November 4, 2022, 10:45am

In case anyone is interested, I made a TidalCycles interface for this here: GitHub - jarmitage/tidal-rave: Live coding RAVE real-time models using TidalCycles and SuperDirt

You can see it in use at Algorithmic Arts Assembly 2022 (Lil Data): Lil Data, Eloi El Bon Noi, Alicia Champlin, Qbrnthss - YouTube

Jose · November 9, 2022, 11:03am

Hi @victor-shepardson,

I just saw that PyTorch is compatible with Apple M1. I wonder if you have tried to train models with this version?

And thanks (a little late) for the new version!

José

gyuchulm · February 4, 2023, 2:42am

Hi, I try to make a model with a rave v.2 model.
Are there any tips for training a model? I tried to build a model but failed several times.

like the same issue loading a VCTK model in supercollider.(Server down) I assume that It is a memory issue, but I don’t know how to solve the issue.

so If someone has a tip for a training model, let me know!

schmolmo · February 9, 2023, 3:05pm

hey @gyuchulm,

i have some experience training some rave v1 models on google cloud platform. do you have any specific questions? i could send over some notes on the setup if you like

best,
schmolmo

gyuchulm · February 11, 2023, 10:04am

@schmolmo It would be great! Now I am trying to build some model with v2, but, It is pretty vague to build a model haha

elgiano · June 4, 2023, 2:50am

Hi everybody!
These days I’ve been coding an adaptation of nn_tilde for SuperCollider!

It uses the same backend as nn_tilde, so it features processing arbitrary model methods, setting and getting attributes, even processing on a separate thread, and it loads torchscripts so it can potentially load other ML models than RAVE.

So far I’ve used it only on Linux, only on CPU, on RAVE (v1 and v2) models and msprior.

If anybody tries it let me know how it goes!

scztt · June 4, 2023, 8:52pm

How is this working for you this far? I’ve tried RAVE, both the VST version and the Supercollider UGen and generally observed that it waaaaay too inefficient to do anything remotely realtime. But it seems like SOMEONE is doing realtime things with it, so I can’t help but feel like the problem is on my side? It’s definitely not a horsepower problem, the computer I’m running it on is plenty fast - it is a Mac, so it could be a case or insufficient compute support on Mac.

gyuchulm · June 5, 2023, 7:24pm

Nice I will check it thanks!

Eric_Sluyter · June 6, 2023, 12:05am

hey friend this is super!

I found it sort of a hassle to build (but not your fault, just to download homebrew and install libtorch) – if you want I can send you my m1 build to add to the github?

…and if anyone has a mac intel build I’d love if you could share it… I tried hard but gave up

I found the code in the help file didn’t work for me, but I got the basic max nn~ example working with this:

s.boot

NNModel.load(\wheel, "/.../wheel.ts");

b = Buffer.read(s,"/.../soundfile.wav");

(
{
  var in = PlayBuf.ar(1, b, BufRateScale.kr(b), loop:0);
  var out = NN.ar(\wheel, \forward, 2048*4, in);
  [out, in];
}.play
)

s.record

The first thing I’ve tried is this recording of a flute and out came this creature:

look forward to playing with it

mousaique · June 7, 2023, 4:48pm

It would be really great to have the M1 build available so that users with less fluency programming or to compile could get the chance to try this, sounds so promising!

elgiano · June 8, 2023, 12:56pm

Hey Scott,
same here, lots of dropouts in realtime… while I guess I have an “old” computer, all the usable results I got so far from RAVE were when I generated audio directly from python, non real time…
And this is also why I’m writing this plugin… to figure out if nn~ implementation makes it better (with the circular buffering and separate thread). But as I said, so far I still have loads of dropouts, unless I use bigger buffers (>= 4096 samples).

I’m currently working on an NRT interface to see if results are comparable with what I used to get from python. But I’m getting quite confused with the interface I’m designing and I don’t have a quiet time to go through it at the moment… probably I’ll come up with something in like 2 weeks.

@Eric_Sluyter so great to hear from you!!! Sounds a lot like the stuff I’m doing as well!! And I see, no dropouts on an M1 with 4096*2 buffer size: that’s similar to what I get on my linux machine. Thanks for sharing! Is this the smallest usable buffer size you’ve tried?

much love and thanks everyone for the reactions!!

elgiano · June 8, 2023, 2:13pm

I just pushed the NRT code. Documentation is not quite there, but there is some general information in the NN class’ helpfile.

It’s usable, both RT and NRT, but sorry, it’s a bit messy. I will clean up and write better docs as soon as I have the time

scztt · June 8, 2023, 5:24pm

Really cool - excited to try it. I might have time in the next few weeks to help out (if you want!) cleaning up code and optimizing performance.