SuperCollider on Linux

Thank you very much for running this test on your machine. This is with Ubuntu Studio, right? Just out of curiosity, are you using an Intel CPU?

Gerhard

It is Intel, but a/ I was testing on the built-in soundcard (and for some unknown reason, performance is noticeably worse than a USB soundcard in Ubuntu 20.04) and b/ I hadn’t set CPU governing to “performance.” So it’s possibly likely to be able to do a bit better. But tbh I don’t have bandwidth to run a lot more benchmarks.

64 is an extremely small buffer size. To get that to work, I’d recommend joining a Linux audio forum, where you can find people who really know how to configure a system (instead of people like me, who got close enough and left it at that – that is, I am not an expert at Linux audio configuration, and for a 64-sample buffer, quite likely you need expert help).

hjh

Yes, I am aware of this. I don’t necessarily need it to be that small, 128 or 256 may also be sufficient for what I am doing at the moment. I chose 64 in the tests to create a more demanding situation. The xrun problem also appears with considerably larger blocksizes (512 and 1024), only with a bit more load.

My impression is that SC uses Jack substantially differently than for instance PureData, where I can run a process on one core at 85% without any xruns. Also the Jack test client I wrote performs in this kind-of predictable way, i.e. if the CPU load in ps or top approaches 100%, xruns start to appear and become more frequent with more load. With SC this limit seems to be much lower. I would like to understand why that is, as my impression is that with SC I get much less out of the machine than I could.

1 Like

I just want to add that I am seeing the same results as @eckel on both Ubuntu and Ubuntu Studio, i.e. scsynth produces xruns at much lower CPU loads than other jack clients. If anyone can shed any light on this it would be very much appreciated.

I just made another test with loading an SC file directly into sclang. This is what the file test.scd contains:

Server.default.waitForBoot({
	500.do({ { SinOsc.ar([200, 202], 0, 0.001) }.play() });
});

If I run this in the IDE I get lots of xruns. If I run it from the terminal with sclang test.scd, I don’t get any xruns.

Yes, I can reproduce this as well. I had to go to 1000.do to create the xruns in the first place, but when I ran the file from the terminal it worked fine.

Jack reported ~25% load when I ran the code from the terminal and ~30% when I ran it from the IDE. Shouldn’t those numbers be the same?

I want to add that using the configuration script, installing a realtime kernel and possibly applying some suggestions by the Ardour manual gives me a really good configured audio system on linux. Of course, installing Ubuntu Studio or AVLinux might give you good results more easily, but it’s certainly not the only way.

That is very bizarre. I can’t even speculate a way to explain that. In theory, the IDE does nothing to touch any audio threads or audio subsystem at all… but obviously the theory doesn’t cover everything.

I’m totally stumped by that.

It makes me wonder about other editors, then (emacs or vim)… same issue?

I’d call it a bug actually, if it can be consistently reproduced. There’s no reason to accept performance degradation based on the editor.

This at least could be substantiated in the source code, by comparing Pd sources against SC’s.

hjh

What is also curious in the case of Pd is that the load displayed in Jack is minimal as compared to SC. In Pd it is only a few percent (and 85% in ps) wheras in SC it is in the order of magnitude than the cpu load reported by ps. Is seems that Pd runs an independent audio thread and then only copies the frames into Jack. But this is just a wild guess, as I am not familiar with the source code of both.

This is correct! By default, Pd uses a “polling scheduler”, which means that the scheduler runs in a dedicated thread and communicates with the actual audio callback via a lockfree ringbuffer. This means the audio callback does almost no work. The size of the ringbuffer (= “delay” in Pd’s audio settings) introduces additional latency to compensate for CPU spikes and non-RT-safe operations. This is necessary because Pd runs DSP and control code deterministically in the same thread.

I compiled SC without QT in order to run sclang headless. In this case I can pump up the number of synths to 4000, resulting in a ps CPU load of 85% in sustained operation. At that load I start to get a few xruns every now and then (I added a line in the xrun callback in SC to also have them reported in the console). This is the kind of behaviour I was expecting and which I get with pd and my test client.

So it all seems to have to do with GUI code interrupting the audio processing.

1 Like

Something funny about DSP load indicators. With a not powerful nor configured laptop, 256 buf size, I have 5%+ DSP load in jack with nothing running, if I run the yes command alone in a terminal it goes down to 2%+ …why?

LOL, I get xruns purely from starting jackd with 64 sample frames and 2 periods, without even a single client connected; that scripts is all green but inotify and RT kernel. Any pointers for hassle free installation of RT kernel on Debian?

Regarding performance IDE vs terminal - without particular knowledge of the internals, I would be surprised if the GUI threads interfered with the scsynth audio thread - it’s an entirely independent process. I would guess that perhaps the system just runs a number of additional threads when the IDE is used. Perhaps this can be observed with top or something similar?

Great that you have narrowed it down! I wonder how hard it would be to fix the underlying issue. I suppose everyone would prefer the GUI lagging over having an xrun, but I still don’t understand why the two would even be connected (unless the two share a CPU core and that core is at 100%).

What is also curious in the case of Pd is that the load displayed in Jack is minimal as compared to SC.

This would also have implications for running multiple servers to make use of several cores right? If the load is kept out of Jack adding parallel processes should be fine, but if the load is somehow shared more with Jack then maybe expanding to multiple servers wouldn’t yield the desired outcome if Jack itself acts as a bottleneck.

@eckel

Can you post the top output of all relevant threads (scsynth, sclang and scide)? Then we can have a look at the thread priorities.

@ludo

Jack clients run independently from the Jack server.

Jack clients run independently from the Jack server.

So the difference between how Jack reports its load for puredata and SuperCollider has no real practical implication, it is just reported differently?

I have explained the load difference here: SuperCollider on Linux - #30 by Spacechild1

Turbo boost, sorry for the noise.

Now I can see (with everything green except preemption) DSP peaks and xruns starting at 40~50% (onboard sound card) and 50%60 (external sound card)[1] constant cpu load in one core thread. With 256 or 512 buffer size is the same behavior, 40~50% cpu gives xruns. But I can’t confirm if the IDE is the issue, I have the same behaviour running a script from the terminal.

[1] Everything is configured the same and the test is the same, the usb devices in my laptop seem to require more cpu.

I have seen this with some plugins running with wine, low cpu and some xruns, it would be nicer if that wasn’t the case, and I don’t like my laptop.