Benchmarking scsynth


#1

I’m wondering if anyone has strategies to reliably benchmark scsynth performance.

My present case is in testing different builds of UGens, but I’ve often tried benchmarking Synthdefs and relied only on the average and peak CPU from the server. Something like (pseudo code):

(
var n = 500;                       // number of synths to run
~synths !? { ~synths.do(_.free) }; // free if you haven't already
~synths = n.collect{ mysynth.play };

~avg = 0;
~peak = 0;
r = Routine.run({
	var hz = 5;   // poll frequency
	var sum = 0;
	2.wait;       // wait for initial cpu spike to pass
	inf.do{ |i|
		var thisAvg;
		i.postln;                      // tick
		~peak = max(s.peakCPU, ~peak); // save the highest peak
		sum = sum + s.avgCPU;
		~avg = sum / (i+1);            // average the average load
		hz.reciprocal.wait;
	}
});
)
// run for a while
r.stop; r = nil; [~avg, ~peak].round(0.1).postln

It’s ok for an approximation but hard to tell how accurate it is.

Relatedly, any insights on how to predict if CPU spikes will actually cause dropouts? (I’ve heard dropouts with spikes under 100%, and clean audio with spikes over 100%)

Anything more reliable/accurate? It’s of course really dependent on system load…Keeping in mind that at any given time I may have 5 apps running in the background and more browser tabs open than I can count :flushed:

Thanks!


#2

these are problems that i’ve also been struggling with. my approach to ugen profiling is to benchmark without SC — write the raw DSP code in a way separable from the SC plugin, write a separate C++ program that runs a whole bunch of 64-sample chunks and times each one, and look at the entire statistical distribution of the runtimes.

to evaluate RT performance, you should at least look at the central tendency, the variance, and (suggested to me by scott c) the worst measured data point. central tendency is basically the same as NRT performance, and variance/maximum is the “unreliability” — unusually high variance means that the ugen’s real-time safety is more of a crapshoot. it’s not a perfect performance indicator but it seems reasonably scientific to me.

it would be really cool if we could build such a profiler right into SC so we can measure performance in a context that closer matches the production environment.

as for metrics such as peak/average CPU usage, probability/frequency of xruns, and audible dropouts, my understanding is that these have complex relationships with each other and dependencies on environmental factors like the operating system and sound card. the CPU meter, for instance, is the CPU usage for just the SC app on macOS, but for the entire JACK server on linux. investigation of xruns is also hindered by the fact that SC has no xrun detection functionality (and only some audio servers support programmatic xrun detection).

that’s what i’ve picked up about real-time audio DSP profiling mostly from informal discussions and personal experiences. i’d welcome any tips from people who are actual experts in this stuff.


#3

I cannot give a really profound answer to that, but from my experience (on OSX) SC is quite sensitive to different environmental actions. For serious work (recording e.g.) I

.) close as many other apps as possible
.) close as many windows and tabs as possible
.) don’t do any polling, tracing or other posting of intermediate results
.) avoid or reduce MIDI to a minimum
.) avoid scoping
.) go offline
.) don’t move any windows

This might sound a bit funny partially, but believe me, it is a result of practical experiences.


#4

Also, turn off CPU frequency scaling. If the CPU governor detects a higher level of activity, it may switch to a higher clock speed, causing the visible CPU% in SC to go down.

“Spikes well below 100%” – SC’s CPU measurement is not about overall CPU activity. It’s only the amount of time it took to calculate the last hardware buffer, divided by the buffer duration. In practice, this exhibits a wide variance, even with a constant calculation load – and we’re talking about 10, 20, maybe 40 ms per buffer, so it doesn’t take much to delay a calculation just long enough to have a dropout. If your hardware buffer is 11.6 ms and the average load reports as 50%, it means on average each calculation cycle is taking 5-6 ms. It’s not that much margin for error. I don’t think we can reasonably expect to go up to 80-90% without dropouts.

hjh


#5

This would be ideal, but a bit too far out on my horizon, still early days for me writing UGens. Your points on central tendency, the variance, and worst measured data point sound like great things to investigate.

+1!

I suspect this is going on, unfortunately. I was hoping that benchmarking by running many instances of the test at once I’d be “demanding” consistent attention from the governor. I’m running macOS though and I don’t know of a dependable way to manage thread priority. I recall disabling hyperthreading in the past on an older machine when there was a preference pane for it, but I’m not sure this is recommended for newer machines (2016 MBP, i7, Mojave), without having to use a questionable 3rd-party utility. Thoughts on this?

Thanks for this practical interpretation, I hadn’t thought of it this way before.

A catch 22 when doing debugging/benchmarking! I was surprised to see that in this way the server/lang aren’t as “separate” as I hoped.

don’t move any windows

I do a good bit of work with UI/visualization and I am a bit bummed to see UI affect CPU performance on the audio thread. I don’t seem to remember this happening as much in the past (or maybe I’ve just gone crazier with Pen).

Thanks all for your advice!


#6

I haven’t found SC to be that sensitive (although, if I were using SC to record a concert where there’s no room for error, I would certainly avoid other activities – but that’s not only for SC – I have a recording of a premiere of mine where the DAW glitched and wrote some parts of the audio out of order – that might even have been Pro Tools).

The audio thread is supposed to be very high priority, so if the OS sees that there are GUI update threads and audio processing pending, it’s supposed to do the audio first.

Also, on multicore systems, the OS “should” be smart enough to put GUI updates onto a different core. (I guess hyperthreading could mess that up, though.)

When I started with SC, on a G4 iBook (!), GUI updates would routinely be delayed for over half a second if there was a lot of synthesis happening. Since basically every computer has at least two cores now, that problem is a thing of the past.

hjh


#7

On Mac at least, the GUI drawing thread is almost as high priority as the audio thread - this is not so weird, as GUI deadlines are almost as tight as audio deadlines these days (60 fps is roughly the same as a 512 sample buffer - long for real-time audio, but not that far off). Of course there’s plenty to argue about here, but it’s perhaps not worth the breath…

The canonical audio performance test I’ve used is something like:

  1. Pick a small-ish unit of work, e.g. one ugen / series of ugens / small synth.
  2. For a given sample rate + hardware buffer size + ugen settings state, run M instances of #1 for N minutes.
  3. Increment M until you can no longer run for N minutes without an audio dropout. How this is defined is a bit open-ended (is one dropout in 5 minutes enough? <3 dropouts in 10?), depending on how accurate you want your test.

In the end, M-1 represents a “safe” upper limit of what you can run for a given configuration. This should give you a good indication of the overall performance impact of the ugen / synth / operation. Obviously, you can be as clean or as un-clean as you want with this test. I’m inclined to be relatively rough - e.g. run the test while doing obvious, normal UI things like showing level meters and updating widgets - because it reflects how most users would run in most cases, and thus represents the real world performance much better.

Unfortunately for SC, there is no sclang-accessible way to detect audio dropouts. This is a bummer, because otherwise this test could be automated… This is a good first feature for anyone with some low-to-mid-level C++ chops and a willingness to research a little. :slight_smile:


#8

On Mac at least, the GUI drawing thread is almost as high priority as the audio thread - this is not so weird, as GUI deadlines are almost as tight as audio deadlines these days (60 fps is roughly the same as a 512 sample buffer - long for real-time audio, but not that far off). Of course there’s plenty to argue about here, but it’s perhaps not worth the breath…

The canonical audio performance test I’ve used is something like:

  1. Pick a small-ish unit of work, e.g. one ugen / series of ugens / small synth.
  2. For a given sample rate + hardware buffer size + ugen settings state, run M instances of #1 for N minutes.
  3. Increment M until you can no longer run for N minutes without an audio dropout. How this is defined is a bit open-ended (is one dropout in 5 minutes enough? <3 dropouts in 10?), depending on how accurate you want your test.

In the end, M-1 represents a “safe” upper limit of what you can run for a given configuration. This should give you a good indication of the overall performance impact of the ugen / synth / operation. Obviously, you can be as clean or as un-clean as you want with this test. I’m inclined to be relatively rough - e.g. run the test while doing obvious, normal UI things like showing level meters and updating widgets - because it reflects how most users would run in most cases, and thus represents the real world performance much better.

Unfortunately for SC, there is no sclang-accessible way to detect audio dropouts. This is a bummer, because otherwise this test could be automated… This is a good first feature for anyone with some low-to-mid-level C++ chops and a willingness to research a little. :slight_smile:


#9

Thanks, Scott. This seems like a reasonable approach for instruments built in SC, and I’ll keep that in mind.

The present task is on the UGen side, so I am hoping to find a technique that can look a bit deeper and be more nuanced that would can tell the gain made by, say, factoring out a sin(2x) calculation if I have already computed sin(x) and cos(x) (because sin2x = 2*sinx*cosx). Fully aware that I’m probably splitting hairs, but hey if I can learn something along the way and get that extra 5% efficiency… For this I’m probably looking at something more like Nathan mentioned, which is to say I would need to double down on profiling the cpp code directly.


#10

I’ve had a start using XCode’s Time Profiler, one of the “Instruments”. It seems promising in that you can monitor individual processes (scsynth), over specified windows of time, and observe the call stack and see each call’s relative cpu consumption. You can even look into individual instructions within function calls and their relative weights. Still feeling my way through it, but I’ll post back if I get any helpful tips.


#11

Hi there. I think I’m that person. I’ve been yearning for a chance to contribute here for several years, and I think I’m finally getting there. Could anybody spare a few minutes sanity-checking me? I’d like to get feedback on the general approach I think would work, as well as answers to a couple outstanding questions.

So: I thought I’d write a Ugen. It would simply scan the samples for zeros, and emit some sort of control-rate signal with the index and maybe the calculated timestamp of the dropped samples. I don’t need a ton of DSP knowledge to find audio dropouts, they’re just gonna be zeros. Right?

My questions:

  1. Is a Ugen the best tool for the job here? Using one makes analyzing the samples super easy, but I may be missing a better option architecturally.
  2. Is there any way of emitting a signal other than a .kr? Could I, for example send an OSC message to sclang directly?
  3. Generally, what research did you have in mind?

Thanks so much for your time, and this beautiful software!

Chad


#12

In the case of a dropout, scsynth may not run at all. A common case in Linux is that a hardware interrupt with a higher priority than the audio driver causes the audio driver to wake up late. In that case, scsynth will not calculate anything for the hardware buffer. As far as it’s concerned, all signals are uninterrupted, but you will hear the dropout.

In that case, there may not be any way to analyze signals to detect a dropout. If JACK gets called late, none of its clients will run.

I’m not sure what happens if JACK calls scsynth but scsynth takes too long. I think JACK just uses a buffer full of 0 for scsynth’s output in that cycle. But scsynth thinks it’s doing work. I don’t know how it “catches up” when it’s called in the next cycle. Maybe a separate JACK client could detect this, but I’m not sure.

The idea about looking for zeroes within scsynth assumes that UGens are completing their work but outputting 0. That’s not the case here. We’re looking for UGens taking too long – not completing the work in time.

hjh


#13

Right on, this definitely puts me on the right track. I think I understand – scsynth does all its work in a cycle, and you couldn’t use a ugen from within that process to figure out if the work was completed. Audio drops when JACK attempts to read the audio before it’s ready.

JACK already detects XRUNS but those aren’t always audible in my experience, and I’m pretty sure you can already view them from sclang. (If not, there’ll be an easy way to get them.) Based on what you say, another JACK client could probably only be as accurate as the built-in XRUN detection, and if I wanted to make something super awesomely robust I might look at using a separate device and analyzing the signal. Since it got DAC’d and ADC’d between devices, it’ll be less pristine and I’ll have to actually know how to listen for artifacts. So it’s not as low-hanging of fruit as I thought, but could still be within my reach with some research.

Such a tool could totally emit OSC over the network for sclang to consume, though. That part is easy to imagine.

Thanks again, you guys rule.

Chad


#14

(It could still be a Ugen if one used another SuperCollider on another device)


#15

thanks for volunteering to do this!

detecting xruns in JACK is as simple as patching a jack_set_xrun_callback into the JACK driver at server/scsynth/SC_Jack.cpp.

as for how to communicate the xrun back to sclang, i would recommend two channels:

  • send an /xrun OSC message back to all connected clients. these messages should be throttled in some way so that the client isn’t absolutely flooded.
  • incorporate xrun statistics into /server.status. as some ideas, maybe total number of xruns since server boot, and average frequency of xruns over the last 10 seconds?

the tricky part here is cross-platform compatibility with servers that aren’t JACK. @scztt and i talked about this same idea some time ago and i recall that macOS’s coreaudio doesn’t have an xrun callback, and some kind of hack is required to work around it (don’t remember exactly what).

once you have a cross-platform and robust solution working i’d be more than happy to support its inclusion in core SC, and let me know of any further questions.