Client-Server Architecture Limitation Cases

fmiramar · March 27, 2020, 3:53pm

I would like to gather here some of the discussion that have been scattered over the list on the examples were the Client-Server Architeture gets in the way of the users and what would be the possible workarounds.

From Andrea Valle book:

The main disadvantages of such an architecture are two:

the circulation of messages introduces a small delay (which may be significant,
however, if considering time sensitivity of the audio domain);

a great time density of the messages on the network can overloaded the
latter, and message handling can cause a delay.

It should also be noted that it is very rare to incur similar problems.

What are the cases were this problems happens and how to manage them ?

fmiramar · March 27, 2020, 4:28pm

I was exploring Alberto de Campo’s examples of Wavesets operations (SC Book and Quarks), where I found the following problem. On his Wavesets Quark implementation, you basically analyse a soundfile, and annotate the zero-crossing points above a certain threshold to get the Wavesets (slices of waveform among three zero-crossing point). To play and manipulate the Wavesets, he basically creates a sampler SynthDef (based on BufRd) which runs through the soundfile – loaded into a buffer – via Patterns(a Pbindef defines how the buffer will be read). Here is a simple example from the book:

w = Wavesets.from(Platform.resourceDir +/+ "sounds/a11wlk01.wav");

(
		b = w.buffer;
	// Wavesets.prepareSynthDefs loads this synthdef:
		SynthDef(\wvst0, { arg out = 0, buf = 0, start = 0, length = 441, playRate = 1, sustain = 1, amp=0.2, pan;
			var phasor = Phasor.ar(0, BufRateScale.ir(buf) * playRate, 0, length) + start;
			var env = EnvGen.ar(Env([amp, amp, 0], [sustain, 0]), doneAction: 2);
			var snd = BufRd.ar(1, buf, phasor) * env;

			OffsetOut.ar(out, Pan2.ar(snd, pan));
		}, \ir.dup(8)).store;
)

(
Pbindef(\ws1).clear;
Pbindef(\ws1,
	\instrument, \wvst0,
	\startWs, Pn(Pseries(0, 1, 3435), 1),
	\numWs, 1,
	\playRate, 1,
	\bufnum, b.bufnum,
	\repeats, 1,
	\amp, 0.4,
	[\start, \length, \sustain], Pfunc({ |ev|
		var start, length, wsDur;

		#start, length, wsDur = w.frameFor(ev[\startWs], ev[\numWs]);
		[start, length, wsDur * ev[\repeats] / ev[\playRate].abs]
	}),
	\dur, Pkey(\sustain)
).play;
)

The problem begins when you try some operations, like the following:

// waveset delete: time-contract the soundfile by deleting 'wavecycle', more simply just not reading some of them
Pbindef(\ws1, \playRate, 1.0, \startWs, Pn(Pseries(0, 4, 3435), 1)).play;

Which causes lots of late messages, showing that the patterns audio could not be handled by the server at the time they were expected to be.

Is it possible to manage this problem from the client side?

When I try to change latency, telling the server that the delivery could be done really late, like this:

Pbindef(\ws1, \playRate, 1.0, \startWs, Pn(Pseries(0, 4, 3435), 1), \latency, 1.0).play;

I do not receive late messages, but tons of FAILURE IN SERVER /s_new too many nodes messages.

When I go futher and try really heavy stuff, I get audio glitches (big silence gaps in the audio stream):

(
~a = Pbind(
	\instrument, \wvst0,
	\startWs, Pn(Pseries(0, 1, 3435), 1),
	\numWs, 1,
	\playRate, 1,
	\bufnum, b.bufnum,
	\repeats, 1,
	\amp, 0.4,
	[\start, \length, \sustain], Pfunc({ |ev|
		var start, length, wsDur;

		#start, length, wsDur = w.frameFor(ev[\startWs], ev[\numWs]);
		[start, length, wsDur * ev[\repeats] / ev[\playRate].abs]
	}),
	\dur, Pkey(\sustain)
	);
)

(
Pdef(\ws1, ~a);
Pdef(\ws2, ~a);
Pdef(\ws3, ~a);
Pdef(\ws4, ~a);
Pdef(\ws5, ~a);
)

(
Pbindef(\ws1,  \startWs, Pn(Pseries(0, 28, 3435), 1), \repeats, 28).play;
Pbindef(\ws2,  \startWs, Pn(Pseries(0, 29, 3435), 1), \repeats, 29).play;
Pbindef(\ws3,  \startWs, Pn(Pseries(0, 30, 3435), 1), \repeats, 30).play;
Pbindef(\ws4,  \startWs, Pn(Pseries(0, 31, 3435), 1), \repeats, 31).play;
Pbindef(\ws5,  \startWs, Pn(Pseries(0, 32, 3435), 1), \repeats, 32).play;
)

Is this a limitation of the implementation or can it be managed ?

scztt · March 27, 2020, 6:54pm

I’m not super familiar with the wavesets stuff you’re talking about, but I believe this might not be a client-server problem at all. When playing event patterns, usually what happens is something like:

Next event is scheduled at time=100.0
Event player wakes up at time=100.0 - latency (so, lets say time = 90.0), to process the next event and send it.
An event is produced by the Pbind / Pbindef / etc being played. It is scheduled at time=100.0, the logical time it is supposed to occur.
The server receives the message(s) send when the event is played with plenty of time to get things ready and play them.

In the case you’re posting about, I suspect w.frameFor is a VERY slow operation, since it probably concerns searching sample-by-sample through an audio buffer. So, what may be happening is, your player thread wakes up at #2 above - during #3 it goes off on a long quest to find a frameFor. While it’s doing this, the target time = 100 passes… when it finds what it needs, it sends an event to the server, scheduled for time=100. But, that time has long since passed, because frameFor took so long - so, when the server receives it, it (reasonably) complains that it’s being asked to play something at a time that has already passed.

Here’s something to try: call w.frameFor ahead of time, and load the start, length, wsDur in a list. Then, stick that list into a Pseq to supply your [\start, \length, \sustain] keys. I would guess that your late message problems go away…

Also, be aware that - depending on you audio file, you MIGHT be playing a LOT of synths in this example - thousands per second - which could easily bring things screeching to a halt.

scztt · March 27, 2020, 7:37pm

General comments / myth-busting on “server vs client architecture”, for anyone who’s wondered why SC is set up the way it is:

All well designed audio engines have a server-client architecture. This is a basic requirement for building audio software. Strict separation of real-time audio threads/processes from non-realtime calculations and business logic is a fundamental design pattern when building audio software. I’d be super happy to explain more about WHY this is the case, if people are interested.
Circulation of messages doesn’t introduce a meaningful delay when one is sending them to a local server. They are not traveling over a network unless they are being send… to a server over a network (e.g. on a laptop in the other room).
SuperCollider is not always fantastic at building abstractions over the client-server boundary… this can make certain kinds of things (e.g. loading samples into a buffer) feel a little clumsy in code. If you’re working with SC and noticing this clunkiness … well, you’re probably right :). But, the places where SC separates client and server are fundamentally the correct places to do so. I’ve seen audio engines that do a less rigorous job than SC does (usually and predictably because it makes certain kinds of code easier to write) - and they often end up with a bunch of subtle, weird, and disruptive problems.
This means a couple things:
1. The frustration one feels debugging client server issues is probably much less that the frustration one would feel debugging the kinds issues you get with a poorly designed client-server boundary (I say this very much from experience…). For example: in some audio engines, the late messages mentioned in the OP wouldn’t produce an informative message and otherwise-totally-fine playback - they’d produce an audio glitch, or a random result on 1 our of 30 cases, or a weird timing problem that you can never really track down, or…
2. There is a LOT of room for improving the abstractions over client-server details. The good thing is: SuperCollider is great at building abstractions, and the architecture is very sound so even if your screw up your abstraction, you probably won’t screw up the audio engine.

dkmayer · March 27, 2020, 9:17pm

The problem that can always occur is (a too high number of) short durations. A pragmatic solution would be introducing a low threshold in the Pattern, e.g.

#start, length, wsDur = max(w.frameFor(ev[\startWs], ev[\numWs]), 0.0005);

On average 2000 events per second would be too many, but most audio wouldn’t constantly have super-short wavesets. If you have a particular question about the Wavesets class, Alberto de Campo is on the SC users mailing list.

Accidentially yesterday I stumbled upon the RTWaveSets class by Fabian Seidl

It’s hosted by Till Bovermann

I haven’t tried it out yet, but it aims to avoid the bandwidth restriction limit.

Other than that - right now - I’m working on a related project, which I hope to finish soon.

Concerning your more general first post: After having dealt with a lot of troubles and confusion which come from the server-client distinction, in the end I think after practice one gets a feeling for the possibilities and limitations of the domains. It’s even enriching that often parallel and mixed solutions exist, better than to focus only on facts that can not be overcome (like limited bandwidth).

jamshark70 · March 27, 2020, 11:26pm

Minor clarification: The clock does no latency compensation. The thread wakes up at the scheduled time and the sound occurs later than the scheduled time.

If SC is the only source of sound, and all SC sound is late by the same amount, then users never notice because everything sounds together.

Latency compensation is a feature of LinkClock because then SC is not the only relevant source of sound. But here, it works by shifting the time base: LinkClock’s beat 100 is latency seconds earlier than other peers’ barlines (and the sound is later than that by the same amount, so the sound occurs in sync with other peers).

hjh

jamshark70 · March 28, 2020, 1:48am

There is one specific thing that is straightforward to do in Max/MSP and Pure Data (and was straightforward to do in SC2, but less straightforward now): Starting a synth based on a trigger signal (such as from audio analysis).

SC2: You could feed the trigger into a Spawner (I think… SC2 was, like, 17-18 years ago). Because the language and audio engine were integrated in SC2, it would evaluate the synthesis function on demand from the trigger and add the new units into the graph. This was good in that the trigger would fire, and the new synth could start in the next control cycle (maybe… either that or hardware block), but bad in that every note had to reevaluate the synthesis function (rebuild UGens, optimize, etc.).
SC Server: The only way to add new units is for the language to send /s_new. To do this in response to an audio trigger requires SendReply in the server --> OSCFunc/OSCdef in the language, then language sends /s_new. There’s inevitably a small amount of latency – and, jittery latency (because the /s_new can’t be time stamped, so it goes at the beginning of the next hardware buffer).
Max and Pd: Trigger detectors tend to output bang (a control message), which you can use to trigger envelopes immediately. But it’s not quite analogous to SC Server, because all of the units have to be allocated in advance (Max: [poly~], Pd: [clone]) so it’s rather a matter of routing the trigger to an already-existing unit.

The Max/Pd approach suggests that, if you need instant trigger signal responses in SC Server, you can pre-allocate a pool of synths, and transfer the trigger signal over a bus. When one synth is active, just switch the trigger-producer to use a different synth’s bus.

(
var n = 20;

fork {
	SynthDef(\trigboop, { |out, trigbus, freq = 440, amp = 0.1|
		var trig = In.kr(trigbus, 1),
		sig = SinOsc.ar(freq),
		eg = EnvGen.ar(Env.perc(0.001, 0.05), trig);
		Out.ar(out, (sig * eg * amp)/*.dup*/);
	}).add;
	s.sync;
	
	// synth pool
	b = Bus.control(s, n);
	c = Array.fill(n, { |i|
		Synth(\trigboop, [out: 1, trigbus: b.index + i]);
	});

	// trigger producer
	i = 0;  // current index
	a = { |out, amp|
		var trig = Impulse.kr(10) * amp;
		SendReply.kr(trig, '/trig', 1);
		Out.kr(out, trig);
		Out.ar(0, K2A.ar(trig));
	}.play(args: [out: b.index + i]);

	// need to update 'i' on trigger
	OSCdef(\x, {
		i = (i + 1) % n;
		a.set(\out, b.index + i);
	}, '/trig');
};
)

So there’s even a solution for that – just not the normal SC-idiomatic way. If you record that and zoom in on the audio in Audacity, the triggers and notes are completely synchronous (though the OSCdef would break down if triggers come in faster than the network round trip – or, put the increment logic into the trigger-producer synth).

The only other case I can think of that would be hard to handle is the way that, in Pd, you can [tabwrite~] a signal into a buffer and the data should be immediately accessible to control-layer [tabread] objects (where in SC server, you have to fetch the buffer data by OSC). But WRT to scztt’s comments, it would be more accurate to say that Pd is only hiding the interface between audio processing and control access, where SC forces you to deal with it explicitly.

hjh

fmiramar · March 28, 2020, 4:31pm

Please, that will be really helpfull!

fmiramar · March 28, 2020, 6:11pm

Indeed, .frameFor is doing lots of operations:

frameFor { arg startWs, numWs = 1, useFrac = true;
		var whichXings = if (useFrac) { fracXings } { xings };
		var startFrame = whichXings.clipAt(startWs);
		var endFrame = whichXings.clipAt(startWs + numWs);
		var length = endFrame - startFrame;
		var sustain = length / sampleRate;

		^[startFrame, length, sustain]
	}

Thanks, I will try to figure out how to improve this

fmiramar · March 28, 2020, 6:35pm

@dkmayer thanks, lots of useful infos! I will try this plugin ASAP.

Please share it, if possible, when it is finished!

Doing some research I’ve realized that there is a improvement by Till Boverman. From a quick look it seems to be the solution proposed by @scztt . Unfortunately it is not listed on the Quarks list…

fmiramar · March 28, 2020, 7:26pm

That’s really interesting. I could not hear any sound from the example, and the recording was empty as well… Do you know what’s wrong ?

I don’t know much stuff about Pd, but I often hear people saying that the block~ and switch~ could not be transposed to SC. Specially the fact that you could run different block sizes on different patches … Do you also know how to manage this ?

jamshark70 · March 28, 2020, 11:33pm

Hm, I have no idea. I had thought maybe order of execution, but I think that can’t be it because at the very least you should get the impulses in the left channel (Out.ar(0, K2A.ar(trig))). If you aren’t even getting that, then there is something in your configuration that is different from mine.

The code example worked on my machine.

Scsynth simply doesn’t implement an equivalent for [block~] at all.

I hadn’t thought of that for this thread because there thread is about the client-server split, but [block~] has nothing to do with that. (If somebody wanted to implement it in the server, the user interface could be to set a block size for a group – that’s totally compatible with client-server, it’s just that nobody has done it.)

[switch~], we a/ already have (node.run(false)) and b/ don’t need. You could think of that as a workaround for the fact that Pd must have all the objects existing all the time. In SC, it’s cheap to create and destroy synths at any time. So instead of “switching off” subpatches that you’re not using, you simply free them.

hjh

scztt · April 26, 2020, 2:28pm

I wrote up a long-ish article about some of this client-server stuff here: Real-time Audio Processing (or, why it's actually easier to have a client-server separation in Supercollider)

I hope it clarifies things a little!