Tempo drift when recording

I was recording a track into Supercollider and found that the timing is slightly off.

For example I record a track into Reaper just playing a Kick with 60 bpm (example below). The tempo in Reaper is set to 60 of course. After aligning the kick at the beginning with the grid it slightly drifts off the the grid over time. After two minutes the drift is ca. half of the kick duration.

This also happens when recording directly from Supercollider and importing the wav into Reaper.

What could be the reason for this behaviour? I’m using Ubuntu 24.04 with Pipewire, but when recording from SC that should not matter, I guess.

TempoClock.default.tempo = 60/60;

SynthDef(\kick, {
	arg mainOut = 0, stemOut = 2, amp = 0.5, pan = 0, atk = 0.001, dec = 0.22, upfq = 110;

	var pitchEnv, bodyEnv, body, click, clickEnv, sig;
	pitchEnv = EnvGen.kr(Env([upfq, 55], [0.08], -2));

	bodyEnv = EnvGen.kr(
		Env.perc(atk, dec, 1, -4),
		doneAction: 2
	);

	body = SinOsc.ar(pitchEnv) * bodyEnv;

	clickEnv = EnvGen.kr(
		Env.perc(0.0005, 0.01, 1, -8)
	);

	click = BPF.ar(WhiteNoise.ar(1), 2500, 0.8) * clickEnv * 0.25;

	sig = body + click;
	sig = (sig * 4).tanh * 0.4;
	sig = Pan2.ar(sig * amp, pan);

	Out.ar(mainOut, sig);
	Out.ar(stemOut, sig);
}).add;

Pdef(\gridtest,
	Pbind(
		\instrument, \kick,
		\dur, 1,
		\dec, 0.05,
		\amp, 0.1,
		\upfq, 120
	)
).play;

Edit: No Issue when using:

(
SynthDef(\serverClick, { |out=0, amp=0.15|
	var trig, sig;
	trig = Impulse.ar(1);   // 1 Hz = 60 BPM quarter notes
	sig = Decay2.ar(trig, 0.0005, 0.02) * WhiteNoise.ar(amp);
	sig = HPF.ar(sig, 3000);
	Out.ar(out, sig ! 2);
}).add;
)

x = Synth(\serverClick, [\out, 0]);

I this drift from language scheduling as suggested by ChatGPT?

The client uses calculates OSC bundle timetags based on the system clock. The audio callback, however, is driven by the clock of the audio hardware. Both clocks drift over time at different rates and possibly in different directions.

The Server tries to convert the incoming OSC time tags to sample time. (On Linux and Windows this is done with a DLL filter. On macOS the OSC time offset is periodically reestimated by a dedicated timer thread. Naturally, both strategies are only approximations.) This guarantees that the Client will not move ahead or fall behind the Server, which is important in a real-time performance context.

The problem is that the logical time of a TempoClock on the Client does not correspond to the sample time on the Server. That’s why you’re seeing this drift in your recording.

AFAICT the only possible way to get sample accurate timing on the language (and still avoid clock drift) is to drive the language scheduler by the audio callback. If anyone is interested in how this could work, see the following discussion: Keeping sclang and scsynth in hard sync

FWIW Supernova has a useSystemClock option. If you set this to false, bundles are scheduled based on the sample time. This means that a duration of 1 second in the language corresponds exactly to 48000 samples on the Server (assuming a samplerate of 48 kHz). The downside is that the notion of “now” will drift apart between Client and Server, possibly causing “late” bundles. It’s probably not an issue if you don’t keep the Server running for an extended period of time. Otherwise you can mitigate it by setting a larger Server latency.

1 Like

Thanks, I think it is clear now to me. The client uses system clock and the server uses my audio interface or or internal sound card. I will try how it drifts with Supernova.

This is another solution:

This won’t work if you need to create synths for each note - as you can’t do that on the server. But if you can create your synths in advance, then this will give you sample accurate timing.

1 Like

Thanks, but in this cased I used Pdefs for the sequencing. See here: Tides - First Track made with Supercollider

I checked SuperNova and when using it it looks much better. :ok_hand: Thanks!

1 Like

Is this something that would have to change on the server as well, or is this just a client change.
I tried to follow the discussion that you linked to, but I found it hard to follow unfortunately.

If you mean driving the language by the audio clock, then yes, this would require changing both the client and server implementation.

1 Like