Server clock vs. Language clock

If the Patterns run independently from your live input, the latency can be just as high as needed

The challenge is to start a pattern when I play a note on guitar (depending on the analyzed midinput). When this happens, the clock’s beats is set to 0 and the pattern is played on the clock with quant: 1, ergo immediate execution. Any latency other than nil will add to the latency of the first downbeat of the pattern. With latency = nil, the overall latency, judging from looking at the recorded audio in a sample editor, is around 30 ms of which the audio to midi conversion is responsible for 20-25 ms. With a latency of e.g. 0.03, the latency would double which is undesirable. I am messing around with a hacky way of dealing with it: s.latency = nil for the first downbeat, then 0.1 beats later set the latency to 0.03 and leave it there. The timing will of course be a little strange in the very beginning of the pattern but my initial test seem to indicate that this is less of a perceptual problem than one would think. Once the pattern is going, the latency is much less of a an issue. I noticed that regardless of latency setting (if other than nil) the reported times from the server are less than the latency settings. Eg. with a latency of 0.2 the server time stamps are consistent around of value of approx 0.175 added to each beat. I wonder why it is not closer to 0.2? I know the difference is small, just wondering…

if you want to minimize overall latency, you first need to figure out the lowest possible hardware buffer size that gives a stable audio signal without dropouts.

Yes I had it backwards, thinking higher buffer size would allow less latency. With a buffer size of 64 I get occasional crackles in the audio, so that I might have to up the size to 128. A buffer size of 64 and a latency of 0.02 almost works…I will have to do more testing.

One thing I’ve also mentioned elsewhere: if the Pattern sequencing is causing language jitter that ruins the live input, you may consider running both processes in dedicated SuperCollider instances. (It’s also possible to have multiple sclang instances control a single Server.)

I don’t really know if the patterns cause additional language jitter, and I am not sure how to test this. How would you go about running multiple slang instances?

Ok, that’s what I thought. So you are really controlling the Patterns with your live input.

I noticed that regardless of latency setting (if other than nil) the reported times from the server are less than the latency settings

In your test you start a Synth with a continuous ramp and sent its values periodically back the Client. It takes time before the Client receives these replies, so naturally the Server will report a (sample) time of N while the Client’s network thread already sees a time of N+k.

Also, please keep in mind that these values will slowly drift over time. The Client measures system time, but Sweep measures sample time.

of which the audio to midi conversion is responsible for 20-25 ms.

I guess you need to find a better audio to MIDI converter :slight_smile:

How would you go about running multiple slang instances?

I think in your particular setup it won’t help anyway, because you also want to control the Pattern with the live input.

1 Like

I guess you need to find a better audio to MIDI converter :slight_smile:

To be fair, the minimum latency cannot be lowder than the period of the lowest pitch, which in case of the e-guitar would be 12 ms for the low E string (82.5 Hz). However, this assumes that we need the same latency for all strings, which is no necessarily true! With a guitar we can track the strings seperately, so we may decide to track each string as fast as possible. After all, the period of the highest string is only 1.5 ms – quite a difference! It’s a bit of a trade-off: should all strings have the same latency, or the shortest respective latency, or something in between? IMO there cannot be a single solution that fits all playing styles. Actually, it would be even useful to switch tracking modes between modes of playing (e.g. riff playing on low strings vs. shredding on high strings).


Now, I just realized that MIDI GUITAR 2 is a plugin! This means it can’t track the strings independently, so the latency really cannot be shorter than 12 ms (in reality, probably twice as long). Are you using the plugin in the Server with VSTPlugin? That would explain why you are interested in Server->Client timing (which otherwise would be completely irrelevant). In that case you will indeed have problems achieving the lowest possible latency because you need a full Server->Client->Server roundtrip. For this kind of live-tracking, Pure Data or Max/MSP might be more appropriate because you can immediately handle the MIDI notes on the audio thread with no additional latency involved.

I think you should really look into proper MIDI guitar pickups! Not only is the tracking faster, you can directly receive the MIDI notes in sclang.

To be fair, the minimum latency cannot be lowder than the period of the lowest pitch, which in case of the e-guitar would be 12 ms for the low E string (82.5 Hz).

That is true, however the ML algo they use (don’t how it is set up) could, I think, theoretically allow for a faster detection than one complete cycle of the fundamental since at least the first 9 partials of the signal of an electric guitar has significant energy. However, when I have experimented with onset and pitch detection in SC, using vanilla and Flucoma ugens, the time it takes to detect a note with high pitch confidence is considerably higher, but also, I am not upsampling the signal which I think could speed up the detection.

The string detection feature is added to MIDI GUITAR 3 which in beta-testing.

For my usage of MG (which is most certainly different than most people’s, because I analyze the midi output, where most people probably just play the output) I would probably prefer a more uniform latency because it would make it easier to analyze.

Are you using the plugin in the Server with VSTPlugin ?

I have been using the standalone version of MIDI GUITAR 2 and receiving MIDI through MIDIdefs in SC, but when a vst3 version of MIDI GUITAR 3 is ready I want to see if I get better results running MG as vst plugin in SC. I guess it would still have to go through MIDIdefs.

For this kind of live-tracking, Pure Data or Max/MSP might be more appropriate because you can immediately handle the MIDI notes on the audio thread with no additional latency involved.

Just thinking about porting my 7000 line code to Pure Data or Max makes my head ache:) I initially chose SC because I think all the logic controlling the behavior of the system would be nightmarish to achieve with PD or Max, but also, I have no experience with PD or only very minimal experience with Max. Can you explain what you meant with

…because you can immediately handle the MIDI notes on the audio thread with no additional latency involved.

?

I think you should really look into proper MIDI guitar pickups! Not only is the tracking faster, you can directly receive the MIDI notes in sclang.

At least up until recently MG was as good as any hardware I have seen or tested. I think for instance that MG perform as well as the Roland pickup system. I saw a new Fishman solution which claims to deliver faster tracking, I think around 13 or 15 ms. It is hard to compare the different solutions based on these kinds of specs for a number of reasons: latency is not inherently steady (as you pointed out), the amount of bogus notes (false positives) is definitely also a big factor when analyzing (I do various kinds of filtering to the midi roll to remedy this) and also, how high is the pitch confidence?. The pitch confidence for MG is extremely high, ie. for the ‘real notes’ as estimated by me (as opposed to the false positives), MG very rarely miss the pitch - almost to a point where I would prefer slightly lower pitch confidence if it meant faster detection at the expense of missing the pitch of a note now and then.

This would show SC is a truly Hegelian system, wherein the master/slave dialectics, the slave is the one with true consciousness. “The truth of the independent consciousness is accordingly the consciousness of the servant…being a consciousness repressed within itself, it will enter into itself, and change around Into the real and true independence.” That’s “SC4” right there.

But seriously, changing this hierarchy would be a big qualitative shift, with many different effects.

Ok, wow!

Just thinking about porting my 7000 line code to Pure Data or Max makes my head ache:)

Sure! If you have already written your project in SC, there’s not much point in porting it to Pd or Max unless you really hit a wall.

…because you can immediately handle the MIDI notes on the audio thread with no additional latency involved.

?

Pd is synchronous: for every block of 64 samples, Pd first does all the messaging (including clock timeouts, network I/O, MIDI I/O, etc.) and then does the audio processing. This means you can, for example, receive a MIDI message, send some Pd messages in turn and Pd will immediately turn it into audio. Also, any sequencing in Pd is perfectly sample accurate by default! There is no notion of “Server latency”.

SC is asynchronous: MIDI is received in sclang, then you need to send a message to the Server, which is (hopefully) received as soon as possible and then turned into audio. Even without Server latency, you may get a delay of a full hardware buffer size in the worst case. And then there is also language jitter.

Personally, the synchronous and deterministic nature of Pd is one of the reasons I still prefer it over SuperCollider, even though patching takes much more time than writing sclang code.

How much of it is the lack of optimization of sclang, or is it server latency proper?

Assume that two processes A and B are running independently from each other. Process A might send a message to B just when the latter has already started or finished its cycle, so the message will only be dispatched at the next cycle of B.

Note that the Server computes control blocks in batches, For example, with a hardware buffer size of 256 samples, every audio callback will compute 4 blocks of 64 samples in a row. Now, if you’re out of luck, your message might be received just while or after the last block has been computed, after which the audio thread goes to sleep for the remaining time slice. This is the reason why message dispatching is quantized to the hardware buffer size (in the worst case) and not to the Server block size (typically 64 samples). The actual quantization depends on the CPU load. If the CPU load is low, the audio callback finishes very quickly and spends most of its time sleeping, so quantization approaches the hardware buffer size period (e.g. 5.3 ms for 256 samples @ 48kHz). If the CPU load is high, the quantization is less pronounced as the callback spends less time sleeping and more time processing blocks, giving messages the opportunity to “sneak in”.


To illustrate further why you cannot “directly” turn MIDI data into audio in sclang:

First the MIDI timer thread needs to wake up and read incoming MIDI messages. Then it needs to obtain the global interpreter lock – which might be currently held by another sclang thread! Only then it can dispatch the MIDI message to the user code which may in turn send messages to the Server. The Server will finally receive the message and dispatch it in the upcoming control block. Depending on the “phase” of the audio callback, the message may be dispatched in a few microseconds – or in a few milliseconds.

1 Like

I believe not much effort has been put into fine-tuning and optimizing this. There must be ways to mitigate this.
First, optimizing the language. Creating larger Slots (128 bits) and parallelizing the computation. MIDI is a sensible part, and it could also have its priority. And then, also trying to have some sort of “soft sync” with the server. (I know, we only know it some “optimization” works testing it, but it never got much attention).

Or “hard sync” as you mentioned before.

Or maybe even think of another language (not sclang) that has a dedicated separate real-time clock in sync with the server.

None of this would change the fact that the client needs to send messages to the Server and the exact time of reception cannot be controlled. If the Server would wake up the Client, it might already have finished before the Client could send its messages. This delay is inherent in asynchronous processes. The upside is that the Client cannot block the Server. That’s the fundamental trade off!

Or “hard sync” as you mentioned before.

Also known as SC2 :slight_smile:

1 Like

SuperCollider Server has pretty good latency for my understanding. I like this design. One could try to just mitigate scheduling problems on the language side as seriously as the server is optimized. That’s all I’m saying. )))))

I usually use a system with minimal desktop and real-time kernel and low latency. I never had problems with that playing live. But I would be very concerned if I could perceive jitter, irregularities, or latency in the language when I’m playing. I would find another option maybe

I don’t care much about how large is the std lib or anything, I’m more concerned that the language implementation starts to become “legacy code” we got used to but not up to the engineering excellence of the server(s).

Even in that case, a message can arrive at the server just a few samples too late, and have to wait for the next callback.

Set your soundcard to a large buffer size, then set up a basic MIDI synth in VCV Rack and you’ll hear the same quantization (i.e. unplayability). Or just about any DAW. These systems are synchronous, unlike SC, so synchronicity by itself isn’t enough to solve this problem.

FWIW even two decades ago, when I was running SC on an underpowered G4 iBook, I didn’t notice significant delays in the handling of incoming MIDI messages. If MIDI hasn’t been optimized, it’s because it already works well enough.

hjh

True! The more important thing is to decrease the hardware buffer size as much as possible.

My point is rather that in a synchronous system, once the MIDI event is received by the audio thread, it is guaranteed to be processed in that audio block. If it is received in another process, you don’t know when it will eventually makes its way into the audio callback, so it adds another layer of uncertainty. Probably not a big deal for most people, though.

BTW, there has been a recent discussion about adding MIDI functionality to the Server, but I can’t find it… In general, I can imagine a plugin API that registers UGens for MIDI events. For example, you could have a UGen that outputs the current value of a given MIDI CC channel. Or with the appropriate API functions you could even have UGens that spawn/destroy other Synths based on MIDI note-on resp. note-off messages. Of course, this would be significantly less flexible than handling MIDI events on the Client side, but you could shave off some extra latency. Just to point out some possibilities. I don’t really except something like this to ever be implemented in SC3.

Fair, and I never experienced it to be a big deal.

(BTW, MIDI timing slop occurs in Pd as well. I reproduced ~25 ms timing errors just now.)

hjh

This shouldn’t happen under normal circumstances. Feel free to open an issue on GitHub! (BTW, I have a PR that improves the overall responsiveness of Pd: improve scheduler by Spacechild1 · Pull Request #1756 · pure-data/pure-data · GitHub)

I thought it was unavoidable, when you’re trying to react As Fast As Possible to incoming MIDI, when the hardware block size is too large.

MIDI’s designed for hardware devices with minimum latency, not general-purpose computers with latency on the order of a dozen or more ms. (This is why, when I performed with a friend who was using his Elektron drum machine, SC was the MIDI clock source.)

I was trying to produce some empirical findings related to the notion that optimizing the MIDI receipt chain would make a noticeable difference. What I found is, it doesn’t make a big difference, not even in Pd.

I tried:

  • SC MIDI pattern (with outgoing message latency – in theory Pd to take advantage of timing information) → Pd. Jittery timing.
  • Pd [noteout] → Pd [notein]. Jittery timing.
  • SC MIDI pattern → VSTPluginMIDISender (with latency – VSTPluginMIDISender is my own way of dealing with latency for VSTPlugin MIDI communication). Timing is near-perfect.

This doesn’t eliminate the variable of the ALSA MIDI layer, which could be introducing jitter…? But if the jitter were that bad, it would be unusable and they would have had to fix it a long time ago. (And I got good timing with the aforementioned SC-generated MIDI clock.) Although… what does reduce the impact of that variable is that jitter was considerably reduced by pulling down the HW buffer size.

So my conclusion is that Pd is not immune to the hardware buffer size limitation… which I think is reasonable: if MSPuckette had a magic solution to the latency problem, everyone would have stolen it by now.

hjh

Yes, but 25 ms seems exceedingly high. This would roughly correspond to a HW buffer size of 1024 samples at 44.1 kHz. I would be curious to know your audio settings. However, I have already gone very off-topic, so if you are interested in working this out (I am!) then maybe either open a ticket on GitHub or send me a PM.

Yes, I was deliberately testing a poor performance case, to demonstrate that this isn’t uniquely SC’s problem. I do know how to configure my system for good realtime response, but that wasn’t the point of this test. I was concerned that some folklore might get started that Pd’s timing is substantially better and that we just need to fix timing in sclang… nope.

hjh

1 Like

Yes, I was deliberately testing a poor performance case,

Ahhh! You should have said that :wink:

to demonstrate that this isn’t uniquely SC’s problem

Of course, the hardware buffer problem is not unique to SC! I am sorry if I gave the opposite impression!

I was concerned that some folklore might get started that Pd’s timing is substantially better and that we just need to fix timing in sclang… nope.

Yes! But that’s not the point I was trying to make. The thing that Pd can do – and SC can’t – is deterministic sequencing without additional latency. In particular, the issue that @Thor_Madsen is trying to solve wouldn’t exist in Pd in the first place. See again Server clock vs. Language clock - #39 by Spacechild1.

1 Like