Yes! I think the primary reason has been multi-channel expansion because it naturally adds the UGens in width-first order. This can lead to excessive wire buffer usage. In fact, you can easily run out of wire buffers altogether.
Consider the following snippet:
(
~numOsc = 10;
SynthDef(\test, { |out, amp = 1.0|
var sig = SinOsc.ar((1..~numOsc) + 440);
var amps = { exprand(0.1, 1.0) } ! ~numOsc;
sig = sig * amps;
Out.ar(out, Mix.ar(sig) / ~numOsc * amp);
}).add.dumpUGens;
)
This is the output without topological sort:
test
[ 0_Control, control, nil ]
[ 1_SinOsc, audio, [ 441, 0.0 ] ]
[ 2_SinOsc, audio, [ 442, 0.0 ] ]
[ 3_SinOsc, audio, [ 443, 0.0 ] ]
[ 4_SinOsc, audio, [ 444, 0.0 ] ]
[ 5_SinOsc, audio, [ 445, 0.0 ] ]
[ 6_SinOsc, audio, [ 446, 0.0 ] ]
[ 7_SinOsc, audio, [ 447, 0.0 ] ]
[ 8_SinOsc, audio, [ 448, 0.0 ] ]
[ 9_SinOsc, audio, [ 449, 0.0 ] ]
[ 10_SinOsc, audio, [ 450, 0.0 ] ]
[ 11_*, audio, [ 1_SinOsc, 0.14781964574475 ] ]
[ 12_*, audio, [ 2_SinOsc, 0.61188809040249 ] ]
[ 13_*, audio, [ 3_SinOsc, 0.87976904193435 ] ]
[ 14_*, audio, [ 4_SinOsc, 0.49579341485918 ] ]
[ 15_*, audio, [ 5_SinOsc, 0.14011180115573 ] ]
[ 16_*, audio, [ 6_SinOsc, 0.61413128730507 ] ]
[ 17_*, audio, [ 7_SinOsc, 0.4674508871795 ] ]
[ 18_*, audio, [ 8_SinOsc, 0.3693163891174 ] ]
[ 19_*, audio, [ 10_SinOsc, 0.16267632509419 ] ]
[ 20_Sum4, audio, [ 14_*, 13_*, 12_*, 11_* ] ]
[ 21_Sum4, audio, [ 18_*, 17_*, 16_*, 15_* ] ]
[ 22_MulAdd, audio, [ 9_SinOsc, 0.23164356857488, 19_* ] ]
[ 23_Sum3, audio, [ 22_MulAdd, 21_Sum4, 20_Sum4 ] ]
[ 24_/, audio, [ 23_Sum3, 10 ] ]
[ 25_*, audio, [ 24_/, 0_Control[1] ] ]
[ 26_Out, audio, [ 0_Control[0], 25_* ] ]
This needs as many wire buffers as SinOscs! With ~numOsc = 100
you already exceed the default number of wire buffers (64).
And this is the output with topological sort:
test
[ 0_Control, control, nil ]
[ 1_SinOsc, audio, [ 441, 0.0 ] ]
[ 2_*, audio, [ 1_SinOsc, 0.21350452668567 ] ]
[ 3_SinOsc, audio, [ 442, 0.0 ] ]
[ 4_*, audio, [ 3_SinOsc, 0.14639397071191 ] ]
[ 5_SinOsc, audio, [ 443, 0.0 ] ]
[ 6_*, audio, [ 5_SinOsc, 0.43959053276532 ] ]
[ 7_SinOsc, audio, [ 444, 0.0 ] ]
[ 8_*, audio, [ 7_SinOsc, 0.30976199993617 ] ]
[ 9_Sum4, audio, [ 8_*, 6_*, 4_*, 2_* ] ]
[ 10_SinOsc, audio, [ 445, 0.0 ] ]
[ 11_*, audio, [ 10_SinOsc, 0.53345690432921 ] ]
[ 12_SinOsc, audio, [ 446, 0.0 ] ]
[ 13_*, audio, [ 12_SinOsc, 0.2133492885536 ] ]
[ 14_SinOsc, audio, [ 447, 0.0 ] ]
[ 15_*, audio, [ 14_SinOsc, 0.7878841337054 ] ]
[ 16_SinOsc, audio, [ 448, 0.0 ] ]
[ 17_*, audio, [ 16_SinOsc, 0.10154009291624 ] ]
[ 18_Sum4, audio, [ 17_*, 15_*, 13_*, 11_* ] ]
[ 19_SinOsc, audio, [ 449, 0.0 ] ]
[ 20_SinOsc, audio, [ 450, 0.0 ] ]
[ 21_*, audio, [ 20_SinOsc, 0.22857845276941 ] ]
[ 22_MulAdd, audio, [ 19_SinOsc, 0.64841385504448, 21_* ] ]
[ 23_Sum3, audio, [ 22_MulAdd, 18_Sum4, 9_Sum4 ] ]
[ 24_/, audio, [ 23_Sum3, 10 ] ]
[ 25_*, audio, [ 24_/, 0_Control[1] ] ]
[ 26_Out, audio, [ 0_Control[0], 25_* ] ]
I think this only needs 4 + ceil(~numOuts / 4)
wire buffers: 4 buffers for the Sum4 inputs and 1 buffer for every Sum4 result. (Someone please do the math!)
For 10 SinOscs this is only 10 vs 7, but for 100 SinOscs it is 100 vs 29.
Note that this could be reduced to only 5 wire buffers – for all values of ~numOsc
! – if every Sum4
instance would immediately sum into the final output, instead of summing all Sum4
outputs at the very end. On the downside, the final sum would be done by individual +
UGens instead of Sum4
UGens. Regarding the Sum4
optimization, there seems to be an implicit tradeoff between the number of UGens and the number of required wire buffers.