Proposal: Refactor synthdef ugen optimizations

muellmusik · July 29, 2024, 4:31pm

For myself, I’m opposed to putting an additional burden on the user.

The user already specifies the order UGens are created. That should be sufficient. My understanding of the spirit of the topo sort was to optimise that order where possible. Making the topo sort implementation simpler at the expense of user burden is cart before the horse IMV.

Yes what Jordan proposes has some complexity, but that ship sailed a long time ago, and I think it will be simpler than the existing approach. One could argue that WidthFirstUGen was also ‘forever’. I agree we should not make changes of this sort lightly or often, but as long as we provide a reasonable path to deprecation and update (nudge, don’t break) I think it’s okay.

The optimisations have never worked entirely correctly. If @jordan’s approach can fix it, then respectfully I think it’s worth doing and fulfilling the original intent.

muellmusik · July 29, 2024, 4:34pm

Listen all, there are now over 80 messages in this thread, and this is one of many similar discussions over the years, including one fairly recently.

I think what might be most productive now would be for @jordan to come up with a draft implementation? That could focus the discussion on something tangible, and allow meaningful comparisons between alternatives?

Really glad this long standing issue is getting such thoughtful attention!

jordan · July 29, 2024, 5:01pm

Almost done the weak antecedents thing, thanks to everyone’s input, I had a really clear idea of what I was doing and it has only taken about 5 hours or so — thank you!

Haven’t done the sorting of the graph yet, just making sure all the edges are there.

I don’t think its very complicated to be honest.

I’ll make a draft PR soon.

jordan · July 29, 2024, 5:40pm

github.com/supercollider/supercollider

ClassLib: fix and refactor SynthDef compiler

supercollider:develop ← JordanHendersonMusic:topic/synthdef-compiler

opened 05:40PM - 29 Jul 24 UTC

JordanHendersonMusic

+1012 -476

## Purpose and Motivation This is a large refactor/rewrite of the SynthDef co…mpiler because it is broken and complicated to use. See https://scsynth.org/t/proposal-refactor-synthdef-ugen-optimizations/9834/86 for a wider discussion on the issue. This is currently a draft. Feedback about implementation is welcome here, for a more wider conversation please use scsynth. ## Description of proposed changes Weak edges will be placed between server side resource access to ensure correct ordering. Optimisations (replacing ugens or chains thereof) can be done by dealing with just the connections, no need to delete anything from inside a UGen or alter the whole graph state. The graph will topologically sort itself, optimising for long chains of strong connections to reduce wirebuf usage. Dead code elimination happens by default because the server resources are tracked and something meaningful only happens when some resource (bus, buffer, etc...) is touched. ## So far complete * Initial UGenResourceManagers creating additional edges between resources (there are still some UGens that need attention). ## Demo Right now, when ever a weak edge is made, it logs it to the post window. ``` SynthDef(\t, { var sig = 5.collect({|i| In.ar(i) }).sum; BufWr.ar(sig, 0, LFNoise2.ar(2).range(0, 1), 1); 3.collect({|i| Out.ar(i, sig) }); In.ar(2) }) post window: [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an In, an Out] [createWeakConnectionTo, an Out, an In] [createWeakConnectionTo, an Out, an In] [createWeakConnectionTo, an Out, an In] ``` That is each Input (x5) connecting to all the Outputs (x3), then all the Outputs (x3) connecting to the final Input. ## Types of changes - Documentation (TODO) - Bug fix - New feature - Breaking change ## To-do list - [ ] Code is tested - [ ] All tests are passing - [ ] Updated documentation - [ ] This PR is ready for review

It only does the weak edges right now.

jordan · July 31, 2024, 11:05am

Topological sort now works, it also implements dead code elimination by default, but I wanted to share an example of how it was broken before.

Here is the synthdef.

SynthDef(\a, {
	var ins = 3.collect {|i| 
		if (i == 2){ 
			SendReply.kr(Impulse.kr, '/a', DC.kr(0.1));
			PlayBuf.ar(1, 0);
		} { In.ar(i) }
	};
	ins = ins * 2;
	ins = ins + 1;
	
	ins[1] = BufWr.ar(ins[1], 0, ins[1]);
	ins[0] = SendReply.kr(Dust.kr, '/foo', ins[0]);
	
	Out.ar(0, ins.sum)
}).dumpUGens;

The new result (I haven’t enabled optimisations yet so there are no MulAdds yet). As you can see, the read clearly happens before the write.

[0_In, audio, [1]]
[1_*, audio, [0_In[0], 2]]
[2_+, audio, [1_*, 1]]
[3_PlayBuf, audio, [0, 1.0, 1.0, 0.0, 0.0, 0]]  <<< here 
[4_BufWr, audio, [0, 2_+, 1.0, 2_+]] <<< and here
[5_*, audio, [3_PlayBuf[0], 2]]
[6_+, audio, [5_*, 1]]
[7_+, audio, [4_BufWr, 6_+]]
[8_In, audio, [0]]
[9_Out, audio, [0, 7_+]]
[10_*, audio, [8_In[0], 2]]
[11_+, audio, [10_*, 1]]
[12_Dust, control, [0.0]]
[13_DC, control, [0.1]]
[14_Impulse, control, [440.0, 0.0]]
[15_SendReply, control, [14_Impulse, -1, 2, 47, 97, 13_DC[0]]]
[16_SendReply, control, [12_Dust, -1, 4, 47, 102, 111, 111, 11_+]]

The old way, where they happen in the wrong order.

[0_In, audio, [0]]
[1_MulAdd, audio, [0_In[0], 2, 1]]
[2_In, audio, [1]]
[3_MulAdd, audio, [2_In[0], 2, 1]]
[4_BufWr, audio, [0, 3_MulAdd, 1.0, 3_MulAdd]]  <<< here
[5_Impulse, control, [440.0, 0.0]]
[6_DC, control, [0.1]]
[7_SendReply, control, [5_Impulse, -1, 2, 47, 97, 6_DC[0]]]
[8_PlayBuf, audio, [0, 1.0, 1.0, 0.0, 0.0, 0]]  <<< and here
[9_MulAdd, audio, [8_PlayBuf[0], 2, 1]]
[10_+, audio, [4_BufWr, 9_MulAdd]]
[11_Out, audio, [0, 10_+]]
[12_Dust, control, [0.0]]
[13_SendReply, control, [12_Dust, -1, 4, 47, 102, 111, 111, 1_MulAdd]]

smoge · July 31, 2024, 11:29am

It looks good and correct. The (old) algorithm treated the graph simpler than it was. There is no “overengineering” here; it just reads the types now.

It would be good to test with many, many cases.

jordan · August 1, 2024, 2:09pm

Just an update, I’ve now enabled basic optimisations and the graph looks like…

0_In, audio, [1]]
[1_Sum3, audio, [0_In[0], 0_In[0], 1]]
[2_PlayBuf, audio, [0, 1.0, 1.0, 0.0, 0.0, 0]]   <<<<<< Read
[3_BufWr, audio, [0, 1_Sum3, 1.0, 1_Sum3]]   <<<< Write
[4_Sum4, audio, [3_BufWr, 2_PlayBuf[0], 2_PlayBuf[0], 1]]  <<<<< PlayBuf + PlayBuf
[5_In, audio, [0]]
[6_Out, audio, [0, 4_Sum4]]
[7_Sum3, audio, [5_In[0], 5_In[0], 1]]
[8_Dust, control, [0.0]]
[9_Impulse, control, [440.0, 0.0]]
[10_DC, control, [0.1]]
[11_SendReply, control, [9_Impulse, -1, 2, 47, 97, 10_DC[0]]]
[12_SendReply, control, [8_Dust, -1, 4, 47, 102, 111, 111, 7_Sum3]]

Its shorter than before because I’ve enabled rewriting a * 2 into a + a, which with Sum3/4 can reduce the number of UGen calls. See github for more information on the details.

Tests will be coming shortly. Although I am unsure how best to go about this as the graphs will be different.

A null test between old and new might be a good approach, but I am unsure how to scale this.

jamshark70 · August 1, 2024, 2:30pm

Another test case (harder one – the current topo sort handles this case quite badly) – Troubleshooting: numWireBufs & NamedControls problem - #5 by jamshark70

hjh

jordan · August 1, 2024, 3:04pm

I’ll have a look, but should say that limiting the number of wire buffers isn’t the goal because @Spacechild1 wasn’t able to observe a performance hit.

I don’t really have time to count the wirebuffs, but I think it is better?

Ex1

(
d = SynthDef(\test, {
	var n = 3;
	var ctls = Array.fill(n, { |i|
		[
			("freq" ++ i).asSymbol.kr(110 * (i+1)),
			("amp" ++ i).asSymbol.kr(1 / (i+1))
		]
	}).flop;  // 'flop' puts all freqs in one subarray, amps in the other
	var sig = SinOsc.ar(ctls[0]) * ctls[1];
	Out.ar(0, sig.sum)
}).dumpUGens;
)

OLD

[ 0_Control, control, nil ]
[ 1_SinOsc, audio, [ 0_Control[0], 0.0 ] ]
[ 2_Control, control, nil ]
[ 3_Control, control, nil ]
[ 4_SinOsc, audio, [ 3_Control[0], 0.0 ] ]
[ 5_Control, control, nil ]
[ 6_*, audio, [ 4_SinOsc, 5_Control[0] ] ]
[ 7_MulAdd, audio, [ 1_SinOsc, 2_Control[0], 6_* ] ]
[ 8_Control, control, nil ]
[ 9_SinOsc, audio, [ 8_Control[0], 0.0 ] ]
[ 10_Control, control, nil ]
[ 11_MulAdd, audio, [ 9_SinOsc, 10_Control[0], 7_MulAdd ] ]
[ 12_Out, audio, [ 0, 11_MulAdd ] ]

NEW

[0_Control, control, [110]]
[1_SinOsc, audio, [0_Control[0], 0.0]]
[2_Control, control, [1.0]]
[3_*, audio, [1_SinOsc, 2_Control[0]]]
[4_Control, control, [220]]
[5_SinOsc, audio, [4_Control[0], 0.0]]
[6_Control, control, [0.5]]
[7_MulAdd, audio, [5_SinOsc, 6_Control[0], 3_*]]
[8_Control, control, [330]]
[9_SinOsc, audio, [8_Control[0], 0.0]]
[10_Control, control, [0.33333333333333]]
[11_MulAdd, audio, [9_SinOsc, 10_Control[0], 7_MulAdd]]
[12_Out, audio, [0, 11_MulAdd]]

Ex2

(
d = SynthDef(\test, {
	var n = 3;
	var freqs = Array.fill(n, { |i|
		("freq" ++ i).asSymbol.kr(110 * (i+1))
	});
	var amps = Array.fill(n, { |i|
		("amp" ++ i).asSymbol.kr(1 / (i+1))
	});
	var sines = SinOsc.ar(freqs);
	var sig = sines * amps;
	Out.ar(0, sig.sum)
}).dumpUGens;
)

OLD

[ 0_Control, control, nil ]
[ 1_SinOsc, audio, [ 0_Control[0], 0.0 ] ]
[ 2_Control, control, nil ]
[ 3_SinOsc, audio, [ 2_Control[0], 0.0 ] ]
[ 4_Control, control, nil ]
[ 5_SinOsc, audio, [ 4_Control[0], 0.0 ] ]
[ 6_Control, control, nil ]
[ 7_Control, control, nil ]
[ 8_*, audio, [ 3_SinOsc, 7_Control[0] ] ]
[ 9_MulAdd, audio, [ 1_SinOsc, 6_Control[0], 8_* ] ]
[ 10_Control, control, nil ]
[ 11_MulAdd, audio, [ 5_SinOsc, 10_Control[0], 9_MulAdd ] ]
[ 12_Out, audio, [ 0, 11_MulAdd ] ]

NEW

[0_Control, control, [110]]
[1_SinOsc, audio, [0_Control[0], 0.0]]
[2_Control, control, [1.0]]
[3_*, audio, [1_SinOsc, 2_Control[0]]]
[4_Control, control, [220]]
[5_SinOsc, audio, [4_Control[0], 0.0]]
[6_Control, control, [0.5]]
[7_MulAdd, audio, [5_SinOsc, 6_Control[0], 3_*]]
[8_Control, control, [330]]
[9_SinOsc, audio, [8_Control[0], 0.0]]
[10_Control, control, [0.33333333333333]]
[11_MulAdd, audio, [9_SinOsc, 10_Control[0], 7_MulAdd]]
[12_Out, audio, [0, 11_MulAdd]]

jordan · August 1, 2024, 7:06pm

FYI, the PR is now functional, but the tests and docs need updating if any one wants to try it.

Spacechild1 · August 1, 2024, 8:15pm

Great work!

I don’t really have time to count the wirebuffs, but I think it is better?

Note that for testing purposes you could just print the number of wire buffers inside scsynth so we don’t have to count

As I said, I only tested with scsynth and things might be different with Supernova! (In Supernova every Synth needs its own array of wire buffers; with many Synths this can add up quickly.) Also, I only did some quick synthetic benchmarks; these things also need to be tested with real life work loads.

jordan · August 1, 2024, 10:01pm

Ah I forgot about supernova. I’m not too familiar with scsynth, is it possible to get the number of wire buffers back into sc? Would be good to make some tests, at least making sure things scale in a particular way.

Spacechild1 · August 2, 2024, 12:45am

GraphDef_Read (in SC_GraphDef.cpp) is the function that actually creates the SynthDef. At the end of the function you can just post the value of mNumWireBufs to the console.

jordan · August 2, 2024, 6:36pm

Today I’ve made tests and common sub expression elimination.

The old tests where testing that the optimisation was applied. Now, I fill a buffer with an un-optimised SynthDef and and optimised one ensuring they fulfil a null test (below -240db).

Here are some benchmarks from the optimisations. @jamshark70 I’ve taken these from many of the threads on here you’ve been in involved in over the years and I’ve also borrowed heavily from your CSE implementation, so thank you !

The method SynthDef.newWithoutOptimisations turns off sorting, replacement and CSE, it literally run it in the order the users supplied.

Test 1, very wide.

(
f = { |n = 100|
	bench {
		SynthDef.newWithoutOptimisations(\additive, {
			var oscs = Array.fill(n, { SinOsc.ar(ExpRand(200, 300)) });
			Out.ar(0, (oscs.sum * 0.001).dup)
		}).children.size.debug("Num UGens");
	};
	
	bench {
		SynthDef(\additive, {
			var oscs = Array.fill(n, { SinOsc.ar(ExpRand(200, 300)) });
			Out.ar(0, (oscs.sum * 0.001).dup)
		}).children.size.debug("Num UGens");
	};
};
)

f.(2000);
Num UGens: 6001
time to run: 0.049266753999994 seconds.
Num UGens: 4669
time to run: 0.634530913 seconds.

f.(200);
Num UGens: 601
time to run: 0.010466048999994 seconds.
Num UGens: 469
time to run: 0.022342442999999 seconds.

f.(20);
Num UGens: 61
time to run: 0.001505336000001 seconds.
Num UGens: 49
time to run: 0.0039255450000013 seconds.

Test 2, very deep.


(
f = { 
	bench {
		SynthDef.newWithoutOptimisations(\additive, {
			// tw 0011 (f0)
			var s = { |o, i| SinOsc.ar([i, i + 0.0001] ** 2 * f.value(o, i - 1), f.value(o, i - 1) * 0.0001) * f.value(o, i - 1)};
			var f = { |o, i| if(i > 0, { s.value(o, i) }, o)};
			Out.ar(0, (f.value(60, 6) / 60).sum);
		}).children.size.debug("Num UGens");
	};
	
	bench {
		SynthDef(\additive, {
			// tw 0011 (f0)
			var s = { |o, i| SinOsc.ar([i, i + 0.0001] ** 2 * f.value(o, i - 1), f.value(o, i - 1) * 0.0001) * f.value(o, i - 1)};
			var f = { |o, i| if(i > 0, { s.value(o, i) }, o)};
			Out.ar(0, (f.value(60, 6) / 60).sum);
		}).children.size.debug("Num UGens");
	};
};
)

Results, I think this is quite impressive!!

f.();
Num UGens: 1944
time to run: 0.025203086000005 seconds.
Num UGens: 48   <<< wow!
time to run: 0.30789417 seconds.
-> 0.30789417

jordan · August 3, 2024, 2:58pm

Discovered yet another bug. This one is quite important. #6411

TLDR:
RecordBuf will accept a scalar as input, but doesn’t complain.


b = Buffer.alloc(s, 0.01 * s.sampleRate);
{ RecordBuf.ar(0, b, loop: 0, doneAction: 2) }.play;
b.getToFloatArray(action: {|ar| ar.postln });

// Should be filled with zeros
FloatArray[0.0, 0.0, 0.0, 0.0, 0.0, 3.0782323365823e-41, -6.9850717199076e+35, 4.580564420185e-41, 1.0, 2.3510242946256e-38, 0.0, 0.0, 0.0, 0.0, -6.9850970729196e+35, 4.580564420185e-41, 2.0, 2.8025969286496e-44, 0.0, 0.0, 0.0, 3.0782323365823e-41, -6.9851224259316e+35, 4.580564420185e-41, -99.0, 9.4039548065783e-38, -6.9853981399372e+35, 4.580564420185e-41, 0.0, 4.580564420185e-41, -6.9851477789436e+35, 4.580564420185e-41, 0.0, 0.0, -6.9854932137322e+35, 4.580564420185e-41, 2.8025969286496e-45, 9.403954806...etc...

This is important for optimisations such as a * 0 => 0.

b = Buffer.alloc(s, 0.01 * s.sampleRate);

{ RecordBuf.ar(SinOsc.ar * 0, b, loop: 0, doneAction: 2) }.play;

b.getToFloatArray(action: {|ar| ar.postln });

// Should be filled with zeros
FloatArray[0.0, 0.0, 0.0, 0.0, 0.0, 3.0782323365823e-41, -6.9850717199076e+35, 4.580564420185e-41, 1.0, 2.3510242946256e-38, 0.0, 0.0, 0.0, 0.0, -6.9850970729196e+35, 4.580564420185e-41, 2.0, 2.8025969286496e-44, 0.0, 0.0, 0.0, 3.0782323365823e-41, -6.9851224259316e+35, 4.580564420185e-41, -99.0, 9.4039548065783e-38, -6.9853981399372e+35, 4.580564420185e-41, 0.0, 4.580564420185e-41, -6.9851477789436e+35, 4.580564420185e-41, 0.0, 0.0, -6.9854932137322e+35, 4.580564420185e-41, 2.8025969286496e-45, 9.403954806...etc...

Edit: I’ve added a method to let UGens rewrite their inputs post optimisation.

jordan · August 4, 2024, 11:46am

I’ve marked the PR as no longer a draft and I think it is now in a state that people can start using it and having a look at the approach taken.

… Its a big PR +3,400 −1,790, but quite a lot of that is tests, and I think my approach here is much better than what was there before.

I did some benchmarks between un-optimised graphs and optimised, TLDR: anything betweeen 2x and 37x was measured.

github.com/supercollider/supercollider

Comment by JordanHendersonMusic to ClassLib: fix and refactor SynthDef compiler

supercollider:develop ← JordanHendersonMusic:topic/synthdef-compiler

Benchmarks between optimised and un-optimised versions (not the existing impleme…ntation as it is broken), measured by placing timings in `Graph_Calc`. Many of these SynthDefs are taken from sc140, these are all included in the tests where full credit is given. No slow downs observed, performance is at least 2x, often more when multichannel expansion is involved. All these SynthDefs pass a null test (within some tolerance) between the optimised and un-optimised versions, this is in the tests. ```supercollider SynthDef(\t, { var out; out = Mix.fill(9,{ var i = Impulse.ar(0.4)!2; CombC.ar(i, 1, Select.ar(Impulse.kr(50), (55 + Scale.aeolian.degrees).collect{ |x| DC.ar(1 / x.midicps) }), 3 ) }).sum; Out.ar(0, out); }).add ``` Speed up factor: **7.7948072429338** ```supercollider SynthDef(\t, { var out; var n = 5; var sig = SinOsc.ar(19000!n); var in; ReplaceOut.ar(0, DC.ar(0)!n); Out.ar(0, sig!n); Out.ar(0, sig!n); Out.ar(0, sig!n); sig = SinOsc.ar(500!n); sig = (sig * 2 + SinOsc.ar) + sig; in = In.ar(0, n); in = In.ar(0, n); in = In.ar(0, n); in = In.ar(0, n); Out.ar(0, sig * sig); ReplaceOut.ar(0, in * sig); ReplaceOut.ar(0, in * sig); ReplaceOut.ar(0, in * sig); ReplaceOut.ar(0, in * sig); out = In.ar(0, n).sum; Out.ar(0, out); }).add ``` Speed up factor: **1.9200296714155** ```supercollider SynthDef(\t, { var out = ({ var a = SinOsc; var l = LFPar; a.ar(666 * a.ar(l.ar(l.ar(0.5)) * 9) * RLPF.ar(Saw.ar(9), l.ar(0.5).range(9,999), l.ar(2))).cubed }!2).sum; Out.ar(0, out) }).add ``` Speed up factor: **2.3825583201552** ```supercollider SynthDef(\t, { var out; var s = { |o, i| SinOsc.ar( [i, i + 0.0001] ** 2 * f.value(o, i - 1), f.value(o, i - 1) * 0.0001) * f.value(o, i - 1 ) }; var f = { |o, i| if(i > 0, { s.value(o, i) }, o)}; out = (f.value(60, 6) / 60).sum; Out.ar(0, out) }).add ``` Speed up factor: **37.178273334957** <<<<< This is a particularly perverse case!

I need to write some docs for UGen authors but it basically boils down to adding these methods if you want the UGen to be optimised correctly. If you don’t add them, only performance suffers, not behaviour.

For a ‘pure’ ugen (the class PureUGen is not recommended any more and I’ve remove all uses from the class lib).

resourceManagers { ^[] }
hasObservableEffect { ^false }
canBeReplacedByIdenticalCall { ^true }

For BufRd

resourceManagers { ^[UGenBufferResourceManager] }
bufferAccessType { ^\read }
hasObservableEffect { ^false }
canBeReplacedByIdenticalCall { ^true }

BufWr

resourceManagers { ^[UGenBufferResourceManager] }
bufferAccessType { ^\write }
hasObservableEffect { ^true }
canBeReplacedByIdenticalCall { ^true }  // subsequent identical reads do nothing

Out

resourceManagers { ^[UGenBusResourceManager] }
busAccessType { ^\write }
hasObservableEffect { ^true }
canBeReplacedByIdenticalCall { ^false }  // Out adds to the existing buffer

ReplaceOut

resourceManagers { ^[UGenBusResourceManager] }
busAccessType { ^\overwrite }
hasObservableEffect { ^true }
canBeReplacedByIdenticalCall { ^true }  // replaces buffer

jordan · August 6, 2024, 10:43am

Just coming back to this because I forgot to post the results.
Printing code.

GraphDef* GraphDef_Read(World* inWorld, char*& buffer, GraphDef* inList, int32 inVersion) {
    ...
    scprintf("%s: numWires %d numWireBuffers %d\n", graphDef->mNodeDef.mName, graphDef->mNumWires, graphDef->mNumWireBufs);
}

Example

(
d = SynthDef(\test, {
	var n = 50;
	var freqs = Array.fill(n, { |i|
		("freq" ++ i).asSymbol.kr(110 * (i+1))
	});
	var amps = Array.fill(n, { |i|
		("amp" ++ i).asSymbol.kr(1 / (i+1))
	});
	var sines = SinOsc.ar(freqs);
	var sig = sines * amps;
	Out.ar(0, sig.sum)
});

d.dumpUGens;

d.add;
)

Results

// this PR n == 50
test: numWires 301 numWireBuffers 2

// current Dev n == 50
test: numWires 201 numWireBuffers 50

As you can see, this produces the optimum number of wire buffers.

jamshark70 · August 6, 2024, 1:27pm

Cool. I wonder if it’s any different with arrayed controls – it shouldn’t be…? Just with arrayed controls, all the channels’ descendants become “available” all at once, rather than one by one.

	var n = 50;
	var freqs = \freqs.kr(Array.fill(n, { |i| 110 * (i+1) }));
	var amps = \amps.kr( Array.fill(n, { |i| 1 / (i+1) }));

In current dev, this variant produces a wide graph too.

hjh

jordan · August 6, 2024, 1:30pm

I’ll check, but it shouldn’t because the antecedent is an output proxy. I had a whole saga dealing with those…

jordan · August 18, 2024, 2:29pm

I can’t upload an image directly on to github so I’m just dumping this here, but this is a flame graph of the very large synthdef that reduce 2k ugens down to 48 with CSE, taking 0.8 seconds. Dictionary appears to the main cause of slow down, as they keys are [rate, inputArray, weakAntescendentsArray] and nested arrays makes the equality slow. If any smart people can think of a better way of hashing this information let me know

Also, Dictionary.grow is a little under O(n^2) (different graph). If anyone has any ideas on that, it would also make a big difference!