Proposal: SynthDefs could eliminate duplicate UGens

jamshark70 · July 3, 2021, 2:43am

This morning, I toyed with optimizing ~~Eli~~ Sam’s Maths2 pseudo-UGen, and this got me thinking about cases like the following:

SynthDef(\test, { |out|
	var sig = SinOsc.ar(440.dup, 0, 0.1);
	Out.ar(out, sig);
}).dumpUGens;

[ 0_Control, control, nil ]
[ 1_SinOsc, audio, [ 440, 0 ] ]
[ 2_*, audio, [ 1_SinOsc, 0.1 ] ]
[ 3_SinOsc, audio, [ 440, 0 ] ]
[ 4_*, audio, [ 3_SinOsc, 0.1 ] ]
[ 5_Out, audio, [ 0_Control[0], 2_*, 4_* ] ]

Ooh, fun, both 3 and 4 are redundant.

I think we could catch these cases:

Add a UGen:isDeterministic flag. (OK, big job to go through hundreds of UGen classes.)
Add to UGen, isEquivalent { |that| ... } which returns true if a and b are the same class, and a is deterministic (which implies b to be deterministic as well), and a’s inputs are identical to b’s inputs.
- Nondeterministic UGens, even with identical inputs, would have to fail this test.
For every UGen that is created, if there is a previously-created equivalent UGen, dump the new one and substitute the old one.
- If this scans linearly through the existing array, execution time would be proportional to N^2, which is bad.
- That could be optimized by keeping an IdentityDictionary of UGen classes, so that a new SinOsc would scan only previously created SinOscs. One step further would be to key *aryOpUGens by the operator name instead of the class, so that + operators would scan only other +s and not *s.

Just throwing it out there for discussion for now. I have a show in about a week so I don’t have time to work on this right at the moment. But, I think this might be good, especially if the performance impact could be kept to a minimum. The SinOsc.ar(440.dup) case is pretty common, and even experienced users sometimes write the same a * b 10 times in 15 lines. If the system could automatically streamline many of these, I think that would be good.

Comments?

hjh

Spacechild1 · July 3, 2021, 11:30am

Is it? Why would someone write this? It just doesn’t make any sense with deterministic UGens.

The repeated a * b case is more interesting. But then again, that’s a very easy optimization to make yourself.

Generally, I don’t think the job of the optimizer is to fix bad user code. Instead, it makes sure that idiomatic code performs well. Examples:

topological sort to improve the performance of multi-channel expansion
collapsing a * b + c expressions to a single MulAdd UGen

Every dynamic language has a set of good practices to avoid needlessly slow code. I think we just need to document such pratices for sclang, or more specifically for writing SynthDefs.

Spacechild1 · July 3, 2021, 2:11pm

More specifically, in the case of SinOsc.ar(440.dup) there already is an idiomatic way to write this correctly, so I don’t think we need the optimizer here.

jamshark70 · July 3, 2021, 11:01pm

New users.

It’s actually quite a lot to visualize a UGen graph. You’re right that users should understand what they are writing and avoid redundancy by hand. In practice, it may be years before a user is truly able to do this well.

This thread was inspired by Phaseshaping Osc Algorithms - #15 by Sam_Pluta - which reflects another scenario (writing under time constraints and not having time to pay attention to finer points of SynthDef efficiency). It’s of course completely valid to write quickly; perhaps the system could be more tolerant of less-than-ideal coding circumstances.

hjh

Spacechild1 · July 3, 2021, 11:36pm

What about adding a guide to the documentation about performance tuning? This would mean that users who actually encounter a performance problem can learn how to improve their code.

Examples:
Lua https://www.lua.org/gems/sample.pdf
Python PythonSpeed/PerformanceTips - Python Wiki
JS/Web Web performance | MDN
C++ Optimized C++ [Book] (a whole book!)

I think it’s more reasonable to add some documentation than adding complicated and brittle optimization code that is a nightmare to debug.

In practice, it may be years before a user is truly able to do this well.

It’s expected that beginners write suboptimal code. But we can give them materials to learn and improve.

“I know it’s inefficient but it’s too annoying to clean up everything so I won’t bother”

If it’s not a bottleneck, why bother indeed? The important part is that it can be fixed later.

I really want to stress again the difference between inefficiencies that can be fixed idiomatically by the user and those which can’t (and therefore need the optimizer).

rdd · July 4, 2021, 12:51am

Hello,

As so often there’s a Fredrik Olofsson graph to demonstrate!

// tw 0011 (f0)
{var s = {|o,i| SinOsc.ar([i,i + 0.0001] ** 2 * f.value(o,i-1),f.value(o,i - 1) * 0.0001) * f.value(o,i - 1)}
;var f = {|o,i| if(i > 0,{s.value(o,i)},o)}
;f.value(60,6) / 60}

Here scsynth reports 1950 UGens for this graph.

After common subexpression elimination the graph has 48 UGens.

Best,
Rohan

Ps. I think the Finally tagless, partially evaluated paper has a nice description of the problem (made clearer again in the Implementing Explicit and Finding Implicit Sharing paper) but I don’t know of Smalltalk work in this area.

rdd · July 4, 2021, 12:52am

Pps. A picture of the post-common subexpression elimination graph…

rdd · July 4, 2021, 6:30am

Ppps. Reference had been detached above somehow, apologies, it’s:

// tw 0011 (f0) ; https://twitter.com/redFrik/status/23182604046
{f={|o,i|if(i>0,{SinOsc.ar([i,i+1e-4]**2*f.(o,i-1),f.(o,i-1)*1e-4,f.(o,i-1))},o)};f.(60,6)/60}

Sam_Pluta · July 4, 2021, 6:33am

Or I wrote the code in the back of a tour bus as the battery ran out on my laptop and just wanted to share something I was working on with another user.

Though I do appreciate the valuable feedback that this forum provides (including this feedback), I am finding the use of my code as an “example” on two occasions here, quite inappropriate (if perhaps inadvertently so). Can we please watch how we word things when criticizing the work of another user?

Sam

jamshark70 · July 4, 2021, 7:10am

A couple of a priori assumptions there, no? Do you know for sure that it would be brittle and un-debuggable?

“That sounds hard, so don’t try.”

In fairness, it is worth a look at the costs involved:

Maintenance cost: Unknown. What I’m describing isn’t trivial, but I think it has a good chance of being better than you’re assuming.
Performance cost vs benefit:
- Benefit: Unknown. We have no idea what percentage of SC users are writing well vs poorly optimized SynthDefs.
- Cost: I’ll admit to being a bit concerned about this. The worst case would be large-scale additive synthesis where all of the deterministic oscillators do have different inputs. In that case, you can’t throw any of them away (no benefit) but it would take time to be sure that you can’t throw any of them away – possibly as much time as O(n^2), though we could easily improve on that by partitioning UGens by class (and further by looking up input arrays by hash in a Set, instead of linearly scanning).
  If we could be sure that few users would benefit, or that they would benefit only trivially, and the performance cost could not be reduced, then there would be a concrete argument against adding the logic. (I would find this line of reasoning more convincing than assumptions about the stability of an algorithm that hasn’t been written yet.)

I do understand this point, but I don’t fully agree. User-friendliness matters. It’s important (something that the Pd developer community seems not to understand).

Even I feel sometimes like “Oh why do I have to make another variable for a sub-expression?” What if… users simply didn’t have to worry about that? Wouldn’t it make the platform more relaxed and enjoyable to use? (Now, if it dramatically slows down large-SynthDef construction, then that would negate the benefit… but I do not agree that there is no meaningful benefit to be negated.)

hjh

jamshark70 · July 4, 2021, 10:41am

Ahhhh… Ok I do apologize for that… I’ll go back and edit the wording.

I’m a way, this helps to explain why it could be helpful to do more optimization. If the system eliminates some redundancy automatically, then it simply isn’t an issue… While understanding Christof’s point, sometimes you really are on a tour bus without much battery left, and at those times, it’s nice if the system can help you out more rather than less.

hjh

Sam_Pluta · July 4, 2021, 11:20am

I very much appreciate the apology. I mean…would I have gone back and optimized everything as you did? I don’t know! But I definitely wasn’t done working on it. And if efficiency is the issue, the faust ugen is definitely the way to go anyway.

Sam

Spacechild1 · July 4, 2021, 2:15pm

Of course there is always (some) benefit in optimization. I think my reply was too negative.

I just tried to raise some concerns regarding future maintainance costs (aka technical debt). The Class Library is already huge and - I dare say - convoluted. I would only add additional complexity if the benefit clearly outweights the costs.

I would never claim that Pd is the pinnacle of user-friendlyness

I think your two examples are interesting, but they are also quite different, so let’s break it down a little:

a) SomeUGen.ar(x ! y) vs SomeUGen.ar(x) ! y

These expressions have completely different meanings. If the user wanted A but wrote B instead, it’s their mistake. I don’t think the optimizer should fix user mistakes. Just because in some circumstances both expressions can yield the same sounding result doesn’t mean you can/should ignore the fundamental semantic difference.

b) common subexpression elimination

In theory, this is a good job for the optimizer. It is a technique that is typically employed in compiled languages like C++. However, these optimizations are not trivial. You already mentioned O(N^2) complexity, so you probably have to set a limit. To give another example: Should it deal with associativity?

var a = SinOsc.ar(440), b = SinOsc.ar(220);
var c, d;
...
b = SinOsc.ar(a * 0.5 * b);
c = SinOsc.ar(b * 0.5 * a);

Where do you draw the line? Generally, I think it’s better to have no CSE than half-baked CSE.

I’m not saying it wouldn’t be a nice feature, I’m just really wondering if it can be implemented in a reasonable way… But I don’t want to discourage you from trying

New users.

I think for beginners performance rarely matters in practice. They are happy if they get something playing as intended. Once they start building more complex things and hit a performance bottleneck, they can learn about optimization tricks. Otherwise it just doesn’t matter.

Let’s say we have the following UGen graph function:

var a = SinOsc.ar(440);
...
// do something with 'a'
// lots of code
...
[ SinOsc.ar(a * 0.5), Saw.ar(a * 0.5) ];
}

Well, I could simply do

// lots of code
...
var b = a * 0.5;
[ SinOsc.ar(b), Saw.ar(b) ];

Right?

Nope. Sclang mandates that variables must be declared at the beginning of a function . For me, this is one of the most irritating things about sclang. The only other language I know which does this is C89 (because of memory limitations of computers in the 70s).

(Offtopic: Another thing which really bugs me is that sclang won’t inline functions with variable declarations and yells at me if I try. As a library author I’m expected to avoid such warnings, but as a result the code becomes incredibly awkward).

IMO, those oddities don’t make sclang a very user-friendly language to begin with. I have the feeling that the need for CSE mainly stems from limitations in the language itself.

jamshark70 · July 4, 2021, 11:42pm

Sure, and it’s “as time permits” for me too. Practically speaking, that may mean “forever unfinished.”

It’s a bit unfair to refer to O(n^2) though, as I’ve already suggested using hash lookup for identifying previous input lists. While updating a hash table does have a cost, lookup is certain to be faster than a linear search so the aggregate complexity should be well less than n squared.

IdentityDictionary[
    (SinOsc -> Dictionary[
        [440, 0] -> sinoscInstance
    ])
]

hjh

Spacechild1 · July 4, 2021, 11:49pm

I was referring to common subexpression elimination. But maybe you can use hashtables there as well.

jamshark70 · July 5, 2021, 1:06am

In my view:

a = SinOsc.ar(440);
...
b = SinOsc.ar(440);

The two SinOscs are common subexpressions. So it’s all one problem, not two.

But I know what you mean – math operators – which I did already think of – see first post: “One step further would be to key *aryOpUGens by the operator name instead of the class, so that + operators would scan only other +s and not *s.”

For + and * it would be pretty easy to check both [a, b] and [b, a].

a + b + c vs c + b + a would be harder. I wouldn’t tackle that one, at least not in the first go-round.

I’m thinking now that it might be better to make it a quark. Users could install it and write e.g. SynthDef(...).addOptimal or such, if they want. If it isn’t part of the main class library, then concerns about main library maintenance disappear (and users assume risk).

hjh

Spacechild1 · July 5, 2021, 12:27pm

The two SinOscs are common subexpressions. So it’s all one problem, not two.

Ok, for that to work we would really need the isDeterministic method.

I’m thinking now that it might be better to make it a quark.

I was about to suggest this!

However, I’m not sure if it’s a good idea - or even possible - to optimize the SynthDef after the fact. Instead we could allow users to add additional optimization passes. In the simplest case, it could be a SynthDef.addOptimizationPass class method which takes a user defined function that is called at the right spot during SynthDef compilation (probably in optimizeGraph)

redFrik · July 5, 2021, 12:57pm

However, I’m not sure if it’s a good idea - or even possible - to optimize the SynthDef after the fact. Instead we could allow users to add additional optimization passes. In the simplest case, it could be a SynthDef.addOptimizationPass class method which takes a user defined function that is called at the right spot during SynthDef compilation (probably in optimizeGraph)

a Quarks sounds like a good idea.

though, as a start, how about one that just scan your synthdefs and report back metrics? that’d be super useful on its own. the quark could analyse, sort and flag the worst offenders - the defs containing a lot of duplicated ugens. then it’s easy to find and manually optimise them as needed.
_f

#|
fredrikolofsson.com musicalfieldsforever.com

rdd · July 5, 2021, 9:20pm

Hello,

There is:

WhiteNoise.superclass == UGen
SinOsc.superclass == PureUGen

(Which is not only for nice alignment!)

Best,
Rohan

Ps. Also, to be very clear, I mentioned Fredrik’s lovely graph for being like the Sklansky circuit in the Kiselyov paper, that is as introducing sharing by construction. Also because it is pleasing (and simple!) for a computer program to rewrite.

jamshark70 · July 5, 2021, 10:56pm

But IIRC, LPF.ar.isKindOf(PureUGen) is false.

A question: Is the use of a random number generator the only way for a UGen to be non-deterministic?

hjh