Dot product UGen

Hello everybody!

I came across something that I thought being pretty straight forward to implement, but somehow is getting me crazy.

I’d like to calculate the dot product between an x number of signals and an x*x Matrix. Something that in the language I would write like this:

(
var dot = {
	|m,v|
	m.collect({
	|i|
	(i*v).sum
})
};
dot.value([[0.5,0],[0.5,0.5]], [3,4]) //  -> [1.5, 3.5]
)

But how could I multiply the output of a UGen with a Matrix?

DC.ar([3,4])*[[0.5,0],[0.5,0.5]]

Any suggestion? thank you!

ddgg

Actually there’s no difference as multichannel expansion is done in the language.


// inner arrays of m understood as rows, vectors as columns
// of course convention can be changed

~linMap = { |m, v| (m.flop * v).flop.collect(_.sum) };

~linMap.([[10, 20], [50, 100]], [5, 7]);


x = { 
	Saw.ar(
		~linMap.(
			[[10, 20], [50, 100]], 
			[5, 7] * SinOsc.ar(0.1).range(0.98, 1.02)
		).poll,
		0.1
	)
}.play

x.release

Hello Daniel!
Thank you. That seems pretty straight forward.
In your experience, would this work also for an x size vector and a x*x size Matrix where x is…lets say…40? Wouldn’t the Language start to suffer the computation?

Well, let’s check it out :slight_smile:

To be exact, the language is busy only at compile time, then let’s see how much CPU is consumed by the server when running. You might have to increase the server option numWireBufs

In the following example, after ca. 4 seconds of compilation on my machine, 1768 ugens are engaged at 3.5 % CPU. That look pretty reasonable as long as you are not wanting to run many of these in parallel.

s.options.numWireBufs = 64 * 16;
s.reboot;


~linMap = { |m, v| (m.flop * v).flop.collect(_.sum) };

(
n = 40;

x = { 
	Saw.ar(
		~linMap.(
			{ { rand(5.0) } ! n } ! n, 
			{ rand(5.0) } ! n * SinOsc.ar(0.1).range(0.98, 1.02)
		),
		0.1
	)
}.play
)

s.scope(40)

x.release

It gets a bit heavier if you have a vector of independent ugens, though not all too much. Of course, it’s much heavier with a matrix of ugens, though from your post I assume you’re not wanting that.

Unfortunately this is exactly what I wanted. I’m sorry, I didn’t express myself clearly…
I would like to be able to change either the vector values or the matrix coefficient at each DSP block.
I’ll try to present another example and then we will see…

I’m curious where the weight would come from. AFAIK SomeUnit.ar * float and SomeUnit.ar * OtherUnit.ar would not perform very differently.

hjh

Well, it still looked doable with a size of 40 in the examples I tried. If you have a chance of reducing the matrix (triangular, band, or sparse in another way), the better.

I had such cases in mind:


s.options.numWireBufs = 64 * 16;
s.reboot;


~linMap = { |m, v| (m.flop * v).flop.collect(_.sum) };



// constant matrix ...

// 1806 ugens, 5 % CPU

(
n = 40;

x = { 
	Saw.ar(
		~linMap.(
			{ { rand(5.0) } ! n } ! n, 
			SinOsc.ar({ rand(5.0) } ! n).range(0.5, 1.5)
		),
		0.1
	)
}.play
)

x.release


// ... versus constant vector

// 4926 ugens, 15 % CPU

(
n = 40;

x = { 
	Saw.ar(
		~linMap.(
			SinOsc.ar({ { rand(50.0) } ! n } ! n).range(0.5, 1.5), 
			{ rand(5.0) } ! n
		),
		0.1
	)
}.play
)

x.release

Indeed, in this case you couldn’t reduce the number of oscillators, and these will be heavier than math ops, so it’s always going to be more expensive.

As these are all linear operations, though, I wonder if there’s a way to reduce some redundancy by applying the range operation to the sums rather than to the individual oscillators: ab + ac + ad is 3 muls and 2 adds, while a(b + c + d) is 2 adds and 1 mul… and scale up to 1600 MulAdds vs potentially only 40 muls and I guess 39*40 adds. (But then, does the overhead of splitting up the MulAdd units outweigh the elimination of 1560 multiply ops? Optimization is tricky.)

hjh

// special case assuming all oscillators
// are mul'ed and add'ed by the same amount
~linMapRange = { |oscMatrix, vector, oscMul, oscAdd|
	var vectorMul = vector * oscMul;
	var offset = vector.sum * oscAdd;
	(oscMatrix.flop * vectorMul).flop.collect(_.sum) + offset
};

(
n = 40;

x = { 
	Saw.ar(
		~linMapRange.(
			SinOsc.ar({ { rand(50.0) } ! n } ! n), 
			{ rand(5.0) } ! n,
			0.5, 1.0
		),
		0.1
	)
}.play
)

The original has 4296 ugens; this variant has 3366. But tbh, the Saws and SinOscs are responsible for most of the CPU usage.

Derivation:

~linMap.([[a * 0.5 + 1, b * 0.5 + 1], [c * 0.5 + 1, d * 0.5 + 1]], [e, f])

[(a * 0.5 + 1)e + (b * 0.5 + 1)f, (c * 0.5 + 1)e + (d * 0.5 + 1)f]

[ae * 0.5 + e, bf * 0.5 + f, ce * 0.5 + e, df * 0.5 + f]

let g = e * 0.5  // mul factor from 'range'
let h = f * 0.5

let i = e * 1  // add term from 'range'
let j = f * 1

[ag + i + bh + j, cg + i + dh + j]
[ag + bh + i + j, cg + dh + i + j]

let k = i + j
[ag + bf + k, cg + df + k]

hjh