Using freq-info from FFT as arguments in SynthDefs?

nammedit · August 14, 2019, 12:00pm

Hey everyone!
So I sat up all night looking through the FFT helpfiles to figure out how to analyze sound and resynthesize it using Sinewaves. This is where I am at the moment:

(
SynthDef.new("FFT", {
	var in, chain, sig;
	in = PlayBuf.ar(1, 0, doneAction:2);
	~b = LocalBuf(2048, 1);
	chain = FFT(~b, in);
	chain = PV_MaxMagN(chain, 3);
	chain = chain.pvcollect(2048, {
		arg mag, phase, bin, index;
		mag.postln;
	});
	sig = IFFT(chain);
	Out.ar(0, sig!2);
}).add;
)

As you can see, I at least figured out how to narrow it down to the 3 strongest bins with PV_MaxMagN, but couldn’t for the life of me figure out how to convert the Unpack1FFT’s stored in the arguments mag and phase under the .pvcollect-iteration into freq and amp information to use in the SinOsc.
Anyone know what to do?

Martin

jamshark70 · August 15, 2019, 2:25am

One note for forum usage: please paste code as text, enclosed in backticks like this:

```
your code here...
```

(If you use a screenshot, the only way to play with your code is that I have to retype it.)

So…

The catch with pvcollect is, in a SynthDef, all of the UGen interconnections must be established at the beginning and they can never change after that (unless you recompile the SynthDef).

pvcollect passes bin magnitudes, phases and indices into the function. Each bin needs to have a signal processing chain, and that processing chain will always go through one and only one path.

Note also that PV_MaxMagN does not give you “the three strongest partials.” It gives you the entire FFT frame – all 513 bins – but it preserves only the top 3 and zeroes out the rest. It does not “move” the three strongest into bins 0-2 – so tobin:2 is definitely not going to work for you. FFT is all based on position. Bin 0 is always DC offset. Bin 1 is always for one sinusoidal cycle per buffer; bin 2 is for two sinusoidal cycles, and so on, up to n/2 = Nyquist frequency. Moving the magnitude and phase into another bin means that you lose all information about frequency.

You could potentially “reroute” the partials dynamically with a conditional counter (index = index + (mag > 0)) and a PanAz (which is AFAIK the best way to route a single input to one of several outputs). I’m afraid I don’t have time right now to work that out.

hjh

nammedit · August 15, 2019, 12:38pm

Hello. I’ll be sure to post my code as you showed next time!

About the PV_MaxMagN, thank you for letting me know. And I didn’t quite catch that the bin position gave the frequency info, thanks for that also. Does that mean bin[56] gives 56 hz, or doesn’t it work like that / is it dependent on the size of the buffer?

Anyway, I’m not sure I understood how you’d reroute the correct bin into for example a SinOsc using PanAz. If you have time, could you explain it more?

Martin

Sam_Pluta · August 15, 2019, 8:30pm

It doesn’t totally work that way. You know the frequency is “about” at the bin frequency, but not exactly. If that were true, the entire world would just be a sawtooth wave.

Here is a mega-crack version of that Panharmonium thing someone posted about recently. This is a trick that a wise man once showed me (who may not want his name associated with this): you can fake the frequency by finding the loudest bin, then calculating the ratio between that and the bin above it. Then there is a frequency about the frequency of the loud bin plus the ratio*freqSizeOfEachBin.

I suppose you should probably look and see if the bin above or the bin below is louder and go from there, but I have not done that here.

Sam

b = Buffer.read(s, Platform.resourceDir +/+ “sounds/a11wlk01.wav”);

c = Buffer.alloc(s, 2048);

(

{

var chain, in;

in = PlayBuf.ar(1, b, loop:1);

chain = FFT(c, in);

//in

}.play(s);

~getPeaks = { arg buffer;

var return, mags, peaks, freqs;

buffer.getn(2, 600, {arg vals;

	mags = List.newClear(vals.size/2);

	(vals.size/2).do{arg i;

		mags.put(i, (

			(vals[i*2]**2)+

			(vals[i*2+1]**2)

		).sqrt

		);

	};

	peaks = mags.order.reverse.copyRange(0,6);

	peaks.do{|peak, i|

		peaks.put(i, peak+(mags[peak]/(mags[peak]+mags[peak+1])));

	};

	freqs = ((peaks*44100/2048));

	freqs.postln;

	x.set(\freqs, freqs);

	//notes = Array.fill(12, {rrand(30.0, 105.0)}).round(0.5);

});

};

SynthDef(“tempy”, {arg freqs = #[440,440,440,440,440,440];

Out.ar(1, Mix(SinOsc.ar(freqs, 0, 0.05)));

}).add;

x = Synth(“tempy”);

{

inf.do{

	~getPeaks.value(c);

	0.1.wait;

};

}.fork;

)

jamshark70 · August 16, 2019, 1:42am

FFT has become one of my favorite topics lately

To be more formal about it – bin 56 is always exactly one and only one frequency = 56 * sampleRate / frameSize. (How do you get this? As noted above, bin 1 places one sinusoidal cycle within the frame. 1 cycle * sr samples/sec / fs samples, “samples” cancel out leaving cycles per second. Bin 2 places two sinusoidal cycles in the frame = 2 * sr / fs, and so on.)

FFT does not, never did, and never will give you access to the exact frequency of the input.

So, what if you have a frequency in between bin frequencies?

FFT is always a cyclical signal. (It’s a sum of cosine waves. Cosine waves are always cyclical. So, the sum of cosines must also be cyclical.) If the input frequency is in between, then the frame will include part of the cycle and jump back to the beginning – “hard sync” in synthesizer terms. The resulting spectrum is strongest at the nearest bin frequency, but distributes some of the energy into neighboring bins. Using a rectangular windowing function, some of the energy leaks into high partials – high frequencies are exaggerated. (You can see the Gibbs effect wiggling around the jump point.)

To deal with that, usually one of several window functions (envelopes) is applied – here, a Hann window. (See PS.)

Now the beginning and end of the window connect smoothly, and the high frequencies are under control. (The rest is handled by taking successive windows and overlapping so that the envelope functions add up to a constant 1.)

Some energy still leaks away from bin 1. As Sam said, the amount of leakage is a clue as to how far the input frequency is away from the bin frequency. You may not be able to recover exactly the frequency, but I can imagine you could get a reasonable estimate. The above graphs are for f = 1.4 cycles/frame, and bin1 is a bit stronger than bin2. I also checked 1.5 cycles/frame, and the two are almost exactly equal (so you would go halfway away from the bin frequency).

hjh

PS FFT actually applies a sine window (first half of a sine wave). IFFT applies another one – so the total window being applied is the square of the first half of a sine wave = Hann window. Hann has the nice characteristic that, with overlap = 2 (hop = 0.5), the sum of the envelopes = 1 (sin^2 t + cos^2 t = 1), avoiding amplitude modulation distortion. But, even though the FFT windowing function is slightly different from the above graphs, the observation about adjacent bins to estimate frequency still holds true.