How to offset the FFT Window in a Paulstretch style Synthdef

I’m trying to find a way to synchronize the FFT Ugen, so that I can delay the analysis window by an exact amount.

I’m building a Synthdef that incorporates Paul-stretch style time-stretching, building off of jpdrecourt’s work here: http://sccode.org/1-5d6

The relevant bit of code looks like this:

sig = sig.collect({ |item, i|
	chain = FFT(LocalBuf(fftSize), item, hop: 1.0, wintype: -1);
	// PV_Diffuser is only active if its trigger is 1
	// And it needs to be reset for each grain to get the smooth envelope
	chain = PV_Diffuser(chain, 1 - trig);
	item = IFFT(chain, wintype: -1);
});

Here we have two signals, and we’re running an FFT on each signal. The windows over the two FFTs are aligned. What I want to do is shift the second FFT window to have the two calculations out of phase. Note I’m not considering enveloping windows here: I’m just interested in the exact moment that the last N samples are used for a FFT calculation and finding a way to have those moments interwoven between the two chains.

The FFT Ugen has an active parameter, which I’ve tried using to leave it ‘turned off’ for the first half of the window, but unfortunately it doesn’t seem to affect the positioning of the window. What would be ideal is an offset parameter to FFT that I could set to i * (fftSize/2) (assuming sig holds two items). But perhaps there is another way to do it.

In jpdrecourt’s example, this need is overcome by introducing a delay line after the FFT process. However, this introduces a delay between the input parameters and the output sound which I’m trying to eliminate.

I hope my question is clear - let me know if I need to explain it better!

Thanks,

Tim

PS - This diagram may also help explain the intention behind having the two FFT chains out of phase.

In jpdrecourt’s PaulStretch the FFT’s frames are aligned because you are calculating the frames of each FFT chain within the same amount of control cycles. But each aligned FFT frame here have different audio content because the UGens that read audio buffer have some phase offset in respect with each other:

// Extraction of 2 consecutive grains
// Both grains need to be treated together for superposition afterwards    
sig = [GrainBuf.ar(1, trig, trigPeriod, bufnum, 1, pos, envbufnum: envBufnum),
       GrainBuf.ar(1, trig, trigPeriod, bufnum, 1, pos + (trigPeriod/(2*stretch)), envbufnum: envBufnum)]*amp;

Are you avoiding this because you want to work with a live input in real time ?

1 Like

What are you trying to do, exactly and how is this different than JP’s solution? By your description I wonder why you wouldn’t just delay the input parameters.

BTW - this is incorrect in his code:

    sig = sig*PlayBuf.ar(1, envBufnum, 1/(trigPeriod), loop:1);

The FFT delays the output by fftSize-BlockSize.ir, so the env needs to be delayed by (fftSize+BlockSize.ir)/SampleRate.ir for the window to line up correctly. In this case, just delay the PlayBuf by that value.

Whoops!

(fftSize-BlockSize.ir)/SampleRate.ir

Sam

Thank you for the responses.

Are you avoiding this because you want to work with a live input in real time ?

Exactly that, yes. I want to update the position parameter in realtime, and with the aligned FFT calculations in jpdrecourt’s implementation, updates to pos don’t propagate until we’ve processed both FFTs. If you skip pos around then you end up with pairs of adjacent grains even when continuously modulating pos.

When I woke up this morning I thought I had the solution, which was to set the hop parameter of the FFT to 0.5 giving a 50% overlap, and then to use a PV_Freeze to keep just alternate frames. This nearly seems to work except now I’m finding the original signal is leaking past the PV_Diffuser. It’s strange because if I cut out PV_Diffuser then I can still reconstruct the signal exactly as expected, but there seems to be some issue with the buffers in between.

There’s lots of debug printouts in my code but here is a summarised form:

fftDelayCompensation = (fftSize-BlockSize.ir)/SampleRate.ir;
t_fft_changed = trig[i];
isInPhase = Trig.ar(t_fft_changed, trigPeriod/2);
isInPhase = DelayN.ar(isInPhase, fftDelayCompensation, fftDelayCompensation);
chain = FFT(buffer: LocalBuf(fftSize), in: grain, hop: 0.5, active:1, wintype: -1);
chain = PV_Freeze(chain, 1 - isInPhase);
// I've tested that this triggers only when Spectrum(chain) changes
t_frozen_fft_changed = DelayN.ar(t_fft_changed, fftDelayCompensation, fftDelayCompensation);
chain = PV_Diffuser(chain, 1 - t_frozen_fft_changed);
grain = IFFT(chain, wintype: -1);

The FFT delays the output by fftSize-BlockSize.ir, so the env needs to be delayed by (fftSize+BlockSize.ir)/SampleRate.ir for the window to line up correctly. In this case, just delay the PlayBuf by that value.

@Sam_Pluta Is there any documentation anywhere on how delays and chunking is handled with the FFT? It makes sense that the output would be delayed by by fftSize but I wouldn’t have expected fftSize-BlockSize.ir, hence why presumably why jpdrecourt didn’t bother delaying the envelopes at all as they are looping with the same length.

I’m going to try a little harder with this hop and freeze technique. Otherwise though I think it may be simpler to just preprocess all my audio files with FFT and then just sample grains from them in frequency space.

What I really want is a demand rate FFT.

The reason is that you can’t hear it until the window size gets very small and the PaulStretch works best with longer windows, so it isn’t really present.

Is this the vibe you are going for:

{
	var fftSize = 8192;
var sig = PlayBuf.ar(1, b, loop:1);

	var trig = Impulse.ar(2*SampleRate.ir/fftSize);
	var trig0 = PulseDivider.ar(trig, 2, 1);
	var trig1 = PulseDivider.ar(trig, 2, 0);
	
	var toggle = ToggleFF.ar(trig0);
	
var chain = FFT(buffer: LocalBuf(fftSize), in: sig, hop: 0.5, active:toggle, wintype: 0);
var chain2= FFT(buffer: LocalBuf(fftSize), in: sig, hop: 0.5, active:1-toggle, wintype: 0);
	
	chain = PV_Diffuser(chain, 1-trig0);
	chain2 = PV_Diffuser(chain2, 1-trig1);
	
	sig = [IFFT(chain), IFFT(chain2)];
	//sig = (IFFT(chain)+IFFT(chain2)).dup;
	sig
}.play
1 Like

Yes yes getting closer for sure! I tried to use the active parameter on the FFT like that but couldn’t get it to work - this is a massive help - thank you!