FFT random phase implementation

Dear list,

i’d like to implement a specific form of phase randomization that had originally been implemented in python. I tried asking GPT for suggestions, but to no avail, as the resulting SC code is unusable.
The description of the code im trying to translate is the following:

“The random phase are complex numbers whose real and imaginary parts squared sum to 1. Then after computing the FFT magnitudes, the first half of FFT (corresponding to [negative frequencies] is multiplied with this random phase, and the second half (corresponding to positive frequencies) with the conjugate (flip sign of the imaginary part and keep the real part) of these same random phase.”

Would anyone know how this would like in SClang? Unfortunately it’s mathematics are well beyond my capacities…I’d assume .pvcalc would be the right way to go?

Thanks,
Jan

Still trying to implement this command line transformation in SC!

Some things things that are a riddle to me from the description above:

  • how can the real and imaginary parts of complex numbers be square summed to use them for PV processes?
  • how can fft be split into positive and negative frequencies to multiply them further?

is this possible with the current tools in SC?

Thanks for any hints!
Jan

Not too sure, but I think the negative frequencies are just mirrored versions of the positive when using real audio input. So supercollider doesn’t give you access to them because they are useless for audio applications?

I’m not sure either, these were also described as first and second half of FFT and linked to this as explanation
https://dsp.stackexchange.com/questions/431/what-is-the-physical-significance-of-negative-frequencies
They apparently are of use for certain fft processing.
But all in all it doesn’t help me in understanding how to go about the implementation. I guess PV_Mul and PV_Conj would be of use but the two questions still elude me…

What is wrong with PV_Diffuser?

nothing wrong with it, i’m just looking to recreate this specific implementation because of the resulting sound, which is not arbitrary randomization like with PV_Diffuser.

That is the definition of the magnitude. For some complex representing a frequency bin, a + bi, the magnitude is sqrt( a.pow(2) + b.pow(2) ).

Not too sure on this, but I think if you give a real signal (not complex) to an fft the negative and positive frequencies are the same.

This quote doesn’t really make sense, but if the positive and negative freqs are the same…

I think it is trying to say, ‘randomise the phase but keep the sign’.
Where did you get the quote from?

Ah, I see, so you think it’s essentially an arbitrary randomization of the phase, just expressed a bit more complicatedly?
I got the explanation from a programmer I was working with who is assisting in the development of the scattering transform https://www.kymat.io/
this specific phase implementation was from one of the main developers of the scattering transform.
I could try to find the code that actually reproduces the process in python, maybe that’s more clarifying than the verbal explanation (to someone who understands the maths at least)

To clarify one thing, the FFT Ugen in SC is a real FFT, not a full FFT, so there are no negative frequencies/mirror bins. This is super confusing in SC and in Max, since the terms are used interchangeably. FYI, in Max the fft~ object uses an fft, but the pfft~ object uses a real fft. In pd it looks like the object is correctly labeled as an rfft object. Clear as mud, I know.

In SC you can do a full fft on a signal in the language using the Signal.fft method. So you could implement your algorithm in the language if necessary. In that case, the code would non-realtime and could look very similar to the python code.

Sam

Thanks for clarifying this! It’s very good to know to not amount to even more confusion, given the complexity of fft. I was hoping to get around it with PV_Ugens, but good to now its not the approach for this case!
So to do what i’m after with Signal means that to basically adapt the python syntax to sclang. That sounds like a too challenging task given the maths of it! :exploding_head:

That is:

var fftSize = ..... you decide this.....;
var halfSize = fftSize div: 2;
var randPhases = Array.fill(fftSize div: 2, {
    var theta = 2pi.rand;
    Complex(theta.cos, theta.sin)
});

randPhases[0] = Complex(1, 0);  // phase shift of DC is pointless
randPhases = randPhases.add(Complex(1, 0));  // phase shift of Nyquist is pointless

(halfSize - 1, halfSize - 2 .. 1).do { |i|
    randPhases = randPhases.add(randPhases[i].conjugate);
};

// edit: match fft format
randPhases = Complex(
    randPhases.collect(_.real).as(Signal),
    randPhases.collect(_.imag).as(Signal)
);

That should be your window of random phases. Then, given a same-size chunk of audio, as a Signal:

sig = sig * Signal.hanningWindow(sig.size);

fft = sig.fft(Signal.newClear(sig.size), Signal.fftCosTable(sig.size));

fft = fft * randPhases;

ifft = fft.real.ifft(fft.imag, Signal.fftCosTable(sig.size));

sig = ifft.real;

You could save some load on the garbage collector by caching the Hann window and the cos table in their own variables, instead of repeatedly creating them fresh.

Actually how to do the windowing is a whole other conversation, but this at least is the basic format.

hjh

I think this does it…? Though I don’t hear a big impact.

(
~fftStuff = (
	fftSize: 1024,
	prep: { |self|
		self.use {
			var halfSize = ~fftSize div: 2;
			var randPhases = Array.fill(halfSize, {
				var theta = 2pi.rand;
				Complex(theta.cos, theta.sin)
			});

			randPhases[0] = Complex(1, 0);  // phase shift of DC is pointless
			randPhases = randPhases.add(Complex(1, 0));  // phase shift of Nyquist is pointless

			(halfSize - 1, halfSize - 2 .. 1).do { |i|
				randPhases = randPhases.add(randPhases[i].conjugate);
			};

			// edit: match fft format
			randPhases = Complex(
				randPhases.collect(_.real).as(Signal),
				randPhases.collect(_.imag).as(Signal)
			);

			~randPhases = randPhases;
			// let's cache a couple things for speed too
			~cosTab = Signal.fftCosTable(~fftSize);
			~sineWindow = Signal.hanningWindow(~fftSize).sqrt;
		};
		self
	},

	processOneWindow: { |self, signal|
		self.use {
			var fft;
			signal = signal * ~sineWindow;
			fft = signal.fft(Signal.newClear(~fftSize), ~cosTab);
			fft = fft * ~randPhases;
			signal = fft.real.ifft(fft.imag, ~cosTab);
			signal.real * ~sineWindow
		};
	}
);
)

~f1024 = ~fftStuff.copy.put(\fftSize, 1024).prep;


(
var infile = SoundFile.openRead(Platform.resourceDir +/+ "sounds/a11wlk01.wav");
var outPath = "~/test.wav".standardizePath;
var outfile = SoundFile.new
.numChannels_(infile.numChannels)  // btw, wish there were a less verbose way to do this
.sampleRate_(infile.sampleRate)
.headerFormat_(infile.headerFormat)
.sampleFormat_(infile.sampleFormat);

var fftThing = ~f1024;
var chunkSize = fftThing.fftSize, hopSize = fftThing.fftSize div: 2, pos = 0;
var overlapBuffer = Signal.newClear(chunkSize);
var chunk;

if(infile.isNil) {
	Error("Couldn't open input file").throw;
};

if(outfile.openWrite(outPath).not) {
	infile.close;
	Error("Couldn't open output file").throw;
};

protect {
	while {
		chunk = Signal.newClear(chunkSize);
		infile.seek(pos, 0);
		infile.readData(chunk);
		chunk.size > 0
	} {
		pos = pos + hopSize;
		if(chunk.size < chunkSize) {
			chunk = chunk.extend(chunkSize, 0);  // zero-pad if needed
		};
		chunk = fftThing.processOneWindow(chunk);
		overlapBuffer = overlapBuffer.drop(hopSize).extend(chunkSize, 0.0)
		+ chunk;
		outfile.writeData(overlapBuffer.keep(hopSize));
	};
	// maybe...? probably
	outfile.writeData(overlapBuffer.drop(hopSize));  // flush the rest
} {
	infile.close;
	outfile.close;
};
)

hjh

1 Like

Although… (last one for now, promise!)… having done all that, I think it’s probably not necessary. I think you should be able to get the same result with PV_Mul, and probably the same result with PV_Diffuser (but I don’t have time to check the source code to be sure).

The important characteristic of a real FFT is that what the quote calls negative and positive frequency bins are mirror images about the X axis (reflected vertically).

The “complex numbers whose real and imaginary parts squared sum to 1” are polar vectors with magnitude 1 and angle theta. A “negative frequency bin” would be rotated counterclockwise by theta radians. The corresponding positive frequency, being multiplied by the complex conjugate, would be rotated clockwise by the same amount – so the rotated point will still be a mirror image about the X axis.

So if you start with a real FFT, as SC PV units do, and perform this operation, you will end up with a different real FFT frame – but still real.

So I believe PV units can handle it natively. I could be wrong but I doubt it.

hjh

This actually comes pretty close to what I’m after, thank you for the extensive piece of code!
In the case of trying to recreate it with PV Ugens, you mean that the combo PV_Mul and PV_Diffuser would be the equivalent to the Signal process you’ve proposed?

a follow up question: inversely, is it possible to do the transformations that PV Ugens provide with the Signal class? for example to multiply magnitudes of two different sound files?

If you load the random phases into a buffer and access it in the SynthDef with FFTTrigger, then PV_Mul might get the same result as the complex multiplication that the quote describes.

PV_Diffuser might do the whole thing internally.

That is, either/or.

hjh

will try this, thank you!

Btw if I recall, the format in the buffer is

  • 0: DC magnitude
  • 1: Nyquist magnitude
  • 2: Bin 1 magnitude
  • 3: Bin 1 phase
  • 4: Bin 2 magnitude
  • 5: Bin 2 phase
  • etc… up to the last bin below Nyquist

(But I’m not sure if it should be polar or complex. EDIT: Ah, didn’t check the help: FFTTrigger has a polar flag!)

hjh

thank you! actually it is very convenient to have this code as an quasi NRT process that can render the files. would this code also allow for further fft processing like convolving one signal with another, or use phases and mags of two different files? i found ways to tweak the magnitudes on the costable but how to insert another files fft data is not clear to me.

The convolve operation is part of the SignalBox quark. That quark contains lots of goodies for operating on fft signals.

To convolve two sound files of any duration you can use the Convolve quark (which requires the SignalBox quark to be installed).

To see a more elaborate version of what you are doing, check out the TimeStretch class from the TimeStretch quark.

Sam

Sam