FFT processing on Audio and Mic

richfield006 · December 19, 2021, 11:10am

Dear team,

Hope this mail finds well. This is my first programming language I’m learning and have learnt(self-taught) the basics of SC using The SuperCollider Book, Youtube and scsynth forums. I’m trying to do a project based on Digital Signal Processing (DSP) using FFT’s (Fast Fourier Transform UGen). I’ve looked at forums, help files on SCIDE, created my own question(FFT processing using SoundIn and an audio track) – but couldn’t get an complete understanding/answer to what I need to do.

My idea → Take 2 inputs:

Input from internal mic
An audio track being played through a buffer.

Do real-time FFT processing with input mic and an audio track.

I think I got a flow diagram as above from online (https://stash.reaper.fm/37301/STFT%20sketch%20small.png) and thinking of applying the exact concept.

So my idea is to play the audio track and affect the phase (once I get an working template, I can do other processing by modifying the parameters or doing something different) from the input mic to create an experimental audio out track.

So far, I have figured out playing with fft bins (audio and mic separately). I’m stuck here not knowing:

1)How to combine them, 2) Compare audio and mic fft using a loop or something and then 3) Modifying(processing) one with another. Could you please please help me in this regard and point me at the right direction? I’m desperate to get a working model by end of this year if possible please

s.boot;

//Mic Input processing - without using synthdef
(
x= {
	var sig, chain, size=2048;
	sig=SoundIn.ar(0, 1);
	chain= FFT(LocalBuf(size), sig, 0.5, 0);
	chain = PV_BinScramble(chain, \wipe.kr(0), \width.kr(0), \trig.tr(0)); // Processsing to be done
	sig=IFFT(chain)*0.1!2;
}.play;
)

x.set(\wipe, 0.8, \width, 0.2, \trig, 1);
x.free;


//Audio FFT processing using SynthDef
~buf = Buffer.read(s,"/Users/Kirthan/Desktop/real.wav"); //choose any audio file

(
SynthDef(\fftaudio, {
	arg loop=1, da=0;
	var sig, chain, size=2048;
	sig = PlayBuf.ar(1, \buf.kr(0), BufRateScale.ir(\buf.kr(0)), loop:loop, doneAction: da);
	chain = FFT(LocalBuf(size), sig);
	chain = PV_BinScramble(chain, \wipe.kr(0), \width.kr(0), \trig.tr(0)); //Processing to be done
	sig = IFFT(chain) * \amp.kr(0.5)!2;
	Out.ar(\out.kr(0), sig);
}).add;
)

a = Synth(\fftaudio, [\buf, ~buf]);
a.set(\loop, 0, \da, 2);
a.set(\wipe, 0.5, \width, 0.1, \trig, 1);
a.free;


//Mic FFT proccessing using SynthDef
(
SynthDef(\fftmic, {
	var sig, chain, size=2048;
	sig=SoundIn.ar(0, 1);
	chain= FFT(LocalBuf(size), sig, 0.5, 0);
	chain = PV_BinScramble(chain, \wipe.kr(0), \width.kr(0), \trig.tr(0)); //Processing to be done
	sig = IFFT(chain) * \amp.kr(1)!2; // Amplitude set to unity gain - Please reduce or use it with caution!!!
	Out.ar(\out.kr(0), sig);
}).add;
)

m = Synth(\fftmic);
m.set(\wipe, 0.5, \width, 0.1, \trig, 1);
m.free;

shiihs · December 19, 2021, 12:56pm

As a general approach, you could add a synth that streams the microphone input to one audio bus (instead of to the speakers), and a second synth that plays the loaded audiofile to a second audio bus (also instead of sending it to the speakers), then create a third synth that reads input from both audio busses and combines both inputs into something new, and then send the new thing to the speakers.
You will want to investigate using groups to ensure that the third synth is executed after the first two synths.

richfield006 · December 19, 2021, 1:06pm

Thanks a lot for your swift response @shiihs, will get cracking on with that!

jamshark70 · December 19, 2021, 2:24pm

A post was merged into an existing topic: FFT processing using SoundIn and an audio track

jamshark70 · December 19, 2021, 2:26pm

Oh, sorry about that, with two threads for basically the same question, I got confused about which post belongs where.

I’d suggest choosing one and following up only there. Splitting the conversation between two threads is probably not very effective as it dilutes attention.

hjh

richfield006 · December 19, 2021, 2:35pm

Thanks for doing that. This forum UI is new to me and I’m not quite certain what I have done. I hope you’ve fixed it.

jamshark70 · December 19, 2021, 2:42pm

Actually I should move that post back here … Since the other post refers to an attachment missing from this thread.

Bit messy…

hjh

richfield006 · December 19, 2021, 11:16am

Received an automated mail saying the .scd can’t be accepted. So have attached a screenshot of my code.

Thank You

Kirthan

richfield006 · December 19, 2021, 2:49pm

Would you want me to do something? Remove something?

Spacechild1 · December 19, 2021, 3:35pm

@richfield006 always post your code as text with code tags (paste the code, select it and then click the “</>” icon or type Ctrl+E):

a = b + c;
d = e + f;

richfield006 · December 19, 2021, 3:43pm

Thank you. Updated my post!

jamshark70 · December 23, 2021, 11:40pm

A bit more information from a private message – pulling it public as it’s a more detailed problem spec, better to have more eyes on it than only mine.

“bins with lower energy need to be boosted” sounds like a PV_Compander with slopeBelow set to a number less than 1, and slopeAbove = 1 (i.e. the opposite of a normal compressor). If that doesn’t work, try a normal compressor (slopeBelow = 1, slopeAbove < 1) with make-up gain (with a PV_ math UGen).

The catch is the relationship between the two inputs, where I’m confused. Do you mean a typical sidechain, but spectrally? That is, boost quiet bins by an amount related to the corresponding sidechain input’s bin loudness above the threshold? (Hm, if that’s the case, then slopeBelow < 1 wouldn’t work because it would boost where the sidechain is quiet. So you might have to take the compressor → gain approach.)

Or is it compared to the total signal level from the mic?

Unfortunately PV_Compander doesn’t have a sidechain input, so it’s tricky. I’d like to have a totally clear understanding of exactly what you want to happen to one bin before putting a lot more time into it. I suspect it might be to subtract the sidechain (so that louder sidechain bins would make the source bin much quieter), then compand, but I’m not clear that I’m reading you correctly. A clear problem spec leads more quickly to a solution.

hjh

richfield006 · December 24, 2021, 5:27am

Thanks a lot for your time and detailed explanation. I guess you have understood my idea.
I’ll try to explain my idea bit clearly - Consider you are listening to a track on a product with a speaker and a mic (Smart Speakers, Phones etc) in a outside busy environment (streets etc.). Since the external noise sometimes gets louder, quieter parts of the track you are listening to would become masked by the noise. I am proposing to boost the quiter bits (increase the energy) of the track to overcome the noise energy by still retaining the louder parts the same in the track you are listening to - Its more like an expander. Hope this explains my idea.
So I’m taking the mic input (Input 1) as my side-chain signal - I’m interested in the fft bins that have magnitudes above a certain threshold - So I would have to collect the bin number and magnitude from this input
Input 2 is the audio track your listening to - Taking the Input 1 Bin and Magnitude info, I’m gonna process the increase the bins and magnitudes of this track triggered from the mic input.
Have attached a pic (a flow diagram):

jamshark70 · December 26, 2021, 12:08am

Been thinking this over, though I didn’t have time to test anything.

Couple of ideas:

With pvcalc2, you could use the magnitude of the sidechain bin to calculate a gain factor for the audio bin, using math operators. You couldn’t use “if” here, but for example a standard compressor gain curve (without knee), i.e. the amount of attenuation, not the final gain, would be something like max(0, noisemag.ampdb - threshold) * (1 - ratio) and you could apply this as a positive gain to the other signal (where a standard compressor applies this as a negative gain, more or less). For a large number of bins, this will make a huge SynthDef and it wouldn’t be CPU-cheap, but it would let you express the math as you like. (Btw if you have too many bins, there’s more time between FFT frames so it would react slower.)
PV_ units would be faster, but I’m not sure you can do exactly the above with them. One idea I had was to start with the main audio magnitudes and subtract the noise magnitudes – so that louder noise bins would result in much quieter magnitudes. Then, PV_Compander with slopeBelow < 1. This would amplify the quieter bins (and since a louder noise bin produces a smaller magnitude, this would amplify more where the noise input is louder). Then add the noise magnitudes back in. The gain per bin, however, would depend on both input magnitudes instead of depending only on the noise mag, so this might not do exactly what you want.

A third approach would be to write your own PV_ unit in C++, but I guess you’d rather prototype in a SynthDef first.

I don’t promise that either of these would actually sound good.

hjh

richfield006 · December 27, 2021, 10:08am

Thanks a lot, will give those ideas a try today. As I’m currently trying hard to get a working model, not much bothered about sounding nice😁 -. As I installed the sc3 plugins, have been trying some units which does FFT stuffs - came across FFTPeak. Is there a way I can send the freq and mag information of Mic input to a bus and receive those numbers and process those numbers on audio track?


s.boot;
(
SynthDef(\mic, {
	arg size=512; // Sent it to a different bus
    var sig,chain, freq, mag,
	sig = SoundIn.ar(0, 1);
	chain = FFT(LocalBuf(size), sig, 0.5, 1, winsize: 256);
	chain = PV_LocalMax(chain, 10); // Only send Bins above a threshold
	# freq, mag = FFTPeak.kr(chain1, 20, 20000).poll; // I guess I can set the rage of FFT bins I need and output the freq and mag.
    }).add;
)

Synth(\mic);


~voice = Buffer.read(s,"/Users/Kirthan/Desktop/real.wav"); // Load a MONO Audio file

(
SynthDef(\audio, {
	arg size=512;
    var sig, chain;
	sig = PlayBuf.ar(1, ~voice.bufnum, BufRateScale.ir(~voice.bufnum), loop: 1);
	chain = FFT(LocalBuf(size), sig, 0.5, 1, winsize: 256);
	chain = PV_MagBelow(chain, 0.01); //Only send FFT bins below a threshold
	//Or is there a way I can receive the Freq and Mags info from Mic play those in the audio file?
	
	
	sig=IFFT(chain)*1!2;
    }).add;
)

Synth(\audio);

s.quit;

richfield006 · December 27, 2021, 3:29pm

Sorry to be a pain, I have been trying to implement this for past 4 hours but failed miserably as I couldn’t completely understand the process on how to do this. Could you please help me out. Thank you

jamshark70 · December 28, 2021, 1:00am

I’m sorry, I have 6.5 hours of classes today, it won’t happen until tomorrow at the earliest.

Anyone else take a stab?

hjh

richfield006 · December 28, 2021, 8:27pm

Yes, many thanks to @nathan for his time and support helping me out making the model without the use of FFT. Will post the code once I get it working.

bovil43810 · December 28, 2021, 9:05pm

Thank you for this and all of the other useful synthesis recipes you posted on here!

richfield006 · December 31, 2021, 5:02pm

Here’s an approach to achieve this with EQ’s. Have done some real world test by taking them to loud atmospheres and these values are somewhat reasonable for the final audio - tested on MBP2018 mic and speakers. Please tweak them to your preference.
As this is now working I’m now interested to compare with FFT’s and see what’s the trade-off with audio quality.

s.boot;

~music= Buffer.read(s,"LOAD AN AUDIO FILE MONO or STEREO - IF STEREO change channel num to 2");

(
a = ({
	var mic, sig, loFreq = 20, hiFreq = 20000, numBands = 16, threshold = -20, bandFreqs, dB, maxBoost;
	bandFreqs = Array.geom(numBands, loFreq, (hiFreq/loFreq)**(1/numBands));
	mic = SoundIn.ar(0, 1); //Input Mic Signal
	mic = BPF.ar(mic, bandFreqs, 0.34);
	dB = Amplitude.ar(mic, 0.1, 0.5).ampdb.poll; //Convert to dB scale
	maxBoost = 6; //Increase or decrease the value to your preference.
	mic = min(max(dB - threshold, 0), maxBoost);
	sig = PlayBuf.ar(1, ~music.bufnum, BufRateScale.ir(~music.bufnum), loop: 1); //Set for mono audio signal
	numBands.do{ arg i;
		sig = BPeakEQ.ar(sig, bandFreqs[i], 0.34, mic[i]);
	};
	sig = Limiter.ar(sig, 1);
}).play;
)

s.quit;

Massive thanks to @nathan for helping me achieve this.