Using FFT UGens to create a concatenative-synthesis thingy?

Hey!

I’m looking into the FFT UGens wanting to create a code that analyses incoming sound, and finds a sample from a folder that has the highest match to the sounds Spectral Centroid information at an interval of for example 1 second.
I’m trying to use the SpecCentroid.kr() but I have no idea how I’ll go forth trying to analyse the information, as I can’t get any numbers out of the UGen that I can work with.

(
~b1 = Buffer.alloc(s, 2048, bufnum:1);
~b2 = Buffer.read(s, "/Volumes/DATABSE M/codes/supercollider/disklavier/soundfiles/state3.wav", bufnum:3);
)

(
a = {
	var chain, sig, centroid;
	sig = PlayBuf.ar(1, bufnum:3, doneAction:2);
	chain = FFT(b, sig);
	centroid = SpecCentroid.kr(chain);
	sig;
}.play;
)

Can anyone lead me in a good direction?

I have been also working on something like this!
SCMIR is still the resource I would use for this, especially if you want to analyze a corpus of buffers offline and then use that dataset to “reconstruct” another sound live.
IMO the pipeline would be more or less like this:
0) Analyze your buffers and store the data

  1. Load the data and put it in something like a KDTree
  2. Analyze a live source, send the data from server to client using SendReply/OSCFunc
  3. Search for nearest neighbors in your dataset: get the buffer number and frame position
  4. Create one or more synths to play the retrieved segment

It looks like you’re asking about point 2. Would this help to move one step forward?

(
a = {|arg pollRate=10|
	var chain, sig, centroid;
	sig = PlayBuf.ar(1, bufnum:3, doneAction:2);
	chain = FFT(b, sig);
	centroid = SpecCentroid.kr(chain);
        // send centroid to language pollRate times per second
        SendReply.kr(Impulse.kr(pollRate), "/anal", centroid);
	sig;
}.play;
)

(
// this function will receive centroids from the server
OSCdef(\recvAnal,
    {|msg| 
        var centroid = msg[3];
         // do something with your received centroid
         // ....
    },
    path: "/anal",
    // this is for receiving only from a
    argTemplate: [a.nodeID]
);
)
1 Like

Dude, I can’t thank you enough. I’ve been low-key looking for a way to get data from the server back to the language for ages, thank you so much!!

question: What is a KDTree? Is it a part of the SCMIR library? (which I haven’t downloaded yet)

KDTree is a Quark on its own. It gives you a data structure that knows how to find nearest neighbors in an n-dimensional space (nice if you plan to have more than one features).

Quarks.install("KDTree")

I would really like to hear what would be other people’s approaches to this task!

@elgiano I can second this approach, and especially emphasize the requirement for a KDTree for real time searching.

In my experience with concatenative synth, adding more features will make this process more interesting. SCentroid can be misleading at times if the intention is to match pitch. You could try also for time domain periodicity measures, which are more computationally cheap that spectral analysis requiring FFT. I have found that envelop also has a huge influence on ‘aesthetic’ matching.

The preprocess analysis step should take some appreciable time if there are a lot of samples in the directory, but searching for NNs in the KDTree will be fast. The one realtime latency consideration i would question is returning a filepath, or file index from the KDTree search, opening the file, and streaming the audio. Perhaps, if you are using short audio samples, you could also buffer the audio files.

Also to consider is how the file lengths compare to the analysis window.

1 Like

@elgiano do you have some basic code with the whole process that you could share ?

I have to look for it and tidy it up, but I can do it in the weekend! Please remind me if I forget :slight_smile:

1 Like

Super interesting! What about other indices/descriptors for that task? I’m still on a very beginners level, so excuse my ignorance, but is the general way to implement any such descriptors/indices into Supercollider by utilizing a library/method, someone has already created?
Take for example the bass ratio (proAV / data and information, lists, tables and links): In essence it’s super simple analysis+ a bit of math. If there isn’t already an implementation could this be realized in Supercollider with a reasonable amount of time (and knowledge of course)?

Just released:

https://www.flucoma.org/download/

Lots of descriptors.Very well organized and also works the same way in Max and PD.

2 Likes