Crossfading a polarity flip

Example:
Let there be two possible stereo images.
The first stereo image is straight in the middle i.e. left and right are duplicates of eachother.
The second one is the widest image possible i.e. left and right are the flipped polarity version of eachother i.e. left = right * -1.
What im trying to accomplish is crossfading between the two stereo images. To crossfade from straight in the middle to widest possible stereo field. A kind of effect of a sound opening up before you.
EDIT:
I realised i didnt clarify the main problem this entails. The problem is that if you want to have one channel to fade from original to flipped polarity, you’d have to cross 0. And it is this dip that i want to avoid. But at the same time keep the signal as original (spectrum-wise i guess) as possible during the fading process. I also realised my example was not according to what i was trying to achieve. I made a new example. I used the idea that between the sig and sig.neg i use an intermediate state, which is (maybe?) in the middle of both sig and sig.neg phase-wise? Which means you have little phase cancellation. There is still some comb filtering going on.

(
{
	var sig = Saw.ar(123), panner;
	
	panner = SelectX.ar(MouseX.kr(0, 2), [sig, DelayN.ar(sig, 0.3, 0.0018), sig.neg]);
	
	[panner, sig];
	
}.play;
)

Also note, that this only works for this specific input signal. If you put a sine wave somewhere in the areas where the comb filtering occurs, you’d get heavy phase-cancellation.
But ok, i just wondered if there are any existing methods for what im trying to accomplish. I cant find anything on the internet about this. Does anyone know?

Lets rewrite your version like this:

{ Saw.ar(123) * [1, MouseX.kr(1, -1)] / 10 }.play

when mouse is left we have both channels the same - when mouse is right they are out of phase. Xfade2 uses quarter-sine curves instead of linear as we do here… but both versions will have a severe volume dip at some point: if you fade the level of the right channel between in-phase (1) and out of phase (-1) it will have to pass through 0, in other words be silent…

Hello @Aueh,

I’m going to suggest you have a look at Classic stereo imaging transforms—a review. This paper presents an overview of MS theory in the context of stereo transforms… and hopefully can add some further insight into what you’re after.

3 Likes

Yes its a problem that it has to pass through 0. With the straight forward method, the one that you describe, its not possible to avoid the silence. But i wondered maybe there is somehow a clever way to bypass this problem. The reason why i crossfaded between sig.neg and delayC.ar(sig) is that if you crossfade between sig.neg and just sig then you’d have to cross 0. But because sig is slightly delayed it actually doesnt have to. I do realise this means the delayd signal is already more similar to the flipped polarity signal than the original signal. Which means the starting point is actually not double mono. But maybe you can do a series of faders where each fader fades between delayed signals which are similar to eachother. And each fader takes it one step further until you have flipped polarity. Im just thinking out loud, i dont know if it makes sense. I do understand if you think about it it seems impossible not to cross 0, but sometimes people come up with stuff you’d never think about yourself. And also the more general question would be if this a topic which has been thought about before and if there are any existing methods already.

THank you! ill check it out, i hope i can find the thing im looking for.

I have this plugin Wider | Polyverse Music | Widen your sound today which is free. But dont know whats under the hood. It says it has a diverse allpass comb filtering algorithm haha.

Additionally found this one by a quick search: https://www.dafx.de/paper-archive/2024/papers/DAFx24_paper_92.pdf

1 Like

Thx, i dont have time now to read the paper, but ill check it out, i appreciate the help.

Just thinking this through now… sharing in case helpful

If flipped polarity is equivalent to 180° out of phase (like with a sine wave) you can gradually shift the phase of one channel, so they are synchronized when mouse is all the way left, and phase inverted when it’s all the way right:

(
{
  var phase = Phasor.ar(rate: 123 * SampleDur.ir);
  sin(phase + [0, MouseX.kr(0, 0.5)] * 2pi);
}.scope
)

And if you can build something up out of sine waves, you can shift each one in this way. Here is an approximated saw wave from adding 60 sine waves together, each of which you can polarity invert the same way:

(
{
  var phases = Phasor.ar(rate: 123 * (1..60) * SampleDur.ir);
  phases.collect { |phase, i|
    sin(phase + [0, MouseX.kr(0, 0.5)] * 2pi) * (1/(i + 1));
  } .sum * 0.2;
}.scope
)

For me this is expensive on CPU, and it doesn’t make a very convincing saw wave. A much cheaper version of this same idea is to use FFT, which uses math to chop up an input audio signal into a bunch of sine waves and then resynthesize it – by shifting the phases of the sines you can get this same effect:

(
{
  var sig = Saw.ar(123);
  var chain = FFT(LocalBuf(4096), sig);
  var ifft1 = IFFT(chain);
  var ifft2;
  chain = PV_PhaseShift(chain, MouseX.kr(0, pi));
  ifft2 = IFFT(chain);
  [ifft1, ifft2];
}.scope;
)

(and if what you’re going for is purely stereo width, there are better ways to do this, as others have mentioned)

1 Like

For a phase shift of a general signal, you could try a hilbert transform. Of course it does change the spectrum a little, but it shifts every partial by pi/2 or 90°.


(
{
	var a = SinOsc.ar([300, 443, 782]);
	var b = Hilbert.ar(a);
	[a, b].lace
}.plot
)
1 Like

Hilbert returns a pair of signals, which are 90° apart from each other… I don’t think it’s guaranteed to be 90° from the input:

( // perfect circle on x/y scope
{
  var sig = SinOsc.ar(MouseX.kr(100, 1000));
  var b = Hilbert.ar(sig);
  b
}.scope
)
( // shifting oblong
{
  var sig = SinOsc.ar(MouseX.kr(100, 1000));
  var b = Hilbert.ar(sig);
  [sig, b[0]]
}.scope
)

that’s right – still maybe it could be a solution in some cases?

1 Like

Well yeah actually for the original challenge this works well – converting to polar coordinates to get a phase control which is then possible to shift by 180° or arbitrarily. Smoother sounding transition than the FFT method, although it does add some extra buzz (maybe aliasing?), and the waveform looks nothing like what came in

(
{
  // input signal
  var in = Saw.ar(123);
  // get two signals out which are 90° out of phase
  var hilbert = Hilbert.ar(in);
  // convert x/y to polar
  var polar = Complex(*hilbert).asPolar;
  // then extract the phase ranging from -pi to pi
  var phase = polar.theta;
  // now can reconstitute original signal...ish
  sin(phase + [0, MouseX.kr(0, pi)]);
}.scope
)
1 Like

Oh yeah fft, nice!


Not sure whats going on here though. When the phase approaches pi/2, that top bar goes up alot, idk what that is, is that true peak? Its weird because right and left sound the same loudness wise. And something else strange happens when the original signal is already close to yellow, the shifting signal actual gets softer approaching pi/2, maybe because it just clips so much it goes silent?
Other than that this is so far the best method.

Also i wondered maybe its possible to use the realm of imaginary numbers? If you imagine the circle of unity, you can go from 1 to i to -1. So the amplitude stays 1 the whole time and you circumvent 0. Im just thinking abstractly, so im not sure if this is applicable to the realm of audio.

The hilbert example actually is doing this - the two signals can be interpreted as real and imaginary part of a complex oscillator, which is what I did in order to get the phase. You can’t listen to a complex number though, so it doesn’t practically help you avoid hitting 0 (speaker moves along a single axis from -1 to 1)

Just looking at four sine waves, and then their corresponding cosine waves (i.e. pi/2 phase shift)… adding the cosines together results in a bigger peak (which makes sense since they all are 1 at the start and end of the cycle), but the RMS (which measures loudness or energy) is almost the same.

(
var phase = Env([0, 2pi], [1]).discretize;
var phases = (1..4).collect(_ * phase);
var sines = sin(phases);
var cosines = cos(phases);
sines.plot;
cosines.plot;
~sinesum = (sines.sum * 0.25);
~cosinesum = (cosines.sum * 0.25);
~sinesum.plot;
~cosinesum.plot;
)
~sinesum.peak // -> 0.80807173252106
~cosinesum.peak // -> 1.0
~sinesum.rms // -> 0.35338071540025
~cosinesum.rms // -> 0.3547597649238

If you clip the cosine sum to the same peak value as the sine sum, you get less RMS:

~cosinesum.clip(-0.808, 0.808).rms // -> 0.33009959751974

I can’t tell you much more about the math behind this, although I’m curious :slight_smile: but practical experience says that peak level doesn’t say much about overall loudness, and here’s some intuition about why.

(And so in practice you should probably give yourself enough headroom – in SuperCollider I default to multiply synth outputs by 0.1 aka -20db – so you can muck about without clipping)

…

p.s. For a different sort of intuition, consider a sine vs a square wave:

(
var phase = Env([0, 2pi], [1]).discretize;
~sine = sin(phase);
~square = (~sine * 1e20).clip(-1, 1);
[~sine, ~square].plot
)

~sine.rms
~square.rms

Both waveforms range from -1 to 1, square sounds significantly louder (which makes sense because the square wave spends all its time at peak)

1 Like