I am using live guitar input in my code and playing back sampled guitar notes at lower rates to create basslines. My problem is very strong aliasing when playing at slow rates, like < 0.1. I am using XPlayBuf which has cubic interpolation.
Firstly, I am trying to understand the math behind the aliasing, which I think I understand for very high frequencies, but what is causing the aliasing when working with low frequencies? Are the frequencies bouncing off of the low end of the spectrum rather than the high end?
Secondly, what can I do to remedy the aliasing other than filter out the high end, and which filter would be suitable for this, HPF.ar or a steeper filter? Would an oversampling buffer reader help this and does one exist which can also crossfade?
The highest note on my guitar is midinote 85, but I can limit the used buffers to something lower than that, e.g. midinote 74 = 587 Hz. Electric guitar signals donāt have much information after the 9th partial so the upper limit of the frequency band would be around 5000 Hz. I am wondering if pre-filtering at record-time would create better results than filtering at playback-time or maybe a combination of the two?
This can happen when frequency shifting (which adds a constant amount to the soundās partials). But transposition multiplies frequencies by a pitch ratio. Itās impossible to begin with a positive frequency and multiply it by a positive-but-very-small ratio, and get a negative frequency. So it canāt be inverted aliasing.
It is probably an artifact of the cubic interpolation. A cubic-interpolated function is smoother than linear, but itās still piecewise, and the pieces may not fit together perfectly smoothly.
Iām just guessing though ā could perhaps come up with a better theory if I could zoom in on the waveform.
Also interested in this! Iāve spent several dozen of hours recently reimplementing a non-realtime cubic spline interpolation in SC to understand what was going on. I also had aliasing problems in the lower part of the spectrum. I can share the code if needed.
I was about to try to answer but, honestly, @jamshark70 has better answers than I do.
Here is a plot of a guitar note. I also uploaded it in case you want to experiment. (Guitar Note.wav - Google Drive) (48k, 24 bit, wav). I really appreciate the help.
if b is the file loaded into a buffer, this reproduces the problems:
{ PlayBuf.ar(1, b, 0.1) }.play
EDIT: I see there there is a lot of garbage stuff way up high, which might come from the virtual guitar amp I am using. Let me record at straight note with SoundIn. Will updateā¦
Just ran some more tests and I think I found the root of the problem - bogus information in the high end coming from my virtual guitar amp (NAM, Neural Amp Modeler), if I tap the signal before NAM the problem seems to be gone.
EDIT: bogus hi end is not only coming from NAM, I will try LPing the signal before it goes to the buffer
The interpolation is needed because when playing back a certain soundfile at different speeds it is unlikely that the sample frames are lining up with the sample points in the buffer. If you play back a buffer at original speed the samples are ligning up with the sample grid, if you play it back slower the buffer samples will be stretched out, so the interpolation estimates the buffer data between samples. I guess this could be more an issue of data estimation which results in wrong data the more you stretch the waveform then aliasing (some kind of downsampling). Could be wrong.
I still have difficulties understanding interpolation properly. One thing I think I might have overlooked at first is folding. Could this be the aliasing reason here ? I might be totally wrong here, so if someone knows better, please correct me.
In my interpolation implementation, I simply thought I could expand my original signal then select relevant points (original sample is 44100 samples, interpolation is 44100 * 48000 samples, final signal is 48000 samples). But I think that you might get aliasing if you donāt filter the middle signal (with 44100 * 48000 samples). I think you might need to apply a low pass filter and a high pass filter to this signal before converting it to your desired sample rate.
In my case there is no sr conversion, just playing back the sampled signal at (mostly) rates < 1. I found out my guitar pickups are picking up very high pitched noise from time to time (nothing surprising) and that high pitch noise is moved down to the audible spectrum when playing eg. rate = 0.1. My solution has been to hipass the incoming signal at 10kHz, which is good enough for guitar as the highest note on my guitar has a fundamental frequency of 1100 Hz.
if you have a buffer sampled at your playback sample rate and you play it back at original speed you get one data point of your buffer for each playback sample. If you reduce the playback speed lets say to half the original speed you have to represent 2 samples with one sample in your buffer this results in a stepped waveform (think of a sample and hold) therefore you have to interpolate between samples. The slower the playback rate is the more data points you have to estimate, which results in an estimation of data points which not necessarily represent the original file. The type of estimation is based on the interpolation algorithm you use, the simplest one would be linear interpolation. Additionally if your playback rate is a non integer value it is very unlikely that the sample grid lignes up with the buffer sample grid.
An oversampled ugen, say 8x or 16x, and using a more sophisticated interpolation algorithm can make a difference in the CPU load.
I never compares, but there are NRT applications that use the highest quality sample rate conversion.
Under the right conditions, ignoring other factors, would they be the same? If thatās the case, is it an option between a huge memory buffer and an increase in CPU load?
If someone has technical reasons or A/B comparisons, I would be delighted to know a bit more.
I tried myself, but never in an A/B situation or thinking about other factors Iām forgetting now.
True. A good interpolation formula (cubic, or sinc) will only add an aurally-undetectable amount of noise (because deviation from ātrueā in the sample values is nothing more than noise). A bad one (linear) may introduce sharp corners, and that could add aliased frequencies.
Would it make sense for a DSP process to adapt the algorithm according to the other aspects? I wrote an oversampling buffer player, and this idea crossed my mind. My instinct is that it would not work so well.
Also, I wonder if expensive interpolation algorithms, such as higher-order Lagrange, are possible in NRT.
For comparison, linear interpolation is O(4) since there are only 2 points (2^2). Cubic interpolation is already complexity O(16).
Finally, 8th order Lagrange (9 points): O(9^2) = O(81) operations per sample. Itās 81, no typo. I donāt think musicians use that much, but scientific computing does.
There are all flavors re: interpolation algo, and most of the time, itās not hard to say when itās a waste of computations and when itās insufficient for the processes on signals.,
(Of course, I imagine we are also thinking about extreme transformations. Sometimes, I want to push things a lot but also avoid the ācheapā quality of the artifacts.