Irregular speakers array panner/decoder

moscardo · March 2, 2023, 4:25pm

In the project I’m currently working on, I will have to move some sound sources in a multi-channel setup of about 40 speakers arranged more or less evenly along the walls of a corridor about ten meters long.

Some 20 speakers will therefore be arranged on the left wall and the same number along the right wall. The speakers will be positioned at different heights between 50 and 180 cm.

I think this setup could fall under the name shoebox setup (or at least irregular array of speakers).

I would need a decoder/panner, which, given the x, y and z position of the sound source, would be able to send the respective appropriately compensated audio signal to the different speakers.

Personally, even if the possibility exists, I would discard the option of an ambisonic panner/decoder, precisely because:

there is no real preferential listening sweet-spot (the listener is free to move around and walk down the corridor);
it is not a regular ring or dome/semi-dome setup;

I would prefer instead a ‘simplified’ approach such as the VBAP vector panning system could provide. However, I don’t think VBAP is for me either, since, AFAICK, even VBAP assumes that the drivers are arranged in a regular periphonic or pantophonic setup.

A system that would be useful to me instead, would be the one that is implemented within Reaper’s ReaSurround or ReaSurroundPan plugins, where among other things controls such as the influence of each speaker or the diffusion/divergence level of the source are possible.

Below two images where, as a demonstration, I have placed 5+5 speakers and a mono sound source.

Note: in my setup every speakers is pointing towards the opposite wall, instead of pointing to the coordinate system center.

Which approach do you think would work best?
Is there insider SC a decoder/panner which will do the job out-of-the-box?
In case not, is there any externals i can use?

Thank you so much, as always, for your help and suggestions

Sam_Pluta · March 2, 2023, 5:00pm

I know this isn’t the answer you are looking for, but I would just use PanAz and see if it is close enough for government work. This way you get equal power panning between pairs of speakers and you only have to deal with a circle in your code. I would at least try it and see if it works for you.

Sam

fmiramar · March 2, 2023, 11:42pm

As a first approach, I would also do PanAz/VBAP + playing a little bit with reverberation and LPF to simulate the distance effect. Also testing firstly the acousmonium/difusion traditional approach.

If you have the possibility of setting all the speakers with the same height (and time for experimenting with different setups), I guess that using wave field synthesis would worth at least some rough try. However, it is a quite complicated/problematic technique, specially regarding the movement of the sound sources…

Is the site that you are working in acoustically dry? If it is quiet reverberant then the challenge increases…

One interesting place to ask this questions would be the " Spatial Audio in VR/AR/MR" Facebook group. It is full of truly experts in this area (and they are super friendly and kind )

jamshark70 · March 3, 2023, 12:22am

sc3-plugins’ VBAP implementation includes a class VBAPSpeakerArray, which allows speakers to be positioned at arbitrary angles, in 2 or 3 dimensions.

So it might be the case in VBAP that a regular arrangement of speakers provides the best imaging for any arbitrary audio-source position, but you can put the speakers wherever you want (and accept that imaging wouldn’t be ideal everywhere – in your setup, the quality of imaging will be very different between C and D vs E and F, but VBAP should let you do it).

I guess “influence of each speaker” could be handled by adjusting channel gains independently (but I’m out of my depth here).

hjh

muzikman · March 3, 2023, 5:35am

Isn’t VBAP an approach for projecting an audio source to a bunch of virtual microphones similar to the in real life speakers such that the reconstruction makes it seem like the source is there in the room? I didn’t quite like the spatialization this provides, because there’s a certain amount of leakiness across to other speakers, and sometimes none at all, and it doesn’t sound realistic enough.

With ambisonics, the audio source recording is independent of where it is located in the room. It is placed in the space on decoding. This decoding portion is dependent on the final position of the speakers and the number of speakers, so it doesn’t have to be a hexadome of speakers. The different open source implementations of ambisonics should have a utility to generate matrix transforms to convert the recording to any speaker array configuration.

One thing I’ve felt about how ambisonics work though, it’s very listener centric. Great for VR/AR where there’s only one listener in a given spot. This is reflected in how sound sources are placed in the ambisonic space. There’s a direction - azimuth, angle, elevation - but the distance is purely an audio inverse power law loss based on a central point.

That is, until i visited an installation that was a large scale 3 room sized video projection with ambisonic to maybe a flat array of speakers. The resulting spatialization was as if the audio was surrounding the listener everywhere in the room, though it’s mostly environment sounds of forests, machinery etc.

I noticed this with my audio mixes as well. Mixing with ambisonics really gives a lot of space to the mix. I think the power/gain relationship is attenuated to an extent better than normal panning laws on stereo speakers, which gives better soundstaging.

scztt · March 3, 2023, 4:16pm

IIRC distance-based amplitude panning (DBAP) is the solution that best fits this case - I thought that VBAP algorithms still have some assumptions about where the listener might be / SOME regularity in the speaker placements. @woolgathering made a DBAP SuperCollider plugin here: DBAP Spatialization Plugin

One other option: DBAP / VBAP approaches are still ones that try to collapse a somewhat complex, arbitrary listener + room + speaker configuration into a low-dimensionality set of controls (e.g. X,Y parameters). This is most useful if you want to have an abstracted relationship between pieces being performed and the space/configuration - either because multiple pieces are being presented (and you need more “generalized” control), or you’re making a piece for performance in multiple different spaces/configurations.

If this is a scenario where you’re making ONE piece for performance on ONE speaker setup and don’t need that level of abstraction, it might be worth ditching ANY low-dimensionality, generic panning system and just creating semi-arbitrary panning setups per-instrument / per-sound-source. Meaning: play sound A only from speaker 1, pan sound B between speakers 2 3 and 4, give sound C area-effect panning in speakers 5-8. Three or four specific panning gestures (plus a more generic nearest-neighbor panning for sounds where close localization isn’t important) can already give a ton of possibility for articulation.

DBAP is already compromised enough that very precise spatial gestures may simply not work at all - you might end up with a highly “washed out” version of whatever you’re envisioning. Composing with specific + esoteric panning strategies (rather than a generic approach) may give you a better chance to nail specific gestures in a way that’s articulate and memorable - especially if you’re presenting to an audience that may not have deep experience with listening to and decoding complex sound spaces.

moscardo · March 4, 2023, 4:41pm

Thanks everyone for your interest and your advices and suggestions,

I completely agree with @scztt regarding the approach of opting for semi-arbitrary panning.

Perhaps as I am now at a preliminary stage of the project, I am still trying to build up tools that will ensure that I can then control the panning as agnostically as possible, independently from the type of sound-source, and the speaker arrangement (which, I confess, I know will vary until the last day).

I trust that as the project progresses and its needs become clearer, I will be able to draw up a more precise set of panning gestures to apply in a per-instrument/per-sound-source basis.

As also suggested by @fmiramar and @Sam_Pluta, I think I will then move on and try to build panning based on the classic instruments I am familiar with (like PanAz). As suggested by @jamshark70, I will also have a VBAP panner ready, although I still think that for this technique the concept expressed by @muzikman regarding the Ambisonic is also valid, i.e. that they are, in both cases of VBAP and Ambisonic, particularly listener-centric systems, based on the assumption of having one/few listeners, positioned in a specific and very precise sweet-spot.

I would like to experiment more and more in this field of sound spatialisation, and the reason for my dissertation is precisely that: as time goes by, I am trying to get to know new modes of expression and, possibly, to apply them critically and knowledgeably.

On this occasion, I would like to share with you the book that has been a great source of knowledge and inspiration for me on this topic: it is “Immersive sound” by Routledge (link), a well-written and very comprehensive book.

I would also like to point out - but perhaps you already know about it - that in one of the rooms of the civic museums of the city of Pesaro here in Italy, there is a permanent installation by a certain David Monacchi called ‘Sonosfera’.

A spherical acoustic environment, surgically cared for in its phono-isolation from external noise and phono-absorbent with respect to internal reverberation, and equipped with about 40 speakers in a regular grid for listening to ambisonic recordings (including ‘Fragments of extinction’)
It was a relatively recent discovery for me and a very interesting experience.

woolgathering · March 6, 2023, 7:59pm

To be sure, the DBAP plugin needs updating. The DBAP algorithm itself suffers from other problems, too.

I ended up writing a research paper and developed a newer version that doesn’t suffer from the same issues: [2109.08704] Speaker Placement Agnosticism: Improving the Distance-based Amplitude Panning Algorithm. It works better in some respects (as in not blowing up at the edges). It has yet to be implemented.