A Guide to Ambisonics and HOA
Understanding Ambisonics
Ambisonics Spatialization Algorithm
Ambisonics is achieved in two steps: encoding and decoding. The encoding process corresponds to the panning of the virtual sources that are rendered as an Ambisonic Stream. This stream then needs to go through an Ambisonics decoder, and provide the audio for the loudspeakers of the given layout.
This process confers to Ambisonics a very high flexibility and portability, as an encoded stream can be decoded to a wide variety of speaker layouts.
With HOLOPHONIX, this encoding/decoding process is seamlessly integrated. You can use Ambisonics to spatialize virtual sources, but also to decode externally created streams in B-Format or HOA, or streams coming directly from a microphone in A-Format or HOA.
The Ambisonics stream is composed of several audio channels, derived from the decomposition of the acoustic field into an appropriate basis. Individually, these channels are of no interest. Only their combination is relevant, and it is used to compute the loudspeaker feed signals.
This section explores how an Ambisonic stream is encoded and decoded.
First Order Ambisonics
The principle on which relies the Ambisonics encoding and decoding is very similar to M/S recordings, that could be considered as unidimensional Ambisonics.
The MS pair allow recording stereophonic scenes, but doesn’t rely on a classic microphone pair. Instead it uses both a cardioid microphone to record the middle (M) of the scene, and a bidirectional microphone (figure of eight) to record the left and right sides (S).
Recordings produced by this technique are not supposed to be played on speakers as they are. The signals must be decoded first, to be played on a stereo system.
The decoding process thus consists in matrixing the "diectivity" components, for a given speaker layout (in case of MS, for a stereo setup):
- the left channel signal is obtained by adding the (M) and (S) channels: the out of phase lobe of (S) will can-cel the right part of the (M) signal,
- the right channel is obtained by adding (M) and (S) channels, with S having its phase inverted. The out of phase lobe will now be the left lobe; it will cancel the left part of the middle signal.
Ambisonics works on the same principle. For the 2D case, instead of a cardioid middle (M) channel, it uses is an om-nidirectional component called W. The equivalent of the sides (S) channel is called X. It also features a Y channel, bidirectional too, that covers the front and rear parts.
This 2D stream can then be decoded to a speaker lay-out containing at least three loudspeakers around the listener.
With 3D Ambisonics, the stream will simply feature one new component: the Z channel, on the vertical axis. It can then be decoded to at least four loudspeakers around the listener, featuring elevation.
Higher Order Ambisonics
Decomposing space in multiple directional captures can be extended with more components, featuring more precise directivity patterns. This is called Higher Order Ambisonics.
Ambisonics components are called "spherical harmonics,” from an analogy with the Fourier series decomposition of sound. As well as any signal can be considered as a sum of sinusoidal functions, of different frequencies and amplitudes (its harmonics), space can be divided into spatial functions which, once summed, enable to describe the whole space.
That higher quantity of Ambisonics components, along with their increased directivity, allows you to achieve a more precise rendering of the sound field with a speaker system.
The HOA order (m) expresses the number of Ambisonics components (n) (i.e. the number of audio channels the stream is composed of ), with the following equations:
- In 2D: n = 2m + 1
- In 3D: n = (m + 1)².
As for 1st order Ambisonics, this stream is decoded for the loudspeaker setup by combining these multiple components, with different techniques and optimizations that are covered in the HOA Bus page.