A Guide to Wave Field Synthesis
Spatialization algorithms offer an accurate localization area that is wider than with traditional systems. Wave Field Synthesis (WFS) is the most efficient algorithm in that field, but other algorithms, such as amplitude panning (VBAP, LBAP, etc.) or Higher Order Ambisonics (HOA), also offer excellent results on that matter, and those other algorithms can be useful for a wide range of situations with different audio content or system designs.
Origins in Scientific Research
Wave Field Synthesis is a sound spatialization algorithm that was initially researched in the 1980s at the University of Delft (Netherlands), and later by several research centers such as Ircam’s STMS lab.
Its working principle is based on Huygens’s principle, formulated in 1690 in his Treatise on Light. When a sound is emitted, it propagates through space as a wave front. WFS aims at reproducing that wave front, using several loudspeakers forming an array. The algorithm simply relies on the level and delay differences between the loudspeakers. Altogether, the contributions of those loudspeakers will build a simulated wave front (called secondary wave front) that is in theory identical to the primary wave front.
Why Choosing WFS over Stereo?
The Issues with Stereophony for Live Sound
Stereophony is a technique that relies on a pair of loudspeakers, allowing the listener to perceive the localization of sounds as coming from a continuity of positions from the left to the right loudspeaker. To position a sound between those two speakers, it simply relies on a ‘phantom image’ created by using level differences (for example, when a sound is placed slightly at the left, the right loudspeaker will receive slightly less sound than the left one).
However, stereo requires the listeners to be placed at the center of the venue. If they are not exactly at that position, they will not be able to perceive the accurate localization, because of the way our perception works.
Sweet-spot: the Weak Spot of Stereo
To determine the localization of a sound, our hearing system mainly relies on a simple principle: it measures, between our two ears, the difference of sound intensity and time of arrival for the perceived sounds. Our hearing system also relies on the ‘precedence effect’, which states that if two loudspeakers produce the same sound, we will perceive the localization of the sound as coming from the closet loudspeaker, even if the most distant speaker has a higher level.
Therefore, as soon as listeners move towards the side of the venue, the sound of the closest loudspeaker will arrive first at their ears, and they will start perceiving sounds as coming only from that loudspeaker, allowing no effect of localization (called left-right panning in stereo).
Consequently, the perception sound localization in stereo systems is only valid when the listeners are sitting at the center of the audience, a position called the ‘sweet spot’, a major drawback in stereo systems.
When Stereo Ends up Being... Mono
Due to the poor localization performance of stereo, sound engineers often have to position all the sounds at the center, resulting in a mono mix. Not only the artistic qualities of the mix might suffer from the lack of spatialization, but such a mono mix also increases the frequency masking phenomenon, an other drawback described below.
Mono is also enforced by the way standard systems are designed to balance the limited loudspeaker coverage of ‘stereo’ systems. The main ‘left’ and ‘right’ speakers only overlap at the center of the venue, even though designs tend to make this overlapping region as large as possible. Outside the overlapping area, listeners do not get a good and intelligible perception of the opposite speaker.
One solution is to add ‘fill’ loudspeakers, used to cover sub-areas. But those additional speakers receive a mono sum of the left and right signals, even if the sound engineer uses stereo panning in his mix. To optimize their mix, the engineers will have to take in consideration that a part of the audience is only covered by a mono mix.
Consequences of the ‘Frequency-Masking’ Phenomenon
The frequency masking phenomenon happens when summing in one channel two signals that feature portions of audio in the same frequency range (which is the case for the two situations described above).
It causes a loss of intelligibility that is usually managed by sound engineers with multiple audio treatments such as equalization, compressors, etc. But such tools, even if used for artistic purposes sometimes, alter the sound, and can make it appear less clear or natural.
The Benefits of WFS
An Algorithm with no Sweet Spot
Because that algorithm is focused on recreating a natural wave front, and the use of finely computed delays and level differences, every audience member will be able to perceive the sound as coming from the desired direction, wherever the listeners are placed in the venue. The sweet spot disappears, and almost every single seat becomes a privileged position.
Unlike stereo, with wave field synthesis sound engineers can experiment with virtual source positioning, and give the desired localization to sounds. Thanks to the core characteristics of WFS (individual gains and delays) there will be less frequency masking between different sources. The sound becomes easily more intelligible, clear or natural, and it becomes unnecessary to use as many compressors, equalizers, unless for artistic purposes.
WFS on High-Density Loudspeaker Arrays
Theoretically, to recreate the perfect wavefront, WFS would require an infinite and continuous array of loudspeakers (i.e. so many of them that there would be no spacing between them). But in practice, an interacoustical distance of about one meter between the loudspeakers is sufficient for live sound applications, to provide a wide localization area in the audience with no artifacts.
However, the spacing between the loudspeakers will limit the maximum frequency for the perfect wave front reconstruction. By using a higher quantity of loudspeakers with the smallest distance possible between them, the correct wave front reconstruction becomes possible for most of the audio spectrum. This allows to achieve an effect called ‘holophony’, where listeners are able to perceive the distance of the virtual source, in addition to its lateral position.
Such high density loudspeaker array can be heard at Ircam’s ESPRO auditorium, which counts 280 Amadeus PMX5 loudspeakers forming four WFS arrays, driven by the technology at the heart of the HOLOPHONIX audio engine. On the frontal array, the loudspeakers have an interacoustical distance of 16 cm (i.e. between the center of each loudspeaker).
For standard sound reinforcement situations, an array with larger distance between the loudspeakers (approximately one meter in standard sound reinforcement situations), offers satisfying results. Only the lower pitch sounds will benefit from the ‘holophony’ effect. But even though the interacoustical spacing doesn’t allow a correct reconstruction of the wave front for high pitch sounds, the full audio spectrum will still be perceived as coming from the direction of the sound source, thanks to the delays applied by the algorithm. Therefore, Wave Field Synthesis fulfills our main goal of offering an accurate source localization to every audience member.