A New Monitoring Tool for 5.1 Audio

 

Richard Brice

 

Abstract

 

A new visual-display monitoring-tool for sound engineers working with 5.1 audio is described. It combines the advantages of the analytical power of the familiar stereo Lissajous display with a visualisation of the periphonic sound-field pioneered in the “Jellyfish” display. The background and theory are discussed and a practical, analogue circuit implementation is given.

 

Background

 

Multi-channel audio has its historical roots in the cinema industry where a sense of periphonic sound has long been thought a great benefit to the overall entertainment.  Despite a multiplicity of products, a standard has gradually emerged which, whilst it fails to provide accurate periphonic localisation, nonetheless provides a degree of audio “envelopment” which is deemed by film makers and audiences alike to be the most important factor in the enhancement of their entertainment.  That standard has become known as 5.1 multi-channel audio; these numbers referring to the fact that the system comprises five full-bandwidth channels and one reduced bandwidth, low frequency enhancement (LFE) channel arranged as shown in Fig. 1. 

 

Fig. 1 The standard 5.1 listening arrangement

 

 

The low frequency channel (LFE) was originally termed the “Baby Boom” channel for its original adoption in Star Wars in the late nineteen-seventies and is reserved and engineered to provide the physical sensation we associate with deep space explosions (albeit that these take place in a vacuum!).

 

 

Monitoring

 

Because of its increasingly wide spread adoption, the requirement for a suitable monitoring device for 5.1 audio is becoming similarly widespread.  At the present time, the most common is the presentation of three quasi-stereo channels; the 5.1 audio being broken down into three pairs in the following way;

 

 

Unfortunately, the presentation of these signals, either on peak-reading type or power-averaging type meters, is both difficult to interpret and gives very little visual information about the “enveloping” 5.1 sound-field.  An attempt has been made to improve upon this situation by DK-AUDIO A/S of Denmark in what they have termed the “Jellyfish display” as illustrated in Fig. 2. 

 

 

 

Fig. 2 The “Jellyfish display” from DK-AUDIO

 

In this computer-generated presentation, the positions of the five, full range loudspeakers are marked on a graticule and the amplitude distribution of the sound-field is used to modulate a visual “blob” which sits in the middle of the screen.  This amplitude induced distortion of the “blob” is very highly damped, such that if a signal of consistent energy is used to energise - for example - the left front loudspeaker, then a tentacle grows out of the blob in the direction of the speaker position.  When energised with complex multi-channel programme the overall affect resembles a dancing jellyfish!

 

Whilst this approach is rather fun, in my own experimental 5.1 mixing sessions, I have found it to be not terribly useful.  The problem is, the damping is so high that the display fails to register all but the largest contours of programme dynamics. In addition, it simply displays the energy distribution about the periphery of the listening space: which is the one thing your ears can reliably tell you! What is required is a much “faster” display, and one which gives an indication of phase relationships between the channels.

 

Phase

 

The phase relationships that exist between channels of a multi-channel audio system represent critical information to a recording or quality-control engineer. This is because – although multi-channel audio systems are largely based on amplitude-derived stereophony – faults in microphone placement and in subsequent engineering and processing can produce phase-errors and anomalies that result in poor localisation or bass cancellation and comb-filter effects; especially when down-mixed to stereo or to mono. An indication of the phase relationships between the channels can alert the sound engineer to these possible problems in a way that tired, over-worked ears cannot always do.

 

The requirement to view the phase relationships between the channels of a multi-channel audio system relying on summing localisation has been well established for many years: especially so in television, where rapid quality judgements have to be made in perhaps less than ideal conditions. The solution is a display of a complex Lissajous Figure[1], derived from the left and right of the standard stereo inputs.  In this type of display the plates of an oscilloscope are fed with an amplified audio signal. This two-dimensional display has a particular advantage that it permits the engineer easily to inspect the degree to which the left and right signals are correlated; which is to say the degree to which a stereo signal contains in-phase, mono components and the degree to which it contains out-of-phase or stereo components.

 

In the usual arrangement, the Y plates inside the oscilloscope are driven with a signal that is the sum of the left and right input signal (suitably amplified). The X plates are driven with a signal derived from the stereo difference signal (R-L), as shown in Fig. 3. Note that the left signal will create a single moving line along the diagonal L axis as shown. The right signal clearly does the same thing along the R axis. A mono (L = R) signal will create a single vertical line, and an out-of-phase mono signal will produce a horizontal line. A stereo signal produces a woolly ball centred on the origin; its vertical extent governed by the degree of L/R correlation and its horizontal extent governed by L/R de-correlation. And herein lies the polar display's particular power, that it can be used to asses the character of a stereo signal, alerting the engineer to possible transmission or recording problems, as illustrated in Fig. 3.

 

 

Fig. 3 The Stereo Lissajous Display

 

The presentation of simultaneous left and right signals in a Lissajous display may usefully be thought of as the presentation a complex plane such that any instantaneous sound pressure, caused by the combination of the signals issuing from the left and right loudspeakers, may be thought of as a complex number where the difference component is the real part and the sum, the imaginary).

 

 

A New Monitoring Display

 

This article outlines the development of a new visual display device for the mixing and quality-monitoring of 5.1 audio signals.  It combines the attributes of the agility of the peak programme meter, the presentation of the distribution of the overall sound field of the Jellyfish display and the analytical power of the complex Lissajous display. 

 

 

Theory

 

Practical 5.1 audio systems treat the creation of phantom auditory events on the periphery of the circle on which lie the five cardinal loudspeaker positions by means of a piecewise stereophony. A study of 5.1 audio books and articles, as well as investigation of practical implementations reveals the orthodoxy is the following:

 

Phantom images are reliably created by the energising – with appropriate amplitude differences – the two adjacent channels to the particular phantom position.

 

In the case of the forward arc, this is familiar from conventional stereophony. However, the Centre channel loudspeaker complicates the situation and this is dealt with later. Interestingly, contemporary usage tends still towards the use of conventional, two-loudspeaker stereophony (LF, RF only) for music-bed and front effects; with the Centre channel being reserved for dialogue or a mix of dialogue and an arithmetically derived average of left and right.  This theory is extended to cover the rear arc (between LS and RS) and for side images between LF and LS and RF and RS. This monitoring tool supports the orthodoxy of 5.1 multi-channel audio and reflects the theoretical and practical presentation of the virtual sound field according to that orthodoxy, as will be shown below.

 

The schematic of the initial, experimental prototype of the proposed monitoring system is given in Figure 4.  Central to the concept are the half wave-rectifiers.  Later I will give analogue circuit implementation of the complete monitoring system but, both this rectification part and the subsequent display part, could easily be adapted (and improved) to digital techniques and/or software implementation.

 

 

 

 

Fig. 4 The initial experimental prototype

Why half wave rectification of each of the input signals?  The answer lies in the piecewise, two-channel stereophonic approach to periphony adopted in 5.1 audio.  Think back to the complex visual display used for stereophonic monitoring in which we saw two signals energising the X and Y plates of a cathode ray oscilloscope display.  As discussed, one way of rationalising this display was to imagine that one signal represented real values and the other imaginary, and that any instantaneous sound pressure - caused by the combination of these two signals - was represented at a point on the complex plane.  Half-wave rectification of the signals (the transformation of bipolar signals into unipolar ones) ensures that the presentation of the signals on the display is confined to one quadrant of the complex plane.  And this is exactly what is required given the piecewise, two channel stereophonic approach employed in 5.1 audio. 

 

In the experimental set up shown in Figure 4, in can be seen that, with the appropriate direction of rectifier, LF and RF contribute the imaginary and real values in the first quadrant and LS and RS represent the real and imaginary components in the third quadrant.  Note also that the Centre channel is added equally to the real and imaginary values in the first quadrant.  This ensures that the Centre channel contributes only to a special vector at +p/4. (The Centre channel is shown greyed-out in Figure 4 because there is a limitation with the technique as shown, which will be dealt with below.)

 

Now consider a sound panned from front to back to the left of the listener. A little thought will demonstrate that this will appear on the display as a phasor of length l which rotates between the positive imaginary access and the negative real one as illustrated in Fig. 5. 

 

Fig. 5 a front to rear left pan

 

 

In fact, ignoring the Centre channel for a moment and imagining the four remaining speakers as contributors in a conventional quadraphonic loudspeaker set up, one can imagine that, with the appropriate amplitude panning, it is possible to produce a phasor of a certain length rotating around the origin on the complex plane.  Clearly, this is exactly what the recording engineer requires; since it gives an accurate picture of the amplitude and position of phantom images within the listening “circle”.

 

 

Phase display

 

It is instructive to consider the resulting display when, for example, two coherent signals are presented to adjacent stereophonic speakers but at different phase relationships. In this example I will take the example of LS and RS.  Imagine that LS is fed with a tone of 1 kHz and that RS is presented with a similar tone but with a varying phase relationship with respect to the signal in the LS channel.  Experiments have shown that the resulting display is highly informative and is summarized at five important phase relationships in Figure 6.

 

 

Fig. 6

 

 

When both signals are entirely coherent and in phase the result is a phasor at 225o (Fig. 6a).  As the phase begins to change at the 90o point, the display has become a hemi semi-circle in the third quadrant (c).  As the signals phase relationship moves beyond the 90o point, the hemi semi-circle degenerates (d) to becoming two entirely separated phases on the real and imaginary axis (e).  This is remarkably intuitive and correlates well with this objective experience. This demonstrates that the proposed display combines the virtues of;

 

 

 

Incorporating the Centre channel

 

As described above, if the conventional, two-loudspeaker, stereo panning technique is used, the proposed display will accurately represent the perceived phantom image position as a phasor which rotates between the Y(I) axis and the X(R) axis. However, there exists a problem with the proposed display in relation to the way the Centre channel is introduced as shown above.

 

The Centre channel represents a complication in the piecewise-stereophonic approach of 5.1 theory, because sounds may be panned across the front arc in two ways. Imagine a left-to-right pan across the arc bounded by the loudspeakers LF and RF. This may be accomplished either by means of a conventional stereo pan between LF and RF, or as a pan from LF to C and thence to RF. Several authors recommend the second technique as the preferred method (Holman 2000). However, there is great disagreement between authors on the preferred control law (Rumsey 2001). Gerzon (1992) goes so far as to state that no simple law can ever exist for such a control.

 

 

Fig. 7

 

 

So what will the proposed display indicate as a source is panned between the three front loudspeakers? Fig. 7 represents the result of a LF to Centre pan[2] when the primitive circuit of Fig. 4 is employed. To understand the graph you have to read each point left to right as the position of the tip of the phasor as it moves from extreme left (LF) to centre (C) in quadrant 1 as the pan is operated, the various points indicating one-tenth of the overall pan-control rotation. The reason for this non-linear result is the effect of the rectifiers. Looking at the circuit in Fig. 4, you can see that the positive values presented to the Y(I) output will be the rectified result of the LF signal and the Centre signal and, in each case, the instantaneous, positive value will be whichever is the greater of these two signals. That accounts for the inflection in the curve in Fig. 7.

Fig. 8 – An improved circuit giving better results in the first quadrant

 

 

This result is neither accurate nor intuitive and something better is evidently required. One way of approaching the problem is to say, “what is really needed is a fifth oscilloscope deflection plate between the positive Y plate and the positive X plate, energised directly by the Centre signal” (as shown in Fig. A1 in the Appendix). Naturally this would be impossibly expensive and – in any case – unnecessary, because it’s possible to produce an identical effect to this extra plate by energising the positive X and Y plates with a Centre signal multiplied by sine 45 degrees and cosine 45 degrees respectively[3].This is accomplished in circuitry by rectifying the Centre signal, reducing its amplitude by 1/Ö 2 and summing it with the signals for LF and RF to generate the X(R) and Y(I) signals for the first quadrant. A simplified circuit for so doing is given in Fig. 8.

 

Interestingly, this alternative approach has exposed the unexpected result that, depending on the pan law, at some phasor arguments, the phasor magnitude appears distorted as revealed in Fig. 9. In the figure, the phasor positions are given for twenty points between full LF and RF passing through Centre. Two laws are plotted, constant-power (diamonds) and constant-gain (squares).

 

 

 

Fig. 9 – Various pan laws as displayed on the proposed display

 

 

However, as Fig. 9 reveals, in spite of the phasor amplitude distortion, the phasor arguments accord well with the pan-control deflection irrespective of pan-law and this is a big improvement over the original circuit. Furthermore it is widely recognised that neither the piecewise, constant-power (sine-cosine-sine) approach nor the constant-gain approach are ideal for pan-controls for three loudspeakers in the forward arc; being deplored on both psychoacoustic (Rumsey 2001) and theoretical (Gerzon 1992) grounds. Certainly the phasor distortion on the proposed display during LF-C-R pans supports the reservations of these authors. (See Appendix 1 for a further discussion).

 

 

The Final Design

 

Comparing the figures 5 and 6 with the disposition of the speakers in relation to the listener as shown in Fig. 1, it is evident that rotating the entire display by 45 degrees would be advantage to the user; because the mapping of the resulting phasors would better coincide with the listening room arrangement. Now, any point (x , y) may be rotated by any angle (a) by means of the matrix multiplication:

 

(x , y) .             cos a    sin a               =  ( xR ,  yR)

                                    -sin a    cos a

 

 

 

where   ( xR ,  yR) is the point after rotation.

 

Because the required rotation is constant, these multiplications are constant too, and simply represent the appropriate scaling of the summing resistors shown in Fig.11, which is the final, prototype circuit. Notice too that a precision rectifier circuit has replaced the simple rectifiers of the initial prototype. This is important because the complex exponentials in the two (I / Vak) relationships of the simple diodes translate to strange curly things when viewed on the complex plane!

 

Further Improvements

 

Nowadays, an analogue implementation is outdated, but – as stated above – these ideas are easily to translate to either a hardware digital or to a software implementation. One limitation which might easily be addressed in a software implementation, would be the addition of a conventional PPM linear meter to monitor the LFE channel; since this is missing in the practical implementation described above. The proposed final graticule for the new monitoring device is shown in Fig. 10.

 

 

Fig. 10 The final graticule for the new monitoring device

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Fig. 11 – Full circuit for prototype display device

 


Appendix 1

 

 

Is there a better LF-C- RF pan-law?

 

Given the fact that the constant-power (sine-cosine) panning law is universally accepted as being the best approach to standard, two-channel stereo panning and that this law produces, on the proposed display, a result that is both theoretically and perceivably justified, it seems not unreasonable to turn the argument “on its head” and use the display to suggest an improved panning law which will produce – when employed – a similarly consistent result when displayed in the manner described in this article.

 

Fig. A1 – The proposed display rationalised as a five-plate oscilloscope

 

 

If – as shown in Figure A1 - we imagine the proposed display as a five-plate oscilloscope with an extra plate placed at +45 degrees between the positive X and Y plates (and it is mathematically justifiable to think of it in this way), we can solve simultaneous equations to find the ideal law so that we get an identical display for the three-channel, LF-C-RF pan and for the two-channel LF-RF pan.

 

This requires that we resolve any phasor into component phasors separated by 45 degrees. In Fig A2, phasor (r , s) is the resultant of phasor (c , d) and (a , b).

 

 

Fig. A2 - Phasor (r , s) is the vector sum of phasor (c , d) and (a , b)

 

More generally, we can say that,

 

r = a + c and

s = d + b

 

We need to solve for a, b, c and d in terms of r and s. But there are too many unknowns. Fortunately, we can simplify. Firstly, because one component of the vector is always aligned with the Y axis, the value of c will always be zero. Secondly, because the second component is always at 45 degrees, a will always equal b. So we can re-write this,

 

                        r = a , or  a = r   and

 

                        s = d + a  , or,  s = d + r

 

                        d = ( s – r )

 

But, because we are thinking in terms of the voltages applied to these plates, we must think in terms of the magnitudes of each phasor M(Y) and M(C). Because c is always zero, the M(Y) = d. The magnitude of the phasor (a , b) is given by,

 

            M(C) = Ö (a2 + b2)

 

Although this simplifies (because a = b) to,

 

            M(C) = Ö 2 a   =   Ö 2 r

 

We can now solve for a and d so that the phasor (r , s) proscribes a circular path; just as it does when it is generated using the sine-cosine relationship in X and Y. The results are given in Fig. A3

 

 

 

 

Fig. A3 – A proposed new panning law

 

Note that the LF and RF responses are based on a (sinq - cosq) curve and the Centre channel pan is based on an amplified version of the cosine curve between 0 and 45 degrees and its mirror image (both these functions having a similar form). Happily for the analogue designer, the results are very nearly linear and a proposed circuit is given in Fig. A5.

 

The results shown in Fig A4 are plotted on the same axes as Fig. 9 to compare this proposed law with the constant-power and constant-gain regimes discussed in the main text. For comparison, the results of the circuit plotted with the theoretical curve in Fig. A6.

 

Returning to the analogy of the five-plate oscilloscope, the proposed display is a consistent analogue of the physical, acoustical situation when five loudspeakers are arranged as in a 5.1 set-up. It thereby represents a consistent, theoretical framework for analysing the “pan problem” discussed by the authors in the references. Therein lies the possible legitimacy of the new pan-law. Nevertheless, the display is not a perceptual model and the validity (or otherwise) of the proposed pan-law would need to be tested in listening experiments.

 

 

Fig. A4 – The result of the new panning law when displayed on proposed display (triangles). Constant-gain (squares) and constant-power laws are plotted for comparison.

 

 

Fig. A5 – Proposed new pan control and values

 

 

 

 

Fig. A6 – Practical pan-control law and its display on proposed display (theoretical curve is also given – triangles)

 

 

 

 

 

 

References

 

Gerzon, Michael A. 1992. Panpot laws for multispeaker stereo. 92nd Convention of the Audio Eng. Soc., Vienna. Preprint 3309.

Holman, T. (2000) 5.1 Surround Sound Up and Running. Focal Press

Rumsey, F. (2001) Spatial Audio. Focal Press.

 



[1]After Jules Lissajous who was professor of mathematics at the Lycée Saint-Louis in Paris in the mid eighteen-hundreds and who studied vibrations by means of mirrors attached to perpendicularly mounted tuning forks of related frequencies.

[2] Assuming a constant-power (sine-cosine) law.

[3] Assuming linear amplifiers and linear deflection.