Demystifying the black art of digital audio

Alan Wheable

Author: Alan Wheable

Published 1st August 2013

by Alan Wheable
Issue 79 - July 2013
At the pure analogue end of the audio world (which these days is very narrow with much of the audio content being captured digitally) it is quite straight forward to check that an audio channel is present, that it sounds right and has the right levels. Likewise if the audio channel has hum or a buzz on it is relatively easy to identify the problem and fix it.
In the digital audio world where multiple digital audio channels are being delivered at different bit rates, as PCM or in a variety of compression formats, as mono, stereo or surround sound (in many flavours) for use in HDTV, DVD, Cable, Satellite, internet, mobile phone, etc, it can be nearly impossible to spot where things are going wrong let alone work out how to fix them.
The daily use of technologies such as Dolby E, Dolby Digital and Dolby Digital Plus in all areas of broadcast television content delivery requires an understanding of the importance of metadata, the transport mechanism as well as the tools to check them.
Digital Audio is a very broad and sophisticated subject area so here I will only focus on the specific challenges around Dolby E, Dolby Digital and Dolby Digital Plus as part of a SDI transport stream.

Dolby formats in common use in media distribution

There are a number of Dolby digital audio standards that transport audio data within the SMPTE 337M-2000 data burst or AES audio and embedded within the SDI data stream. These include:
- Dolby E
- Dolby Digital
- Dolby Digital Plus
These standards can be used to transport mono, stereo, 5.1 and 7.1 audio programmes:
Dolby 5.1 uses five channels for normal-range speakers (20 Hz 20,000 Hz) (right front, centre, left front, surround right and surround left) and one channel (20 Hz 120 Hz allotted audio) for the subwoofer driven low-frequency effects.
Dolby 7.1 (non broadcast) uses six channels in the primary program (Independent Substream) for a standard 5.1 surround sound mix and then the remaining 2 channels in an ancillary programme (Dependent Substream) to provide the additional down-mix version.
Dolby E
Dolby E is a video frame-based audio encoding and decoding technology developed by Dolby Laboratories that allows up to 8 channels of audio (mono, stereo, 5.1 or 7.1) for a primary programme (Programme 1) and optional ancillary programs. These 8 channels are compressed into a digital stream that can be transferred between compatible devices and stored on a standard stereo pair of audio tracks.
Dolby E is primarily a production format that allows the relatively transparent movement of a finished Dolby Surround Sound audio programme, via SD-SDI, HD-SDI or 3G-SDI, through the production chain until it reaches the point of transmission where it is converted to Dolby Digital or Dolby Digital Plus. As Dolby E is frame-based it allows simple editing such as cuts that do not affect the audio.
Dolby Digital
Dolby Digital (AC-3) is a perceptual audio system for digital audio that allows the reduction of data needed to deliver high-quality sound. This system relies on the fact that the human ear will screen out certain levels of sound that are perceived to be noise. The removal of this noise reduces the amount of data needed to deliver the sound to the listener. This system was developed primarily for DTV, DVD and HDTV.
Dolby Digital technology was developed by Dolby LaboratoriesTM to allow up to six channels of sound (mono, stereo or 5.1) in the form of a single program that can be delivered at different bit rates. These 6 channels are compressed (lossy) into a digital stream that can be broadcast.

Dolby Digital Plus Dolby Digital Plus (E-AC-3) is a more advanced version of Dolby Digital that provides a more efficient encoding algorithm that provides enough bandwidth to support sophisticated multi-programme content combining mono, stereo, 5.1, 7.1 & 13.1 for a primary programme (Programme 1) and optional ancillary programs that can be delivered at much lower bit rates than Dolby Digital. These programme channels are compressed (lossy) into multiple independent digital data stream plus up to 8 dependent sub stream that can be transferred between compatible devices and stored on a standard stereo pair of audio tracks.

HANC Encoding

Unlike PCM audio the Dolby data burst contains both the encoded audio channels as well as metadata. This Dolby metadata carries specific information about the encoded surround sound audio including the Dolby encoding method, the number and type of audio channels and the specific matrix coefficients required to re-assemble the surround sound audio at the receiver. The Dolby metadata is be delivered to the receiver with the encoded audio channels to ensure that the correct audio levels and the correct channel separation.

The complexity of Dolby encoding, its metadata and its transport using HANC SMPTE 337M data packet embedded within the SDI data, means that it is susceptible to video timing, switching issues, decoding and encoding and the insertion of additional broadcast metadata within the broadcast chain. It is therefore important to be able to analyse the metadata at each stage to ensure that all data is transmitted transparently and decoded successfully at its final destinations.
The header information within the Dolby E and Dolby Digital encoded audio package is used by Dolby decoders to identify specific encoding method used.
In the case of Dolby E, if this header information is missing then the audio is assumed to be PCM which can effectively cause a full scale noise burst that can damage audio monitoring equipment. Only when the next frame where a valid Dolby E header appears, will the Dolby circuitry be able to decode the data correctly.
In the case of Dolby Digital and Dolby Digital Plus, if this header information is missing, or if the Dolby programme is interrupted, then the audio is assumed to be PCM which can effectively cause a full scale noise burst that can damage audio monitoring equipment. Only when the next valid Dolby Digital header appears will the Dolby circuitry decode the data correctly. Note that this occurs every 3072 audio data samples.

Things to look out for In most cases it has to be assumed that the actual Dolby Surround Sound programme data is correct as it is difficult to interpret without decoding it first to its base band channels. When moving SDI video containing Dolby audio around a facility the highest risks of failure are likely to be the caused by timing issues that critically affect the detection of the Dolby header as well as interruptions or corruptions of the data stream.

Dolby E Framing Values

A Dolby E encoded audio programme is a video frame-based system whose data occupies the area of ancillary data normally occupied by the AES/EBU PCM audio. Unlike PCM audio that can tolerate video switching anywhere within a large range of lines in the vertical interval, Dolby E has a narrow guard band which contains no audio and in which video switching can take place without loss of critical Dolby header information.
The Dolby Reference Point (immediately following the Dolby guard band) is where the encoded Dolby E audio packet starts. This can be defined as specific video lines and timed approximately 700s ±80S from the SMPTE RP156 Reference Point. It is important for the Dolby E packet to be positioned well away from the video switching line so that Dolby E packets are not corrupted by downstream switchers. Test equipment such as the Sx hand held range with Dolby option can measure the timing of the Dolby E packet relative to the SDI input or the External reference as shown in Figure 1.
Corruption of Dolby Metadata
As mentioned earlier, the Dolby metadata contained with the audio data burst is as important as the encoded audio itself. Equipment such as play-out servers, that store complete Dolby Digital and Dolby Digital Plus video/audio programs, must deliver the Dolby data burst exactly as it was created otherwise it will not be decoded correctly at the receiver.

Equipment that decodes and re-encodes the video/audio data stream must re-assemble the data stream exactly to ensure that it can be decoded by the receiver as intended.
It can be difficult to inspect or interpret a Dolby data stream as it is passes through a broadcast chain. It can be more useful to inject a known Dolby program with known metadata values and then check at each stage in the broadcast chain that the injected program and metadata is the same.
Equipment such as the PHABRIX Sx hand held and PHABRIX Rx rack mount systems allow the generation of Dolby E, Dolby Digital and Dolby Digital Plus metadata and test tones embedded within an SDI data stream so that closed loop testing of any equipment can be performed.

Dolby CRC and SMPTE Pa/Pb Sync Word Spacing
There are a number of critical measurements that can be used to ensure that the Dolby data burst is correct:
The Dolby data burst provides a CRC word for each data burst. This allows the integrity of the Dolby audio data to be checked by the receiving equipment.
With Dolby Digital and Dolby Digital Plus, the audio data burst is a constant length each frame and is bounded by the SMPTE 377M-2000 Pa and Pb sync words. If this spacing is incorrect, or changes during a programme, it can indicate that the data burst is not generated correctly or that it is corrupt. It may not be possible to detect random Pa/Pb spacing changes or CRC errors during a program transmission so it is may be necessary to log these errors for later analysis.
Dolby Programme Combinations
Dolby programs can consist of mono channels, stereo pairs, 5.1 surround sound and 7.1 surround sound and even 13.1 surround sound as Main, Dependent or Independent programme streams. There are a large number of possible programme combinations that are permitted by Dolby E, Dolby Digital and Dolby Digital Plus and therefore equipment manufacturers and installers need to test all of these combinations to ensure that they are compliant. Products such as the PHABRIX Sx hand held and Rx rack mount equipment allow the automated testing of each of the possible combinations to ensure compliance.
Audio Buzz
When Dolby encoded audio is not processed or decoded correctly it can produce a distinctive audio buzz. If the audio programme is interrupted the decoder may repeat the same audio segment repeatedly unit the programme is restored.

Conclusion
At first sight Dolby audio encoding looks like a black art but in reality it is only a black art if you dont have an understanding of the principles that have been adopted and the equipment to analyse it. As I said in my introduction that if you can hear the audio you can tell if it is alright. With Dolby E, Dolby Digital and Dolby® Digital Plus its a bit like the Matrix where you have to concentrate on the moving letters and symbol to see the picture. Or in this case see the audio[1].
Digital audio is here to stay and will steadily become more sophisticated as more and more programme content is delivered over increasingly narrow channels but with the expectation of increasingly higher quality. So to deliver this expectation we need to tool-up, both our understanding of digital audio as well the equipment required to do so.
[1] If you view the audio data stream on a monitor as pulse cross, it looks a bit like the cascading symbol view of the Matrix from the film of the same name.
[2] Dolby® E, Dolby Digital and Dolby Digital Plus are trademarks of Dolby Laboratories¢

Related Listings

Related Articles

Related News

Related Videos

© KitPlus (tv-bay limited). All trademarks recognised. Reproduction of this content is strictly prohibited without written consent.