Audio Basics

Some detail-motivated people, myself included, find themselves on a life quest for good sound. One could consider it a hobby. We seek all the musical clarity and detail that lies within the response limits of our hearing, the highest highs and the lowest lows. We seek musicality without coloration or distortion. We seek the maximum impact of musical transients. We seek all the subtleties of room acoustics, and of the shadings/sibilants of the human voice. We want to sense that the music makers are playing in the room with us.

We normally do not use background music just to keep us company. We know that only through intense listening can we truly appreciate what the composer and performers are trying to communicate. It is this focus that causes us to demand as much authenticity as we can achieve. (And because we can imagine the performers in the room with us, listening intently also demonstrates good manners. Most public performers learn to tolerate unmannerly audiences, but are genuinely appreciative of those that at least pretend to pay attention.)

Good sound always begins with a faithfully-recorded musical performance. For the last 60 years or so, two channel (stereo) music has provided the optimal music reproduction experience to the most people. Stereo music was initially recorded onto analog master tapes, then reproduced on vinyl records. In the 1980s, the CD was invented to store a digitized version of the stereo analog tape signals. Later, the recording itself was made direct-to-digital (as indicated by the first D in the DDD SPARS code on a CD).

To most, a CD sounded cleaner (fewer media-induced artifacts) than the analog vinyl and tape it replaced. Better yet, the CD would not degrade with use and would provide exact copies of itself. It was and still is the holy grail for most good-sound seekers. Limitations of human hearing, advancing with age, lessen perceivable benefits of higher-end digital technologies.

How does one digitize sound? An analog waveform is converted to digital information by sampling the amplitude of the waveform at a high frequency and recording each sampled volume as a number. Each sample records a linear amplitude difference relative to surrounding volume samples, where the magnitude  (resolution) of the sample occupies some number of bits, usually from 1-24 bits, within the digital audio stream.

Thus there are two parameters that characterize a digital audio format, sample resolution and sample frequency. The more bits of resolution and the higher the sampling frequency provided by a digital format, the truer the digital waveform. The CD digital format is called linear pulse code modulation (LPCM, aka Red Book format). CD LPCM uses a sampling frequency of 44.1 kHz and a sample size of 16 bits. Most people think well-produced CD sound is excellent, but it requires large bandwidth and storage size.

With the advent of personal music players with small flash memory, compressed music formats such as MP3 (don’t ask) have become prevalent in the last decade. Most music available for on-line download is encoded as MP3 or more recently in Advanced Audio Codec (AAC, the format for current iTunes downloads). For MP3/AAC, it is claimed that only inaudible (mainly very high frequency) information is removed during compression.

MP3/AAC comes in various qualities from iffy to good-enough, depending on the chosen playback bit rate, and/or selection of adaptive bitrates (VBR). 192kbps is the lowest MP3 bitrate that typically will satisfy critical listening by the average consumer. Going up to 256kbps (the default iTunes AAC bitrate), I have found no credible reports of people being able to identify differences between the MP3/AAC encoding and a CD of the same music.

For all but true golden ears, the advantages of lowered storage requirement probably outweigh any disadvantages of ‘good enough’ compressed formats. My experience has been that the low-quality MP3 can be significantly audibly inferior to CD. For example, with the original iTunes 128Kbps MP3 format, I hear orchestral strings gain a fuzziness when played back on good audio equipment. Yet, I find bell like sounds to be acceptably reproduced, so perhaps quality depends on the complexity of the waveform.

Those of us on a quest for good sound avoid compromise and never listen to compressed music from iTunes. We would rather buy CDs and rip them into an iTunes database using Apple Lossless (ALC) or similar lossless encoding. Thus we get the best of both worlds, the quality of a CD and the convenience of an iTunes database on our computer. (And we do not allow Apple to convert us into micro revenue streams.) While ALC requires 5 times the storage space of AAC 256Kbps, disk space is cheap and backups are automated using Apple’s Time Machine software.

With my roughly thousand albums, ripping more than once is not an option. A lossless format guarantees my library can always provide the quality of the original recording when that quality is desired. Since lossy compressed audio has in the past sounded inferior to CD sound to my ears, no lossy musical performances will ever emanate from the Great Wall of Stuff in my home. Movie soundtracks are a different story though, because for them, compressed audio is usually the only game in town.

If stereo is good, might not more simultaneous channels be better? Multi-track music is not new; in the 1970s, 4-channel quadraphonic analog tapes were briefly available as an enhancement to stereo music. Today’s surround sound, as it is often called, puts the listener in the middle of the performance (music). It also can put them in the middle of the action (movie).

The 1940 Disney film, Fantasia, was the first use of surround sound in a movie. Surround sound technologies for creating movie soundtracks have migrated from the theater to home video equipment, where they have assumed many different identities over the last two decades. Labels with these various names populate the faceplates of new A/V components, advertising their compatibilities.

The most common surround format has six discrete channels, designated 5.1. It offers 5 full frequency channels, and one low frequency effects (LFE) channel for carrying low frequency movie sound effects below 120Hz (cannons, bombs, and the like). The LFE is called .1 because it represents only a small fraction of the frequency range of a full range audio channel. The 5.1 channels are the front left, center and right speakers, the rear left/right surround speakers, and the subwoofer. In addition to LFE signals, bass management facilities of some components can direct some of the lower frequency content of the other 5 channels to the subwoofer as well.

DVD-Video media is almost always mixed in surround sound. Some HDTV content is broadcast in surround sound as well. For example, live events such as sports programming typically will mix audience noise in the surround tracks. Dolby Labs has dominated this surround market with two basic audio technologies, Dolby Digital and Dolby Pro Logic. Dolby Digital (aka AC-3 encoding) is a 5.1 discrete channel compressed digital soundtrack recorded onto most DVD-Videos.

A/V compression is required for producing DVD-Video discs (due to storage size restriction), and also for broadcast TV (due to transmission bandwidth restrictions). DVD-Video compression is via the MPEG-2 standard for both audio soundtracks and video. Audio encoding is typically via AC-3 compression at 448Kbps for 5.1 channels. ATSC broadcast sound uses AC-3 compression as well. The Japanese HDTV broadcast standard uses AAC compression, but Dolby exerts lobbying influence in the USA. Thus broadcast sound and DVD-Video sound are similarly compromised, but good enough. They are seldom meant for hyper-critical listening.

Dolby’s other decoding technology, Dolby Pro Logic II, can turn any stereo (e.g. LPCM) audio track into 5.1 channel sound by synthesizing the other channels using sophisticated algorithms. This technology does not record an audio format onto the DVD, but rather processes stereo soundtracks to produce 5.1 sound on playback. Dolby Digital and Dolby Pro Logic are pretty much the only technologies required for providing state-of-art surround sound from DVD or CD or stereo TV broadcast. They do the job as well or better than any other available technology.

Higher quality sound is available on some DVDs. In addition to or in place of Dolby Digital, some DVD-Video recordings have either an LPCM CD-quality soundtrack or a Digital Theater Systems (DTS) 5.1 channel compressed soundtrack. Because it uses less compression than AC-3, DTS is often the choice for soundtracks of music performance videos, as is uncompressed LPCM. For example, the Eagles Hell Freezes Over concert DVD-Video comes with DTS and LPCM soundtracks. Other sound enhancement technologies, identified by names such as Hall, Stadium, Arena, Club …, provide added ambience to the soundtrack via DSP processing in the A/V playback component, but such artificial sound effects are not usually of use to ‘good sound’ listeners.

So far, CD quality audio is the best of the previous attempts at providing good recorded sound (I have to be careful here and avoid offending any vinyl advocates). However, CD recording practices of the last two decades, particularly in pop music, have succumbed to the ‘loudness wars’, wherein music producers try to outdo each other in making all the music on the CD sound loud. The general public, it is imagined, will prefer the CD that plays loudest. CD masters that have had their volume pushed to the maximum and beyond to the point of digital distortion do not faithfully reproduce the actual performance. Good sound reproduction equipment will mercilessly expose these flaws. Audio enthusiasts should exercise care in choosing well-mastered CDs.

There have been a few attempts to transcend the audio quality of CDs, by upping the resolution and the number of channels. This has meant finding physical media with greater storage capacity than a CD (700 MBytes). None of these attempts have taken the market by storm, because the sound is not audibly superior to plain old CD for most people. However, good sound seekers will pay a premium for them, perhaps in the faint hope of finding a superior audible experience, and perhaps in the expectation that old familiar stereo music will take on new life in surround sound. But a more tangible benefit might be that these high-end recordings will have been produced or re-produced with more care, so the new digital master itself will likely be more faithfully rendered.

Two attempts at ultra-high fidelity sound were made between 1999-2005. DVD-Audio and Super Audio CD (SACD) formats were offered on DVD and CD media, each offering multiple tracks of high quality sound. A format war ensued. DVD-A had a brief run of some 5 years and then was obsoleted. SACD is still being made available, but the number of titles is small. Special DVD-players can process all four formats, CD, SACD, DVD-V and DVD-A.

Like CDs, DVD-Audio uses LPCM encoding and can provide 2-channel music at 24-bit, 192kHz quality, or 5.1 channel audio at 24-bit, 96kHz quality. DVD-A sometimes uses Meridian Lossless Packing (MLP). SACD uses Direct Stream Digital (DSD) encoding, representing the waveform by 1-bit samples, sampled at a frequency of 2.8224MHz (64 times the CD sampling rate of 44.1KHz). The single bit of information records whether the sound volume is increasing or decreasing relative to past samples. [Aside: Essentially, such sampling is the front-end step of all current analog-digital conversion (ADC). SACD chose to record this base signal, whereas LPCM encoding passes this signal through a decimation filter, which down-samples the signal to remove unnecessary bandwidth and lower the noise floor.]

Countless words have been written about which technology sounds better after reproduction, but no blind A-B test has ever suggested an audible difference. Being a golden ear wannabe, I have built a small library (only 25 albums or so) of DVD-A and SACD titles. Most are re-masterings of classic recordings, but some are new material recorded specifically for this media.

There may be a cautionary tale here. These high-resolution discs are higher priced than CDs, and the small number of titles available indicate that the general public was not convinced of a need for audio quality higher than the CD offers. Quite the opposite, the overwhelming evidence shows the majority of the public is satisfied with lower quality at half the price per track, as evidenced by the MP3 online kiosks and the difficult times experienced now by CD vendors. Convenience and price are trumping quality big time in our brave new world.

The latest ultra-high fidelity media format is Blu-Ray Disc (BD), which provides audio and video together, an ultra-high quality version of DVD-Video. BD won a format war with a competing technology, HD DVD. BD will store nearly six times more information than DVD-Video, so that six channels of high-quality uncompressed sound can accompany video recorded at 1080/30p. Uncompressed audio on BD can be encoded as LPCM, DTS-HD Master Audio, or Dolby TrueHD. The latter two are unique to BD, continuing the audio marketing competition between DTS and Dolby. In quality, they are equivalent to DVD-Audio, but may support more channels.

At some point, BD might be expected to entirely replace DVD-Video, DVD-Audio, and SACD, a new holy grail. But the BD title producers and equipment makers have not observed the cautionary tale above. The public may not sign up to pay a higher price for something they don’t appreciate. Until these titles and their players compete on price with DVD-Video and SACD, they will be a novelty. And since the advent of streaming A/V titles on demand from the Internet, physical media will inevitably become a niche market for those few like myself who still value the highest perceptible quality. Hopefully this niche is large enough to keep the golden-ear flame alive.

Comments Welcome