An audio pro takes a closer look at the idea that “high-rez is better”… and comes up with controversial results
By Ethan Winer
There’s a lot of interest in “high-resolution” audio lately, spurred in part by audiophile record labels that offer uncompressed 24-bit files at 96 kHz, as well as Apple’s recent Mastered For iTunes initiative that aims to increase the fidelity of music processed using lossy data compression. (Not to be confused with volume compression used as a studio effect.) Mastering is an important final step to ensure a pleasing tonal balance, among other enhancements, and even great mixes often benefit from a second opinion by an expert with a fresh outlook. Everyone wants their music to sound as good as possible. But is the push for high-resolution formats a true advance, or just a ploy to sell us the same music again?
The weakest link in lossy-compressed music is the compression process itself. Portions of the music deemed “unimportant” are discarded based on their audibility. A loud cymbal crash can mask a soft tambourine, so it’s acceptable to discard the tambourine for a second or two until the cymbal fades out sufficiently. Using more aggressive compression makes the file smaller by reducing its bit rate, but also removes more of the music. At some point the loss is audible. Therefore, it’s important to balance the amount of data compression with the demands of the music to minimize degradation.
On Mastering
So what does this have to do with Mastered For iTunes? Apple’s Technology Brief [1] claims that high-resolution audio sounds better than normal CD quality at 16 bits and 44.1 kHz, but offers only anecdotal evidence:
“…many experts feel that using higher resolution PCM files during production provides better-quality audio and a superior listening experience in the end product. For this reason, 96/24 resolution is quickly becoming a standard format in the industry, and it’s also common to see higher resolution files, such as 192/24.”
I’m not convinced that using sample rates and bit depths beyond those of standard CDs is useful for a distribution medium because 44.1 kHz at 16 bits has been proven sufficient. One early test done in 1984 showed that inserting a 44/16 digital “bottleneck” into the audio path was not audible [2]. A more recent—and far more exhaustive—series of tests by Meyer and Moran [3][4] published in the Journal of the Audio Engineering Society employed 60 listeners in 554 trials over a period of one year. The results of asking listeners which file they preferred were the same as random chance, at 49.82 percent.
A few years ago, a company specializing in high-resolution music files was found to be selling CD-quality files that had been up-converted to 96/24. This came out only after someone analyzed the files and discovered some that rolled off steeply above 22 kHz. None of their customers ever noticed! Which makes sense, given that nobody can hear frequencies much higher than 20 kHz.
Even if 44/16 is sufficient for distribution, does recording source tracks at higher resolution improve fidelity? Some people believe that plug-ins sound better at high sample rates and bit depths, but controlled tests I’ve done for distortion, noise, and frequency response—the parameters that affect audio fidelity—failed to confirm this.
Most audio software processes music using 32-bit floating-point math, regardless of the source file’s resolution. This yields extremely low distortion for the DSP (Digital Signal Processing) operations that perform EQ, compression, and other effects. Further, 16 bits provides a dynamic range of 96 dB, which is sufficient for any type of music including gentle classical and jazz.
It may seem that 24 bits offers more resolution than 16 bits, but all that’s affected is the noise floor. Even in a quiet professional studio you’d be hard pressed to make a recording where the noise of the 16-bit medium exceeds the ambient noise in the room or the noise of the microphone preamps. You can easily prove this for yourself: Record any group of musicians in any venue at 24 bits, then play the file in audio-editor software and note the playback meter during a silent passage. If the background noise is above –90 dB then the medium is not a factor. Some types of DSP math can be more accurate at higher sample rates, but that’s handled by upsampling within the code when appropriate. There’s no need to record tracks at a higher sample rate.
Nobody is arguing against obtaining the highest quality possible throughout the entire production process, including the final medium consumers receive. Further, improving the code that implements lossy compression is always worth striving for. But that’s the responsibility of the programmers, not recording engineers. In my opinion, the notion that sample rates or bit depths greater than 44/16 are audibly superior should be re-examined, and this is easy to assess for yourself with a proper blind test.
[1] http://images.apple.com/itunes/mastered-for-itunes/docs/mastered_for_itunes.pdf
[2] http://www.bostonaudiosociety.org/bas_speaker/abx_testing2.htm
[3] http://www.aes.org/e-lib/browse.cfm?elib=14195
[4] http://www.bostonaudiosociety.org/explanation.htm
Ethan Winer is co-owner of RealTraps, a popular acoustic treatment company based in New Milford, CT, and his new book The Audio Expert from Focal Press has received rave reviews. You can find Ethan at www.realtraps.com and www.ethanwiner.com.