Sunday, May 14, 2017

Clipping In Sampling Rate Converters

In my last post, I investigated clipping of intersample peaks that happen in DACs. But as I had started exploring the entire path of sound delivery, I discovered that digital sound data can arrive to DAC already "pre-clipped". And thus even a DAC with headroom will render it with audible inharmonic distortions.

Theory

The reason behind this is inevitable sample rate conversion when sampling rates of the source material and of the DAC do not match. Unfortunately, this happens quite often because during the evolution of digital audio multiple sampling rates come into use. The major "base" sample rates are 44100 Hz originating from CDs (Red Book Audio standard), and 48000 Hz coming from digital video. Plus, there are whole multiples of those rates: 88200, 176400, 96000, 192000 etc.

Having this variety, it's not surprising that sampling rate converters are ubiquitous. Without them it would be impossible to correctly play, say a 44100 Hz CD audio via a 48000 Hz DAC—the source audio will be rendered with wrong rate and will have incorrect pitch.

But doing the conversion isn't trivial. What sample rate converter has to do is basically render the sound wave into a mathematical curve, and then resample the values of this curve using the target sample rate. The problem that can occur here is that in a sound wave normalized to 0 dBFS the points of the target sample rate can overshoot this limit.

For example, below is a graph of a 11025 Hz sine wave at 45° phase shift sampled at 44100 Hz (blue dots), and sampled at 48000 Hz (red dots):

As you can see, at the 48 kHz sampling rate the dots are closer to each other, and some of the red dots have values of above (or below) the margins of the original 44.1 kHz sampling rate.

Had the source wave 44.1 kHz wave been normalized to 0 dBFS, the blue dots that currently have approximate values of 0.5 and -0.5 would be at 1 and -1, respectively. Thus, the values of the 48 kHz sampling would end up above 1 (or below -1). Which means if the converter is using integer representation for samples (16-bit or 24-bit), and doesn't provide headroom, it will not be possible for the converter to render those values, as they will exceed the limit of the integer. Thus, they will be clipped, and this will result in a severe distortion of the source wave.

The same thing can happen in a conversion from 48 kHz down to 44.1 kHz, or when upsampling from 48 kHz to 96 or 192 kHz. Basically, any conversion that results in emerging of new sample values can produce values that exceed the peak value in the source wave. The only potentially "safe" conversion is when the source wave get downsampled to a whole multiple, e.g. from 96 to 48 kHz, because this operation can be performed by simply throwing out every other sample.

Practical Examples

Google Nexus Player

Here am examining sound paths that I have at home. Let's start with Google Nexus Player. It's a rather old thing, and I don't think it pretends to be a "Hi-Fi" player, but nevertheless I use it from time to time, and I would like to see what it does to sound.

This is my setup: the HDMI output from Nexus Player goes into an LG TV, and it separates audio via TOSLINK connection that goes into E-MU 0404 music interface, and then to SPL Phonitor Mini. As in the last post, for measurements I will be using E-MU Tracker Pre card connected to a laptop on battery power.

I use two sound files for test: one is the same as the last time (11025 Hz sine wave at 45° phase in a 44.1 kHz FLAC), and another is 12 kHz sine wave at 45° in a 48 kHz FLAC. Both files were uploaded to my Play Music locker. I'm aware that Play Music uses lossy 320 kbps MP3 on their servers, but for these simple sine wave files this generous bitstream is effectively equivalent to lossless. At least, Play Music doesn't perform any resampling.

Since TVs are designed to be used with video content, their preferred sampling rate for audio is 48 kHz. I haven't found any way to change that setting for my TV. So first in order to test the signal path, I played the 12 kHz sine wave file (48 kHz SR), and captured it from the line output of E-MU 0404 also using 48 kHz sampling rate on Tracker Pre. The result on the frequency analysis is a beautiful clean peak at 12 kHz with no distortions at all:

However, 48 kHz isn't the typical sampling rate for the content on Play Music store—since their source is CD content, most of the albums are using 44.1 kHz sampling rate. Even YouTube uses 48 kHz sampling rate audio as I have discovered (I've checked with VLC player, it can open YouTube video streams). Not sure about the sampling rate used in Play Movies, though.

So let's now play the 44.1 kHz sine wave file using the same setup. The only change I've made is setting the capturing sampling rate to 44.1 kHz on Tracker Pre. And the result is pretty ugly:

If I wasn't really happy about how the frequency analysis looked for Benchmark DAC1, this one simply made my hair stand. The resampler in Nexus Player clips severely. What's even worse, there is not much I can do about that, since there are no controls over digital attenuation or sampling rate. Too bad. At least now I know why snare drum on "Gasligting Abbie" by Steely Dan doesn't sound good when played via this setup.

Dune HD Smart H1

I also have an old Dune HD player connected to the same LG TV. Unlike Nexus Player, Dune offers a lot of control over playback. It also supports FLAC format. Again, I started with playing a 12 kHz sine wave at 48 kHz SR just to make sure that the sound path is clean, and it was all OK.

Then I played a 11025 Hz sine at 44.1 kHz SR, and again got a lot of distortion (although the level of distortion peaks is lower than on Nexus Player):

But here at least I can do something to fix that. I can't change the sampling rate, but Dune offers digital volume control, even in dB scale. I used it to reduce the volume by 4 dB down, providing enough headroom for the resampler, and the result is a beautiful clean 11025 Hz peak:

Great, now I have much more confidence in my setup.

PC-based Playback

By PC I mean Macs as well. On desktops and laptops there is a lot more control over the parameters of the digital audio signal path—it's easy to change the sampling rate on the DAC to match the sampling rate of the source material, also the majority of digital players offer digital attenuation. So there is no problem ensuring that nothing clips the digital signal on its way to the DAC.

The practical advice here is—if you are not sure about the sampling rate of the source material, use the digital volume control on the player to reduce the volume and thus provide some headroom for the sampling rate converter. Setting volume down to -4 dB (or about 80-85% if the volume control uses percents) should do the job.

Conclusion

Sampling rate converters are ubiquitous, and conveniently adapt the source audio stream to ensure that it will play regardless of the sampling rate set on the DAC. However, as we have found out, they are not transparent and can easily clip intersample peaks, thus producing audible inharmonic distortions.

To avoid that, make sure the sampling rates match between the played material and the DAC, or at least reduce the digital volume a bit to offer some headroom for the sampling rate converter.

Sunday, May 7, 2017

DAC Clipping on Intersample Peaks

The article "Intersample Overs in CD Recordings" on Benchmark Media raises interesting topics of intersample peaks, and DAC headroom. In short, this is what the article states:

  • 16-bit 44.1 kHz digital samples can be interpolated to achieve signal-to-noise ratio equivalent of 20-bit systems, and modern DAC chips are capable of that;
  • but these chips don't provide digital headroom, and intersample peaks, when they occur, get clipped, producing audible non-harmonic distortions.
  • Benchmark DAC1 is susceptible to this problem, whereas in DAC2 and DAC3 this issue was addressed by introducing a design involving using an external interpolator, and driving DAC chips at -3.5 dB.
  • Maintaining headroom in DAC is important because in audio recordings normalized to 0 dBFS intersample peaks can easily occur.

So I decided to test the DACs I use on the subject of headroom, and also figure out what can be done to address the clipping problem without resolving to buying DAC2 or DAC3 converters.

Let's take some measurements. I don't have Audio Precision, so I was taking my measurements using an old trusty E-MU Tracker Pre connected to a notebook on battery power. In Audacity I created a 16-bit 44.1 kHz sound file containing 11025 Hz sine wave phase shifted to 45° and normalized to 0 dBFS.

Creating Test Sample

BTW, generating this sine wave is not as straightforward as it may seem. The "Generate Tone" Audacity function unfortunately doesn't allow specifying the phase. The workaround is to use very powerful by not so straightforward "Nyquist Prompt" effect instead.

First, generate 10 seconds of silence (it will become selected automatically). Then in "Effect" menu choose "Nyquist Prompt", enter the following, and press "OK":

(osc (hz-to-step 11025) 10 *table* 45))

This will replace the silence with a 11025 Hz sine wave phase-shifted to 45°. Afterwards, normalize it to 0 dBFS by choosing "Effect > Normalize" and entering "0.0 dB" as the target value. The result should look like the left channel on the screenshot below (with "View > Show clipping" option enabled):

The left channel represents the sine wave normalized to 0 dBFS, the right channel shows the same wave normalized to -6 dBFS. Note that Audacity doesn't render sine wave images, like Adobe Audition does, instead it just connects the dots representing sample values.

The red bars on the left channel warn us that these samples will overshoot 0 dBFS when rendered by DAC—that's because the "hat" of the rendered analog sine wave will connect these dots and thus will end up above the maximum value that can be represented using integer values.

Let's look at this sine wave in the frequency domain ("Analyze > Plot Spectrum" in Audacity):

I have changed the default settings of the analysis panel to use Blackman-Harris window and 4096 FFT buckets. This provides the most accurate result for the sine wave. As you can see, the panel shows that the peak of the sine wave is at +3.0 dBFS.

Tests

For each of the DACs I tested I was using the following sequence of steps:

  1. Load the test signal wave into VLC audio player, ensure that its volume is set to 100% (unity). Also check the OS sound level, it needs to be at 100% as well.
  2. Connect the outputs of the DAC to the inputs of E-MU, and play the sample several times in order to set up input sensitivity on E-MU at the maximum level right before it starts to clip—this is to maximize signal-to-noise ratio at the input end.
  3. Now record the signal, check in Audacity that the input isn't shown as clipped, so if there was clipping it could only happen at the output DAC, not at the input ADC.
  4. Check the frequency domain to see if there are any extra frequencies in the recorded signal besides 11025 Hz. The presence of extra frequencies mean that the DAC has clipped output and produced inharmonic distortions.
  5. If the DAC is clipping, check whether reducing volume at the player or at the OS level helps to get rid of distortions.

I started with Benchmark DAC1 since it is known that it doesn't provide headroom and will clip. And indeed it does:

Note that E-MU's input sensitivity is not as good as of the Audio Precision frontend used by Benchmark Media for their post, so we don't see the noisy spikes below -90 dBFS, but the presence of extra spikes around the input signal frequency confirms that we indeed can detect whether the DAC clips by using this technique.

The next thing I tested was Objective DAC of JDS Labs make. It has turned out to be producing even harsher distortions:

It was also interesting to find out that due to enormous distortions, the resulting 0 dBFS wave on the left channel was produced at lower level than quieter but having enough headroom -6 dBFS wave on the right channel. That's clearly a disaster.

Do all DACs clip?

Indeed, the results were a bit disappointing—the "audiophile grade" DACs are not very good at dealing with normalized CD recordings. Also, the following statement from the Benchmark Media's post seems to be leaving no hope:

Every D/A chip and SRC chip that we have tested here at Benchmark has an intersample clipping problem! To the best of our knowledge, no chip manufacturer has adequately addressed this problem. For this reason, virtually every audio device on the market has an intersample overload problem. This problem is most noticeable when playing 44.1 kHz sample rates.

I started testing the other DACs I had lying around:

And to my surprise, I found that none of them has the audible clipping problem! Look at the frequency analysis for MB Air (the only one among the listed that has shown any IHD at all):

There are very minor (I would say, inaudible) spikes from IHD, but it looks much cleaner than the results of Benchmark DAC1!

The music production oriented sound interfaces (E-MU and MOTU) actually have no oversample clipping at all—they provide enough headroom. I guess most of the music pros oriented devices do, since during recording and mixing quite loud transients can be produced, and these devices need to handle them.

A bit surprising was the absence of clipping on the another version of Objective DAC (the Mayflower version). I don't have a good enough explanation for that except that the versions of ODAC they use are different:

  • the JDS Labs one uses "UAC1 DAC" (the old revision of ODAC);
  • Mayflower uses "ODAC-revB" (the newer revision, see this post by JDS Labs).

But JDS Labs never mention that "revB" has added headroom, and in fact acknowledge that performance of the DAC at 0 dBFS level is slightly worse than at lower levels. So, still a mystery to me.

Workarounds

But what if you have a DAC that is subjective to clipping, like Benchmark DAC1 or an old version of ODAC? What I tried to do is first to reduce the output volume level on the VLC player—this reduction happens in the digital domain, and then, as a separate experiment—on the DAC itself using OS volume control provided by DAC as part of the USB Audio standard.

Not surprisingly, scaling the peaks below 0 dBFS by reducing the volume level at the player gets rid of distortions.

What's more surprising is that for ODAC reducing the volume level with OS volume controls (I've set them to -6 dB) also remedies the clipping. That was something new for me since my understanding was that USB Audio volume control would apply to the analog wave that comes out from the DAC chip. But it turns out that at least for ODAC, the chip itself scales down the input digital signal before processing it.

Benchmark DAC1 doesn't provide external volume control via USB Audio protocol, and the volume knob that it has applies the volume control in the analog domain to the signal that has left the DAC chip (already clipped), so it's not helping. The only option to avoid clipping with DAC1 is to use the volume control at the music player.

Conclusions

First of all, big kudos to Benchmark Media for raising awareness about the facts that DACs can clip intersample overs, and that a lot of music recordings actually have them.

But then I would like to steer away from their (not explicit but assumed) conclusion that you should only buy their DAC2 and DAC3 products if you want to avoid the clipping problem. In fact, using pro sound interfaces may be an answer, as well as simply reducing the output volume level. Just don't hesitate to test the resulting signals yourself.

UPDATE

After reading some docs on ODAC / O2 interconnection I have discovered that line out of my ODAC revB is accessible via the "line in" jack on O2's front panel (so it's actually a dual purpose jack—it can serve either as line input for O2 amp or as line output for ODAC—wicked smart!). And I have repeated my measurements on intersample clipping. Nothing changed however—the result look the same as the one recorded via O2's headphone output—no IMD distortions.