Saturday, July 7, 2018

My Setup for Headphone Listening, Part 2

Continuing the topic of my desktop setup for headphone listening, let's recap what we had covered in Part 1. We have set up a transparent hardware chain at moderate cost, and decided to make all the necessary adjustments on the software side using DSP plugins. In order to route audio from any player program via the DSP processing chain, on Mac we use Audio Hijack, and on Windows—a combination of virtual loopback cables and a plugin host program. I'm not covering Linux and mobile platforms here, sorry.

The Processing Chain

I don't believe in the existence of a perfect playback chain that would suit all commercial recordings. Even if the chain itself is transparent, the combination of recording's frequency balance and the headphone's frequency curve may not suit your taste. Also, due to non-linearity of human hearing, even changing playback volume affects perceived tonality. So clearly, an ability to tweak tonal balance is required.

Also, when using closed headphones the reproduction sounds unnatural due to super-stereo effect—each ear can hear its own channel only. This is especially noticeable on recordings that employ some form of "spatial" processing intended for speakers.

So our goals are pretty clear: being able to easily adjust levels of high and low frequencies, and have a crossfeed. In addition, we can try adding some psycho-acoustic enhancement by injecting 2nd or 3rd order harmonics (this is roughly equivalent to using a tube amplifier). Previously, I was also enthusiastic about the idea of headphone frequency response normalization. Now I'm less excited, and I will explain why later. But if headphones used are known to have some particular tonal issue, like the 6 kHz bump of Sennheiser HD800, adding a "normalizing" plugin could be a good idea.

So here is a conceptual diagram of the DSP chain I use:
First comes a simple 2- or 3-band equalizer employing Baxandall curves. I find these to be more pleasant sounding than typical shelving filters of multi-band parametric equalizers.

The next block adds harmonic distortions. It helps to liven up some recordings if they sound too dry and lack "dimension". I think, in small controlled quantities harmonics sometimes can help. However, I prefer to add them with a DSP plugin rather than with an amplifier.

Then comes a crossfeed plugin. An alternative is to use the crossfeed feature of the headphone amplifier or DAC, if it has one. But using a plugin allows to have crossfeed on any DAC / amp, so it's more versatile. Also, if crossfeed is implemented as a plugin, it's possible to add a headphone "normalization" plugin after it. I think that having crossfeed after normalization defeats the purpose of the latter since crossfeed will most likely change the carefully tuned frequency response.

I run my chain at 96 kHz, even when the source material is at 44.1 kHz. Use of higher sampling rates is common in music production world, as they allow using smoother antialiasing filters during processing, and also help reducing the quantization noise. Going up to 192 kHz or higher will consume more CPU resources, and considering a modest amount of effects used, I don't think it's really needed.

At first, I was hesitating a bit whether should I use an integer multiple of the source sampling rate, that is, 88.2 kHz instead of 96 kHz, but then I realized that converting from 44.1 kHz to 48 kHz can be expressed conceptually as first upsampling with a multiplier of 160, and then downsampling by 147, both being integer multipliers (44100 * 160 / 147 = 48000). Also, a lot of good DAC units have an upsampling DSP processor connected before the DAC chip, for upsampling to 192 kHz (TEAC UD-x01), 384 kHz (Cambridge Audio DacMagic Plus and Azur), 768 kHz (!) (Pro-ject DacBox DS2 Ultra), or even an "odd" value of 110 kHz (Benchmark DAC1). So the DAC chip would never "know" what was the track's original sampling rate.

Thus, there is no reason to worry about going from 44.1 kHz to 96 kHz on the processing chain side, as modern software resamplers should be transparent. This is assuming that the input signal doesn't have intersample peaks. And we took care of this by lowering the digital volume of the player (see Part 1), giving some headroom to the audio signal before it gets upsampled.


DSP plugins still need to be measured despite that their effects are usually better documented than of hardware units. Why? Because there can be surprises or discoveries, as we will see. Also, some plugins for some reasons have uncalibrated sliders, labelled in a very generic fashion like "0..10" and it's not clear what changes each step introduces. So unless you have a very trained ear, it's better to measure first.

And the audio transport channels that we use, despite being fully digital and thus supposedly "bit-perfect" still can introduce distortions, or cause losses in audio resolution. This is an imperfect world, and we need to be prepared.

The Empty Chain

As an example, let's measure an empty processing chain consisting of Windows 10 Pro (Build 17134.rs4_release.180410-1804), Hi-Fi Virtual Cable (from player to the effects host), VB-Audio Cable (from the effects host to the analyzer input), and DDMF EffectRack (64-bit version). The virtual stream on the EffectRack uses "Windows Audio Exclusive mode".

One problem that I've noticed is that using a test sine signal at 0 dBFS causes distortion:
But after lowering the input signal level by just 0.1 dB it's gone. I've double checked that the test signal is not overshooting 0 dBFS. I've also checked with Bitter plugin that the signal is not clipping:
Having that a lot of modern recordings have peaks normalized to 0 dBFS the advice I gave in Part 1 about lowering the digital volume on the player by at least 3.5 dBFS seems especially useful in this case.

I'm not sure where this distortion is happening—it could be anywhere in the kernel audio transport, in virtual cables, or in EffectRack. However, I've also tried this experiment with DDMF virtual streams, and with another effects host: PedalBoard 2, and the result was the same, so I'm suspecting Windows audio chain. But I must note that 0 dBFS sine plays fine via lots of sound cards' physical loopback, thus most likely it's a combination of Windows and virtual cable drivers that causes this behavior.

The lesson from this measurement is that I should not use a 0 dBFS test signal when testing the processing chain.

Another curious thing I've found is that with some virtual cables EffectRack causes distortion when there are no effects in the chain (stream's audio input is directly shorted to audio output), but the distortion is gone as soon as I insert a processing plugin, even if it does nothing to the audio stream (all processing knobs are at zero position).

By the way, using Bitter plugin is also helpful for verifying the actual bit resoluton of the processing chain. As we have seen on the screenshot above, on Windows I do actually have 24-bit resolution. It's interesting that on Mac with Audio Hijack the resolution seems to be even better—32 bits:

The Equalizer

There is no shortage of equalizer plugins. My long time favorite was basiQ by Kuassa because it's free, simple to use, and it implements a good old 3-band Baxandall equalizer. Typically I used it in very moderate amounts never going more than 5 or 6 steps from the zero settings. This is how the equalization curves look like:

Note that even in "all zeroes" setting the frequency response isn't entirely flat. I'm not sure if it's intentional, but it's better to be aware of this (that's why we measure!) Also note that the amount of correction resulting from the same amount of knob steps are not the same for low, mid, and high frequencies. I think, this is to account for the fact that human ear is less sensitive to changes in bass frequencies.

Another important thing is that when boosting any frequencies, the resulting increase in the sound power must be compensated by decreasing the output level of the plugin (the small knob at the bottom), otherwise clipping may occur on loud music passages.
But I've said that basiQ used to be my favorite plugin. I've migrated to Tone Control by GoodHertz, and this is why. Although it only has 2 bands (vs 3 on basiQ), Tone Control has an interesting offering of using a linear phase filter. That means zero group delay—no frequency groups get delayed by processing. This is important because some musical sounds (e.g. a hi-hat crash) are wide band signals, and delaying some parts of this signal "smears" the sound in time domain, which according to some theories affects its localization by brain.

Tone Control isn't free, and actually it's quite expensive for a 2-band equalizer ($95!). However, using it I could easily replicate my setups in basiQ, and Tone Control can create web shortcuts for them to use anywhere. Before that I tried DDMF LP10 equalizer, which also offers linear phase filters, but replicating delicate tone curves of basiQ with it was very hard, so I decided to pay a bit more for Tone Control.

Harmonics Enhacement

I decided to experiment with adding harmonics after reading this post by Bob Katz. I've found Fielding DSP Reviver plugin, which costs just $29. I measured it in order to calibrate the scales of its controls—they just go from "0" to "100", and also to verify that they don't have aliasing problems.

After measuring the levels of THD of Reviver, I decided never to go higher than "5" mark for both 2nd and 3rd harmonics. For the reference, the "1" mark adds 2nd harmonic at ~0.2% THD (about -55 dBFS), and for "5" it's a bit higher than 1% THD (about -40 dBFS). And for the 3rd harmonic the figures are a somewhat lower, so when both sliders are at "5", this creates a natural harmonics picture:

(I've put the cursor over the 2nd harmonic to show that it's at -39.95 dBFS, while the 3rd as we can see is lower than -40 dBFS.)

Turning on "Serial" mode also adds 4th and 5th harmonics:
Subjectively, adding harmonics may add "dimension" to sound, make it a bit "fatter". It in facts helps some recordings from 70-s and 80-s to sound better. My hypothesis is that for their production, tube amplifiers were be used in studio, so they were sounding "rich" there. But while being played via a transparent solid state chain on headphones they sound more "bleak" and "flat". So adding back some distortions helps.

However, I would not recommend abusing the harmonics plugin because, unfortunately, adding those "euphonic" harmonic distortions also brings in unpleasant non-harmonics. Dr. Uli Br├╝ggemann of AudioVero explains the reasons in his article. And indeed, if we look at IMD SMTPE and CCIF measurements for Reviver, the level of SMTPE-measured distortions is quite high. So use it with caution—keeping it turned on all the time defeats the purpose of having a transparent reproduction chain. The effect of non-linear distortions can also be seen on the frequency response graph which becomes noticeably "fuzzier":

Crossfeed and Headphone Normalization

I covered both Redline Monitor and a couple of headphone normalization plugins in my earlier posts. For headphone normalization I would also prefer a plugin that has "linear phase" mode.

Now I would like to explain why I'm actually not currently using headphone normalization. From my experience, normalization indeed makes the headphones sound different from their original tuning, which can be exciting at first. But is it really a setup that you would want to use all the time? I doubt that. I actually have doubts that normalization can serve as a "reference", here is why.

There are several factors that can affect the headphone normalization process: first, the same model of headphones isn't necessarily consistent from instance to instance, and besides that, pad wear can affect bass response. OK, some companies offer measuring your headphones, and imagine we have done that. Then the second factor comes in—your head. The dummy heads used in measurement use statistical averages for head, ear pinna, and ear canal dimensions. But they are obviously not the same as your head, and this will affect the shape of the frequency response at the ear drum (see this interesting thesis work for details). And finally, the target response is not set in stone. There are several versions of Harman target curve, diffuse field curve, and your actual room curve.

So, there are just a lot of variables in the normalization process. There is a solution that takes them all into account—the Smyth Realizer, but it's too expensive for an ordinary folk. Thus, since we are not interested in music production, but only in pleasantly sounding reproduction, I've found that simply using tone controls delivers a desired sound with much less effort.


For me, using a simple DSP processing chain and a transparent reproduction chain has become a flexible and not too expensive way to enjoy my music in headphones. This setup offers endless ways to experiment with tonalities, "warmth", and soundstage perception while staying with the same hardware.

Wednesday, June 27, 2018

My Setup for Headphone Listening, Part 1

I listen to music on headphones a lot. This is my retreat from distracting noise that often surrounds me at work and at home. I don't normally use portable audio players or a mobile phone, instead I have what is called a "desktop" system: a computer, a desktop DAC, a desktop headphone amp, and closed over-ear headphones. At home, I also use a couple of pairs of open over-ears when it's quiet around.

When listening on headphones, I can notice more issues with the reproduction chain and in the recording, compared to listening on speakers. That's why I pay a lot more attention to details for the headphone setup. My goal here is to be able to relax and enjoy the music on headphones the same way I can enjoy it on speakers.


The hardware part of the chain consists of three components: USB DAC, headphone amplifier, and headphones. The criteria for choosing them is easy to formulate—be as transparent as possible. That means, adding as little distortions and colorations as possible, having precise inter-channel balance and low crosstalk levels. I tend to avoid doing any sound processing in the analog domain, relying on DSP plugins running on the computer instead.


As far as electronic components are concerned, it's quite easy to fulfill the transparency requirements. Any modern DAC with the price starting from $200 does the job. Here are some not very expensive DACs that I'm familiar with.

Cambridge Audio DacMagic series. The entry level models: DacMagic 100 and DacMagic Plus used to be expensive back in time of their introduction, but now have become cheaper because they don't handle DSD and MQA. So for people not interested in those formats these DACs now represent a good deal. Especially DacMagic Plus with its internal operating sampling rate of 384 kHz and selectable output filters. Ken Rockwell had published a very thorough review of this unit. Note that due to high impedance of the headphone output (50 Ohm), DacMagic Plus should not be used as a headphone amplifier, but rather only considered as a DAC.

E-MU 0404. This legendary external sound card of the past is now a bargain because it's not USB Audio Class compatible, and E-MU / Creative Labs have abandoned updating drivers for it, so it's not usable on modern OS versions. However, it has an SPDIF input, so it can be used as an SPDIF DAC driven by a USB Audio Class compliant pro audio card, or the optical output of the computer. For example, I connect it to my MOTU Microbook IIc which has a coaxial SPDIF output. 0404 only supports sampling rates up to 96 kHz over SPDIF. The other caveat is that an instance of an old OS (e.g. WinXP running in a virtual machine) is still needed in order to set up the sampling rate of this card.

JDS Labs EL DAC. Haven't tried it personally, but the price fits the budget. JDS Labs generally follow the principle of designing transparent equipment with good objective characteristics. The measurements are published here.

TEAC UD-301. A cheaper option than UD-5xx series. The UD-501 model was measured by Archimago and looks really solid. UD-301 used the same DAC chip, but doesn't have the option for selecting the type of the output filter.

Headphone Amplifier

Decent transparent headphone amplifiers are not hard to find either, as we can see from my previous post on measurements of AMB M3 and SPL Phonitor Mini. I also use the desktop version of Objective2 headphone amplifier.


Headphones are more tricky as there is no clear objective criteria on how headphones should sound like. I'm aware of Harman target equalization curve, but first, it's still under development, and second, not every headphone manufacturer follows it. Anyways, the frequency balance of the headphones can be corrected in the software chain, so the main requirement is about low distortion levels. Personally, I stuck myself with Shure SRH1540. I was also enjoying Beyerdynamic T5p until they broke. Both of those are closed over-ear headphones.

I've got some open over-ears as well: Beyerdynamic T90, Massdrop Sennheiser HD6xx, and AKG K240. These are all different sounding, with K240 being the most uncolored but also adding the most distortions, T90 sounding the most "airy", and adding extra high frequencies, with HD6xx being somewhere in the middle.

To summarize, I strive to have the reproduction chain as transparent as possible. I do not use tube amplifiers, for example. I know they can sound nice, but the distortions they add can't be taken out if needed. On the other hand, if the chain is transparent, it's easy to add any tweaks and "euphonic" distortions at the prior stage—on the computer.


The Player

The software chain starts with the music player. I'm not very picky about them. My primary sources are Google Play Music for streamed content, and Foobar2000 or VLC for grabbed CDs and high resolution (24/96) files.

The only thing I need to tweak in the player is to set its output level so it has headroom for intersample peaks. As it had been demonstrated in that post, a digital sound file can contain encoded sound waves that while being converted into analog would exceed the normal level of 0 dBFS. And thus, having slightly more than 3 dBFS of headroom is recommended. For the Play Music player this means setting the volume control two steps below the maximum output volume:

This provides attenuation by -6 dBFS (one step attenuates by -3 dBFS), which is more than enough.

For VLC I settled up with 82% of output volume (about -4 dBFS attenuation), and for Foobar2000, setting the volume control to -3.5 dBFS provides the necessary headroom. This is a very important step, as any further sound processing step could result in clipping, and distortions caused by clipping can't be removed afterwards.

Plugin Host and Audio Capture

The most important component of the processing chain is the plugin host. I use hosts that allow intercepting system audio or audio from a specific application. On Mac I use Audio Hijack. This is an easy to use and stable application that includes a kernel module for capturing sound output. I think it can only host AudioUnit plugins, but generally it's not a problem since all the plugin makers provide their modules in different formats.
On Windows things are more complicated. There is a free open-source app called Equalizer APO which installs itself as a filter for the selected audio interface. It can host VST plugins. However, I've got a couple of issues with it. First, it doesn't allow VST plugins to show their meters. Second, it crashed when I was attempting to add Redline Monitormy current favorite crossfeed plugin. Since Equalizer APO is open source it should be possible to fix both of these annoyances, but I haven't got to this yet.

Instead, I found another plugin host app called "Virtual Audio Stream". It allows using 4 independent effect racks. In order to capture applications or system sound output, VAS provides virtual audio devices, but they are limited to 44.1 kHz. However, any other "virtual cable" device can be used instead. I use "Virtual Audio Cable", where the "Hi-Fi" version supports sampling rates up to 384 kHz.

The next big topic is the list of plugins that I use with these hosts, and their settings. This will be covered in the next post.

Thursday, June 14, 2018

Measuring AMB M3 vs. SPL Phonitor Mini

I've built AMB's M3 headphone amplifier more than a year ago and I enjoyed it all this time. Judging purely from listening experience, I was quite sure that my build doesn't have any major flaws. Also, I was confident in the M3 measurements that Bob Katz has done for his unit. However, I decided to perform some on my own. As my measurement rig is not super precise, I decided to measure M3 side by side with a commercial headphone amp to have a better grip with reality. I've chosen SPL Phonitor Mini because I think it's in the same "weight category" as M3.

AMB M3 is a two stage amplifier, with the first stage based on opamps, and the second stage on MOSFET transistors (with big heat sinks!) Although formally the amp has Class AB topology, its enormous power allows it to stay in Class A for most of the use cases. Another distinguishing feature of M3 is "active ground"—that is, the ground channel also goes through the same amplification stages as left and right channels. Personally, I'm on the same side with NwAvGuy who said that it's not a good idea. But it's interesting to see what practical consequences this design choice actually has.

SPL Phonitor Mini is one of my favorite headphone amplifiers for a long time. Initially this was due to its awesome crossfeed implementation. Now that I've found some comparable DSP implementations, this is of a less importance. But I still enjoy Phonitor for its power, reliability, and the fact that it has both unbalanced and balanced inputs. Besides crossfeed, Phonitor has another feature—high-voltage rail (120 VDC), which helps to achieve low noise floor.

Notes on Measurements

From my previous experiments, I've found that I can trust my measurements of frequency response, THD, channel balance, and output impedance. I put less faith into my IMD measurements, but for comparison between two amplifiers this should be OK. So this is the set I decided to stick with.

I learned that when measuring an amp with a driven ground like in M3 (or a fully balanced amp), the ground channel of the probe must be left floating. I'm not entirely sure about whether this only applies to mains-powered measuring equipment (mine isn't) or not. But just in case, I decided to stick to this method. And for consistency I decided to measure both amplifiers the same way. Also for consistency, I was using unbalanced inputs on Phonitor (that's the only input on my M3).

I let both amps to heat up for an hour before measuring them. For most of measurements, I set the volume level on both amplifiers to output 400 mV into a 33 Ohm resistive load using a 1 kHz sine wave. The line output of MOTU Microbook IIc was attenuated to -3 dBFS.



Here I was pleasantly surprised by superiority of M3. Below is graph of THD for 1 kHz sine, M3 is on the front, in orange, Phonitor is on the back, in cyan:

It can be seen that M3 almost doesn't have harmonic distortions from the test signal, and it's 60 Hz hum spike is at noise level. Here is the same graph with M3 alone:

That's very impressive. Even considering that Phonitor's harmonics are below audible threshold, on M3 they are practically absent. It's interesting that on Bob Katz's graphs (here and here) the 60 Hz spike is more prominent. Perhaps that depends on the power supply?

The results for a 20 Hz sine are also very good, this is M3:

And this is Phonitor Mini:

Apparently, 400 mV output level is a piece of cake for both amplifiers. I decided to crank them both up to produce 3 V RMS into 33 Ohm load. Distortion levels are now noticeably higher in both amps, but still below audibility (again, M3 is orange, Phonitor is cyan):

It's interesting to note that the level of THD of M3 at 3 V output: 0.0061% is still lower than Phonitor's level of THD into 440 mV: 0.0074%. That demonstrates how much power M3 has. And just a reminder, please don't rely on THD+N numbers on these graphs—they are quite high due to relatively high noise floor of my measurement rig.


Here M3 also demonstrated better performance. Here is SMTPE IMD for M3:

And for Phonitor Mini:

As we can see, there are a lot more sidebands on the 7 kHz signal caused by 60 Hz signal played along with it.

And here is CCIF IMD for M3:

And for Phonitor Mini:

The 1 kHz signal—the result of interaction between 19 kHz and 20 kHz signals is more visible, although it's level is at -100 dBFS, which is inaudible.

Frequency Response

I would not expect anything but a ruler flat response from both of these amplifiers, and indeed this was the case:

The channel balance is also exemplary. It's 0.078 dB for Phonitor, and 0.061 dB for M3. And remember that I've built M3 by hand!

Stereo Separation (Crosstalk)

It's the only measurement where M3 has shown worse results than Phonitor. Here is Phonitor:

The crosstalk level stays at -74 dBFS until 1 kHz, and then climbs up to -64 dBFS. It's definitely better than Behringer UCA202 was showing. Now let's look at M3:

Here variation is less—within 4 dB, but the overall level is higher—at -60.5 dBFS. Why is that? Bob Katz obtained similarly high figure: -42 dBFS into 20 Ohm load. But the crosstalk was improving (becoming lower) as the load impedance was growing higher. Bob explains this with the fact that the driven (active) ground of M3 has output impedance.

Considering the worse absolute value, Bob says that anything better than -30 dBFS is insignificant. Thus, -60 dBFS isn't a big deal.

Output Impedance

Another victory of Phonitor: 0.06 Ohm of output impedance versus 0.11 Ohm on M3. Although, I'm not sure this measurement has the same meaning considering the driven ground of M3.


M3 is a transparent amplifier. The absence of distortions is due to its enormous power capacity. I actually doubt that the driven ground has much influence on its performance. It would be interesting to build a version of M3 with classical passive ground channel. Like LXmini speakers, it's a great design that can be reliably built and provide consistence level of performance.

Thursday, June 7, 2018

Linkwitz LXmini—First Impressions

Initially I was planning to do the next post about my measurements of FiiO E5 headphone amplifier—another attempt to "calibrate" my measurement rig against NwAvGuy's Prism dScope, but I've got sidetracked by another project.

It was long time ago when I've learned about loudspeakers designed by Siegfried Linkwitz. His promise is to deliver a great sound in conditions of untreated domestic rooms. Sounds challenging, and his speaker designs depart greatly from traditional "boxes." One particular model—LXmini looked very unusual—made from plastic drain pipes, with drivers positioned orthogonal to each other. I've got a chance to attend a demo at Burning Amp festival, and they indeed sounded quite nice to me.

I bought build plans for LXmini, but was endlessly procrastinating actually building them. Finally, I've made an effort—ordered the kit from Madisound and bought the rest of parts at Home Depot. Building took several days mainly because I had to paint the parts, wait until they dry out, then glue them together, then wait again. I've made the speakers in black. Here they are:

Compared to powered monitors on stands they look less bulky, giving back to the room the sense of space.

Choosing the Amplifier

All of Linkwitz designs use active crossovers based traditionally on miniDSP boards (there are of course variations due to DIY nature of this project). With this approach, each speaker driver requires its own amplifier, so for the pair of speakers I had to provide 4 channels of amplification.

I decided to look for an amplifier in a half-rack width body so I can fit it into my gear rack. The woofer driver of LXmini is rated for 8 Ohm impedance and "long term power handling" for 80 Watts. The second—full range driver is 4 Ohm and requires less power. So I decided to look for a 4 channel amplifier rated for 100 Watts into 8 Ohms to have some headroom.

The choice of half-rack width amplifiers has turned out to be not very wide. I've found some models from pro audio equipment makers: Atlas, Crestron, Parasound, QSC, and Stewart Audio. All of them were class D—not surprising because heat sinks that are required for delivering this amount of power via class AB would never fit into half-rack format. But I wasn't afraid of class D amp, as my JBL LSR305 monitors use them, and I can't see any difference from class AB amplifiers in KRK Rokit G5.

I've chosen SPA4-100 from QSC. It was matching my requirements exactly, and the specs state very flat frequency response. It isn't cheap though—costing above $800, but QSC is a well known brand of pro amplifiers, so my hopes were for good quality and long term reliability.

This is how it looks in my rack:

Initial Setup and Check

I decided to put LXmini at the front, replacing my KRKs. I also decided to try to get rid of my center channel because the speaker (E3c) had a non-uniform frequency response that was quite hard to correct, and I could always distinguish it by ear from other speakers. This left me with a bit unusual 4.1 configuration. However, it's not the infamous "quadro" setup, but rather the traditional 5.1 layout, just without the center.

Since LXmini require an additional DSP which has a non-negligible delay, I used REW to make sure the speakers are time aligned with each other. This process is based upon frequency response measurement, and when I looked at the measured FR I was pleasantly surprised how well the speakers are matched:

The graphs use "psychoacoustic" smoothing. Obviously, the irregularities at low frequencies are due to room modes. Judging by the right channel (red), the natural roll off of the speakers starts at 50 Hz. BTW, I've configured my audio chain that I can drive LXminis either on their own in stereo configuration, or as part of surround setup with subwoofer. These graphs are for the stereo configuration.

The interesting thing about LXmini is that unlike traditional designs, they have quite low crossover point—around 700 Hz, and the full frequency range above is covered by the top speaker. Thus, the top speaker in LXminis is properly called "full range", not "tweeter."

Then I ran LEDR test. In short, it's a synthesized signal that exploits HRTF in order to achieve 3D positioning of the test sound in order to help evaluating "imaging." In a room-speaker system with tamed early reflections and reasonably flat FR playing this test signal produces a remarkable effect of a sound moving in an arc in different planes, including vertical one.

Previously I tried this test with my JBL and KRK speakers set as fronts. The KRKs were producing a more realistic picture, although the perception of vertical movement was quite weak. With JBL, everything was smeared. In fact, even a simple test of playing pink noise through both stereo channels wasn't producing any sensible phantom center image with JBLs. That's why some time ago I put them to rear channels position where they do their job better.

Directionality and Energy Time Curve

The key to understanding why all those speakers have different ability to resolve the sound stage in my room lies in the character of their interaction with it. From my previous comparison of my JBLs with KRKs I know that JBLs have wider dispersion, due to the construction of their high frequency horn. And LXmini has the narrowest radiation pattern—dipole (figure 8).

My listening area is not symmetric, with a wall and a large book shelf on the left side. I do have space behind the speakers, and on the right side. Due to the reflective surfaces being close on the left, I always have to compensate for additional sound energy there by slightly reducing the volume level of the left speakers (yes, the for the rear one, too). Another issue with the room is that the ceiling is quite low—2.4 meters (8'). Though, there is a large sofa with cloth cover and the floor is carpeted, creating some natural sound absorption.

Apparently, there are lots of reflections in my room. The question is, how harmful are they for the sound localization cues. A good hint for answering this question is provided by the Energy Time Curve graph. Here is a very good introduction from Gik Acoustics on how to interpret it. Bob Katz's book "Mastering Audio" also contains useful information about the ETC.

Let's look at the ETC graphs for LXminis in my room (listening position, the first 30 milliseconds):

I would say, they look really good. The initial impulse decays to almost -20 dB during the first millisecond. And all the reflections arriving within the first 30 ms are below -15 dB. That's an exemplary performance for an untreated room. For comparison, this is how ETC graph looks for my KRKs:

I left the LXmini graphs as shadowed plots for comparison. Here we see much stronger reflections arriving within the first 5 ms. They must be caused by wider radiation pattern. Some of the sound radiated to the sides immediately reflects from a closely positioned surface and reaches the listening position almost together with the main impulse. For the JBLs the situation is even worse:

Here we see series of strong reflections arriving within the first 5 ms, and also that later reflections are stronger. I'm pretty much sure this is due to much wider radiation pattern of JBLs.

But don't get me wrong, I'm not saying that JBL LSR305 is a bad speaker—no, it's in fact a good one, especially considering its price. It has a flat frequency response and very good directionality. It's just not for a room where reflective surfaces are located close to it. I'm sure, in a more spacious room, or in an acoustically treated room where strong early reflections are eliminated, it will sound great and will not have any problems with imaging.

In fact, even in my room these JBLs work great as rear speakers, due to their proximity to the listening area. In this case, their direct sound dominates over any reflections and they sound very true to life.


LXmini is a fantastic speaker for a small untreated room due to its narrow radiation pattern. The phantom center image created by a stereo pair of LXminis is so strong that I've got rid of my center speaker in surround configuration.

Sunday, May 13, 2018

Measuring Behringer UCA202

I decided to exercise my measurement rig by measuring Behringer UCA202 music interface and comparing the results with the ones NwAvGuy published long time ago. He was using Prism dScope, and it was interesting to me how close could I get to repeating his results on my equipment.

What Is UCA202

UCA202 is quite an old and basic music interface, it's digital section is built using BurrBrown/TI PCM2902 chip which is only capable of 16-bit 48 kHz resolution. The good thing about UCA202 is that it costs less than $30, and is USB Audio Class 1.1 compatible, so it works with any modern operating system without drivers.

Behringer produces these interfaces in large batches, so I was assuming they have a good level of quality control and thus little variance between the characteristics of each device. Which again is a good thing when you are trying to compare your measurements with results from 2012.

But before diving into UCA202 measurements let's digress for one technical observation.

MOTU Microbook IIc Microphone Input

From my previous examinations of Microbook, we have seen that it has a quite uneven noise floor on its line input:

I decided to try other inputs: the microphone input and the instrument input. Their main difference from the line input is that they have higher sensitivity and are equipped with an amplifier. The instrument input was no better than the line input, but the microphone input has turned out to have lower noise floor:

As we can be see the microphone input (cyan plot) has less noise at low frequencies and does not exhibit the peak between 2 and 3 kHz. Its noise level is below -102 dBFS, and reaches -103 dBFS when "pad" (input signal attenuation by 20 dB) is activated.

Then I tried connecting Millet Soundcard Interface (SCI) to the microphone input instead of the line input. Here I found that without padding, SCI's output signal level is too high for the microphone input, but with padding activated it becomes a bit low, so I turned up input amplification in CueMix FX (MOTU's control panel). One advantage of this approach is that I could make a 0 dBFS test signal to reach almost 0 dBFS on the input after passing through SCI. And this did improve measured THD+N and IMD levels, despite the fact that due to amplification the overall noise level has also raised up.

For illustration, here is 0 dBFS 1 kHz signal played from MOTU's line output via SCI back to MOTU's microphone input, padded and attenuated:

The cyan plot is microphone input's noise floor that we have seen before.

This is how distortion figures compare with my previous measurements connecting SCI to MOTU's line input:

                                     Line      Mic
                        THD  1 kHz: 0.00067% 0.00064%
          THD+N (20Hz-48kHz) 1 kHz: 0.0087%  0.0075%
                         THD 20 Hz: 0.0048%  0.0041%
          THD+N (20Hz-48kHz) 20 Hz: 0.010%   0.0090%
                        THD 20 kHz: 0.025%   0.0027%
         THD+N (20Hz-48kHz) 20 kHz: 0.026%   0.0067%
                  IMD CCIF -3 dBFS: 0.016%   0.0034%
                 IMD SMTPE -3 dBFS: 0.0036%  0.0038%

So the microphone input provides the best characteristics you can possibly get from the Microbook / SCI pair. Also it is better isolated from the line output. I have noticed that high frequency tones played over line output can interfere with line input (one possible explanation why there were some issues with measuring IMD CCIF via line input.)

UCA202 Line Out


Now back to UCA202. Here is how my test setup looked like:

Everything was running on battery power, and there were no loops besides the audio signal loop. Note that UCA202 was running at 44.1 kHz sampling rate to match NwAvGuy's measurements.

One good point of using SCI even for line level measurements is that it has the input impedance of 100 kOhm, the same as of dScope. As we can see from the photo on the NwAvGuy's post, the line outs of UCA202 were connected directly into dScope's inputs, so it was acting as a load by itself.

One thing I have noticed from NwAvGuy's measurements is that dScope allows specifying both lower and upper range for THD+N measurements, so on his picture we can see that the FFT was showing frequencies up to 96 kHz, but from the text we see that "The distortion measurements include all frequencies up to 22 Khz."

My attempt to repeat the measurement in the same way has revealed a shortcoming of ARTA—it only allows to specify the lower bound for THD+N measurements, while the upper bound is always half of the sampling rate. So, in order to check the ultrasonic noise bump from noise shaping I had to run the measurement at 96 kHz sampling rate, but for the THD+N measurement I had to switch to 48 kHz sampling rate. So here is the measurement showing the noise shaping bump:

Please, don't compare the THD(+N) figures here with the result from NwAvGuy. Here is the measurement at 48 kHz sampling rate:

Now we can compare the figures:

                               dScope           MOTU/SCI
                    THD 1 kHz: 0.0079%            0.01%
     THD+N (20Hz-22kHz) 1 kHz: 0.0089% (+0.001%)  0.012% (+0.002%)

So the measurements are in the same ball park, but the figures obtained with dScope are lower by 0.002–0.003% (that's 2–2.6 dB difference.)


NwAvGuy reports "excellent IMD result, run at –2 dBFS ... IMD of 0.0009%." From the graph screenshot, he was using two tones: 60 Hz and 7 kHz with amplitude ratio 4:1 (SMPTE standard).
First I have tried this test with MOTU loopback through SCI:

Even here IMD is 0.0033% which is ~3.5 times higher than he measured with UCA202. So I wasn't expecting any good result when measuring myself. Indeed, the number is quite bad:

It's 0.019% (21 times bigger). However, the patterns of 60 Hz products and 7 kHz sidebands look similar to NwAvGuy's picture.

The inconsistency with NwAvGuy's figures may be due to the fact that ARTA uses DIN standard for IMD calculation. Although the manual says that "This intermodulation factor is very close to the value of intermodulation distortion that can be measured by SMPTE analog instrumentation." but I'm not sure what their definition of "very close" is.

UCA202 Headphone Out

I used 330 Ohm resistive load for UCA202's headphone output (NwAvGuy was using 150 Ohm), which in theory should result in even less distortions.

Note that there are two ways to exercise the headphone output on UCA202: the first is to emit stimulus signal via its USB DAC, and the second is to turn on "Monitor" switch on UCA202 and connect an external generator to its line inputs. I don't know for sure if the signal passes through the digital section in this case, or the line input connects to the headphone output in the analog domain only (from my measurements, I suppose it's the latter.)

In fact, NwAvGuy hadn't specify which way was he using for measuring the headphone output. I was providing stimuli via the USB DAC most of the time, and only used the "Monitor" option when obtaining the frequency response, as in this case I could "subtract" the transfer functions of MOTU/SCI from the measurements.


I have set the headphone volume level on UCA202 to output 400 mV (same as NwAvGuy), and measured THD using 48 kHz sampling rate on MOTU:

So we have 0.0074% THD compared to 0.0070% measured by NwAvGuy. That's very close!

Maximum Output

Here my experience was a bit different. NwAvGuy tells that at the maximum volume his UCA202 was producing 660 mV into a 150 Ohm load, and had 0.97% THD. Again, this is when playing a 0 dBFS 1 kHz sine wave. Whereas my UCA202 at the maximum volume was producing about 1 V RMS into 330 Ohm load, albeit with enormous distortion. And 1% THD distortion level was achieved at 782 mV into 330 Ohm load.

So maybe Behringer have beefed up the headphone output on later revisions of UCA202? Looks so. I couldn't resist opening up the unit, and found that instead of 4558 opamp that was used in NwAvGuy's unit, my unit uses 4556A. So it's definitely possible that the opamp gain was also tweaked to accommodate high impedance headphones better.

Channel Separation

Here ARTA isn't very helpful—it doesn't provide a dedicated measurement mode. But I could use frequency response (FR) measurement mode, play the test signal into one channel and capture another channel to see how much of the test signal did leak in—this is exactly what the channel separation test is about, right? I attempted to do this and found that ARTA always uses both channels for output, so this approach didn't work out.

As a workaround, I have saved the test signal (periodic pink noise) into a file, one channel only. Then I started playing this file via UCA202, and analyzing the input in ARTA. This is somewhat worked but the FR at high frequencies was looking very noisy.

So, I installed good old Room Eq Wizard (REW) and measured the FR using it. I had to connect both UCA202 and Microbook to the same laptop, so I plugged UCA202 through an USB galvanic isolator.

Doing this measurement with REW was much easier as it allows specifying which channels to use for playback and capture. The resulting graph was also noisy at high frequencies, but REW is good at smoothing. And after "calibrating" the graphs—offsetting them so the direct channel FR goes at 0 dBFS, I've got a graph which looks very much like NwAvGuy's:

My output was into a 330 Ohm load, but it didn't affect the results too much. Also note that on my graph the separation at high frequencies is closer to -60 dBFS than on NwAvGuy's. But I still consider the results to be pretty close to expected.


This measurement wasn't very conclusive. First I've measured jitter using ARTA on MOTU loopback to establish a baseline (SCI being a fully analog device shouldn't affect jitter at all.) The picture was looking very clean—I guess, all MOTU's jitter got lost in the noise level. Even with averaging I couldn't see any sidebands emerging.

Then I switched to UCA202. Note that I was still hosting it on the same measurement laptop (through USB isolator), so I had to run it at the same sampling rate as MOTU—48 kHz. This is yet another limitation of ARTA—it can't use different sampling rates for input and output devices. As NwAvGuy, I was using the headphone output on UCA202 with RMS level of 400 mV. Here I could see some jitter:

So there are sidebands of about 300 Hz. The absolute levels don't match what NwAvGuy's graph shows, but on the other hand, there is less low frequency "spread."

And note yet another limitation of ARTA—there is only one marker available, so I had to add the labels for sidebars in a graphics editor.

Output Impedance

Measuring output impedance was easy—I didn't even need ARTA for that. I simply used my Agilent DMM to capture open circuit voltage when outputting 1 kHz tone at the same volume level as needed to achieve 400 mV into 330 Ohm. Then I used this value with this online calculator in order to get the result:

And the result—48 Ohm agrees both with NwAvGuy's measurement and Behringer specs.

Frequency Response

Here is a problem: neither MOTU nor SCI do not provide a perfectly flat frequency response by themselves. When measuring line level equipment which can be connected to MOTU's line input directly it's not a problem—the dual channel FR measurement mode in ARTA and STEPS can take care of removing MOTU's transfer function from the measurement.

But I had to test the headphone amplifier of UCA202 which needs to be connected via SCI. For that, I have fired up REW once again. First I have measured the frequency response of MOTU/SCI loopback. Then I connected MOTU's line out to UCA202 line in and enabled "Monitor" mode to send the signal directly to the headphone out of UCA202. After measuring both left and right channels, I used "Trace Arithmetic" in REW in order to remove MOTU/SCI loopback response from the measurement to leave only UCA202's own response. This is what I've got:

It doesn't look like NwAvGuy's graph at all. Here I have realized that NwAvGuy must have been sending the stimulus signal via UCA202's USB DAC, rather than using "Monitor" mode. I did this, and got something looking more similar:

A couple of caveats here:

  • I had to use 48 kHz sampling rate because MOTU doesn't support 44.1 kHz which UCA202 was using in NwAvGuy's test, and REW needs both input and output devices to use the same sampling rate.
  • Apparently, graphs are over-compensated at low frequencies because now the signal only goes through the input path of MOTU but the loopback measurements includes both input and output paths.
I have channel level difference at 20 kHz of 0.28 dB in "Monitor" mode, and 0.55 dB from UCA202 USB. NwAvGuy's result was "0.25 dB." I'm not so surprised with a different result here because as we have found out, the headphone section on my unit uses a bit different hardware compared to what he had.

So for FR, it's easy to obtain good measurements for analog power or line level equipment, but not so for digital sources.


What measurements done using MOTU/SCI rig I can trust:
  • Frequency response for analog power and line level equipment.
  • THD and THD+N, if taken at a proper sampling rate for limiting the upper range. Although, they will likely be slightly higher than actual.
  • Channel separation.
  • Output impedance.
What measurements I can not trust:
  • Frequency response for digital equipment.
  • IMD—will likely be much higher than actual.
  • Jitter—the absolute values can't be trusted but still can be used for comparisons or detecting gross issues.
Shortcomings of ARTA:
  • Unable to specify the upper range for THD(+N) measurements.
  • No channel separation measurement mode.
  • Always emits the stimulus signal into both channels of the output device.
  • Both input and output device have to run at the same sample rate.
  • Single marker in FFT windows.
Shortcomings of Microbook/SCI rig:
  • No support for 44.1 kHz sampling rate.
  • High noise floor.
  • Not perfectly flat own FR, need to compensate for that.
  • Only one channel on SCI so it's impossible to use dual channel FR measurement.
So the capabilities of my rig are good enough for exploratory testing and comparisons, but not so good for doing "absolute" measurements. However, for its price that's a good enough result.