Sunday, August 14, 2022

Modular Audio Processing System, Part III

Finally, after Part I and Part II, we are getting to the last part of my audio system description. First I'll tell a few words about the power unit, and then get into details about iterating on the digital audio path, also presenting some measurments.

The Power Unit

When there is a bunch of hardware units stacked in a rack and each of them requiring a power outlet, a natural desire occurs to have only a single power cord for the entire rack. For a long time a was using a simple metal-housed power strip which I bolted on the side wall of the rack. At some point I decided that I want to provide a more serious level of protection for the equipment. I also wanted the power unit to be implemented in the same half-rack form-factor as the rest of the equipment, and of course I wanted it to have no fans.

Around that time I also learned about the principle with a somewhat spiritually sounding name—the principle of "non-sacrificial" power protection. There is nothing supernatural in it though. Most power filter and protector strips used at home employ electrical elements that are intended to take the impact of an electrical power surge and thus "sacrifice" themselves, protecting the equipment this way. The elements used for this noble role are called "metal oxide varistor" (MOV). The problem is that power protectors never indicate how many MOVs contained in them are still in a good shape, thus it's always a lottery when such a protector will fail, possibly taking down the downstream equipment with it.

Whereas, the power protector I've bought: Middle Atlantic PD-415R-SP was intentionally built using a MOV-free design. Another feature advertised by the manufacturer is EMI filtering between the sockets, which is nice to have, at least in theory, when one has to mix digital and analog equipment and use switched power supplies. I must admit, I partially defeated the last feature by using power socket splitters, because the power unit unfortunately only has 4 sockets, while I have 7 pieces of equipment to power. However, since there is a lot of equipment and wires packed into a compact rack, there is a lot of EMI "flying" around, thus filtering just at connection points is not enough anyway.

To close the topic on the power unit, another its drawback besides not having enough sockets is relatively high price—around $350–$500, depending on the dealer. However, if we divide this price per socket, and compare to the price of equipment it protects, it seems like a reasonable investment.

The Digital Path

Finally, the fun part. My aim was to ensure that practically any digital source of audio could be connected to the input of the DSP. This is because I don't want to limit myself to use of certain streaming services or stick to media software like Roon. I'm a long time Google Play / YouTube Music user, I ripped my audio CDs, I also might want to play something via a browser. In addition, recently I decided to subscribe to Apple Music because they have switched to lossless streaming, and even offer "high resolution" versions for many popular albums—with a monthly fee which is less than a cost of a typical CD this was an easy decision.

In order to be able to use wide variety of software-based audio sources, one needs to use a real computer, or at least a mobile device. Initially I tried using the same Mac Mini which runs my DSP, however the performance of this 8-year old machine is clearly not enough to avoid glitches when running a browser alongside Reaper. Also I realized that I want a device with its own screen and keyboard input, so I can use it while I work. So I took off a shelf an old MacBook Air which I connected directly to the MOTU card by Ethernet via the AVB protocol. After a short period of use it has become obvious that modern browsers can pose a heavy load for any old computer—after half an hour of streaming YouTube Music the MacBook Air was always turning its fan on and ruining the listening experience.

I firmly decided that I need a fanless device, so I restored another "Air" device—this time an iPad Air—it had its screen broken and I worked around this with a help of an adhesive "screen protector" film. Then I started considering options for getting digital audio out the iPad (mine has a 3.5 mm analog output, but...) and I realized there are plenty of ways:

  • AirPlay, which can be used either over wireless or over wired network connection. Since iPad needs to be connected to a charger, the wired option seems to be more appropriate, especially if an Ethernet dongle with PoE (Power-over-Ethernet) is used—just a single wire needs to go into the device!

  • HDMI output via a dongle—since it's an old iPad model, it has a Lightning output, thus use of an Apple-made dongle is preferred.

  • USB output via a different type of dongle. Obviously, this dongle needs a power input, too. Unfortunately, USB audio interfaces that can provide power are less frequent to encounter than I would like.

Let's compare these options more thoroughly.

Comparison of Digital Output Alternatives for iPad


The AirPlay protocol—it's not a secret that it is based on an open RTSP streaming protocol, and once the encryption key that Apple uses was extracted, there are now plenty of open-source clients. I have a Raspberry Pi lying around (naturally, I amassed a lot of old computing devices), and I found the DigiOne SPDIF "hat" for RPi from a company that seems to care about audio quality— Another option is to connect Pi to an USB Audio Class compliant sound card.

I decided to try shairport-sync AirPlay client. After going through its docs I have realized that unlike the AVB protocol, AirPlay does not have a notion of "master clock," which means that the sender and the receiver of audio essentially run "freewheleed." Thus, even if both use the same nominal sampling rate (the AirPlay protocol always uses 44.1 kHz sampling rate), due to difference between the effective sampling rates (for example, the ends up running at 44099 Hz and the receiver at 44101 Hz), frames can be dropped or zero frames needs to be stuffed into the stream, thus glitches are unavoidable without a special precaution. In order to avoid glitches shairport-sync resamples the received audio to the sampling rate of the playback device. The effective sampling rate of the sender is discovered from timestamps sent together with audio data.

After playing with shair-port sync, I must admit that I'm impressed with the efforts of its author Mike Brady to make a software that "just works." However, since I didn't really have to transport audio far away and over wireless networks, I decided that perhaps all this complexity of re-synchronization at the receiver side can be avoided. Another important shortcoming of shairport-sync is that it only supports sampling rates that are multiples of 44.1 kHz, and according to this answer by Mike use of other base rate (that is, 48 kHz) is not possible without a substantial rework of the software.


The second option was to use HDMI output. I dismissed HDMI on the grounds that I only need a stereo output, have no interest in multichannel encoded content. Also, use of HDMI has some additional shortcomings:

  • iOS always uses 48 kHz sampling rate (at least, with the HDMI audio splitter that I have), which enforces resampling at playback time of all the content that Apple Music offers: the majority of albums on Music use 44.1 kHz sampling rate (so far I've only encountered one album in 48 kHz, it was "Waiting for Cousteau" by J.M. Jarre). The "Hi-Res" content uses either 96 kHz or 192 kHz. Note that Apple Music may still display a "Hi-Res Lossless" logo while resampling the "Hi-Res" content down to 48 kHz, which is clearly misleading.

  • No volume control on the digital side. Since iPad assumes it plays on some TV or an AVR which offers its own volume control, it always outputs at digital full scale. This obviously leads to clipping of intersample peaks.

  • iPad bears extra load because it also streams its screen along with the audio signal. The HDMI dongle gets warm pretty fast, too.

Thus, use of HDMI output is not an optimal solution for my scenario. This leaves us with the USB output option via "camera kit."

Dealing with USB Output Reliability Issues

The camera kit has a fat ugly wire which goes into iPad, and needs two incoming wires: one for the USB device, and one for power. I really wanted to hide the dongle inside the equipment rack and started looking for a Lightning extender. This has revealed an interesting fact—there exist no "MFi certified" (that is, approved by Apple) Lightning extenders. The extenders which claim to be "certified" are absent from the Apple's database. Apple does not make them either, nor does Belkin (the only accessories manufacturer which I would trust). Nevertheless, I still tried to use an extender wire which at least was shielded (a lot of extenders sold on Amazon are not even shielded, making them suitable only for charging purposes), and it was mostly working, except when it didn't. From time to time Apple Music was stopping playing in the middle of a song—the playback was still "going" on the screen, but there was no sound until the next song.

Finding the source of instability turned out to be a challenging task. Besides trying different Lightning extender wires I also tried 3 different USB transports: Douk Audio U2 Pro, Xing AF200, and finally RME FireFace. None of them were working reliably, including FireFace, and this was really suspicious, knowing that RME is usually rock-solid. Luckily, RME provides an iOS app which allows checking the state of the audio card mixer, and there I could see that whenever audio stops playing, it actually just stops coming from the software, despite the fact that the software (for example, Apple Music) was happily showing that it is playing. Also, while configuring FireFace to work as a USB Audio Class device, I have read some insightful information in its manual regarding connection to Apple devices. RME strongly recommends connecting the USB dongle to any Apple device directly. Finally, this is how I ended up connecting my dongle, and this has solved the stability problem for all USB transports I used.

I chose Xing as my USB transport because it has a screen which shows the current sample rate, attenuation, and playback state. Also, it offers the best variety of digital outputs, including an AES3 balanced digital output which I connected to the input of the sample rate converter.

Power Sources for iPad and Xing AF200

I need to mention that the difficulty with finding the source of iPad's playback instability was exacerbated by the need to find the right power supply for it and for the USB transport. I thought I could just plug the iPad into any USB power outlet and be done with it. However, life is not so easy. First, iPad is picky about power sources—there are various proprietary charging protocols used by Apple, and apparently iPad has some expectations. Obviously, Apple's own charger is accepted, however I've found that it creates a voltage offset between the ground of the output signal and the power ground.

The voltage offset results from a combination of unwanted AC and DC voltages. The AC part is usually some kind of harmonics of the 60 Hz from the power outlet or oscillations produced by the conversion circuitry. Having an offset (either DC or AC) is undesirable because if the USB transport is powered from the same charger, this offset is propagated to the "ground" wire of any electrical unbalanced output, and this can easily cause instability in reception on the input side.

I tried Anker PowerPort 6 charging unit, however its output offset from the power ground is just enormous, around 37V, and this is clearly problematic. I guess, nobody at Anker was envisioning use of this charger in an AV setup.

Finally, I ended up using one of USB ports on the Mac Mini. iPad had no problem charging itself from this port, and it has no significant voltage offset. However, its output power is limited, and I have to power both the iPad and the USB transport. Luckily, the Xing USB transport can also accept an external DC power input, thus leaving USB connection for data transfer only. Unfortunately, it can not send power to iPad. If only it could do that it would obviate the need for an extra USB power wire.

Since all these wires going back and forth and making loops between devices can easily become sources of noise voltages due to ground potentials difference, when choosing a power supply for Xing AF200, I was looking for something flexible. Thankfully, has covered that as well, offering an excellent 5V low-noise power supply called "Nirvana" which offers a "ground lift" switch for the DC output, as well as a ground connector. Thus, one can always configure it in a way which eliminates difference in ground potentials.

Later I found that a powered USB hub by Amazon Basics is also properly engineered to have only a negligible voltage offset on its USB outputs (relative to the power ground), however it's not as flexible as Nirvana SMPS. All in all, the resulting connection scheme looks like this:

Needless to say, after all these adventures I'm not a big fan of consumer digital audio equipment. Although I have found a stable configuration, it still feels a bit fragile, for example, once I tried to use a different USB-A/C cable between the "camera kit" dongle and Xing, and this immediately had broken the stability of playback. Apparently, consumer-grade equipment is not designed for use in complex AV systems.

Mutec MC-6 Sample Rate Converter

A quick note on the company name: please don't confuse "Mutec" with "Mytek"—company with a similarly sounding name which also makes audio equipment.

In the world where each digital device is capable of doing sample rate conversion, and the state of modern software converters is really good, why one would still need to use a hardware sample rate converter? For me, this is just a matter of convenience. The MC-6 unit has 4 inputs of various formats: AES3 (balanced XLR), AES3id (BNC), SPDIF (RCA), and TOSLink (optical), as well as the same outputs, plus BNC ins and outs for word clock. In normal SRC mode, only one of the inputs is active. The ability to choose among multiple inputs places the MC-6 unit into the same role which a pre-amp plays in "classic" hi-fi setups.

I also need to mention that the unit is perfectly engineered—switching between inputs and locking onto the source, as well as losing sync if the source gets disconnected does not cause any audible pops. Unlike consumer devices like iPads this unit is built for a constant 24/7 use and works absolutely reliably. I prefer to use XLR and TOSLink ports because they have a good tolerance for surrounding EMI and differences between ground potentials of units. As we have seen in the section about power sources, use of consumer-oriented power supplies can easily result in huge voltage offsets.

Regarding the clipping of oversampling peaks——this is something I always try to prevent because I often listen to recordings which accidentally or deliberatly leave no digital headroom. Unfortunately, MC-6 does not provide any headroom, it's not really possible with digital-to-digital conversion done in the integer domain—thus, it clips intersample peaks. Being aware of that, in order to avoid clipping on albums that are mastered without any digital headroom I use attenuation on the USB transport (Xing). I usually set it to -4 dB, switching to unattenuated output for classical recordings which are usually mastered the right way.


Measuring digital paths is trickier than analog ones. In order to look at digital signal in analog domain, which can reveal issues with cables, one needs a wide bandwidth oscilloscope—I don't have it. Another "classical" test is the J-Test, which can reveal issues in digital paths by looking purely from the digital side. The idea of J-Test is that it creates certain pattern of bits which can provoke unwanted modulations in the digital signal while it's being transmitted. These bit patterns are specific to the sampling rate being used. Thus, when a sampling rate converter is inserted, these bit patters do not really work as intended when we look at the output from the converter.

I decided to apply a different approach. For sampling rate converters, there is a set of tests proposed by the Infinite Wave team which evaluates how well low-pass filters of the converter suppress aliased frequencies, and also characterizes the properties of the filters. As it can be seen from the results page, the primary application of tests is for software sample rate converters. Hardware implementation can introduce more nonlinearity, and also can have "imperfect" effective sample rates, deviating from the nominal sample rate. Thus, in addition to the tests done by Infinite Wave I also did THD measurements using Acourate.

Since the team at Infinite Wave uses specialized software, I investigated how can I make similar tests using Audacity and REW. I used a regular log sweep for checking aliases using the Spectral analysis in REW. Note that the measurement log sweep has a regular frequency change rate, compared to a more specialized sweep used by Infinite Wave. This explains the difference in curve shapes, while the idea of the test is preserved. And for characterizing the low pass filter I simply passed a Dirac pulse via the chain. Since the chain is digital, we have a transfer system which is very close to textbook.

My primary interest was to check what is happening to a digital signal played at 44.1 kHz from iPad while it is passing through my digital equipment chain to the MOTU interface which runs DSP at 96 kHz sampling rate. I also tried sending test signals via AirPort to the Rasperry Pi equipped with a DigiOne SPDIF "hat" and running shairport-sync.

Below is the diagram of both chains. It demonstrates similarities between them and shows differences. The difference in the resulting sampling rates: 96 kHz vs. 88.2 kHz can be ignored. Note that I used a wired Ethernet connection between iPad and Raspberry Pi in order to avoid possible packet losses over WiFi. It was the same iPad in both cases, and I used VLC to play test signals.

Mutec MC-6

Let's start with the impulse response of MC-6 which was obtained by passing a Dirac pulse through it:

We can see that MC-6 uses a linear phase low-pass filter. Let's look at its passband and transition bands (this is the same frequency response graph, just framed differently):

We can see that the response is flat up to 20 kHz, then it abruptly goes down, essentially trimming out everything past 24 kHz. I suppose, the ripples on the passband graph are artefacts of REW's own processing. Note two interesting moments:

  • The low pass filter does not attenuate sufficiently frequencies in the region from 22 kHz to about 23.5 kHz. I think, this is an engineering trade-off to limit pre-ringing.

  • The phase does not stay flat and lags, which I find surprising for a linear phase filter. I will dig deeper into the reason behind this, it can be caused by REW processing, or it can be actual lag due to asynchronous sample rate conversion.

After figuring out the time- and frequency-domain properties of the filter we can now explain the spectrogram of a log sweep:

We can see that near the end there is some aliasing. I can explain that by the fact that aliases that have appeared after converting from 44.1 kHz were not sufficiently attenuated by the low pass filter. I think the approach to filtering used by MC-6 is similar to "allow aliasing" option of the sample rate converter of the SoX toolkit.

Finally, let's look at the distortion for a 1 kHz sine wave. Below are graphs for the sine at -0.1 dBFS and -60 dBFS:

We can see that the -60 dBFS is very clean, whereas the "almost full scale" wave exhibits some non-linearity. As we can see, even digital paths are not completely free from level-dependent non-linearities. However, measured distortion and noise levels are far below audibility thresholds:

And just to confirm to myself that my choice of the protocols and equipment used for the digital audio path was correct, I made a couple of measurements of the AirPort path via shairport-sync running on Raspberry Pi (recall the diagram above).

AirPlay via shairport-sync

After I have looked a the waveform of the recorded logsweep as it was produced by shairport-sync I already got some doubts:

Note spurious "hairs" above the 0.5 mark—those were not present in the original signal, as the entire sweep is at -6 dBFS (that's 0.5 on this scale). We can see that the frequency response graph is also rather jaggy:

Unsurprisingly, the spectrogram of the sweep shows a lot of artifacts as well:

Note that shairport-sync was built with support for resampling via SoX, and I have enabled it in the config file. Just to make sure that the rest of the chain (Raspberry Pi, DigiOne, and RME FireFace) works correctly, I pushed the same logsweep file onto Raspberry Pi, and played it using SoX, also with resampling—and there were no artifacts on the recorded sine wave and on the spectrogram. That means, the artifacts we see with shairport-sync are caused either by the AirPlay protocol itself (maybe, it's not actually lossless after all?), or resynchronization done by shairport-sync. In any case, this has confirmed that my choice of using digital output over USB was the right one.


Building an audio processing and playback chain is not a trivial task even when using off-the shelf components completely. It's not only the performance of each individual component that matters, but also the way they are connected together. Even in a chain where audio is transmitted predominantly via digital paths, these can still be non-linear effects and losses of the signal. To my opinion, this illustrates very well the idea which Rod Elliott has expressed in his article, that "digital" is just an abstraction—a very powerful one, but still an abstraction, and that "analog" aspects like voltages and currents must nevertheless always be taken into account.

Wednesday, June 29, 2022

Modular Audio Processing System, Part II

I have finished my last post about my audio system with a promise to tell about its digital part and the power supply. However, since that time I've already managed to do some changes to the analog part. The changes were caused by my intent to swap the KRK monitors for something that has less distortion. This has turned to be a project on its own—a conversion of a pair of LXmini speakers into a desktop version, and re-tuning them to have lower distortion. The conversion is a topic for some future post, while the topic of this post is about integrating the 4 channel QSC amplifier into the system and comparison of volume control circuits used by the amplifier and the RDL RU-VCA6A unit.

I use QSC SPA4-100 for driving speakers in LXmini. You might recall that LXmini uses "active crossover" approach and some DSP in order to turn a 4" midrange driver into a full range (albeit bass limited) dipole. One interesting aspect of the SPA line of amplifiers by QSC is that they have some "pro" features like the ability to drive 70V/100V long-range lines, and be remotely controlled. The remote control includes volume adjustment via a 10k variable resistor (pot). This appeared to be interesting to me because with this volume control I could bypass the need to use the RDL RU-VCA6A and connect the amplifier to the output of the sound card directly. It was still needed to retain the RDL control for the subwoofer though, thus the question was—is it possible to use the same gual ganged volume control pot both for RDL and QSC units, and will this work as expected.

The 10k Pot "Interface"

There is an unpublished standard for professional amplifiers and home installations to use a 10 kOhm linear taper potentiometer to control voltage. The conrol line uses 3 wires: ground, full scale DC voltage, and attenuated voltage. This naturally corresponds to the 3 posts of a variable resistor. DC voltage used by control circuits can be as high as 10 VDC, which is in fact what SPA4-100 uses, although the current is only 12 mA, whereas RU-VCA6A uses 5 VDC at 50 mA as a control signal,

You can often find potentiometer assemblies ready for in-wall or in-rack installation for ridiculous prices around $50. There is no real need to buy them, because there is absolutely nothing special in these potentiometers. They don't have to be linear—I use a logarithmic pot and even find it more convenient because I can quickly lower the volume and have a finer control at the "louder" end.

The input is thus a "standard." Yet, since it's not a real standard, manufacturers of amplifier and VCA units are free to choose how control voltage values map to gain or attenuation applied. Recall that VCA units typically have no gain on their own—a unity gain (0 dB or 1.0 multiplier), they can only attenuate the incoming line level signal. While in amplifiers with gain control there is a stage with fixed gain, for example 20 dB, and the gain control is implemented as a VCA feeding that stage. The maximum gain on amplifiers, which corresponds to unity gain on VCA units maps to the zero resistance position on the controlling pot—the maximum value of controlling voltage. This is where the agreement between manufacturers practically ends. What happens as you start turning down the controlling pot, thus increasing its resistance and decreasing the controlling voltage, purely depends on the intentions of the unit manufacturer. As I've learned, QSC and Radio Design Labs (RDL) see it in very different ways.

Volume Curves

In order to obtain a complete picture of SPA4-100 and RU-VCA6A behavior, I have traced gain changes corresponding to controlling voltage changes. The resulting function is known as a "volume curve." I used two multimeters by Agilent, both connected to a PC via a data logger. One multimeter logs the current resistance of the pot, the other logs the voltage of the audio output from the unit:

Agilent can display dBV levels by itself. Thus, by feeding the device under test with a sinewave at the reference level of 0 dBV, the reading of the multimeter on the output provides the device's gain directly. Then we can use log timestamps in order to establish correspondence between the controlling voltage and resulting gain. Note that because both the amplifier and the VCA unit accept balanced input, feeding them with unbalanced input—and that was what I did—decreases measured gain by half, that is by 6 dB. This is not relevant to the experiment because our intent is to measure both devices under the same conditions and compare their volume curves.

Using the approach above, I established that the amplifier goes from +24 dB down to -60 dB of gain, while the VCA unit goes from -6 dB down to -90 dB. Thus, the range of applied gains is actually similar, and is around 84 dB. However, the volume curves are totally different. In order to compare them, I've "normalized" the gain of the VCA unit, as if it were connected to a 30 dB fixed gain amplifier. Let's look at the shapes of the curves:

We can see that the volume of RU-VCA6A has a uniform slope (various bumps on it are likely due to imperfections in my measurement setup), whereas the volume curve of SPA4-100 has a linear region in the beginning up to about 2.5 kOhm resistance of the controlling pot, then it starts sloping, albeit slower than the curve of the VCA unit, and then at about 8 kOhm the amplifier basically drops the gain to the floor value of -60 dB.

Thus, it's not possible to achieve synchronous gain when using the same controlling pot for both devices. The fact that the VCA unit which controls the sub, lowers the gain at a faster pace than the amplifier for the main speakers leaves no hope to use nonlinearity of hearing—equal loudness curves—because they work exactly in the opposite way. And then I found yet another problem...

Noise and Distortion of the QSC VCA

As I started experimenting with the volume control on the QSC amplifier, I noticed that although the setup has practically no audible noise at the highest volume, the noise becomes audible as soon as I start turning the volume down. Normally, a power amplifier amplifies any noise fed to its input, and the higher the gain, the louder that noise will sound. The absence of noise at the highest volume proves that the noise from the DAC output is negligible low. Why does the noise starts appearing as we are making the gain lower?

I took some measurements of noise alone and of THD+N to demonstrate this phenomenon:

The blue FFT is taken at 2.5 kOhm, and the green FFT is at 3 kOhm of the controlling resistor value. The increase in noise and distortion is substantial and can be heard easily.

Now, looking at the volume curve of the amplifier, I've got the idea where does the noise comes from—it's from the amplifier's VCA unit which apparently has lower quality than the constant gain part. It seems that the VCA unit is bypassed when there is no attenuation. This explains the presence of the horizontal part on the volume curve. Once the volume control passes certain point, the VCA is engaged and the noise from it kicks in.

I guess, this was some sort of an engineering tradeoff. Understanding that not everyone really needs a volume control on a power amplifier, engineers at QSC provided a lower quality solution to save some money, however they were smart enough to get this lower quality VCA part out of the way on the critical path.

The Final Arrangement

Now for completeness, this is how my modular system looks after installing the QSC amplifier:

And this is the updated diagram of component connections:

Next time, I will finish this trilogy with a story about the digital part of the chain.

Saturday, March 26, 2022

Modular Audio Processing System, Part I

Recently I have completed assembling a system which I use for my casual audio playback needs and for audio experiments. I decided to reflect on the process of choosing and tuning the components of the system, also mentioning other options I was considering.

The goal of the system is to accept audio from various sources: files, mobile devices, web browsers, then process it to apply necessary room/speaker correction and binaural rendering, and then play on speakers and on headphones. There are of course numerous ready made commercial solutions for this, packing everything into one unit, however my intention was to have a truly modular system where each component can be replaced, and new functionality can be added if needed. Essentially, the system is built around a Mac computer with a pro soundcard. The challenging part was to figure out what additional equipment I need, and how to organize it physically, so that it does not just lay as a pile on the table, entangled by a web of cables.

For a long time already I stick to the "half rack" (9.5 inches) equipment form factor. I have a couple of racks and some audio equipment which either was designed for this format, or can be easily adapted to it. Here is how my current rack looks like:

Below is the schematics of connections between the blocks:

As I've mentioned before, the heart of the system is a Mac—an old model Mini from 2014, which I'm also using to type this post in. I've highlighted the inputs and the outputs of the system. Some of the input and output ports are mounted on the back panel:

All other interconnections are hidden inside the rack, and this makes the result to look nice if not "professional." Now let's consider the system component by component.

Mac Mini + MOTU UltraLite AVB

These two components essentially make a single unit equivalent to a dedicated DSP system, but in my opinion, more flexible. I wrote about the capabilities of UltraLite before: here and here. The features that I use for my needs are:

  • The routing matrix which is convenient for collecting various inputs: from applications, from external hardware, and via Ethernet. Note that I only use digital inputs on UltraLite in order to avoid adding noise.
  • AVB I/O is worth mentioning on its own as, I think, it's much more flexible than traditional point-to-point digital audio interfaces: SPDIF and USB.
  • DSP with some basic EQ functionality, as I wrote in the earlier post on the speaker system setup, it's enough for some basic corrections, however it is incapable of "serious" linear phase processing, which is done on Mac.

As for the functionality where UltraLite falls short, Reaper DAW running on Mac Mini helps to fill in the gaps. Here I can run linear phase FIR filters, linear phase equalizers, crossfeed and reverberation for headphones, and create arbitrary audio delays for synchronizing audio outputs. Note that although Mac Mini isn't a fanless computer, it stays perfectly quiet while running all this processing. Only sometimes does it briefly turn on the fan—this sounds like a loud exhale—and then keeps itself silent for awhile.

Having AVB input is nice as it allows using other computers for providing audio. Although Mac Mini runs Reaper alone with no glitches, launching a web browser inevitably introduces them—modern browsers are very heavy CPU-wise and perform a lot of disk I/O. That's why whenever I have to use a browser-based streaming client, I prefer to run it on a separate computer or a mobile device. The beauty of AVB is that, on Mac at least, one does not need to use any extra hardware audio interfaces, the only thing that is needed is a Thunderbolt to Ethernet dongle.

I also use UltraLite as a DAC—it has 8 line level outputs. Despite that interfaces by MOTU are considerably cheaper than functionally equivalent interfaces from RME, the quality of their analog outputs are on par, if not better. For example, below is a comparison of THD+N measurement for 0 dBFS 1 kHz tone, at +13 dBu of UltraLite's line out vs the line out of RME FireFace UCX (1st gen), both interfaces running at 48 kHz sampling rate, as measured by E-MU 0404 (this is to eliminate any possible bias when the card is measuring itself via an analog loopback):

Note on the output level—RME has selectable output level for its line outputs with the following modes: -10 dBV, +4 dBu, and Hi Gain (see the tech specs). From measurements, I've found the +4 dBu mode to be the "sweet spot": the signal level is high enough, yet distortions are lower than on Hi Gain. On MOTU Ultralite, the same output level is achieved by setting the output trim to -6 dB (that is, the level of full volume output on UltraLite is equivalent to Hi Gain mode on FireFace).

As we can see from the direct comparison, the output from MOTU is cleaner (this could also seen from comparing tech specs, but I wanted to double-check that, also MOTU specifies THD+N of output as low as -112 dB, I could not confirm that). Note that the 2nd generation of RME FireFace UCX declares better specs, but it's still much more expensive than MOTU. The huge benefit of the RME interface is its rock solid stability. With MOTU I'm occasionally running into the issue with emerging high-pitched signal dependent noise, fixing which requires rebooting the interface.

Lossless Wireless Output via Audreio

All the connections in my system use wires. However, sometimes it's nice to be able to drop the headphone wire, especially when I'm listening to music while doing things away from the computer. Surely, I did try a Bluetooth option, using a transmitter based on a Qualcomm chipset, which supports aptX HD codec—still lossy but sounding good nevertheless. However, it seems that BT transmitters, despite what their marketing materials say about the range of the connection, are mostly designed for the case when the listener sits in front of a TV. Once there is no straight line of sight between the TX and the RX, or when I move a bit further away, the connection switches to a lower bitrate, and finally degrades to the SBC codec which sounds noticeably lossy.

Because of this limitation of Bluetooth I decided to use WiFi. Its stations have more powerful transmitters and can turn up the power, even beaming the radio waves towards particular direction—do everything in order to keep a high bitrate connection with the receiver. I use my phone as an endpoint and run Audreio plugin inside Reaper on Mac to transmit audio to their app running in the phone. Since I'm not interested in low latency, I set the largest buffer size, and this works well with almost no dropouts as I'm moving around my house. Unfortunately, further development of Audreio has been canceled, however the last released app and plugin versions are stable and I haven't run into any issues while using them.

Naturally, with lossless audio delivered to the phone, it would be unwise to use wireless headphones. Instead, I usually use ER4SR by Etymotic, powered by a simple headphone DAC dongle by UGreen company, we will check its performance later.

Drop + THX AAA 789 Headphone Amp

I picked this amp because it has balanced line inputs and variable gain setting. I prefer balanced line inputs because there can be strong electro-magnetic fields inside the rack due to presence of power supplies. The Drop amplifier it is indeed very linear, thanks to the THX schematics.

Other options that I tried:

  • Built-in output on MOTU UltraLite. This one I only use for "debugging" or quick A/B comparisons. MOTU's output has relatively high output impedance, not as linear as Drop, and the volume control is not very comfortable.
  • Phonitor Mini. I used these for a long time and they were my reference for the crossfeed implementation. They also have balanced inputs and a dedicated mute switch, so you can leave the volume setting intact when you need a brief pause. However, both units that I had suddenly broke at some moment.
  • AMB M3 (my build). Due to its high power, this amplifier is also very linear under normal listening conditions, however it has two shortcomings: lack of balanced inputs, and high level of cross-talk between channels (not to be confused with crossfeed) due to it's "active ground" design. More about this in my old post.

I don't consider a headphone amplifier as the tool to make listening more "enjoyable." In my setup all psychoacoustic improvements are achieved by DSP processing. Thus, what is left to the headphone amp is to be as "transparent" as possible, and that means:

  • linearity and low noise,
  • close to zero output impedance, and
  • consistent left-right balance across the volume range.

It's interesting to compare output from Drop's amplifier with the mobile DAC by UGreen into the same resistive 33 Ohm load. Drop is driven by the line out of MOTU. UGreen dongle is connected to an iPhone running Audreio app. The signal source is the same—REW. Volumes are set to provide the same output voltage of about 2 dBu: for UGreen this is maximum volume, Drop still has some room for increasing the volume, and the gain setting is I (minimal gain):

Unfortunately, there has been some mains noise during measurement which has degraded the calculated THD+N value. However, the THD figure is the same as we have seen on the MOTU's line out directly, this is very good.

It's clear that the dongle is struggling at this output level. Better THD (less harmonics) are seen when the output volume is reduced to about 3/4, however this lowers the signal-to-noise ratio. Still, I think for a $15 dongle the results are good. I was also impressed that the output impedance of the dongle is only 0.3 Ohms. I'm glad that the manufacturer follows good engineering practices.

RDL RU-RCA6A Multichannel Volume Control

With excellent hybrid analog-digital volume controls that modern audio interfaces offer, one could wonder is there still a need to use an external analog volume control. Well, sometimes there is. In my topology, only part of the outputs from UltraLite go to speakers, and there are more than two outputs that need to be controlled simultaneously. Surprisingly, doing this in a convenient manner is still a challenge! Usually volume controls in sound cards only work for stereo pairs of channels or for each individual channel. Even MOTU's excellent web interface these it no possibility to "bind" several channels into a group for volume control purposes.

That's why I decided to use an external volume control unit. Another valid use case to use it is when there is a need to build a multichannel output out of multiple stereo DAC units. "Multichannel" does not necessarily mean surround sound, it might be a stereo setup with line level "active" crossovers for the purpose of bi- or tri-amping.

Side note: these days there are plenty of "multi-room" playback systems offering time and volume level synchronization while playing over computer wireless or wired networks. It's an interesting option to explore, but keep in mind that most of those consumer network protocols, like AirPlay, are limited to CD audio quality (44 kHz / 16 bits). I will talk about AirPlay in the next part of this post.

My specific requirement to the unit was the form factor. It's easy to find a multichannel AV "prepro", like the Marantz unit I wrote about some time ago, however they are all made in the standard 19" "full rack width" format. Luckily, there exist "pro" equipment, but unfortunately it's typically quite expensive. Two "mid-price" pieces of equipment that I could find are: RCA6A by Radio Design Labs (RDL) and Volume8 by Sound Performance Lab (SPL—same company that makes Phonitors). I plan to do an in-depth comparison of these units some time later. For now, I can tell that Volume8 is all about the quality of the audio, while RCA6A was built with a focus on providing remote control options. The option which I chose to use with RCA6A is a simple 10 kOhm pot which I put in an enclosure:

Matte finish of the big black aluminum knob I found on Amazon pairs nicely with the plastic of the box.

What about the quality of RCA6A? Yes, for a purist in me, it could be better. Below is the output from the RCA6A at unity gain fed by UltraLite's line output at the same output volume setting I was using to make the THD+N measurement:

As we can see, RCA6A adds some non-negligible odd harmonics. Note that the level of the 2nd harmonic which originates from UltraLite remains the same, while the 3rd harmonic gets almost 13 dB higher. The level of noise goes up by 3 dB. However, because I still use inexpensive KRK Rokit monitors, I doubt that RCA6A is a "bottleneck" in terms of audio quality.

To be continued

As shown on the photo of the rack and the scheme, there are also a couple of digital audio units and the power supply. I will discuss them in the next post.