Electronic Projects: 2019

Saturday, December 21, 2019

Understanding Microphone Calibration

A measurement microphone is an essential tool for doing any acoustic system adjustments. Although it can not substitute ears and brain in evaluating the quality of an audio setup, it's indispensable for a number of tasks like speaker alignment.

When aligning speakers we perform relative measurements, for example—are there any significant differences in the frequency response curves between left and right speakers? What about time alignment? For these tasks any measurement microphone working in audio range is suitable, with no calibration required. However, once we start assessing absolute parameters like the shape of the frequency response curve and try "straightening" it, we need to be sure that the microphone itself provides us with accurate data. After all, it's impossible to draw a straight line using a curved ruler!

Here is when microphone calibration comes into play. Ideally, the calibration file describes how exactly does this particular microphone deviates from flat frequency response. Then the measurement program uses this information to compensate the frequency response for the acquired measurement data. Thus, if we measure a reference speaker tuned to flat frequency response (like NTi Talkbox), we should obtain a flat FR line, shouldn't we?

A Bit of Theory

Before answering that, let's consider several basic definitions. If a microphone receives only direct sound of an acoustic source, this is called free field conditions. Because even "omnidirectional" measurement microphones become more and more directional with rising of the frequency, the orientation of the microphone capsule relative to the sound source becomes important in this case. Thus, the main operating condition is when the microphone is pointed towards the sound source—on-axis incidence.

An opposite of free field is when the microphone receives the sound from all directions (random incidence), this condition is called diffuse field. In practice, we mostly deal with reverberant fields—a mix of free and direct fields. Microphones are calibrated to a flat frequency response either for free field conditions or for diffuse field conditions. Due to imperfect omnidirectionality a pressure microphone can not achieve a flat response simultaneously under both conditions. Below is an illustration from G. Bore and S. Peus "Microphones" brochure:

Obviously, when measuring a sound source with a microphone, we need to understand the conditions it was calibrated for. We also need to make sure that we measure under these conditions! A lot of measurement microphones accessible to audio enthusiasts are calibrated for a flat response under anechoic (free field) conditions. That is, for the previous example the calibration file will contain the data for the black solid line, so the measurement program will compensate for its excessive sensitivity at high frequencies. However, in domestic rooms the field is reverberant—there is direct sound from the speaker mixed with reflections coming from all the surfaces surrounding it.

With the advent of computer-based measurement tools simulating anechoic conditions becomes easy. After recording a log sweep, the measurement program performs forward and inverse Fourier transforms on it, obtaining an impulse response (IR). On the IR graph, we can clearly see the initial impulse and the sound contributed by reflections:

If we window the impulse response to cut the reflections, we simulate free field conditions. The drawback is that we also cut out information about the frequencies having wavelengths longer than the window. The approach than can deal with this issue better is called frequency-dependent windowing (FDW). It windows each frequency individually and thus doesn't lose low frequency information. In REW the user can specify how many cycles of each frequency to keep:

Acourate provides more advanced controls allowing to specify cycle count for low and high frequencies, and before and after the main IR peak, all independently from each other:

Now back to the original question—if we measure a speaker tuned to give a flat frequency response under anechoic conditions on axis using a microphone with a calibration file for free field, and then use a fixed gate (window) or FDW on the measured IR, we indeed should obtain a flat frequency response graph.

Practice

Over time, I have acquired 4 measurement microphones of different makes and types:

miniDSP UMIK-1 USB microphone with manufacturer's calibration files: on axis (0 degrees) and 90 degrees;
Dayton EMM-6 analog microphone with manufacturer's on axis calibration file;
Josephson C550H analog microphone with no calibration;
Another miniDSP UMIK-1 USB microphone which I bought from Cross-Spectrum Labs (CSL); it has both manufacturer's (miniDSP) calibration files and another set of calibration files provided by CSL: for 0, 45, and 90 degrees orientation.

There is an interesting story regarding the Josephson microphone. I've reached out to Josephson to ask about calibration files and got the following reply:

We do not include individual calibration data at the price of the C550H, sorry. It is quite time-consuming to do that properly and we would rather not provide unsupported data. We are aware that some companies provide “calibration data” but without any supporting traceability or standard procedure it’s approximately meaningless.

What they mean here is that there exist standardized calibration procedures that are performed by certified labs, and measurements carried out after this calibration can be used in official reports. And C550H is not at the price point to justify these procedures, despite that it's the most expensive microphone of all four I have. Obviously, after reading this statement I started questioning the quality of the calibration files provided by Dayton and miniDSP, especially after I have compared the calibration files from CSL and miniDSP for the same UMIK-1 (Mic 4):

The differences for on-axis response are quite significant—the resonant peak is further up in frequency and is higher by 1 dB! That's interesting, right?

Then I decided to compare all 4 microphones by measuring the same speaker using ground plane technique. The speaker was RBH Sound E3c center channel which I corrected sligtly using Acourate to give it a tighter IR. Since I was doing this experiment in a small room, instead of actual floor I used my daybed:

This speaker due its small size obviously lacks low frequency output, but we can compare everything above 40 Hz. This is how the measurements look when no calibration files are used. The curves are gated using FDW window of 3 cycles to create quasi-anechoic measurement:

All microphones are pretty much aligned except for the Mic 1 (blue) (miniDSP UMIK-1 with manufacturer cal only). If I apply manufacturer's (miniDSP) calibration files for both UMIK-1 mics, they agree very closely:

That is good news, meaning that at least miniDSP are consistent with their calibration method. Note that Mic 1 was bought about 5 years ago, and Mic 4 just recently. The only slightly deviating mic is Josephson (Mic 3, cyan). If we would want to align its measurements with UMIKs, we need to bump its response around 9.7 kHz for 1 dB with Q 1.6.

Also note that Dayton (Mic 2, magenta) still doesn't have its calibration applied. What if we apply it?

Yikes! This is much worse than without calibration. Also note the ruggedness of the calibration data—it works quite bad with smoothed graphs obtained by applying FDW. Now I can see what Mr. Josephson was meaning by "meaningless calibration".

What about CSL calibration data for UMIK-1 (Mic 4)? Here is the graph of Mic 4 with CSL calibration applied (green), Dayton with its calibration (magenta), and Josephson (cyan). I've smoothed UMIK-1 by 1/12 octave and Dayton by 1/6 octave:

We can now see that UMIK-1 with CSL calibration is much closer to Josephson, and Dayton with its questionable "calibration" is nevertheless expressing some similarity, too. Since we know that CSL calibration data is for achieving flat response under free-field conditions, we now know what the target for Dayton and Josephson was, although Josephson this time will need to be bumped down at high frequencies by about 1 dB to match CSL calibration.

Preliminary conclusions we can make so far:

Cross-Spectrum Labs and Dayton calibrations are for achieving flat response at free-field (anechoic conditions). Dayton's calibration data doesn't seem to be of a high quality, though.
miniDSP calibration for UMIK-1 looks more suitable for a flat response under reverberant (or diffuse?) field conditions. Dayton w/o calibration file also shows similar behavior.
Josephson sits somewhere in between. Its high frequency response needs to be decreased by about 1 dB to achieve flat response under anechoic conditions and bumped up by 1 dB to achieve flat response under reverberant conditions.
It's better not to use miniDSP UMIK-1 without its calibration file as in this case it's behavior doesn't match well any conditions and different mics show significantly different behavior.

Test

What if we actually took a sound source with a frequency response that is flat in anechoic conditions and try measuring it with these microphones? I recalled that a couple years ago I was using NTi Talkbox as a reference speaker and actually still have those measurements. At that time I only had Mic 1 (UMIK-1 with miniDSP calibration). However, having the results from the previous experiment we can derive transfer functions for transforming measurements done with that microphone into other ones. Although that will not be as precise as actually measuring, but still we will get a good approximation.

This is how NTi Talkbox frequency response looks under anechoic conditions (from its specs):

It should be reasonably flat from 100 Hz to 10 kHz on axis. And this is how it was actually seen by Mic 1 on axis from the same distance (0.5 m) with a FDW window of 3 cycles applied:

Irregularities below 300 Hz are due to room modes—they need to be ignored. But look at the bump at 7.9 kHz—that's clearly due to insufficient compensation of the microphone coincidence bump in free field conditions! This confirms that miniDSP factory on axis calibration is not for a flat response under anechoic conditions.

In order to predict how would Mic 4 have done the same measurement I used the following formula:

NTi₄ = NTi₁ * (M₄ / M₁)

Which means, we derive a transfer function that transforms the measurement done by Mic 1 into a measurement done by Mic 4 and apply that function to the measurement of NTi Talkbox performed using Mic 1. Then applied the CSL calibration for Mic 4 and got the following:

Now, this is almost flat! Though, we can see a 0.5 dB roll-off at high frequencies, it's not clear whether it comes from windowing, or due to imperfection of our simulation method. Unfortunately, I can't make a direct experiment because I don't have access to that Talkbox anymore.

Josephson also shows a good result in this simulation:

Confirmed—a speaker tuned to flat for free-field conditions indeed measures as flat under quasi-anechoic conditions with a free-field calibration applied. Also confirmed that factory calibration of UMIK-1 mics is not for a flat response in free-field.

Conclusions

Choose the right tool for the job. In order to tune a speaker to a flat response in free field I would choose either UMIK-1 with Cross-Spectrum Labs calibration or Josephson C550H. For measurements in a reverberant field UMIK-1 with factory calibration and Dayton with no calibration can do a good job. In fact, the tuning of Josephson hits a sweet spot allowing it to be used for both kinds of measurements.

Note that I only considered on-axis response of those microphones. For a random incidence (90 degrees) the results may be different. Also note that the results for my Dayton EMM-6 may not apply to other Daytons—I don't know how much variability do exist between their mics. On the other hand, Josephsons are known to be pretty consistent.

A question remains how these differences in the target response of measurement microphones do not prevent people from thinking that having a "calibration" for their mic is all that they need, without wondering what was it calibrated for? My answer to this is that people usually experiment with their target curves anyway and make the final decision judging by whether they like what they hear.

Sunday, October 27, 2019

Measuring QSC SPA4-100 Amplifier and Understanding Driving Modes of Speakers

As I had mentioned a couple of times (see this and this posts), I drive my DIY LXmini speakers from QSC SPA4-100 power amplifier. I had chosen it because of its compact form-factor (1U half rack) and power capabilities (4 x 100W channels) that fit perfectly the LXmini use case. Finally, I've got time to do some measurements on it. While I'm very much satisfied with the sound I'm getting from this amp + speakers, there are a couple of questions I want to get an answer for:

What is the difference in output between the cases when unbalanced or balanced inputs are used with this amplifier.
Does the bridged output mode of the amp provide any improvements in THD compared to single ended mode (the effect that I've seen with Monoprice Unity amplifier).
How a more expensive Class D amplifier (QSC) stands in measurements against a less expensive one (Monoprice).

I decided to measure the amp in 4 Ohm output mode driving 4 Ohm and 8 Ohm loads. This corresponds to the nominal impedances of LXmini's full range driver (SEAS Prestige FU10RB H1600-04) and woofer (SEAS Prestige L16RN-SL H1480). For the loads I used wire-wound resistors attached to massive heat sinks.

Single Ended Mode

Output Power

Below is the table of results obtained by driving one channel of the amp with a 1 kHz sine signal from QuantAsylum QA401. The voltage was measured over the load using Agilent U1252B TrueRMS multimeter:

Load, Ohm	Input, dBV	Output, Vrms	Power, W
8	0, unbal	16.74	35
8	-4, bal	20.88	54.5
4	0, unbal	16.55	68.5
4	-4, bal	20.58	105.6

Trying to go above -4 dBV for a balanced input was tripping the input limiter. This is consistent with the manufacturer's specification for the input sensitivity which is +4 dBu = 1.78 dBV ~ -4 dBV of balanced input (doubling of logarithmic voltage is approx. +6 dBV increase). The gain of the amplifier for unbalanced input is 24.5 dB. In balanced mode it's slightly above 30 dB.

Output power figures are also consistent with the manufacturer's specification. Maximum output power is achieved when maximum allowed input is provided. It can also be seen that the maximum is not achievable when using unbalanced output as the input voltage is limited. This is important as miniDSP 2x4 HD only has unbalanced outputs. They are specified as having 2 Vrms = +8 dBu maximum level, so it's possible to hit the limiter when setting the output gain on miniDSP too high.

I must say that the resulting sound power from LXminis together with the subwoofer so far was enough for playing quite loud in my living room. But it's good to know that power output can be increased if I switch to balanced inputs on the amplifier.

Distortion and Frequency Response

For these measurements I hooked up QuantAsylum QA401 in parallel to resistive load. There is a caution in the amplifier manual warning against connecting any output to the ground. I suppose, trying to do that will trip the short circuit detection circuit in the amplifier. So I used differential connection instead, leaving probes ground connectors floating.

The lowest THD was achieved while driving an 8 Ohm load in 4 Ohm output mode (the picture was taken while using balanced input to the amplifier):

Note that there are two small "spikes" around the test frequency which look surprisingly similar to jitter peaks from DAC tests. I suppose, it's totally possible with Class D amplifiers as they effectively sample the input signal. Thus, small variations in the frequency of the triangle wave generator used for sampling can cause some samples to be off by a small amount. Although, there isn't much worry about that as these spikes are below -110 dB from the main signal, so they are inaudible. Harmonics and aliases also look very small compared to the main signal.

Testing IMD shows more severe distortion and strong aliases at about 60 kHz:

Looks like the antialiasing filter is "slow". Indeed, we can see that from the FR graph:

I also saw similar weak filtering on Monoprice's Class D amplifier and at that point decided that it's because it's a rather cheap model. But now I'm seeing the same on a more expensive amp. Looks like manufacturers decided to use a weak filter to avoid compromising power output. Out of curiosity I also tried measuring the frequency response with a real speaker load, hoping that the inductivity of the speaker would act as a low pass filter, but instead I've got absolutely the same graph. It's good to be aware of this issue.

Driving a 4 Ohm load in 4 Ohm mode yields slightly higher distortion figures. If for an 8 Ohm load we have THD+N 0.0074%, for a 4 Ohm load it becomes 0.0115%.

Balanced Mode

This is where things get pretty interesting. I looked up in the manual how to enable balanced mode, and found that this amplifier doesn't have a switch for that. Instead, the manual says "drive both inputs at the same level, connect the positive terminal of Output 1 and the negative terminal of Output 2 to the load":

This forced me to pause and think a bit about what does that mean for Channels 2 and 4 in non-bridged mode. Since there is no switch for the bridged mode, the amplifier always works the same way regardless of whether we use it for driving two channels in single ended mode, or one channel in bridged mode. For Channels 1 & 3 this doesn't cause any issue—the positive wire of the output gets driven by the amplifier. But what about Channels 2 & 4? It seems that they must be driven via the negative wire of the output and in an inverted phase. Is that true? To answer that, first I connected QA401 left and right inputs to both ends of the load connected to Channel 1, L (blue) to "+", R (red) to "-":

We see a natural voltage drop across the resistive load confirming that the amplifier only drives the "+" wire. What about Channel 2 (connections are done the same was as for Channel 1):

Yes, it's completely opposite—the "-" wire is active! For checking signal phase I connected the left input of QA401 to "+" of Channel 1, and the right input to "+" of Channel 2. Since the positive wire of Channel 2 receives attenuated signal, I adjusted the attenuator on Channel 1 to make the levels to be similar:

In time domain, we can see that Channel 1 and Channel 2 are driven in opposite phases.

Wait, does it mean that Channels 2 and 4 have inverted polarity when the amplifier is used in single ended mode? Actually no, because speakers are differential devices. I'll talk about this later. Just in order to verify that the polarity is correct, I connected two identical speakers the same way to Channel 1 and Channel 2, placed a microphone between them and ran Acourate's "Microphone Alignment" procedure:

As we can see, both speakers are in phase, no need to worry. Let's continue to measurements.

Output Power

I ran a couple of measurements into 4 Ohm load in bridged mode from an unbalanced input.

Load, Ohm	Input, dBV	Output, Vrms	Power, W
4	0, unbal	32.77	268.5
4	-4, unbal	20.63	106.4
4	-10, unbal	10.33	28.4

As we can see, the voltage gain in bridged mode from unbalanced input is the same as for single ended mode from balanced input—30 dB. Doubling the output voltage allows for almost 4x increase in the output power—compare 68.5 W into 4 Ohm from 0 dBV that we have seen for the single ended mode vs. 268.5 W from the same input in bridged mode. Nice! But what about distortion?

Distortion

Unfortunately, distortion doesn't look good. I had to lower the input level to -10 dBV to avoid clipping on the input of QA401, and distortions graph from a 1 kHz input looks like this:

Two tone distortion produces high levels of ultrasonic noise (from same -10 dBV level):

And remember, that's for 10 Vrms output (28.4 W power). In single ended mode even 20 Vrms output produced much less distortion. Clearly, the bridged mode of this amplifier is designed for something like PA applications, not for high fidelity.

Conclusions on the QSC SPA4-100 Amplifier

Answering the questions I've stated in the beginning of this post. We can see that this QSC amplifier is way more linear in its best operating mode (single ended) than cheaper Monoprice in its best mode (bridged)—just take another look at the graphs in the post about Monoprice.

Also, QSC's capabilities are specified much closer to real measurements than what Monoprice had specified. And clearly, higher price point of QSC is fully justified.

The bridged mode produces higher distortion even at lower input signal levels. This can be explained by the fact that each driving amplifier in this case "sees" twice less load. As we have observed on the 4 Ohm vs 8 Ohm load, distortion in this amplifier increases as the load impedance decreases. I suppose, it increases even more with 4 Ohm load gets divided in half by bridging.

What is common for both amplifiers is that there are some visible ultrasonic artefacts that are not filtered out even when using a real inductive speaker load. So actually, driving some sensitive speaker at high output level may overload and even damage it due to excessive high frequency energy.

Speaker Driving Modes

We can see that the audio engineers at QSC are very creative. As we have observed, the same speaker can be driven by this amplifier in 3 modes:

from the "+" terminal in positive phase;
from the "-" terminal in inverted phase;
from both terminals.

Does it make any difference to the speaker? In fact, no because what speaker "sees" is the difference of potentials between its "+" and "-" terminals. Say, we have 1 V (relative to some arbitrary reference point) applied to "+" terminal, and 0 V applied to "-" terminal. The speaker "sees" 1 V - 0 V = +1 V voltage. This voltage drives the cone forward (if enough current is supplied by the amplifier).

What if we apply 0 V to the "+" terminal and -1 V to the "-" terminal? The speaker "sees" 0 V - (-1 V) = +1 V voltage. This voltage drives the cone forward. Now, what if we apply 0.5 V to the "+" terminal and -0.5 V to the "-" terminal? Absolutely the same thing.

This is why it's possible to drive a speaker from the "-" terminal using an inverted signal. The speaker will behave the same as if driven from the "+" terminal using the signal in the original phase. Same thing happens if we drive the speaker from both sides. The only participant for which bridging matters is the amplifier. After internalizing all this stuff, I've re-read the post from Benchmark Media about myths of balanced headphone connections and this time I understood every word from it. Practicing with amplifiers helps to understand the theory!

Sunday, October 6, 2019

Case Study of LXmini in Our New Living Room

This summer we moved into a new rented house and finally I got some time to set up LXmini in this new environment. I've learned a lot while doing this and hope that sharing this experience could be useful for other people.

Initially I was planning to recreate my old 4.1 surround setup with two pairs of LXminis as front and surround speakers + KRK 10s subwoofer (only for LFE channel). However, I tried watching a couple of movies on a temporary stereo setup of LXminis and decided that stereo image they create is immersive enough and I don't want to complicate the setup with another pair.

The challenges I faced while getting the stereo setup right were different from what I had in our old apartment. First, we have bought a tall wide console for the computer and XBox, and I learned that the console creates strong reflections if speakers are put too close to it. On the other hand if I set the speakers further from the console, they get either too close to the couch or to the side wall. Second, this time I decided to use the subwoofer as a low frequency extension for LXminis but didn't want to compromise their excellent output.

Minimizing Reflections

This is a schematic drawing of the room. Note that the ceiling is quite high and sloped. This reduces vertical room modes significantly. The bad news is that the listening space is asymmetric and narrow. Below are views from the top and from the side, all lengths are in meters:

Blue circles represent the positions of the speakers in my temporary setup. The orange circles is the final setup. I've spent some time looking for the best placement and used a number of "spatially challenging" test tracks:

tom-tom drum naturally panned around (track 28 "Natural stereo imaging" from "Chesky Records Jazz Sampler & Audiophile Test Compact Disc, Vol. 3");
LEDR test—HRTF-processed rattle sound (track 11 "LEDR" from "Chesky Records Jazz Sampler & Audiophile Test Compact Disc, Vol. 1");
phantom center test files from Linkwitz Lab page.

When the speakers were placed too close to the console, LEDR was sounding smeared and so were the phantom center tests. ETC curves were also showing some strong early (< 6 ms) reflections:

I moved the speakers further from the console and placed them wider, so they didn't get too close to the couch. Though, the right speaker was now too close to the right wall. Fortunately, the reflections from the wall can be defeated by rotating the speaker appropriately. The hint that I've read in the notes of S. Linkwitz was to put a mirror to the wall and ensure that from the listening position I see the speaker from the side. Since LXmini is a dipole speaker there is a null at the side, thus the most harmful reflection from the nearby wall is minimized. We can see that on the ETC graphs from the new position (the graphs from the initial position are blended in for comparison):

For the left speaker, instead of the two reflections above -20 dB within the first 6 ms there is now one of a bit lesser power. For the right speaker, the overall level of reflections arriving during the first 6 ms are significantly reduced, and its ETC graph resembles more the ETC of the left speaker.

Playing the test tracks has also confirmed the improvement—now I can feel the rattle sound in LEDR moving in vertical and front-back directions clearly. Also, by avoiding creating strong reflections for the right speaker, I've made it essentially equal to more "spacious" left speaker placement, thus the asymmetry of the listening space doesn't matter anymore. However, the resulting "aggressive" toeing in of the right speaker has narrowed the listening "sweet spot". Apparently, it's not easy to achieve a perfect setup under real life conditions.

Equalizing Speakers

From my previous measurements I knew that the quality of the speaker drivers used in LXminis make them well matched. However, my initial measurements has shown some discrepancy which I wanted to correct:

I'm not a fan of excessive equalization—I believe that our brains are a much more powerful computers than our audio analyzers. But adding a couple of filters to correct for speaker placement seems reasonable here. In this case, I reduced the amplitude of one of the notch filters in LXmini equalization and added a couple more filters:

Note that I didn't do anything below 50 Hz because I plan to use the subwoofer with the crossover frequency at 45 Hz.

Then I adjusted KRK 10s to inhibit its output in the range of 30–60 Hz to "boost" its output at 20 Hz. Here I used filters suggested by Room EQ Wizard for the listening position:

Subwoofer Alignment in Time Domain

This was the most challenging part. I connected subwoofer using a cascaded miniDSP 2x4 HD in the following way:

Additional processing delay, phase shifts, and asymmetric positioning together create a framework which is challenging to analyze. Instead, I decided to apply the approach suggested by the author of Acourate software Dr. Ulrich Brüggemann. The procedure consists of the following steps:

Capture the impulse response of the main speaker using Acourate without the subwoofer.
Capture the impulse response of high-passed main speaker plus the subwoofer. The high frequency part of the response allows Acourate to align these IRs in time.
Convolve both impulse responses with a sine wave from the overlapping region.
By comparing the mutual offsets of the resulting sine waves in the initial transient moment and during sustained period deduce time delay and possibly phase inversion.

As I've learned from my experience, aligning based on a single frequency in the Step 3 may not provide the best results as at low frequencies the phase and the group delay of speakers may fluctuate severely. So instead of using a single sine wave I used a log sweep range in the bass region. This doesn't provide data for aligning initial transients, but for bass frequencies I think the sustained stage is much more important.

Here is how convolutions with a log sweep from 40 to 100 Hz were looking initially for the left and right speaker:

The left graph is mostly aligned, while the right one shows a delay of the main speaker for 2.5 ms. It can be seen that even on the left speaker, the alignment in the low bass region is poorer than at higher frequencies. I don't consider that to be a problem because there the contribution of LXminis is negligible. It's much more important to time align the region where both the sub and the LXminis can be heard together. It's also easy to see that if we attempt to use the crossover frequency (45 Hz) as the anchor point for time alignment, the speakers would be out of phase for higher frequencies which will result in "sagged" frequency response.

To avoid compromising the alignment of the left speaker, I decided to delay the sub for 1.25 ms which improves alignment for the right speaker, but doesn't degrade it too much for the left one. Below are the graphs of LXminis filtered with Linkwitz-Riley 24 dB/oct crossover at 45 Hz and with added subwoofer:

Definitely we can see extended bass range. You can also feel it :) I think, setting the crossover point low allows to get the maximum fidelity from the LXminis + subwoofer combination.

With all this laborious setup done, it's time to enjoy music!

Thursday, September 12, 2019

AES Conference on Headphones, Part 2—ANC, MEMS, Manufacturing

I continue to share my notes from the recent AES Conference on Headphones. This post is about Active Noise Cancellation (ANC), Microelectromechanical (MEMS) technologies for speakers and microphones, and topics on headphones manufacturing, measurement, and modelling.

Active Noise Cancelling

Apparently, Active Noise Cancelling (ANC) is a big thing for the consumer market and is an interesting topic for research because it involves both acoustics and DSP. ANC technologies save our ears because they allow listening to music in noisy environments without blasting it at levels that damage hearing. Typically, ANC is implemented on closed headphones or earphones as their physical construction allows to attenuate some of the noise passively, especially at middle and high frequencies. Low frequency noise has to be eliminated using active technologies. Since this requires embedding electronics into headphones, even for wired models, it also gives headphone designers a good opportunity to add "sound enhancing" features like active equalization and crossfeed.

The obvious approach to active noise cancellation is to put a microphone on the outer side of the ear cup and generate an inverse sound via the speaker to cancel the noise at the eardrum. However, as there is always some leakage of the noise inside the ear, "slow" or too aggressive noise inversion will create unpleasant comb filter effect due to summing of the noise with its delayed inverted copy.

An interesting idea that helps to win some time for more precise noise cancelling is to capture the noise from the ear cup on the opposite ear, as in this case the acoustic wave of the noise will have to travel some extra distance over the head. However, as an engineer from Bose has explained to me, the actual results will depend on the plane of the sound wave with respect to the listener.

One consideration that has to be taken into account when generating an inverse noise is to avoid creating high peaks in the inverse transfer function from notches in the original function. The process that helps to avoid that is called "regularization". It is described in this AES paper from 2016.

Use of ANC puts additional requirements on the drivers used in the headphones. As low frequency noise needs the most attenuation, a high displacement speaker driver is required to produce an adequate counter pressure. This typically requires increasing the size of the driver which in turn increases the amount of distortion at higher frequencies. This paper contains an interesting analysis of these effects for two commercial drivers.

"Hear Through"

"Hear Through" is a commonly used term for describing the technology that presents the environmental sounds to the listener wearing noise cancelling headphones. This is achieved by playing the sound captured by the outer microphone into the corresponding ear (basically, performing a real-time "dummy head" sound field capture which I was describing in the previous post). The Sennheiser AMBEO headset and AKG N700NC headphones implement "hear through", however not perfectly according to my experience—the external sound has some coloration and some localization problems. Although, that doesn't affect the ability to understand speech, it still feels unnatural, and there is some ongoing research to make "hear through" more transparent.

According to the study described in the paper "Study on Differences between Individualized and Non-Individualized Hear-Through Equalization...", there are two factors that affect "transparency" of the played back sound. First, it is the fact that closed headphones act as a filter when interacting with the ear, and this filter has to be compensated. Second, it's already mentioned sound leakage. Because "hear through" involves sound capture, processing, and playing back, it has non-negligible latency that creates a comb filter with the original leaked noise. The researchers demonstrated that use of personal "hear through" equalization (HT EQ, specific both to the person and the headphones) can achieve very convincing reproduction. However, the acquisition of HT EQ parameters has to be performed in an anechoic room (similar to classic HRTF acquisition), and thus is not yet feasible for commercial products.

MEMS Technologies

I must admit, I've been completely unaware of this acronym before I attended the conference. But turns out, this technology isn't something new. A portable computer you use to read this article contains several MEMS microphones in it. The key point about this technology is that resulting devices are miniature and can be produced using the same technologies as the ones used for integrated circuits (IC). The resulting device is packaged into a surface mounted (SMD) component. The use of IC process means huge production volumes are easily possible, and the variation in components is quite low.

Initially I thought that MEMS means piezoelectric technology, but in fact any existing transducer technology can be used for engineering MEMS speakers and microphones: electret, piezo, and even electrostatic as was demonstrated in the paper "Acoustic Validation of Electrostatic All-Silicon MEMS-Speakers".

MEMS microphones are ubiquitous. The biggest manufacturer is Knowles. For example, their popular model SPH1642HT5H-1 has high SNR and low THD costs less than $1 if bought in batches. Due to the miniature size MEMS microphones are omnidirectional across the whole audio range. Because of this fact I was wondering whether MEMS microphones can be used for acoustic measurements. Turns out, researchers were experimenting with them for this purpose since 2003 (see this IEEE paper). However, the only commercially available MEMS measurement microphone I could find—from IK Multimedia—doesn't seem to provide a stellar performance.

Engineering a MEMS speaker is more challenging than a microphone due to miniature size. Apparently, the output sound level decays very quickly, so currently they can't be used for laptop or phone speakers. The only practical application for MEMS speakers at the moment is in-ear headphones where the effect of a pressure chamber in an occluded ear canal boosts their level a bit. A prototype of MEMS earphones was presented in the paper "Design and Electroacoustic Analysis of a Piezoelectric MEMS In-Ear Headphone". The earphone is very minimalist and can be made DIY because it basically consists of a small PCB with a soldered on MEMS speaker and a 3D-printed enclosure. The performance isn't satisfying yet, but there is definitely some potential.

Headphones Manufacturing, Measurement, and Modelling

This is a collection of notes that I've gathered from the workshop on "Practical Considerations of Commercially Viable Headphones" (a "workshop" format means that there was no paper submitted to AES), my chats with engineers from headphone companies, and conversations with the representatives of measurement equipment companies.

Speaker driver materials

The triangle of driver diaphragm properties:

The effect of the low mass is in a better sensitivity, as lower force is required to move the diaphragm. Good mechanical damping is needed for producing transients truthfully and without post-ringing. And the higher the rigidity, the more the diaphragm resembles a theoretical "piston", and thus has lower distortion.

In practice, it is hard to satisfy all of the properties. For example, the classical paper cone diaphragm has good rigidity but high mass. Rigid metal diaphragms may lack damping and "ring". I would also add the fourth dimension here—the price. There are some diaphragms on the market that satisfy all three properties, but they are very expensive due to use of precise materials and a complicated manufacturing process.

Driver diaphragms for headphone speakers are typically produced from various polymers as they can be stamped easily. In terms of the resulting diaphragm quality, the following arrangement has been presented, from worst to best:

However, it looks like even better results are achieved with beryllium foil (used by Focal company), but these diaphragms are quite expensive.

If we step away from dynamic drivers, planar magnetic drivers are very well damped and have a lightweight diaphragm, they also move as a plane. The problem with their production is a high chance of defect, each speaker has to be checked individually, that’s why they mostly used in expensive hi-end headphones. Big companies like AKG, Beyerdynamic, Sennheiser, and Shure use classic dynamic drivers even on their flagship models.

Measurements and Tuning

Regarding the equipment, here is a measurement rig for over-ear and in-ear headphones from Audio Precision. It consists of APx555 Audio Analyzer, APx1701 Test Interface (basically a high-quality wide bandwidth amplifier), and AECM206 Test Fixture to put the headphones on.

APx555 is modular. The one in the picture is equipped with a Bluetooth module, and I've been told that it supports HD Audio protocols: AAC, aptX, and LDAC.

Besides AP's AECM206 test fixture, a full head and torso simulator (HATS), e.g. KEMAR from GRAS can be used. For earphone measurements it is sufficient to use an ear simulator as earphones do not interact with the pinna.

Companies Brüel & Kjær and Listen Inc also presented their measurement rigs and software. Prices on this equipment are on the order of tens of thousands of dollars, which is what you would expect.

Measuring the headphones correctly is a challenging problem. There is a nice summary in these slides, courtesy of CJS Labs. The resulting frequency response curves can vary due to variations in placement of the headphones on the fixture. Usually, multiple trials are required with re-positioning of the headphones and averaging.

When measuring distortion, the first requirement is to perform it in quiet conditions to avoid external noise from affecting the results. Second, since measurement microphones are typically band-limited to the audio frequencies range, THD measurement at high frequencies can't be done adequately using the single tone method. Instead, non-linearity is measured using two tone method (intermodular distortion).

The usual "theoretical" target for headphones is to imitate the sound of good (linear) stereo loudspeakers. The deviation between frequency response of the headphones from the frequency response recorded from loudspeakers using a HATS simulator is called "insertion gain". Ideally, it should be flat. However, listening to the speakers can happen under different conditions: extremes are free field and diffuse field. So the real insertion gain of headphones is never flat, and it is usually tweaked according to the taste of the headphones designer.

There is one interesting effect which occurs when using closed-back headphones or earphones. Due to ear canal occlusion, the sound level from headphones must be approximately 6 dB higher to create the same perceived loudness as from a loudspeaker. This is called “Missing 6 dB Effect”, and a full description can be found in this paper. Interestingly, the use of ANC could help with reducing the effects of occlusion, see the paper "The Effect of Active Noise Cancellation on the Acoustic Impedance of Headphones" which was presented on the conference.

Speaking of ANC, measuring its performance is yet another challenge due to the absence of industry-wide standards. This is explained in the paper "Objective Measurements of Headphone Active Noise Cancelation Performance".

Modelling and Research

Thanks to one of the attendants of the conference, I've learned about works of Carl Poldy (he used to work at AKG Acoustics, then at Philips), for example his AES seminar from 2006 on Headphone Fundamentals. It provides classical modelling approaches using electrical circuits and two-port ABCD model. The two-port model can be used for simulating in the frequency domain. Time domain simulation can be done using SPICE, see this publication by Mark Kahrs.

However, these modelling are more of "academic" nature. "Practical" modelling was presented by the representatives of COMSOL company. Their Multiphysics software can simulate creation of acoustic waves inside the headphones and how they travel through the ear's acoustic canals and bones. This was quite impressive.

Another interesting paper related to research, "A one-size-fits-all earpiece with multiple microphones and drivers for hearing device research" presents a device that can be used in hearables research. It consists of an ear capsule with two dynamic drivers and two microphones. It is called "Transparent earpiece", more details are available here.

Thursday, September 5, 2019

AES Conference on Headphones, Part 1—Binaural Audio

I was lucky to attend the AES Conference on Headphones held in San Francisco on August 27–29. The conference represents an interesting mix of research, technologies, and commercial products.

I learned a lot of new things and was happy to have conversations with both researchers and representatives of commercial companies that produce headphones and other audio equipment.

There were several main trends at the conference:

Binaural audio for VR and AR applications
- HRTF acquisition, HCF
- Augmented reality
- Capturing of sound fields
Active noise cancellation
- "Hear through"
MEMS technologies for speakers and microphones
- Earphones and research devices based on MEMS
Headphone production: modeling, materials, testing, and measurements.

In this post I'm publishing my notes on binaural audio topics.

"Binaural audio" here means "true to life" rendering of spatial sounds in headphones. The task here is as follows—using headphones (or earphones) produce exactly the same sound pressure on the eardrums as if it was from a real object emitting sound waves from a certain position in a given environment. It is presumed that by doing so we will trick the brain into believing that this virtual sound source is real.

And this task is not easy! When using loudspeakers, commercially available technologies usually require use of multiple speakers located everywhere around the listener. Produced sound waves interact with the listener's body and ears, which helps the listener to determine positions of virtual sound sources. While implementing convincing surround systems is still far from trivial, anyone who had ever visited a Dolby Atmos theater can confirm that the results sound plausible.

HRTF Acquisition, HCF

When a person is using headphones, there is only one speaker per ear. Speakers are positioned close to the ears (or even inside ear canals), thus sound waves skip interaction with the body and pinnaes. In order to render correct binaural representation there is a need to use a personal Head-related Transfer Function (HRTF). Traditional approaches to HRTF acquisition require use half-spheres with speakers mounted around the person, or use moving arcs with speakers. Acquisition is done in an anechoic room, and measurement microphones are inserted into the person’s ear canals.

Apparently, this is not a viable approach for consumer market. HRTF needs to be acquired quickly and under "normal" (home) conditions. There are several approaches that propose alternatives to traditional methods, namely:

3D scanning of the person's body using consumer equipment, e.g. Xbox Kinect;
AI-based approach that uses photos of the person's body and ears;
self-movement of a person before a speaker in a domestic setting, wearing some kind of earphones with microphones on them.

On the conference there were presentations and demos from Earfish and Visisonics. These projects are still in the stage of active research and offer individuals to try them in order to get more data. Speaking of research, while talking with one of the participants I've learned about structural decomposition of HRTF, where the transfer function is split it into separate parts for head, torso, and pinnaes which are combined linearly. This results in simpler transfer functions and shorter filters.

There was an interesting observation mentioned by several participants that people can adapt to “alien” HRTF after some time and even switch back and forth. This is why research on HRTF compensation is difficult—researchers often get used to a model even if it represents their own HRTF incorrectly. Researchers always have to ask somebody unrelated to check the model (similar problem in lossy codecs research—people get themselves trained to look for specific artifacts but might miss some obvious audio degradation). There is also a difficulty due to room divergence effect—when sounds are played via headphones in the same room they have been recorded in, they are perfectly localizable, but localization breaks down when the same sounds are played in a different room.

Although use of “generic” HRTFs is also possible, in order to minimize front / back confusion use of head tracking is required. Without head tracking, use of long (RT60 > 1.5 s) reverberation can help.

But knowing the person's HRTF constitutes only one half of the binaural reproduction problem. Even assuming that a personal HRTF has been acquired, it's still impossible to create exact acoustic pressure on eardrums without taking into account the headphones used for reproduction. Unlike speakers, headphones are not designed to have a flat frequency response. Commercial headphones are designed to recreate the experience of listening over speakers, and their frequency response curve is designed to be close to one of the following curves:

free field (anechoic) listening environment (this is less and less used);
diffuse field listening environment;
"Harman curve" designed by S. Olive and T. Welti (gaining more popularity).

And the actual curve is often neither of those, but rather is tuned to the taste of the headphone designer. Moreover, the actual pressure on the eardrums in fact depends on the person who is wearing the headphones due to interaction of the headphones with pinnae and ear canal resonance.

Thus, a complementary to HRTF is Headphone Compensation Function (HCF) which "neutralizes" headphone transfer function and makes headphone frequency response flat. As well as HRTF the HCF can be either "generic"—measured on a dummy head, or individualized for a particular person.

The research described in "Personalized Evaluation of Personalized BRIRs..." paper explores whether use of individual HRTF and HCF results in better externalization, localization, and absence of coloration for sound reproduced binaurally in headphones compared to real sound from a speaker. The results confirm that, however even with "generic" HRTF it's possible to achieve a convincing result if it's paired with a "generic" HCF (from the same dummy head). Turns out, it's not a good idea to mix individualized and generic transfer functions.

Speaking of commercial applications of binaural audio, there was a presentation and a demo of how Dolby Atmos can be used for binaural rendering. Recently a recording Henry Brant's symphony “Ice Field” was released on HD streaming services as a binaural record (for listening with headphones). The symphony was recorded using 100 microphones and then mixed using Dolby Atmos production tools.

It seems that the actual challenge while making this recording was to arrange the microphones and mix all those 100 individual tracks. The rendered "3D image" to my opinion isn't very convincing. Unfortunately, Dolby does not disclose the details of Atmos for Headphones implementation so it's hard to tell what listening conditions (e.g. what type of headphones) they target.

Augmented Reality

Augmented reality (AR) implementation is even more challenging than for virtual reality (VR) as presented sounds must not only be positioned correctly but also blend with environmental sounds and respect the acoustical conditions (e.g. the reverberation of the room, the presence of objects that block and / or reflect and / or diffract sound). That means, an ideal AR system must somehow "scan" the room for finding out its acoustical properties, and continue doing that during the entire AR session.

Another challenge is that AR requires very low latency, < 30 ms for the sound to be presented from human's expectation. The latter is tricky to define, as the "expectation" can come from different sources: a virtual rendering of an object in AR glasses, or a sound captured from a real object. Similarly to how video AR system can display virtual walls surrounding a person, and might need to modify a captured image for proper shading, an audio AR system would need to capture the sound of the voice coming from that person, process it, and render with reverberation from those virtual walls.

There was and interesting AR demo presented by Magic Leap using their AR glasses (Magic Leap One) and Sennheiser AMBEO headset. In the demo, the participant could “manipulate” virtual and recorded sound sources which also had AR presentations as pulsating geometrical figures.

Another example of AR processing application is “Active hearing”, that is, boosting certain sound sources analogous to cocktail party effect phenomenon performed by human brain, but done artificially. In order to make that possible the sound field must first be "parsed" by AI into sound sources localized in space. Convolutional Neural Networks can do that from recordings done by arrays of microphones or even from binaural recordings.

Capturing of Sound Fields

This means recording environmental sounds so they can be reproduced later to recreate the original environment as close as possible. The capture can serve several purposes:

consumer scenarios—accompanying your photos or videos from vacation with realistic sound recordings from the place;
AR and VR—use of captured environmental sounds for boosting an impression of "being there" in a simulation;
acoustic assessments—capturing noise inside a moving car or acoustics of the room for further analysis;
audio devices testing—when making active audio equipment (smart speakers, noise cancelling headphones etc) it's important to be able to test it in a real audio environment: home, street, subway, without actually taking the device out of the lab.

The most straightforward approach is to use a dummy or real head with a headset that has microphones on it. Consumer-oriented equipment is affordable—Sennheiser AMBEO Headset costs about $200—but it usually has low dynamic range and higher distortion levels that can affect sound localization. Professional equipment costs much more—Brüel & Kjær type 4101-B binaural headset costs about $5000, and that doesn't include a microphone preamp, so the entire rig would cost like a budget car.

HEAD Acoustics offers an interesting solution called 3PASS where a binaural recording captured using their microphone surround array can later be reproduced on a multi-loudspeaker system in a lab. This is the equipment that can be used for audio devices testing. The microphone array looks like a headcrab. Here is the picture from HEAD's flyer:

When doing a sound field capture using a binaural microphone the resulting recording can’t be further transformed (e.g. rotated) which limits its applicability in AR and VR scenarios. For these, the sound field must be captured using an ambisonics microphone. In this case the recording is decomposed into spherical harmonics and can be further manipulated in space. Sennheiser offers AMBEO VR Mic for this, which costs $1300, but plugins for initial audio processing are free.

Friday, May 31, 2019

Measuring Bridged and "Balanced" Amplifier Outputs

For a long time this topic was troubling me—how to measure bridged mode amplifiers properly. The problem here is that without taking precautions it's possible to end up with an amp ruined by a short circuit. I think I've got enough understanding about this matter and got some interesting results by measuring one of the amps I use.

Bridged Mode of Power Amplifers

A lot of commercial stereo amplifiers I've seen have "bridged mode" feature which turns the unit into a mono amplifier of higher power. E.g. on my Monoprice Unity amplifier, one needs to set the mode switch accordingly, connect the "+" wire of the speaker to the right "+" output, and the "-" wire of the speaker to the left "-" output. Obviously, only one input (left) is used in this case.

This mode is implemented in the amplifier by dedicating each of the channels to one wire of the load, and inverting the input to one of the amplifiers. Schematically, it looks like this:

This configuration doubles voltage on the ends of the load compared to regular stereo mode. In theory, this would result in 4x power increase into the same load, but in reality due to various losses it's usually only a bit higher than 3x. For example, the Monoprice Unity 100W amp is specified as delivering 50 Watt/channel into an 8 Ohm load in stereo mode, and 120 W into the same load when bridged, that's 2.4x ratio. Exemplary engineered AHB2 amplifier from Benchmark offers a much higher increase of 3.8x into the same load when in bridged mode.

However, the bridged configuration potentially can add more distortion because each channel effectively "sees" twice less load (e.g. 4 Ohm if an 8 Ohm speaker is connected). Thus, it would be interesting to measure the difference in distortion of bridged vs. regular mode. But here is the catch—the "-" wire of the load is now connected to the second amplifier's output. We can't connect it to the signal ground of an audio analyzer anymore as this would short-circuit the amplifier.

Here is why it happens. Normally, the ground plane of the input audio signal is the same as the ground plane of the output. When using an audio analyzer, this allows directly comparing the input signal from the signal generator to the output:

However, in the bridged configuration the zero voltage point (reference potential) for amp's output is virtual and located "in between" the terminals of the load:

The same situation can be encountered with Class-D amplifiers that are designed for maximum efficiency. In this case so called H-bridge configuration is used. That means, these amplifiers do not offer "single ended" mode at all and always run in bridged mode. Not every Class-D amp use H-bridge, but measurements for this class of amplifiers must be done with caution.

"Balanced" and "Active Ground" Headphone Amplifiers

And we encounter the same problem when we want to measure a headphone amplifier with "balanced" or "active ground" output. Note that the implementation of "balanced" output may vary—in the simplest case it only means that left and right outputs do not share the ground point. This is done to reduce channel crosstalk that occurs due to common-impedance coupling. In this case there is no additional amplifier on the "-" wire, and thus connecting it to the ground of the analyzer input does not cause any issues.

However, if "balanced" headphone output means "doubled circuitry" (essentially, this is the same as "bridging" for a power amplifier), or if the ground channel has a dedicated amplifier path, as in the AMB M3 amplifier (this is called "active ground"), then we must avoid connecting the ground of the output to the ground of the analyzer input.

Measurement Techniques

Since we must avoid connecting the ground of the output to the ground of the input, the simplest solution would be to leave the second wire of the output "floating" and only connect the "+" wire to the signal input of the analyzer. That's what I used myself in the past. In this case, the analyzer will still uses the input ground as a reference. The result might be off due to difference in levels between the "virtual ground" point in the middle of the load and the input ground.

For example, I created a symmetric load consisting of two 4 Ohm resistors. In this case, theoretically there is a 0 V point right between them. In practice, the measured difference between the potentials of the output and input grounds was 0.35 V. That means, it's better to avoid connecting them because this voltage will induce current into the input ground.

However, it's possible to use a second, floating analyzer unit for the output. It's possible to use a battery-powered voltmeter for measuring the voltage across the load, right? The same way, it's possible to use a full analyzer, but only if it's not connected to the input. This way, the analyzer on the output measures the output voltage relative to the output ground, which gives correct results. But operating two analyzers: one for generating signals, and another the measure the output can be cumbersome.

Also, what if we can't split the load, e.g. if we are using a real speaker instead of a resistor load? In this case we need to make a differential measurement. For oscilloscopes, there are special probes for this purpose. QuantAsylum QA401 has differential inputs (marked "+" and "-"). We need to connect one side of the load to the "+" input wire, and the other to the "-", leaving input ground floating. That's OK because the ground is not used as a signal reference anymore. Here is how wiring looks like:

Another advantage of a differential input is that any common mode noise on the probes gets cancelled. What I have noticed is that on a single-ended measurement I see a 60 Hz spike often, but it disappeared immediately after I have switched to differential input—with same amp, same probes, and same connections. That means, the 60 Hz hum is induced into the probes' wires by electromagnetic fields from nearby mains wiring.

Measuring Monoprice Unity 100W Amp

As a practical exercise, I've measured THD and IMD on Monoprice Unity 100W Class-D amplifier. It does not use H-bridge configuration, that means in stereo mode channels are driven from a single end and the "-" wire of the speaker it at the input ground plane's potential.

Bridged mode into 8 Ohm load, differential measurement

First I set the amp to maximum volume and checked with a true RMS voltmeter the potential difference across an 8 Ohm load while driving the input with a 1 kHz sine wave at -10 dBV (that's the nominal consumer line level). The voltmeter was showing 19.55 Vrms. Note that the resulting power value (from the V ^ 2 / R formula) is ~ 48 W, which is twice less than 120 W specified by the amp's manual (perhaps, the manufacturer was using higher level of the input signal). However, these levels seem right to me, in fact usually I don't even run the amp at the maximum volume.

But even that output level is close to QA401's limits on the input voltage (20 Vrms) so I decided to use a split load (2 x 4 Ohm resistors in series) and lowered input signal to -12 dBV. This got me 14.47 Vrms across 8 Ohm load, which is mere 26 W. Over the same load, a differential measurement with QA401 shows 23 dBV peak (agrees with the figure in Vrms), and if the load is specified as 8 Ohm, QA401 also shows 25 W output power—nice.

I also tried measuring with QA401 over half load (4 Ohm). The peak was now 17 dBV (7 Vrms—half of what the full load has), so I had to specify the load in QA401 as 2 Ohm in order to get the same 25 W figure.

Here is what I saw in terms of THD and IMD:

Definitely not outstanding results, especially if we consider that this is at less than 1/4 of the advertised power. One particularly interesting issue is the amount of ultrasonic noise on the IMD measurement. I suppose, this is caused by the fact that this amp uses a weak anti-aliasing filter, as we can see from its frequency response measurement:

The graph is quite fuzzy due to amplifier's non-linearity, but still we can see clearly that the downwards slope on the right is very gentle. This could be good property for a Class-A or Class-AB amplifier, but since Class-D effectively applies sampling to the input signal, the output is better be treated by a brick wall filter.

Single-ended mode into 8 Ohm load

I tried to achieve the same modest 25 W for an 8 Ohm load (remember that the manual states that the amp outputs 50 W into 8 Ohm in the single-ended configuration), however with the volume at maximum the reading of the voltmeter reading was only 10.45 Vrms, that's less than 14 W output power. I've increased the input signal level to the nominal -10 dBV, and it got me about 22 W. And even with this lesser power, the THD have increased twice compared to bridged mode, and the dual tone signal for the IMD was overloading the amplifier, so I had to cut it the input for IMD back to -12 dBV (and it still seem to overload).

Conclusions

Bridged amplifiers can be measured properly using differential mode of the QuantAsylum QA401 analyzer. If the output voltage is too large, the load can be split to reduce the voltage. Necessary corrections have to be applied if we want QA401 to display proper power figures. It's always possible to double check the results using a true RMS voltmeter.
Bridged mode also helps to defeat noise induced into probe wires by electromagnetic fields, especially the notorious 60 Hz hum.
The performance of Monoprice Unity 100W amp in single ended mode is quite bad. For driving an 8 Ohm load I would prefer using it in bridged mode.
And this result was contrary to my expectations—bridged mode, when driven at lower levels has much less distortion on this amplifier than single-ended mode at nominal level. That's why it's always better to measure first.