Monday, July 23, 2018

Recreating miniDSP filters with Acourate

I'm getting ready to build a second pair of Linkwitz LXmini—this time for rear channels. The original design of LXminis uses miniDSP processors for implementing the crossover and speaker linearization. I use a miniDSP 2x4 HD for the first pair of LXminis, but I decided I don't want to buy a second one. The reason is that 2x4 HD has unbalanced line out connections, but for the rear speakers I would like to put the amplifier further away and would prefer to use balanced lines between the DSP and the power amplifier.

There are some balanced miniDSP units: 2x4 Bal, 4x10 HD, and 10x10 HD, but their form factors do not fit into my half-rack stack. So I decided to go another way—build a dedicated mini PC to run Acourate Convolver via my MOTU UltraLite AVB card. Another reason for choosing Acourate over miniDSP is that the former offers practically unlimited abilities to build filters, because it's all software.

Thus, my first task was to re-create LXmini's DSP crossovers and filters using Acourate. For starters, I decided to follow the original design of the filters as close as possible (which means replicating their phase in addition to amplitude). The end result that I want to achieve is doing all the necessary speaker processing: crossovers, time alignment, speaker linearization, and room correction in one unit—the software DSP. Thus my Oppo BD unit would be only left with the tasks of decoding Dolby and DTS streams, and upmixing stereo into multichannel.

miniDSP 2x4 HD

Let's briefly describe the capabilities and structure of the miniDSP unit. It has a stereo (2 channel) input (switchable between analog, TOSLink, and USB), and 4 channels of analog output. Here is how the processing and routing chain is organized:

When connected to USB, besides 2 output channels the unit also offers 4 input channels that allow capturing processed audio data. This is in fact a very useful feature for our task.

The DSP in "HD" products operates at 96 kHz sampling rate. If digital input arrives at different rate, it automatically gets resampled. The DSP implements 10 biquad IIR filters per both input channel, then 18 biquads for EQ and crossovers per each output channel. It also allows the total of 4096 taps for FIR filter to be arbitrarily distributed over all 4 output channels (with a limitation that a single channel can't have more than 2048 taps).

That means, the processing in this miniDSP has low latency (due to low number of taps), but minimum phase and thus non-constant group delay. The FIR filter section has limited applicability due to short filter length, which gives relatively low resolution in frequency domain. This fact makes me think that miniDSP is optimized for Audio-Video applications where low latency is required, and the quality of the filters can be sacrificed, because when watching movies we normally pay more attention to the picture than to the sound.

miniDSP units are configured using specialized software called "plugins". They can work even without board connection which makes them very useful for studying provided processing system configuration—it's more convenient than trying to decipher the contents of config files manually.

Acourate and Acourate Convolver

"Acourate" is a family of products developed by Audio-Vero company (which as I understand consists of one man—Dr. Ulrich Brüggemann). Acourate is a filter creation tool which also has macro procedures for developing room correction filters. Then there are several variants of software that applies the filters created. Acourate Convolver is designed as a real-time audio processor for Windows, using ASIO interface for low latency access to sound card. Thus, running Convolver on a Windows PC with a good multichannel soundcard effectively turns it into a custom-built DSP box.

Acourate was created with critical listening in mind, so is allows creating linear phase FIR filters with large number of taps. However, it's also possible to cut filters to desired length trading filter quality for lower latency. As Convolver supports several configurations, you can have separate setups for A/V and audio-only scenarios. It's definitely more flexible than hardware-backed miniDSP boxes. Also, by choosing appropriate PC hardware and the soundcard, the software DSP box can be scaled to required number of audio channels. And they can grow quickly in number when active crossover approach (as in Linkwitz speakers) is employed for creating a surround sound setup.

The Method for Filter Re-creation

In a nutshell, there are two approaches for re-creating an existing filter with Acourate. If the filter is already implemented in software or hardware, you can measure it with Acourate or ARTA (or any other compatible analyzer), and proceed based on the measurement results. However, there are some caveats. First, even if the filter can be captured fully in digital domain, there is still possibility for noisy behavior, especially at high frequencies. Thus, some smoothing will be required.

Second, since the filter has some delay, it will manifest itself as phase shift in the measurement. It's easy to understand that by looking at the picture below:

Here we have got two sine waves of the same frequency, but the blue one is lagging behind the red one. If we capture a piece of each wave at the same moment in time and take a Fourier transform (this is what analyzers do), the frequency response will come out the same, but the phase components will be shifted relative to one another. That means, in order to obtain an exact phase response of the system being measured, we will have to compensate for the processing time delay by shifting the phase back.

The second approach is to use pure math and re-create the filter from its parameters using Acourate as an editor. This way it's possible to obtain the filter with exactly the same amplitude and phase characteristics. Also, it will be more precise than a captured one because Acourate calculates in 64-bit (double precision) floating point, whereas the capture will be in 32-bit (single precision) floating point at best. However, it will still help to capture the existing filter in order to verify the analytically obtained one against it, see the "Verification" section below.

Re-creating a Biquad

The LXmini configuration for miniDSP only uses biquad IIR filters. Thus, it's crucial to understand how to re-create them with Acourate. For EQ filters, there are two ways. One is to use the filter parameters: type (shelving or peak), frequency, gain, and Q. They are displayed by the miniDSP plugin app in Basic mode:

And we can enter the same parameters into Acourate's Generate > IIR-Filter dialog and then press Calculate:

And we get the same filter:

Sometimes the definition of the "Q" parameter doesn't match between different DSP vendors, but luckily miniDSP and Acourate use the same definition. What's also convenient about this approach is that it doesn't depend on the sampling rates used. So we can use any target sampling rate in Acourate, and the filter will still affect the same frequency.

There is also another way for re-creating a biquad filter—use the filter coefficients directly. In miniDSP plugin, they are displayed in Advanced mode in a text box:

The box is quite small and doesn't fit all the parameters on this picture. There are 5 of them: b0, b1, b2, a1, and a2. They define the filter completely, but in normalized radian frequency range: from to . The actual angle depends on the sampling rate used. So for example, at 96000 Hz sampling rate the frequency of 960 Hz is π / 100, but it becomes be π / 50 at 48 kHz. That's why when re-creating filters using biquad coefficients the sampling rate at the source and the target must match. Since miniDSP HD uses 96 kHz sampling rate, the same rate must be set for the project in Acourate.

Another thing that needs to be taken care of is the sign of a1 and a2 coefficients. Acourate and miniDSP use different conventions, and thus the signs of a1 and a2 coefficients taken from miniDSP must be negated when being entered into Acourate's dialog box. Acourate also asks for a0 parameter which must always be set to 1:

Assuming that sampling rates match between the miniDSP plugin and Acourate, this should create the same filter.

The second way seems to be more involved and requires great care. Why to use it at all? I would use it in case when for some reason Acourate does not produce the desired filter from a high level definition.

Joining Filters

Now we know how to re-create each EQ filter. The next step is joining them. miniDSP does this automatically. E.g. if we define two EQ filters, the resulting graph will show the result of applying both of them. Here I've added a second EQ notch filter to the previous one:

In Acourate, after re-creating this filter in another curve, we need to apply an operation of convolution (TD-Functions > Convolution) and save the result either in a third curve or overwrite one of the previous curves:

The result is the same curve as we had with miniDSP:

Do not confuse convolution with addition, however. Addition of filters happen when they run in parallel and then their results get summed. This is different from running filters in sequence. Sometimes adding filters may produce a result that looks similar to convolution, but it's in fact not the same.

If all ten EQ filters are engaged in miniDSP, the process of recreating them with Acourate might get tedious—we will need to perform the convoluton operation 9 times. It's better to save each individual EQ curve in case a mistake has been made while generating it. Note that convolution is a commutative operation, thus the order in which the convolutions are made doesn't matter.

Crossovers

Here we have a difference between the capabilities of miniDSP and Acourate. miniDSP HD plugin offers the following types of crossovers:

Acourate has all of these plus Neville-Thiele and Horbach-Keele crossovers. It can also generate them either with minimum phase (as in miniDSP) or with linear phase, see Generate > Crossover menu.

Besides using the Crossover dialog, it's also possible to use an alternative approach for entering biquad coefficients directly and then convolving intermediate curves. In miniDSP, a crossover can consist of up to 8 biquads, and their coefficients are listed on the Advanced tab of the plugin's Xover dialog. Remember that in this case the project sampling rate in Acourate must match the sampling rate of miniDSP HD: 96 kHz.

Combining It All Together

After input and output EQ filters and the crossover filter have been created they need to be joined using the convolution operation. Again, the order in which the convolutions are performed doesn't matter.

Note that it's not possible to re-create a compressor using FIR or IIR filters because it's behavior is amplitude-dependent. However, at least for Linkwitz speakers the compressor is not used.

Polarity (Phase), Delay, Gain

In miniDSP, any output channel can be delayed, attenuated, and have its polarity inverted. If Acourate Convolver is used for processing, these settings can be set in it directly:

However, it's also possible to use Acourate in order to modify the filter:

  • Gain: use TD-Functions > Gain;
  • Polarity: use TD-Functions > Change Polarity;
  • Delay: use TD-Functions > Rotation or Leading/Trailing Zeros. The difference between them is that Rotation preserves filter length, but the filter must have enough zero samples at the end prior to this operation.

Cutting Filter Length

By default, Acourate generates very long FIR filters—typically consisting of 131072 taps. They create a noticeable delay: e.g. for 96 kHz sampling rate it will be 1.365 seconds. It's OK for audio only applications—who cares if play / pause button does not react immediately. But for audio-video that's a lot—imagine having a 1 second delay between actor opening their mouth on the screen and us actually hearing their voice.

Thus, for A/V scenarios we need to cut the filters to usable length. Depending on what other processing stages (e.g. surround decoding) are in the chain, the time "budget" for filtering can be from 20 to 60 milliseconds before the audio delay becomes noticeable. For 96 kHz processing sampling rate, this translates into FIR filter length of 2048 or 4096 taps. More taps is better because this increases filter frequency resolution. The resolution of a 2048 taps minimum phase FIR filter at 96 kHz is ~47 Hz, and for a 4096 taps filter it's twice more—about 23.5 Hz. The resolution is especially important for bass equalization, where spacing between notes is only 1–2 Hz!

Acourate has TD-Functions > Cut'N Window function for cutting filters to length. Cutting is some sort of an engineering black art, because the result depends on the interaction between the filter and the windowing function being used for cutting. By default, Acourate uses "Blackman Optimal" window when cutting. In order to use any other function, it is possible to cut first without any windowing, and then apply the desired window via TD-Functions > Windows... dialog.

I've noticed that for filters having bass equalization, it may be helpful before cutting to move the impulse start a bit to the right using TD-Functions > Leading/Trailing Zeroes function. But remember that this introduces a delay which also needs to be added to other channels.

Verification

After re-creating a miniDSP configuration in Acourate we need to verify that our filter indeed replicates the original. This can be done in a lot of ways. We can choose to only use Acourate, and in that case what we need to do is to analyze the transfer function of the miniDSP configuration. As I've mentioned in the beginning, miniDSP also has USB inputs that are in fact returns of the processed signals. So we can open Acourate's LogSweep > LogSweep Recorder, choose the ASIO driver for miniDSP, specify input and output channels, and also make sure there are no fade-ins and fade-outs, and no peak optimization in the test signal (they are not needed for digital measurements):

Alternatively, we can also use other analyzer programs like FuzzMeasure or RoomEQ Wizard. Both allow analyzing a measurement recorded "offline"—outside of the app. So we can save the measurement log sweep, use Acourate's FIR-Functions > WAV Player in order to process the log sweep with the filter, and load the result back into FM or REW for analysis and comparison with the signal recorded from miniDSP.

Finally, we can use Acourate Convolver looped back through a sound card that has routing controls and check the filters using any analyzer app, even with those that don't offer offline processing, like ARTA. This approach is useful if we do final adjustments to filter's gain and polarity in Acourare Convolver.

When comparing filter phases, depending on the analyzer app it might be needed to calculate minimum phase first, otherwise it will not look like the actual phase of the filter. In Acourate this can be achieved using TD-Functions > Phase Extraction dialog. Also note that due to the processing delay, the phase may appear shifted (recall the sine waves picture at the beginning of the section).

Conclusion

There are several reasons for going with a fully software DSP solution. I certainly like modularity of this approach—you choose the form factor for the PC, and a soundcard with required number of channels and desired quality for DACs. Then you can have different configurations for audio only and AV scenarios, free from any limits of the hardware, and only being constrained by actual physical limit of the filters' time delay.

Also, what I've scooped up in this post is just a tip of what Acourate can do. I will certainly examine linear phase crossovers and room correction soon. One thing I'm missing in Acourate Convolver is IIR filters which could help with achieving required processing latency. However, I do have them on MOTU UltraLite AVB card, so it's not a problem.

Saturday, July 7, 2018

My Setup for Headphone Listening, Part 2

Continuing the topic of my desktop setup for headphone listening, let's recap what we had covered in Part 1. We have set up a transparent hardware chain at moderate cost, and decided to make all the necessary adjustments on the software side using DSP plugins. In order to route audio from any player program via the DSP processing chain, on Mac we use Audio Hijack, and on Windows—a combination of virtual loopback cables and a plugin host program. I'm not covering Linux and mobile platforms here, sorry.

The Processing Chain

I don't believe in the existence of a perfect playback chain that would suit all commercial recordings. Even if the chain itself is transparent, the combination of recording's frequency balance and the headphone's frequency curve may not suit your taste. Also, due to non-linearity of human hearing, even changing playback volume affects perceived tonality. So clearly, an ability to tweak tonal balance is required.

Also, when using closed headphones the reproduction sounds unnatural due to super-stereo effect—each ear can hear its own channel only. This is especially noticeable on recordings that employ some form of "spatial" processing intended for speakers.

So our goals are pretty clear: being able to easily adjust levels of high and low frequencies, and have a crossfeed. In addition, we can try adding some psycho-acoustic enhancement by injecting 2nd or 3rd order harmonics (this is roughly equivalent to using a tube amplifier). Previously, I was also enthusiastic about the idea of headphone frequency response normalization. Now I'm less excited, and I will explain why later. But if headphones used are known to have some particular tonal issue, like the 6 kHz bump of Sennheiser HD800, adding a "normalizing" plugin could be a good idea.

So here is a conceptual diagram of the DSP chain I use:

First comes a simple 2- or 3-band equalizer employing Baxandall curves. I find these to be more pleasant sounding than typical shelving filters of multi-band parametric equalizers.

The next block adds harmonic distortions. It helps to liven up some recordings if they sound too dry and lack "dimension". I think, in small controlled quantities harmonics sometimes can help. However, I prefer to add them with a DSP plugin rather than with an amplifier.

Then comes a crossfeed plugin. An alternative is to use the crossfeed feature of the headphone amplifier or DAC, if it has one. But using a plugin allows to have crossfeed on any DAC / amp, so it's more versatile. Also, if crossfeed is implemented as a plugin, it's possible to add a headphone "normalization" plugin after it. I think that having crossfeed after normalization defeats the purpose of the latter since crossfeed will most likely change the carefully tuned frequency response.

I run my chain at 96 kHz, even when the source material is at 44.1 kHz. Use of higher sampling rates is common in music production world, as they allow using smoother antialiasing filters during processing, and also help reducing the quantization noise. Going up to 192 kHz or higher will consume more CPU resources, and considering a modest amount of effects used, I don't think it's really needed.

At first, I was hesitating a bit whether should I use an integer multiple of the source sampling rate, that is, 88.2 kHz instead of 96 kHz, but then I realized that converting from 44.1 kHz to 48 kHz can be expressed conceptually as first upsampling with a multiplier of 160, and then downsampling by 147, both being integer multipliers (44100 * 160 / 147 = 48000). Also, a lot of good DAC units have an upsampling DSP processor connected before the DAC chip, for upsampling to 192 kHz (TEAC UD-x01), 384 kHz (Cambridge Audio DacMagic Plus and Azur), 768 kHz (!) (Pro-ject DacBox DS2 Ultra), or even an "odd" value of 110 kHz (Benchmark DAC1). So the DAC chip would never "know" what was the track's original sampling rate.

Thus, there is no reason to worry about going from 44.1 kHz to 96 kHz on the processing chain side, as modern software resamplers should be transparent. This is assuming that the input signal doesn't have intersample peaks. And we took care of this by lowering the digital volume of the player (see Part 1), giving some headroom to the audio signal before it gets upsampled.

Measurements

DSP plugins still need to be measured despite that their effects are usually better documented than of hardware units. Why? Because there can be surprises or discoveries, as we will see. Also, some plugins for some reasons have uncalibrated sliders, labelled in a very generic fashion like "0..10" and it's not clear what changes each step introduces. So unless you have a very trained ear, it's better to measure first.

And the audio transport channels that we use, despite being fully digital and thus supposedly "bit-perfect" still can introduce distortions, or cause losses in audio resolution. This is an imperfect world, and we need to be prepared.

The Empty Chain

As an example, let's measure an empty processing chain consisting of Windows 10 Pro (Build 17134.rs4_release.180410-1804), Hi-Fi Virtual Cable (from player to the effects host), VB-Audio Cable (from the effects host to the analyzer input), and DDMF EffectRack (64-bit version). The virtual stream on the EffectRack uses "Windows Audio Exclusive mode".

One problem that I've noticed is that using a test sine signal at 0 dBFS causes distortion:

But after lowering the input signal level by just 0.1 dB it's gone. I've double checked that the test signal is not overshooting 0 dBFS. I've also checked with Bitter plugin that the signal is not clipping:

Having that a lot of modern recordings have peaks normalized to 0 dBFS the advice I gave in Part 1 about lowering the digital volume on the player by at least 3.5 dBFS seems especially useful in this case.

I'm not sure where this distortion is happening—it could be anywhere in the kernel audio transport, in virtual cables, or in EffectRack. However, I've also tried this experiment with DDMF virtual streams, and with another effects host: PedalBoard 2, and the result was the same, so I'm suspecting Windows audio chain. But I must note that 0 dBFS sine plays fine via lots of sound cards' physical loopback, thus most likely it's a combination of Windows and virtual cable drivers that causes this behavior.

The lesson from this measurement is that I should not use a 0 dBFS test signal when testing the processing chain.

Another curious thing I've found is that with some virtual cables EffectRack causes distortion when there are no effects in the chain (stream's audio input is directly shorted to audio output), but the distortion is gone as soon as I insert a processing plugin, even if it does nothing to the audio stream (all processing knobs are at zero position).

By the way, using Bitter plugin is also helpful for verifying the actual bit resoluton of the processing chain. As we have seen on the screenshot above, on Windows I do actually have 24-bit resolution. It's interesting that on Mac with Audio Hijack the resolution seems to be even better—32-bit:

The Equalizer

There is no shortage of equalizer plugins. My long time favorite was basiQ by Kuassa because it's free, simple to use, and it implements a good old 3-band Baxandall equalizer. Typically I used it in very moderate amounts never going more than 5 or 6 steps from the zero settings. This is how the equalization curves look like:

Note that even in "all zeroes" setting the frequency response isn't entirely flat. I'm not sure if it's intentional, but it's better to be aware of this (that's why we measure!) Also note that the amount of correction resulting from the same amount of knob steps are not the same for low, mid, and high frequencies. I think, this is to account for the fact that human ear is less sensitive to changes in bass frequencies.

Another important thing is that when boosting any frequencies, the resulting increase in the sound power must be compensated by decreasing the output level of the plugin (the small knob at the bottom), otherwise clipping may occur on loud music passages.

But I've said that basiQ used to be my favorite plugin. I've migrated to Tone Control by GoodHertz, and this is why. Although it only has 2 bands (vs 3 on basiQ), Tone Control has an interesting offering of using a linear phase filter. That means zero group delay—no frequency groups get delayed by processing. This is important because some musical sounds (e.g. a hi-hat crash) are wide band signals, and delaying some parts of this signal "smears" the sound in time domain, which according to some theories affects its localization by brain.

Tone Control isn't free, and actually it's quite expensive for a 2-band equalizer ($95!). However, using it I could easily replicate my setups in basiQ, and Tone Control can create web shortcuts for them to use anywhere. Before that I tried DDMF LP10 equalizer, which also offers linear phase filters, but replicating delicate tone curves of basiQ with it was very hard, so I decided to pay a bit more for Tone Control.

Harmonics Enhacement

I decided to experiment with adding harmonics after reading this post by Bob Katz. I've found Fielding DSP Reviver plugin, which costs just $29. I measured it in order to calibrate the scales of its controls—they just go from "0" to "100", and also to verify that they don't have aliasing problems.

After measuring the levels of THD of Reviver, I decided never to go higher than "5" mark for both 2nd and 3rd harmonics. For the reference, the "1" mark adds 2nd harmonic at ~0.2% THD (about -55 dBFS), and for "5" it's a bit higher than 1% THD (about -40 dBFS). And for the 3rd harmonic the figures are a somewhat lower, so when both sliders are at "5", this creates a natural harmonics picture:

(I've put the cursor over the 2nd harmonic to show that it's at -39.95 dBFS, while the 3rd as we can see is lower than -40 dBFS.)

Turning on "Serial" mode also adds 4th and 5th harmonics:

Subjectively, adding harmonics may add "dimension" to sound, make it a bit "fatter". It in facts helps some recordings from 70-s and 80-s to sound better. My hypothesis is that for their production, tube amplifiers were be used in studio, so they were sounding "rich" there. But while being played via a transparent solid state chain on headphones they sound more "bleak" and "flat". So adding back some distortions helps.

However, I would not recommend abusing the harmonics plugin because, unfortunately, adding those "euphonic" harmonic distortions also brings in unpleasant non-harmonics. Dr. Uli Brüggemann of AudioVero explains the reasons in his article. And indeed, if we look at IMD SMTPE and CCIF measurements for Reviver, the level of SMTPE-measured distortions is quite high. So use it with caution—keeping it turned on all the time defeats the purpose of having a transparent reproduction chain. The effect of non-linear distortions can also be seen on the frequency response graph which becomes noticeably "fuzzier":

Crossfeed and Headphone Normalization

I covered both Redline Monitor and a couple of headphone normalization plugins in my earlier posts. For headphone normalization I would also prefer a plugin that has "linear phase" mode.

Now I would like to explain why I'm actually not currently using headphone normalization. From my experience, normalization indeed makes the headphones sound different from their original tuning, which can be exciting at first. But is it really a setup that you would want to use all the time? I doubt that. I actually have doubts that normalization can serve as a "reference", here is why.

There are several factors that can affect the headphone normalization process: first, the same model of headphones isn't necessarily consistent from instance to instance, and besides that, pad wear can affect bass response. OK, some companies offer measuring your headphones, and imagine we have done that. Then the second factor comes in—your head. The dummy heads used in measurement use statistical averages for head, ear pinna, and ear canal dimensions. But they are obviously not the same as your head, and this will affect the shape of the frequency response at the ear drum (see this interesting thesis work for details). And finally, the target response is not set in stone. There are several versions of Harman target curve, diffuse field curve, and your actual room curve.

So, there are just a lot of variables in the normalization process. There is a solution that takes them all into account—the Smyth Realizer, but it's too expensive for an ordinary folk. Thus, since we are not interested in music production, but only in pleasantly sounding reproduction, I've found that simply using tone controls delivers a desired sound with much less effort.

Conclusion

For me, using a simple DSP processing chain and a transparent reproduction chain has become a flexible and not too expensive way to enjoy my music in headphones. This setup offers endless ways to experiment with tonalities, "warmth", and soundstage perception while staying with the same hardware.