3 Advanced Audio Concepts You’ve Never Heard Of

Bit Shifting

Bit-shifting is pretty rare in audio processors - it’s a method for altering the placement of bits to increase or decrease gain.

Typically, when we alter the gain of a track within a plugin or with a fader, or when we adjust the amplitude of the left or right channel using panning, we change the word length.

By changing the word length our computers can output the signal at nearly endless divisions or multiplications of the original audio.

For example, say I reduce the gain by something specific like 1.7 dB - since each bit represents 6dB of information, the word length needs to extend to represent a change smaller than 6dB.

This is convenient for sure, and it gives users more freedom to fine-tune a signal’s amplitude, but there are some potential drawbacks.

Some argue that these divisions cause increased quantization distortion since the amplitude of the wave is set to something so specific that the bit depth can’t adequately represent the change.

For example, 16-bit mixing can handle a word length with 5 decimal places, meaning if we alter the gain to a specific level that requires a word length of more than 5 places, which is possible and common, we’ll get a rounding error.

But, with DAWs using 32 or 64-bit processing, which allows for much higher word lengths and subsequently greater precision, any rounding errors will be pretty minimal, and likely inaudible.

With all of that in mind, bit-shifting is an alternative to changing the word length when making amplitude changes.

By shifting the entire bit series to the right, the amplitude is divided by 2, causing a 6dB decrease in amplitude.

By shifting the entire bit series to the left, the amplitude is multiplied by 2, causing a 6dB increase in amplitude.

The word length doesn’t change, there’s no potential for rounding errors; however, all changes are made in 6dB increments.

The same could be done for panning - instead of changing the word length to reduce the amplitude of the right channel to pan the signal to the left, or vice versa, bit shifting can be used.

Again, the idea is that it’s closer to the original recorded signal by retaining the original word length and sequence, just shifted in either direction and with 6dB of change being the smaller amplitude change possible.

I’m not advocating using it over what your DAW uses to alter the amplitude, but, if you’re curious about it, some AirWindows plugins use this method for changing the amplitude and placement - they're free so it’s worth giving it a try.

Let’s take a listen to a rough mix adjusted and panned using the DAW’s panpots and faders. Then the same mix adjusted and panned using bit-shifting. Let me know if you hear a difference.

Watch the video to learn more >

Temporal or Time-based Masking

Usually, when I discuss masking I talk about simultaneous masking.

That’s when either a loud sound or a specific frequency covers up or masks another occurring at the same time.

For example, if the vocal is difficult to hear, the odds are roughly 250Hz is too loud and occurs while the vocalist is performing. 250Hz masks 3-5kHz, making the vocal consonants difficult to hear when both are present.

Temporal masking is masking to signals that occur at different times.

For example, say I have a loud, single sample click sound.

Right before and after the click is a softly plucked note on a guitar.

The area of the note right before the click, and right after will be masked.

When masking occurs before or preceding the loud sound, in this case, the loud click, that’s called pre-masking.

When it occurs after, it’s called post masking.

Interestingly, this masking occurs exponentially - meaning the degree to which the other signal is masked depends on its proximity to the masking sound.

Roughly 20ms before the loud click and 100ms after the loud click will be masked, with the sounds closest to the click incurring the masking to the greatest extent.

This is why, at least in my opinion, the ring of a snare is one of the most difficult things to get right in the mix.

Not only is it a softer sound that’s competing with other louder signals occurring simultaneously, but it also occurs right after a loud signal that can mask it for up to 100ms.

A possible remedy for this is upward processing or maximization that causes subtle waveshaping to lower amplitude aspects of the signal.

By bringing up quieter details of the snare, we increase their amplitude relative to the peak, in turn reducing the effect of both temporal and simultaneous masking.

To understand this better, let’s take a listen to the first example I gave - that is the loud click and soft note example.

I’ll loop a soft piano note, and then introduce a loud click. Although subtle, notice how the area around the click gets covered up.

Watch the video to learn more >

Audio Distortion Occurring in Nature

When we think about distortion, we usually think of an amplifier adding harmonics, or maybe a DAW causing clipping if the signal goes above 0dB at the output.

But sounds can distort naturally due to extreme pressure put on air molecules, or the natural mechanisms in our ears.

For example, we’ve become somewhat familiar with the sound of a rocket taking off during a space launch - the quintessential crackling sound seems to be just part of the experience.

But it’s an interesting example of air molecules departing from their typical behavior.

Usually, air molecules compress and expand gradually, resulting in sinusoidal waves, or the gradually curved peaks and troughs we’re used to seeing represented in DAWs or Oscilloscopes.

But, if the compression and rarefaction are intense enough - if the air molecules are displaced fast enough and the air temperature changes rapidly, air molecules no longer create a sine wave.

They begin to create waveforms more indicative of saw waves, triangle, and square waves.

Running audio from a rocket launch through an oscilloscope shows how distorted the sound is - and this isn’t from the microphone or preamps clipping, this is the actual sound of the rocket without significant added distortion during recording.

Tieing this back to mixing and mastering, this offers a good reason to monitor at reasonable levels - granted, you’ll never mix this loud if you still want to hear, but it shows how significant air pressure begins to reshape waveforms, and can give you an inaccurate representation of your mix, even if it’s much subtler than this.

The next way sound can distort naturally is through our auditory reflex.

When we hear a loud sound, usually a rapid loud sound, our ears compress the incoming signal. The ossicles of the middle ear contract and reduce sound pressure going to the auditory nerve, resulting in an effect very similar to optical compression with some adjusted parameters.Depending on the sound pressure level, it can take about 20 to 100ms before the maximum amount of attenuation occurs.

After a couple of seconds, the attenuation is about 50% of the original level, before it gradually goes back to normal, but various measurements show different results.

Compression causes mild waveshaping, resulting in harmonic distortion - so just like the rocket launch example, if the waveform is being shaped from a sine wave to a different waveform, the result is distortion.

Let’s take a listen to compression designed to mimic how the ear reacts to loud sounds when other noises are present.

The distortion is very subtle, but it’s there nonetheless.

Watch the video to learn more >