Dynamic Audio Normalizer

Created by LoRd_MuldeR <> – Please check http://muldersoft.com/ for news and updates!

Dynamic Audio Normalizer is a library and a command-line tool for audio normalization. It applies a certain amount of gain to the input audio in order to bring its peak magnitude to a target level (e.g. 0 dBFS). However, in contrast to more "simple" normalization algorithms, the Dynamic Audio Normalizer dynamically adjusts the gain factor to the input audio. This allows for applying extra gain to the "quiet" parts of the audio while avoiding distortions or clipping the "loud" parts. In other words, the volume of the "quiet" and the "loud" parts will be harmonized.

Contents:

  1. How It Works
  2. Command-Line Usage
  3. Configuration
  4. API Documentation
  5. Source Code
  6. Changelog
  7. Frequently Asked Questions
  8. License Terms
  9. Acknowledgement

How It Works

The "standard" audio normalization algorithm applies the same constant amount of gain to all samples in the file. Consequently, the gain factor must be chosen in a way that won't cause clipping/distortion – even for the input sample that has the highest magnitude. So if S_max denotes the highest magnitude sample in the input audio and Peak is the desired peak magnitude, then the gain factor will be chosen as G=Peak/abs(S_max). This works fine, as long as the volume of the input audio remains constant, more or less, all the time. If, however, the volume of the input audio varies significantly over time – as is the case with many "real world" recordings – the standard normalization algorithm will not give satisfying result. That's because the "loud" parts can not be amplified any further (without distortions) and thus the "quiet" parts will remain quiet too.

Dynamic Audio Normalizer solves this problem by processing the input audio in small chunks, referred to as frames. A frame typically has a length 500 milliseconds, but the frame size can be adjusted as needed. It then finds the highest magnitude sample within each frame. Finally it computes the maximum possible gain factor (without distortions) for each individual frame. So if S_max[n] denotes the highest magnitude sample within the n-th frame, then the maximum possible gain factor for the n-th frame will be G[n]=Peak/abs(S_max[n]). Unfortunately, simply amplifying each frame with its own "local" maximum gain factor G[n] would not give satisfying results either. That's because the maximum gain factors can vary strongly and unsteadily between neighbouring frames! Therefore, applying the maximum possible gain to each frame without taking neighbouring frames into account would result in a strong dynamic range compression – which not only has a tendency to destroy the "vividness" of the audio but could also result in the "pumping" effect, i.e fast changes of the gain factor that become clearly noticeable to the listener.

The Dynamic Audio Normalizer tries to avoid these issues by applying an advanced dynamic normalization algorithm. Essentially, when processing a particular frame, it also takes into account a certain neighbourhood around the current frame, i.e. the frames preceding and succeeding the current frame will be considered as well. However, while information about past frames can simply be stored as long as they are needed, information about future frames are not normally available beforehand. Older versions of the Dynamic Audio Normalizer applied a 2-Pass algorithm in order to solve this challenge, i.e. the entire audio file was simply processed twice. Newer versions of the Dynamic Audio Normalizer now use a huge "look ahead" buffer, which means that the audio frames will progress trough a FIFO (first in, first out) buffer. The size of this buffer is chosen sufficiently large, so that a frame's complete neighbourhood, including the subsequent frames, will already be present in the buffer at the time when that frame is being processed. The "look ahead" buffer eliminates the need for 2-Pass processing and thus gives an improved performance.

With information about the frame's neighbourhood available, a Gaussian smoothing kernel can be applied on those gain factors. Put simply, this smoothing filter "mixes" the gain factor of the n-th frames with those of its preceding frames (n-1, n-2, …) as well as with its subsequent frames (n+1, n+2, …) – where "nearby" frames have a stronger influence (weight), while "distant" frames have a declining influence. This way, abrupt changes of the gain factor are avoided and, instead, we get smooth transitions of the gain factor over time. Furthermore, since the filter also takes into account future frames, Dynamic Audio Normalizer avoids applying strong gain to "quiet" frames located shortly before "loud" frames. In other words, Dynamic Audio Normalizer adjusts the gain factor early and thus nicely prevents clipping/distortion or abrupt gain reductions.

One more subject to consider is that applying the Gaussian smoothing kernel alone can not solve all problems. That's because the smoothing kernel will not only smoothen/delay increasing gain factors but also declining ones! If, for example, a very "loud" frame follows immediately after a sequence of "quiet" frames, the smoothing causes the gain factor to decrease early but slowly. As a result, the filtered gain factor of the "loud" frame could actually turn out to be higher than its (local) maximum gain factor – which results in distortion/clipping, if not taken care of! For this reason, the Dynamic Audio Normalizer additionally applies a "minimum" filter, i.e. a filter that replaces each gain factor with the smallest value within the neighbourhood. This is done before the Gaussian smoothing kernel in order to ensure that all gain transitions will remain smooth.

The following example shows the results form a "real world" audio recording that has been processed by the Dynamic Audio Normalizer. The chart shows the maximum local gain factors for each individual frame (blue) as well as the minimum filtered gain factors (green) and the final smoothend gain factors (orange). Note how smooth the progression of the final gain factors is, while approaching the maximum local gain factors as closely as possible. Also note how the smoothend gain factors never exceed the maximum local gain factor in order to avoid distortions.

Chart
Figure 1: Progression of the gain factors for each audio frame.

So far it has been discussed how the optimal gain factor for each frame is determined. However, since each frame contains a large number of samples – at a typical sampling rate of 44,100 Hz and a standard frame size of 500 milliseconds we have 22,050 samples per frame – it is also required to infer the gain factor for each individual sample in the frame. The most simple approach, of course, is applying the same gain factor to all samples in the certain frame. But this would lead to abrupt changes of the gain factor at each frame boundary, while the gain factor remains constant within the frames. A better approach, as implemented in the Dynamic Audio Normalizer, is interpolating the per-sample gain factors. In particular, the Dynamic Audio Normalizer applies a linear interpolation in order to compute the gain factors for the samples inside the n-th frame from the gain factors G'[n-1], G'[n] and G'[n+1], where G'[k] denotes the final gain factor for the k-th frame. The following graph shows how the per-sample gain factors (orange) are interpolated from the gain factors of the preceding (green), current (blue) and subsequent (purple) frame.

Interpolation
Figure 2: Linear interpolation of the per-sample gain factors.

Finally, the following waveform view illustrates how the volume of a "real world" audio recording has been harmonized by the Dynamic Audio Normalizer. The upper view shows the unprocessed original recording while the lower view shows the output as created by the Dynamic Audio Normalizer. As can be seen, the significant volume variation between the "loud" and the "quiet" parts that existed in the original recording has been rectified to a great extent, while retaining the dynamics of the input and avoiding clipping or distortion.

Waveform
Figure 3: Waveform before and after processing.

Command-Line Usage

Dynamic Audio Normalizer program can be invoked via command-line interface (CLI), either manually from the command prompt or automatically by a batch file.

Basic CLI syntax:

Note that the input file and the output file always have to be specified, while all other parameters are optional. But take care, an existing output file will be overwritten!

Also note that the Dynamic Audio Normalizer program uses libsndfile for input/output, so a wide range of file formats (WAV, W64, FLAC, Ogg/Vorbis, AIFF, AU/SND, etc) as well as various sample types (ranging from 8-Bit Integer to 64-Bit floating point) are supported.

Passing "raw" PCM data via pipe is supported too. Just specify the file name "-" in order to read from or write to the stdin or stdout stream, respectively. When reading from the stdin, you have to explicitly specify the input sample format, channel count and sampling rate.

For a list of all available options, please run DynamicAudioNormalizerCLI.exe --help from the command prompt or refer to the following chapter.

Usage examples:

Configuration

This chapter describes the configuration options that can be used to tweak the behaviour of the Dynamic Audio Normalizer.

While the default parameter of the Dynamic Audio Normalizer have been chosen to give satisfying results with a wide range of audio sources, it can be advantageous to adapt the parameters to the individual audio file as well as to your personal preferences.

Options:

Gaussian Filter Window Size

Probably the most important parameter of the Dynamic Audio Normalizer is the "window size" of the Gaussian smoothing filter. It can be controlled with the --gauss-size option. The filter's window size is specified in frames, centered around the current frame. For the sake of simplicity, this must be an odd number. Consequently, the default value of 31 takes into account the current frame, as well as the 15 preceding frames and the 15 subsequent frames. Using a larger window results in a stronger smoothing effect and thus in less gain variation, i.e. slower gain adaptation. Conversely, using a smaller window results in a weaker smoothing effect and thus in more gain variation, i.e. faster gain adaptation. In other words, the more you increase this value, the more the Dynamic Audio Normalizer will behave like a "traditional" normalization filter. On the contrary, the more you decrease this value, the more the Dynamic Audio Normalizer will behave like a dynamic range compressor. The following graph illustrates the effect of different filter sizes – 11 (orange), 31 (green), and 61 (purple) frames – on the progression of the final filtered gain factor.

FilterSize
Figure 4: The effect of different "window sizes" of the Gaussian smoothing filter.

Target Peak Magnitude

The target peak magnitude specifies the highest permissible magnitude level for the normalized audio file. It is controlled by the --peak option. Since the Dynamic Audio Normalizer represents audio samples as floating point values in the -1.0 to 1.0 range – regardless of the input and output audio format – this value must be in the 0.0 to 1.0 range. Consequently, the value 1.0 is equal to 0 dBFS, i.e. the maximum possible digital signal level (± 32767 in a 16-Bit file). The Dynamic Audio Normalizer will try to approach the target peak magnitude as closely as possible, but at the same time it also makes sure that the normalized signal will never exceed the peak magnitude. A frame's maximum local gain factor is imposed directly by the target peak magnitude. The default value is 0.95 and thus leaves a headroom of 5%. It is not recommended to go above this value!

Channel Coupling

By default, the Dynamic Audio Normalizer will amplify all channels by the same amount. This means the same gain factor will be applied to all channels, i.e. the maximum possible gain factor is determined by the "loudest" channel. In particular, the highest magnitude sample for the n-th frame is defined as S_max[n]=Max(s_max[n][1],s_max[n][2],…,s_max[n][C]), where s_max[n][k] denotes the highest magnitude sample in the k-th channel and C is the channel count. The gain factor for all channels is then derived from S_max[n]. This is referred to as channel coupling and for most audio files it gives the desired result. Therefore, channel coupling is enabled by default. However, in some recordings, it may happen that the volume of the different channels is uneven, e.g. one channel may be "quieter" than the other one(s). In this case, the --no-coupling option can be used to disable the channel coupling. This way, the gain factor will be determined independently for each channel k, depending only on the individual channel's highest magnitude sample s_max[n][k]. This allows for harmonizing the volume of the different channels. The following wave view illustrates the effect of channel coupling: It shows an input file with uneven channel volumes (left), the same file after normalization with channel coupling enabled (center) and again after normalization with channel coupling disabled (right).

Coupling
Figure 5: The effect of channel coupling.

DC Bias Correction

An audio signal (in the time domain) is a sequence of sample values. In the Dynamic Audio Normalizer these sample values are represented in the -1.0 to 1.0 range, regardless of the original input format. Normally, the audio signal, or "waveform", should be centered around the zero point. That means if we calculate the mean value of all samples in a file, or in a single frame, then the result should be 0.0 or at least very close to that value. If, however, there is a significant deviation of the mean value from 0.0, in either positive or negative direction, this is referred to as a DC bias or DC offset. Since a DC bias is clearly undesirable, the Dynamic Audio Normalizer provides optional DC bias correction, which can be enabled using the --correct-dc switch. With DC bias correction enabled, the Dynamic Audio Normalizer will determine the mean value, or "DC correction" offset, of each input frame and subtract that value from all of the frame's sample values – which ensures those samples are centered around 0.0 again. Also, in order to avoid "gaps" at the frame boundaries, the DC correction offset values will be interpolated smoothly between neighbouring frames. The following wave view illustrates the effect of DC bias correction: It shows an input file with positive DC bias (left), the same file after normalization with DC bias correction disabled (center) and again after normalization with DC bias correction enabled (right).

DCCorrection
Figure 6: The effect of DC Bias Correction.

Maximum Gain Factor

The Dynamic Audio Normalizer determines the maximum possible (local) gain factor for each input frame, i.e. the maximum gain factor that does not result in clipping or distortion. The maximum gain factor is determined by the frame's highest magnitude sample. However, the Dynamic Audio Normalizer additionally bounds the frame's maximum gain factor by a predetermined (global) maximum gain factor. This is done in order to avoid excessive gain factors in "silent" or almost silent frames. By default, the maximum gain factor is 10.0, but it can be adjusted using the --max-gain switch. For most input files the default value should be sufficient and it usually is not recommended to increase this value. Though, for input files with an extremely low overall volume level, it may be necessary to allow even higher gain factors. Note, however, that the Dynamic Audio Normalizer does not simply apply a "hard" threshold (i.e. cut off values above the threshold). Instead, a "sigmoid" threshold function will be applied, as depicted in the following chart. This way, the gain factors will smoothly approach the threshold value, but never exceed that value.

Threshold
Figure 7: The Gain Factor Threshold-Function.

Target RMS Value

By default, the Dynamic Audio Normalizer performs "peak" normalization. This means that the maximum local gain factor for each frame is defined (only) by the frame's highest magnitude sample. This way, the samples can be amplified as much as possible without exceeding the maximum signal level, i.e. without clipping. Optionally, however, the Dynamic Audio Normalizer can also take into account the frame's root mean square, abbreviated RMS. In electrical engineering, the RMS is commonly used to determine the power of a time-varying signal. It is therefore considered that the RMS is a better approximation of the "perceived loudness" than just looking at the signal's peak magnitude. Consequently, by adjusting all frames to a constant RMS value, a uniform "perceived loudness" can be established. With the Dynamic Audio Normalizer, RMS processing can be enabled using the --target-rms switch. This specifies the desired RMS value, in the 0.0 to 1.0 range. There is no default value, because RMS processing is disabled by default. If a target RMS value has been specified, a frame's local gain factor is defined as the factor that would result in exactly that RMS value. Note, however, that the maximum local gain factor is still restricted by the frame's highest magnitude sample, in order to prevent clipping. The following chart shows the same file normalized without (green) and with (orange) RMS processing enabled.

RMS
Figure 8: Root Mean Square (RMS) processing example.

Frame Length

The Dynamic Audio Normalizer processes the input audio in small chunks, referred to as frames. This is required, because a peak magnitude has no meaning for just a single sample value. Instead, we need to determine the peak magnitude for a contiguous sequence of sample values. While a "standard" normalizer would simply use the peak magnitude of the complete file, the Dynamic Audio Normalizer determines the peak magnitude individually for each frame. The length of a frame is specified in milliseconds. By default, the Dynamic Audio Normalizer uses a frame length of 500 milliseconds, which has been found to give good results with most files, but it can be adjusted using the --frame-len switch. Note that the exact frame length, in number of samples, will be determined automatically, based on the sampling rate of the individual input audio file.

Boundary Mode

As explained before, the Dynamic Audio Normalizer takes into account a certain neighbourhood around each frame. This includes the preceding frames as well as the subsequent frames. However, for the "boundary" frames, located at the very beginning and at the very end of the audio file, not all neighbouring frames are available. In particular, for the first few frames in the audio file, the preceding frames are not known. And, similarly, for the last few frames in the audio file, the subsequent frames are not known. Thus, the question arises which gain factors should be assumed for the missing frames in the "boundary" region. The Dynamic Audio Normalizer implements two modes to deal with this situation. The default boundary mode assumes a gain factor of exactly 1.0 for the missing frames, resulting in a smooth "fade in" and "fade out" at the beginning and at the end of the file, respectively. The alternative boundary mode can be enabled by using the --alt-boundary switch. The latter mode assumes that the missing frames at the beginning of the file have the same gain factor as the very first available frame. It furthermore assumes that the missing frames at the end of the file have same gain factor as the very last frame. The following chart illustrates the difference between the default (green) and the alternative (red) boundary mode. Note hat, for simplicity's sake, a file containing constant volume white noise was used as input here.

Boundary
Figure 9: Default boundary mode vs. alternative boundary mode.

Write Log File

Optionally, the Dynamic Audio Normalizer can create a log file. The log file name is specified using the --log-file option. If that option is not used, then no log file will be written. The log file uses a simple text format. The file starts with a header, followed by a list of gain factors. In that list, there is one line per frame. In each line, the first column contains the maximum local gain factor, the second column contains the minimum filtered gain factor, and the third column contains the final smoothed gain factor. This sequence is repeated once per channel.

DynamicAudioNormalizer Logfile v2.00-5
CHANNEL_COUNT:2

10.00000  8.59652  5.07585      10.00000  8.59652  5.07585
 8.59652  8.59652  5.64167       8.59652  8.59652  5.64167
 9.51783  8.59652  6.17045       9.51783  8.59652  6.17045
...

The log file can be displayed as a graphical chart using, for example, the Log Viewer application (DynamicAudioNormalizerGUI) that is included with the Dynamic Audio Normalizer:

LogViewer
Figure 10: Dynamic Audio Normalizer - Log Viewer.

API Documentation

This chapter describes the MDynamicAudioNormalizer class, as defined in the DynamicAudioNormalizer.h header file. It allows software developer to call the Dynamic Audio Normalizer library from their own application code.

Please note that all methods of the MDynamicAudioNormalizer class are reentrant, but not thread-safe! This means that it is safe to use the MDynamicAudioNormalizer class in multi-threaded applications, but only as long as each thread uses its own separate MDynamicAudioNormalizer instance. In other words, it is strictly forbidden to call the same MDynamicAudioNormalizer instance concurrently from different threads, but it is perfectly fine to call different MDynamicAudioNormalizer instances concurrently from different threads (provided that each thread will access only its "own" instance). If the same MDynamicAudioNormalizer instance needs to be accessed by different threads, then the application is responsible for serializing all calls to that MDynamicAudioNormalizer instance, e.g. by means of a Mutex. Otherwise, it will result in undefined behaviour!

Also note that C++ applications can access the MDynamicAudioNormalizer class directly, while C applications can not. For pure C applications, the Dynamic Audio Normalizer library provides wrapper functions around the MDynamicAudioNormalizer class. Those wrapper functions are equivalent to the corresponding methods of the MDynamicAudioNormalizer class, except that you need to pass a "handle" value as an additional argument. Each MDynamicAudioNormalizer instance created trough the C API will have its own distinct but opaque handle value.

Synopsis:

  1. Create a new MDynamicAudioNormalizer instance.
  2. Call initialize() in order to initialize the MDynamicAudioNormalizer instance.
  3. Call processInplace() in a loop, until all input samples have been processed.
  4. Call flushBuffer() in a loop, until all pending output samples have been flushed.
  5. Destroy the MDynamicAudioNormalizer instance.

Functions:

MDynamicAudioNormalizer::MDynamicAudioNormalizer()

MDynamicAudioNormalizer(
    const uint32_t channels,
    const uint32_t sampleRate,
    const uint32_t frameLenMsec,
    const uint32_t filterSize,
    const double peakValue,
    const double maxAmplification,
    const double targetRms,
    const bool channelsCoupled,
    const bool enableDCCorrection,
    const bool altBoundaryMode,
    FILE *const logFile
);

Constructor. Creates a new MDynamicAudioNormalizer instance and sets up the normalization parameters.

Parameters:

MDynamicAudioNormalizer::~MDynamicAudioNormalizer()

virtual ~MDynamicAudioNormalizer(void);

Destructor. Destroys the MDynamicAudioNormalizer instance and releases all memory that it occupied.

MDynamicAudioNormalizer::initialize()

bool initialize(void);

Initializes the MDynamicAudioNormalizer instance. Validates the parameters and allocates/initializes the required memory buffers.

This function must be called once for each new MDynamicAudioNormalizer instance. It must be called before processInplace() or setPass() are called.

Return value:

MDynamicAudioNormalizer::processInplace()

bool processInplace(
    double **samplesInOut,
    int64_t inputSize,
    int64_t &outputSize
);

This is the main processing function. It usually is called in a loop by the application until all input audio samples have been processed.

The function works "in place": It reads the original input samples from the specified buffer and then writes the normalized output samples, if any, back into the same buffer. The content of samplesInOut will not be preserved!

It's possible that a specific call to this function returns fewer output samples than the number of input samples that have been read! The pending samples are buffered internally and will be returned in a subsequent function call. This also means that the i-th output sample does not necessarily correspond to the i-th input sample. However, the samples are always returned in a strict FIFO (first in, first out) order. At the end of the process, when all input samples have been read, to application should call flushBuffer() in order to flush all pending output samples.

Parameters:

Return value:

MDynamicAudioNormalizer::flushBuffer()

bool flushBuffer(
    double **samplesOut,
    const int64_t bufferSize,
    int64_t &outputSize
);

This function can be called at the end of the process, after all input samples have been processed, in order to flush the pending samples from the internal buffer. It writes the next pending output samples into the output buffer, in FIFO (first in, first out) order, iff there are any pending output samples left in the internal buffer. Once this function has been called, you must call reset() before calling processInplace() again! If this function returns fewer output samples than the specified output buffer size, then this indicates that the internal buffer is empty now.

Parameters:

Return value:

MDynamicAudioNormalizer::reset()

void reset(void);

Resets the internal state of the MDynamicAudioNormalizer instance. It normally is not required to call this function at all! The only exception is when you want to process multiple independent audio files with the same normalizer instance. In the latter case, call reset() after all samples of the n-th audio file have been processed and before processing the first sample of the (n+1)-th audio file. Also do not forget to flush the pending samples of the n-th file from the internal buffer before calling reset(); those samples would be lost permanently otherwise!

MDynamicAudioNormalizer::getVersionInfo() [static]

static void getVersionInfo(
    uint32_t &major,
    uint32_t &minor,
    uint32_t &patch
);

This static function can be called to determine the Dynamic Audio Normalizer library version.

Parameters:

MDynamicAudioNormalizer::getBuildInfo() [static]

static void getBuildInfo(
    const char **date,
    const char **time,
    const char **compiler,
    const char **arch,
    bool &debug
);

This static function can be called to determine more detailed information about the specific Dynamic Audio Normalizer build.

Parameters:

MDynamicAudioNormalizer::setLogFunction() [static]

static LogFunction *setLogFunction(
    LogFunction *const logFunction
);

This static function can be called to register a callback function that will be called by the Dynamic Audio Normalizer in order to provide additional log messages. Note that initially no callback function will be registered. This means that until a callback function is registered by the application, all log messages will be discarded. Thus it is recommend to register your callback function before creating the first MDynamicAudioNormalizer instance. Also note that at most one callback function can be registered. This means that registering another callback function will replace the previous one. However, since a pointer to the previous callback function will be returned, multiple callback function can be chained. Finally note that this function is not thread-safe! This means that the application must ensure that all calls to this functions are properly serialized. In particular, calling this function while there exists at least one instance of MDynamicAudioNormalizer can result in race conditions and has to be avoided! Usually, an application will call this function early in its "main" function in order to register its callback function and then does not call it again.

Parameters:

Return value:

Callback Function:

The signature of the callback function must be exactly as follows, with standard cdecl calling convention:

void LogFunction(
    const int logLevel,
    const char *const message
);

Parameters:

Source Code

The source code of the Dynamic Audio Normalizer is available from one of the official Git repository mirrors:

Supported build environments:

Build prerequisites:

Changelog

Version 2.03 (2014-08-11)

Version 2.02 (2014-08-03)

Version 2.01 (2014-08-01)

Version 2.00 (2014-07-26)

Version 1.03 (2014-07-09)

Version 1.02 (2014-07-06)

Frequently Asked Questions (FAQ)

Q: How does Dynamic Audio Normalizer differ from dynamic range compression?

A traditional audio compressor will prune all samples whose magnitude is above a certain threshold. In particular, the portion of the sample's magnitude that is above the pre-defined threshold will be reduced by a certain ratio, typically 2:1 or 4:1. In other words, the signal peaks will be flattened, while all samples below the threshold are passed through unmodified. This leaves a certain "headroom", i.e. after flattening the signal peaks the maximum magnitude remaining in the compressed file will be lower than in the original. For example, if we apply 2:1 reduction to all samples above a threshold of 80%, then the maximum remaining magnitude will be at 90%, leaving a headroom of 10%. After the compression has been applied, the resulting sample values will (usually) be amplified again, by a certain fixed gain factor. This factor will be chosen as high as possible without exceeding the maximum allowable signal level, similar to a traditional normalizer. Clearly, the compression allows for a much stronger amplification of the signal than what would be possible otherwise. However, due to the flattening of the signal peaks, the dynamic range, i.e. the ratio between the largest and smallest sample value, will be reduced significantly – which has a strong tendency to destroy the "vividness" of the audio signal! The excessive use of dynamic range compression in many recent productions is also known as the "loudness war".

The following waveform view shows an audio signal prior to dynamic range compression (left), after the compression step (center) and after the subsequent amplification step (right). It can be seen that the original audio had a large dynamic range, with each drumbeat causing a significant peak. It can also be seen how those peeks have been eliminated for the most part after the compression. This makes the drum sound much less catchy! Last but not least, it can be seen that the final amplified audio now appears much "louder" than the original, but the dynamics are mostly gone…

Compression
Figure 11: Example of dynamic range compression.

In contrast, the Dynamic Audio Normalizer also implements dynamic range compression of some sort, but it does not prune signal peaks above a fixed threshold. Actually it does not prune any signal peaks at all! Furthermore, it does not amplify the samples by a fixed gain factor. Instead, an "optimal" gain factor will be chosen for each frame. And, since a frame's gain factor is bounded by the highest magnitude sample within that frame, 100% of the dynamic range will be preserved within each frame! The Dynamic Audio Normalizer only performs a "dynamic range compression" in the sense that the gain factors are dynamically adjusted over time, allowing "quieter" frames to get a stronger amplification than "louder" frames. This means that the volume differences between the "quiet" and the "loud" parts of an audio recording will be harmonized – but still the full dynamic range is being preserved within each of these parts. Finally, the Gaussian filter applied by the Dynamic Audio Normalizer ensures that any changes of the gain factor between neighbouring frames will be smooth and seamlessly, avoiding noticeable "jumps" of the audio volume.

Q: But what if I do not want the "quiet" and "loud" parts to be harmonized?

In this case, the Dynamic Audio Normalizer simply may not be the right tool for what you are trying to achieve. Still, by using a larger filter size, the Dynamic Audio Normalizer may be configured to act much more similar to a "traditional" normalization filter.

Q: Why does the program crash with "GURU MEDITATION" error every time?

This error message indicates that the program has encountered a serious problem. On possible reason is that your processor does not support the SSE2 instruction set. That's because the official Dynamic Audio Normalizer binaries have been compiled with SSE and SSE2 code enabled – like pretty much any compiler does by default nowadays. So without SSE2 support, the program cannot run, obviosuly. This can be fixed either by upgrading your system to a less antiquated processor, or by recompiling Dynamic Audio Normalizer from the sources with SSE2 code generation disabled. Note that SSE2 is supported by the Pentium 4 and Athon 64 processors as well as all later processors. Also every 64-Bit supports SSE2, because x86-64 has adopted SSE2 as "core" instructions. That means that every processor from the last decade almost certainly supports SSE2.

If your processor does support SSE2, but you still get the above error message, you probably have found a bug. In this case it is highly recommended to create a debug build and use a debugger in order to track down the cause of the problem.

License Terms

Dynamic Audio Normalizer Library

The Dynamic Audio Normalizer library (DynamicAudioNormalizerAPI) is released under the GNU Lesser General Public License, Version 2.1.

Dynamic Audio Normalizer - Audio Processing Library
Copyright (C) 2014 LoRd_MuldeR <mulder2@gmx.de>. Some rights reserved.

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

http://www.gnu.org/licenses/lgpl-2.1.html

Dynamic Audio Normalizer CLI

The Dynamic Audio Normalizer command-line program (DynamicAudioNormalizerCLI) is released under the GNU General Public License, Version 2.

Dynamic Audio Normalizer - Audio Processing Utility
Copyright (C) 2014 LoRd_MuldeR <mulder2@gmx.de>. Some rights reserved.

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License
as published by the Free Software Foundation; either version 2
of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.

http://www.gnu.org/licenses/gpl-2.0.html

Dynamic Audio Normalizer GUI

The Dynamic Audio Normalizer log viewer program (DynamicAudioNormalizerGUI) is released under the GNU General Public License, Version 3.

Dynamic Audio Normalizer - Audio Processing Utility
Copyright (C) 2014 LoRd_MuldeR <mulder2@gmx.de>. Some rights reserved.

This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License
along with this program.  If not, see <http://www.gnu.org/licenses/>.

http://www.gnu.org/licenses/gpl-3.0.html

Acknowledgement

The Dynamic Audio Normalizer command-line program (DynamicAudioNormalizerCLI) incorporates the following third-party software:

The Dynamic Audio Normalizer log viewer program (DynamicAudioNormalizerGUI) incorporates the following third-party software:


e.o.f.