Audio Terminology Glossary

A – B – C – D – E – F – G – H – I – J – K – L – M

N – O – P – Q – R – S – T – U – V – W – X – Y – Z

Artifact: Undesirable sounds around words, such as random, humming noises and metallic sounding breaths. Artifacts can be added to the original audio from excessive or incorrect noise reduction resulting from technical limitations.

Attack Time: The amount of time it takes for a dynamics processor to begin adjusting gain once the signal exceeds the threshold setting.

Attenuate: To reduce in force, or make quieter.

Bandwidth: A measure of a range of frequencies in Hertz (Hz), or musical octaves. See “Q” also.

Boost: To increase, raise or make louder.

Brickwall Limiting: A type of hard limiting that causes a full square wave effect. See “Limiting” also.

Clipping: Also called “Digital Clipping”, clipping occurs when a digital signal peak reaches or rises above 0dBFS (Decibels Full Scale). This is often interpreted as an undesirable distorted sound, and should always be avoided. To avoid clipping, reduce the signal’s input before the gain stage in which the clipping occurs.

Compressor: A dynamics processor that is used to narrow an audio signal’s overall, dynamic range by reducing the volume of loud portions, while amplifying the quiet portions. Adjustable parameters generally include attack, release, threshold, and make-up gain.

Constant Bitrate (CBR): An encoding standard for audio files that forces all of a codec’s output data to be uniform.

Cut: Remove a portion of audio.

Digital Audio Workstation (DAW): Software designed solely or primarily for recording, editing, and playback of digital audio.

Decay: The progressive reduction in amplitude of a sound or electrical signal over time.

Decibel (dB): The standard unit of measurement used to represent sound volume or sound level. In the digital audio world, it is often assumed that when referring to “dB”, it actually refers to decibels relative to full scale (dBFS), where “0dBFS” represents the maximum possible digital level. This means that measurements in the digital audio realm are generally represented in negative values (-).

Distortion: The audio garble that is heard when an audio waveform has been altered. The distortion, which is undesirable in audiobook narration, usually occurs when the maximum output of an audio system is exceeded.

Dynamic Range: The ratio of the amplitude between the maximum and minimum sound levels in a recording. This ratio is usually expressed in decibels as the difference between the loudest possible undistorted level, and the level of the noise floor.

Edited Master: Raw audio (unprocessed) that has gone through the editing/quality control pass (QC pass) stage. This form of audio has not been processed a.k.a. mastered, but has been edited and corrected (QC pass).

Encoding: The process of converting your uncompressed audio files into a format more suitable for certain applications. In audiobook production, this often means converting WAV files to MP3.

Equalization (EQ): The process of boosting or attenuating frequency ranges for the purpose of enhancing sound.

Fader: Another term used for an audio level control, which today refers to a straight-line slide control, rather than a rotary control.

Frequency: The number of times an event repeats itself in a given period of time. Generally, the time period for audio frequencies is one second. Frequency is measured in cycles per second (Hz), and one Hz equals one cycle per second. One kHz (Kilohertz) is 1,000 cycles per second. The audio frequency range for human hearing is generally 20 Hz to 20,000 Hz. This range covers the fundamental pitch and most overtones of musical instruments.

Gain: The amount of amplification (voltage, current or power) of an audio signal, usually expressed in units of dB, i.e. the ratio of the output level to the input level.

Headroom: A term related to dynamic range expressed in decibels (dB), as the difference between the typical operating level, and the maximum operating level in an audio system. The maximum output level of a Digital Audible Workstation (DAW) is 0dB, though many DAWs have additional headroom built into the master fader which allows sound to be output between +3dBFS and +6dBFS. At Audible Studios, audiobook recordings are limited to a maximum peak of -3dB in order to leave headroom and avoid clipping (distortion caused by audio peaks exceeding 0dB). This limit allows for 3dB of headroom, leaving room for any surprise peaks that may occur when converting or exporting audiobooks to

Interleaved Stereo: A stereo audio file that contains information for the left and right channels as one continuous block of data. If your files must be presented to listeners in stereo, encoding in this manner is required.

Joint Stereo: A type of stereo MP3 format which cycles through several different kinds of processes to determine the most optimal-sounding technique for a given frame of audio. This format is prone to errors and glitches. ACX does not accept Joint Stereo files.

Level: Also referred to as ‘volume’, level is the amount of signal strength or amplitude, especially the average amplitude.

Limiter: A type of compressor with a fast attack and release, and a fixed ratio of 20:1 or greater. The dynamic action effectively prevents the audio signal from rising above the output ceiling setting. See “Brickwall limiting” also.

MP3: A common audio format for consumer audio streaming or storage, as well as the standard of digital audio compression.

Mastering: The process of preparing and transferring an edited and mixed audio file to a data storage device; the source from which all copies will be produced (via methods such as pressing, duplication or replication). Typically, mastering involves dynamic processing, such as limiting, and tonal processing, such as equalization and filtering.

Mono: Single-channel sound playback, usually deriving from a single sound source.

Noise Floor: The level of the noise below the audio signal in decibels (dB). Generally considered to be the audible level of background noise in a recording, where no narration is taking place. See “Room Tone” also.

Noise Reduction (NR): A signal processing function used to reduce the amount of background noise, as well as, to lower the noise floor.

Normalize: The process of increasing all digital samples linearly, by the same amount, in order for the largest original sample to reach a given level, based on a peak or RMs value.

Peak: The maximum instantaneous level of a signal.

Phasing: A phenomenon that occurs when two similar audio signals engage one another in an interfering fashion, causing an undesirable ‘sweeping’ effect. This is most commonly heard when summing a stereo audio file into mono.

Production Master: The final, retail-ready audiobook. At this point, the audio has gone through the editing/quality control pass (QC pass) stage, and has been mastered (processed).

Q: An equalizer control that determines how wide or narrow the bandwidth of a selected frequency range will be.

Raw Audio: Unprocessed recorded audio, and the first state of your audio files before the editing/quality control pass (QC pass).

Root Mean Square (RMS): A conventional way to measure the effective average value of an audio signal as well as the perceived dynamic range values of that signal.

Room Tone: The background noise in a room. For audiobook purposes, room tone should be the resting sound in your studio, and as close to silence as possible.

Signal: A generic name used for audio recording purposes that refers to one of the many forms of sound in the audio chain.

Stereo Interleaved: See “Interleaved Stereo”

Threshold: The level at which a dynamics processor will activate, that is, begin to change gain.

VBR (Variable Bit Rate): An encoding option for audio files which tries to minimize file size by encoding to a sliding quality scale instead of a fixed bitrate. ACX does not accept VBR files.

WAV: The most common uncompressed audio format. You should record, edit, and master your audio as WAV files until you are ready to convert to MP3.