Audio bit depth
In digital audio using pulse-code modulation (PCM), bit depth is the number of bits of information in each sample, and it directly corresponds to the resolution of each sample. Examples of bit depth include Compact Disc Digital Audio, which uses 16 bits per sample, and DVD-Audio and Blu-ray Disc, which can support up to 24 bits per sample.
For other uses of "8-bit music", see chiptune.
In basic implementations, variations in bit depth primarily affect the noise level from quantization error—thus the signal-to-noise ratio (SNR) and dynamic range. However, techniques such as dithering, noise shaping, and oversampling can mitigate these effects without changing the bit depth. Bit depth also affects bit rate and file size.
Bit depth is useful for describing PCM digital signals. Non-PCM formats, such as those using lossy compression, do not have associated bit depths.[a]
Binary representation[edit]
A PCM signal is a sequence of digital audio samples containing the data providing the necessary information to reconstruct the original analog signal. Each sample represents the amplitude of the signal at a specific point in time, and the samples are uniformly spaced in time. The amplitude is the only information explicitly stored in the sample, and it is typically stored as either an integer or a floating-point number, encoded as a binary number with a fixed number of digits – the sample's bit depth, also referred to as word length or word size.
The resolution indicates the number of discrete values that can be represented over the range of analog values. The resolution of binary integers increases exponentially as the word length increases: adding one bit doubles the resolution, adding two quadruples it, and so on. The number of possible values that an integer bit depth can represent can be calculated by using 2n, where n is the bit depth.[1] Thus, a 16-bit system has a resolution of 65,536 (216) possible values.
Integer PCM audio data is typically stored as signed numbers in two's complement format.[2]
Today, most audio file formats and digital audio workstations (DAWs) support PCM formats with samples represented by floating-point numbers.[3][4][5][6] Both the WAV file format and the AIFF file format support floating-point representations.[7][8] Unlike integers, whose bit pattern is a single series of bits, a floating-point number is instead composed of separate fields whose mathematical relation forms a number. The most common standard is IEEE 754, which is composed of three fields: a sign bit representing whether the number is positive or negative, a mantissa, and an exponent determining a power-of-two factor to scale the mantissa. The mantissa is expressed as a binary fraction in IEEE base-two floating-point formats.[9]
Floating point[edit]
The resolution of floating-point samples is less straightforward than integer samples because floating-point values are not evenly spaced. In floating-point representation, the space between any two adjacent values is in proportion to the value.
The trade-off between floating-point and integer formats is that the space between large floating-point values is greater than the space between large integer values of the same bit depth. Rounding a large floating-point number results in a greater error than rounding a small floating-point number whereas rounding an integer number will always result in the same level of error. In other words, integers have a round-off that is uniform, always rounding the LSB to 0 or 1, and the floating-point format has uniform SNR, the quantization noise level is always of a certain proportion to the signal level.[21] A floating-point noise floor rises as the signal rises and falls as the signal falls, resulting in audible variance if the bit depth is low enough.[22]
Audio processing[edit]
Most processing operations on digital audio involve the re-quantization of samples and thus introduce additional rounding errors analogous to the original quantization error introduced during analog-to-digital conversion. To prevent rounding errors larger than the implicit error during ADC, calculations during processing must be performed at higher precisions than the input samples.[23]
Digital signal processing (DSP) operations can be performed in either fixed-point or floating-point precision. In either case, the precision of each operation is determined by the precision of the hardware operations used to perform each step of the processing and not the resolution of the input data. For example, on x86 processors, floating-point operations are performed with single or double precision, and fixed-point operations at 16-, 32- or 64-bit resolution. Consequently, all processing performed on Intel-based hardware will be performed with these constraints regardless of the source format.[c]
Fixed-point digital signal processors often supports specific word lengths to support specific signal resolutions. For example, the Motorola 56000 DSP chip uses 24-bit multipliers and 56-bit accumulators to perform multiply-accumulate operations on two 24-bit samples without overflow or truncation.[24] On devices that do not support large accumulators, fixed-point results may be truncated, reducing precision. Errors compound through multiple stages of DSP at a rate that depends on the operations being performed. For uncorrelated processing steps on audio data without a DC offset, errors are assumed to be random with zero means. Under this assumption, the standard deviation of the distribution represents the error signal, and quantization error scales with the square root of the number of operations.[25] High levels of precision are necessary for algorithms that involve repeated processing, such as convolution.[23] High levels of precision are also necessary in recursive algorithms, such as infinite impulse response (IIR) filters.[26] In the particular case of IIR filters, rounding error can degrade frequency response and cause instability.[23]
Bit rate and file size[edit]
Bit depth affects bit rate and file size. Bits are the basic unit of data used in computing and digital communications. Bit rate refers to the amount of data, specifically bits, transmitted or received per second. In MP3 and other lossy compressed audio formats, bit rate describes the amount of information used to encode an audio signal. It is usually measured in kb/s.[51]