Pulse-code modulation

Pulse-code modulation (PCM) is a method used to digitally represent analog signals. It is the standard form of digital audio in computers, compact discs, digital telephony and other digital audio applications. In a PCM stream, the amplitude of the analog signal is sampled at uniform intervals, and each sample is quantized to the nearest value within a range of digital steps.

"PCM" redirects here. For other uses, see PCM (disambiguation).

Filename extension

.L16, .WAV, .AIFF, .AU, .PCM^[1]

audio/L16, audio/L8,^[2] audio/L20, audio/L24^[3]^[4]

"AIFF" for L16,^[1] none^[3]

Varies

Uncompressed audio

Audio CD, AES3, WAV, AIFF, AU, M2TS, VOB, and many others

Yes

Yes^[5]

Linear pulse-code modulation (LPCM) is a specific type of PCM in which the quantization levels are linearly uniform.^[5] This is in contrast to PCM encodings in which quantization levels vary as a function of amplitude (as with the A-law algorithm or the μ-law algorithm). Though PCM is a more general term, it is often used to describe data encoded as LPCM.

A PCM stream has two basic properties that determine the stream's fidelity to the original analog signal: the sampling rate, which is the number of times per second that samples are taken; and the bit depth, which determines the number of possible digital values that can be used to represent each sample.

The introduced time-division switching into the US telephone system in 1976, based on medium scale integrated circuit technology.^[22]

4ESS switch

LPCM is used for the lossless encoding of audio data in the compact disc (informally also known as Audio CD), introduced in 1982.

Red Book standard

(specified in 1985, upon which S/PDIF is based) is a particular format using LPCM.

AES3

with digital sound have an LPCM track on the digital channel.

LaserDiscs

On PCs, PCM and LPCM often refer to the format used in (defined in 1991) and AIFF audio container formats (defined in 1988). LPCM data may also be stored in other formats such as AU, raw audio format (header-less file) and various multimedia container formats.

WAV

LPCM has been defined as a part of the (since 1995) and Blu-ray (since 2006) standards.^[23]^[24]^[25] It is also defined as a part of various digital video and audio storage formats (e.g. DV since 1995,^[26] AVCHD since 2006^[27]).

DVD

LPCM is used by (defined in 2002), a single-cable digital audio/video connector interface for transmitting uncompressed digital data.

HDMI

container format (defined in 2007) uses LPCM and also allows non-PCM bitstream storage: various compression formats contained in the RF64 file as data bursts (Dolby E, Dolby AC3, DTS, MPEG-1/MPEG-2 Audio) can be "disguised" as PCM linear.^[28]

RF64

PCM is the method of encoding typically used for uncompressed digital audio.^{[note 3]}

Demodulation[edit]

The electronics involved in producing an accurate analog signal from the discrete data are similar to those used for generating the digital signal. These devices are digital-to-analog converters (DACs). They produce a voltage or current (depending on type) that represents the value presented on their digital inputs. This output would then generally be filtered and amplified for use.

To recover the original signal from the sampled data, a demodulator can apply the procedure of modulation in reverse. After each sampling period, the demodulator reads the next value and transitions the output signal to the new value. As a result of these transitions, the signal retains a significant amount of high-frequency energy due to imaging effects. To remove these undesirable frequencies, the demodulator passes the signal through a reconstruction filter that suppresses energy outside the expected frequency range (greater than the Nyquist frequency $f_{s}/2$ ).^{[note 4]}

Standard sampling precision and rates[edit]

Common sample depths for LPCM are 8, 16, 20 or 24 bits per sample.^[1]^[2]^[3]^[29]

LPCM encodes a single sound channel. Support for multichannel audio depends on file format and relies on synchronization of multiple LPCM streams.^[5]^[30] While two channels (stereo) is the most common format, systems can support up to 8 audio channels (7.1 surround)^[2]^[3] or more.

Common sampling frequencies are 48 kHz as used with DVD format videos, or 44.1 kHz as used in CDs. Sampling frequencies of 96 kHz or 192 kHz can be used on some equipment, but the benefits have been debated.^[31]

Choosing a discrete value that is near but not exactly at the analog signal level for each sample leads to .^{[note 5]}

quantization error

Between samples no measurement of the signal is made; the sampling theorem guarantees non-ambiguous representation and recovery of the signal only if it has no energy at frequency f_s/2 or higher (one half the sampling frequency, known as the ); higher frequencies will not be correctly represented or recovered and add aliasing distortion to the signal below the Nyquist frequency.

Nyquist frequency

As samples are dependent on time, an accurate clock is required for accurate reproduction. If either the encoding or decoding clock is not stable, these imperfections will directly affect the output quality of the device.

[note 6]

The Nyquist–Shannon sampling theorem shows PCM devices can operate without introducing distortions within their designed frequency bands if they provide a sampling frequency at least twice that of the highest frequency contained in the input signal. For example, in telephony, the usable voice frequency band ranges from approximately 300 Hz to 3400 Hz.^[32] For effective reconstruction of the voice signal, telephony applications therefore typically use an 8000 Hz sampling frequency which is more than twice the highest usable voice frequency.

Regardless, there are potential sources of impairment implicit in any PCM system:

Linear PCM (LPCM) is PCM with linear quantization.

[5]

(DPCM) encodes the PCM values as differences between the current and the predicted value. An algorithm predicts the next sample based on the previous samples, and the encoder stores only the difference between this prediction and the actual value. If the prediction is reasonable, fewer bits can be used to represent the same information. For audio, this type of encoding reduces the number of bits required per sample by about 25% compared to PCM.

Differential PCM

(ADPCM) is a variant of DPCM that varies the size of the quantization step, to allow further reduction of the required bandwidth for a given signal-to-noise ratio.

Adaptive differential pulse-code modulation

is a form of DPCM that uses one bit per sample to indicate whether the signal is increasing or decreasing compared to the previous sample.

Delta modulation

Some forms of PCM combine signal processing with coding. Older versions of these systems applied the processing in the analog domain as part of the analog-to-digital process; newer implementations do so in the digital domain. These simple techniques have been largely rendered obsolete by modern transform-based audio compression techniques, such as modified discrete cosine transform (MDCT) coding.

In telephony, a standard audio signal for a single phone call is encoded as 8,000 samples per second, of 8 bits each, giving a 64 kbit/s digital signal known as DS0. The default signal compression encoding on a DS0 is either μ-law (mu-law) PCM (North America and Japan) or A-law PCM (Europe and most of the rest of the world). These are logarithmic compression systems where a 12- or 13-bit linear PCM sample number is mapped into an 8-bit value. This system is described by international standard G.711.

Where circuit costs are high and loss of voice quality is acceptable, it sometimes makes sense to compress the voice signal even further. An ADPCM algorithm is used to map a series of 8-bit μ-law or A-law PCM samples into a series of 4-bit ADPCM samples. In this way, the capacity of the line is doubled. The technique is detailed in the G.726 standard.

Audio coding formats and audio codecs have been developed to achieve further compression. Some of these techniques have been standardized and patented. Advanced compression techniques, such as modified discrete cosine transform (MDCT) and linear predictive coding (LPC), are now widely used in mobile phones, voice over IP (VoIP) and streaming media.

Nomenclature[edit]

The word pulse in the term pulse-code modulation refers to the pulses to be found in the transmission line. This perhaps is a natural consequence of this technique having evolved alongside two analog methods, pulse-width modulation and pulse-position modulation, in which the information to be encoded is represented by discrete signal pulses of varying width or position, respectively. In this respect, PCM bears little resemblance to these other forms of signal encoding, except that all can be used in time-division multiplexing, and the numbers of the PCM codes are represented as electrical pulses.

Beta encoder

Equivalent pulse code modulation noise

(SQNR), one method of measuring quantization error

Signal-to-quantization-noise ratio

; Ignatius Mattingly (1969). "Computer-controlled PCM system for investigation of dichotic speech perception". Journal of the Acoustical Society of America. 46 (1A): 115. Bibcode:1969ASAJ...46..115C. doi:10.1121/1.1972688.

Franklin S. Cooper

Ken C. Pohlmann (1985). (2nd ed.). Carmel, Indiana: Sams/Prentice-Hall Computer Publishing. ISBN 978-0-672-22634-2.

Principles of Digital Audio

, E. R. Wiley, Philip E. Rubin, and Franklin S. Cooper (1990). "The Haskins Laboratories pulse code modulation (PCM) system". Behavior Research Methods, Instruments, and Computers. 22 (6): 550–559. doi:10.3758/BF03204440.{{cite journal}}: CS1 maint: multiple names: authors list (link)

D. H. Whalen

Bill Waggener (1995). Pulse Code Modulation Techniques (1st ed.). New York, NY: Van Nostrand Reinhold. 978-0-442-01436-0.

ISBN

Bill Waggener (1999). Pulse Code Modulation Systems Design (1st ed.). Boston, MA: Artech House. 978-0-89006-776-5.

ISBN

PCM description on MultimediaWiki

and Bob Badgley invented multi-level PCM independently in their work at Bell Labs on SIGSALY: U.S. patent 3,912,868 filed in 1943: N-ary Pulse Code Modulation.

Ralph Miller

: A description of PCM with links to information about subtypes of this format (for example linear pulse-code modulation), and references to their specifications.

Information about PCM

– Contains links to information about implementations and their specifications.

Summary of LPCM

– Contains information about, and specifications for the implementation of LPCM used in WAV files.

How to control internal/external hardware using Microsoft's Media Control Interface

– audio/L8 and audio/L16 (March 2007)

RFC 4856 – Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences

(January 2002)

RFC 3190 – RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio

– L8 and L16 (July 2003)

Pulse-code modulation

Filename extension

Filename extension

Internet media type

Type code

Magic number

Type of format

Contained by

Open format?

Free format?

4ESS switch

Red Book standard

AES3

LaserDiscs

WAV

DVD

HDMI

RF64

Demodulation[edit]

Standard sampling precision and rates[edit]

quantization error

Nyquist frequency

[note 6]

[5]

Differential PCM

Adaptive differential pulse-code modulation

Delta modulation

Nomenclature[edit]

Beta encoder

Equivalent pulse code modulation noise

Signal-to-quantization-noise ratio

Franklin S. Cooper

Principles of Digital Audio

D. H. Whalen

ISBN

ISBN

PCM description on MultimediaWiki

Ralph Miller

Information about PCM

Summary of LPCM

How to control internal/external hardware using Microsoft's Media Control Interface

RFC 4856 – Media Type Registration of Payload Formats in the RTP Profile for Audio and Video Conferences

RFC 3190 – RTP Payload Format for 12-bit DAT Audio and 20- and 24-bit Linear Sampled Audio

RFC 3551 – RTP Profile for Audio and Video Conferences with Minimal Control