Voice over IP

Voice over Internet Protocol (VoIP^[a]), also called IP telephony, is a method and group of technologies for voice calls for the delivery of voice communication sessions over Internet Protocol (IP) networks,^[2] such as the Internet.

The broader terms Internet telephony, broadband telephony, and broadband phone service specifically refer to the provisioning of voice and other communications services (fax, SMS, voice messaging) over the Internet, rather than via the public switched telephone network (PSTN), also known as plain old telephone service (POTS).

Overview[edit]

The steps and principles involved in originating VoIP telephone calls are similar to traditional digital telephony and involve signaling, channel setup, digitization of the analog voice signals, and encoding. Instead of being transmitted over a circuit-switched network, the digital information is packetized and transmission occurs as IP packets over a packet-switched network. They transport media streams using special media delivery protocols that encode audio and video with audio codecs and video codecs. Various codecs exist that optimize the media stream based on application requirements and network bandwidth; some implementations rely on narrowband and compressed speech, while others support high-fidelity stereo codecs.

The most widely used speech coding standards in VoIP are based on the linear predictive coding (LPC) and modified discrete cosine transform (MDCT) compression methods. Popular codecs include the MDCT-based AAC-LD (used in FaceTime), the LPC/MDCT-based Opus (used in WhatsApp), the LPC-based SILK (used in Skype), μ-law and A-law versions of G.711, G.722, and an open source voice codec known as iLBC, a codec that uses only 8 kbit/s each way called G.729.

Early providers of voice-over-IP services used business models and offered technical solutions that mirrored the architecture of the legacy telephone network. Second-generation providers, such as Skype, built closed networks for private user bases, offering the benefit of free calls and convenience while potentially charging for access to other communication networks, such as the PSTN. This limited the freedom of users to mix-and-match third-party hardware and software. Third-generation providers, such as Google Talk, adopted the concept of federated VoIP.^[3] These solutions typically allow dynamic interconnection between users in any two domains of the Internet, when a user wishes to place a call.

In addition to VoIP phones, VoIP is also available on many personal computers and other Internet access devices. Calls and SMS text messages may be sent via Wi-Fi or the carrier's mobile data network.^[4] VoIP provides a framework for consolidation of all modern communications technologies using a single unified communications system.

Network and transport – Creating reliable transmission over unreliable protocols, which may involve acknowledging receipt of data and retransmitting data that wasn't received.

Session management – Creating and managing a (sometimes glossed as simply a "call"), which is a connection between two or more peers that provides a context for further communication.

session

– Performing registration (advertising one's presence and contact information) and discovery (locating someone and obtaining their contact information), dialing (including reporting call progress), negotiating capabilities, and call control (such as hold, mute, transfer/forwarding, dialing DTMF keys during a call [e.g. to interact with an automated attendant or IVR], etc.).

Signaling

Media description – Determining what type of media to send (audio, video, etc.), how to encode/decode it, and how to send/receive it (IP addresses, ports, etc.).

Media – Transferring the actual media in the call, such as audio, video, text messages, files, etc.

Quality of service – Providing out-of-band content or feedback about the media such as , statistics, etc.

synchronization

Security – Implementing access control, verifying the identity of other participants (computers or people), and encrypting data to protect the privacy and integrity of the media contents and/or the control messages.

Voice over IP has been implemented with proprietary protocols and protocols based on open standards in applications such as VoIP phones, mobile applications, and web-based communications.

A variety of functions are needed to implement VoIP communication. Some protocols perform multiple functions, while others perform only a few and must be used in concert. These functions include:

VoIP protocols include:

Dedicated VoIP phones connect directly to the IP network using technologies such as wired or Wi-Fi. These are typically designed in the style of traditional digital business telephones.

Ethernet

An connects to the network and implements the electronics and firmware to operate a conventional analog telephone attached through a modular phone jack. Some residential Internet gateways and cable modems have this function built in.

analog telephone adapter

application software installed on a networked computer that is equipped with a microphone and speaker, or headset. The application typically presents a dial pad and display field to the user to operate the application by mouse clicks or keyboard input.

Softphone

is an approved amendment to the IEEE 802.11 standard that defines a set of quality-of-service enhancements for wireless LAN applications through modifications to the media access control (MAC) layer. The standard is considered of critical importance for delay-sensitive applications, such as voice over wireless IP.

IEEE 802.11e

defines 8 different classes of service (including one dedicated to voice) for traffic on layer-2 wired Ethernet.

IEEE 802.1p

The G.hn standard, which provides a way to create a high-speed (up to 1 gigabit per second) Local area network (LAN) using existing home wiring (power lines, phone lines and coaxial cables). G.hn provides QoS by means of Contention-Free Transmission Opportunities (CFTXOPs) which are allocated to flows (such as a VoIP call) that require QoS and which have negotiated a contract with the network controllers.

ITU-T

Performance metrics[edit]

The quality of voice transmission is characterized by several metrics that may be monitored by network elements and by the user agent hardware or software. Such metrics include network packet loss, packet jitter, packet latency (delay), post-dial delay, and echo. The metrics are determined by VoIP performance testing and monitoring.^[24]^[25]^[26]^[27]^[28]^[29]

Fax support[edit]

Sending faxes over VoIP networks is sometimes referred to as Fax over IP (FoIP). Transmission of fax documents was problematic in early VoIP implementations, as most voice digitization and compression codecs are optimized for the representation of the human voice and the proper timing of the modem signals cannot be guaranteed in a packet-based, connectionless network.

A standards-based solution for reliably delivering fax-over-IP is the T.38 protocol. The T.38 protocol is designed to compensate for the differences between traditional packet-less communications over analog lines and packet-based transmissions which are the basis for IP communications. The fax machine may be a standard device connected to an analog telephone adapter (ATA), or it may be a software application or dedicated network device operating via an Ethernet interface.^[41] Originally, T.38 was designed to use UDP or TCP transmission methods across an IP network.

Some newer high-end fax machines have built-in T.38 capabilities which are connected directly to a network switch or router. In T.38 each packet contains a portion of the data stream sent in the previous packet. Two successive packets have to be lost to actually lose data integrity.

Power requirements[edit]

Telephones for traditional residential analog service are usually connected directly to telephone company phone lines which provide direct current to power most basic analog handsets independently of locally available electrical power. The susceptibility of phone service to power failures is a common problem even with traditional analog service where customers purchase telephone units that operate with wireless handsets to a base station, or that have other modern phone features, such as built-in voicemail or phone book features.

VoIP phones and VoIP telephone adapters connect to routers or cable modems which typically depend on the availability of mains electricity or locally generated power.^[42] Some VoIP service providers use customer premises equipment (e.g., cable modems) with battery-backed power supplies to assure uninterrupted service for up to several hours in case of local power failures. Such battery-backed devices typically are designed for use with analog handsets. Some VoIP service providers implement services to route calls to other telephone services of the subscriber, such a cellular phone, in the event that the customer's network device is inaccessible to terminate the call.

Security[edit]

Secure calls are possible using standardized protocols such as Secure Real-time Transport Protocol. Most of the facilities of creating a secure telephone connection over traditional phone lines, such as digitizing and digital transmission, are already in place with VoIP. It is necessary only to encrypt and authenticate the existing data stream. Automated software, such as a virtual PBX, may eliminate the need for personnel to greet and switch incoming calls.

The security concerns for VoIP telephone systems are similar to those of other Internet-connected devices. This means that hackers with knowledge of VoIP vulnerabilities can perform denial-of-service attacks, harvest customer data, record conversations, and compromise voicemail messages. Compromised VoIP user account or session credentials may enable an attacker to incur substantial charges from third-party services, such as long-distance or international calling.

The technical details of many VoIP protocols create challenges in routing VoIP traffic through firewalls and network address translators, used to interconnect to transit networks or the Internet. Private session border controllers are often employed to enable VoIP calls to and from protected networks. Other methods to traverse NAT devices involve assistive protocols such as STUN and Interactive Connectivity Establishment (ICE).

Standards for securing VoIP are available in the Secure Real-time Transport Protocol (SRTP) and the ZRTP protocol for analog telephony adapters, as well as for some softphones. IPsec is available to secure point-to-point VoIP at the transport level by using opportunistic encryption. Though many consumer VoIP solutions do not support encryption of the signaling path or the media, securing a VoIP phone is conceptually easier to implement using VoIP than on traditional telephone circuits. A result of the lack of widespread support for encryption is that it is relatively easy to eavesdrop on VoIP calls when access to the data network is possible.^[43] Free open-source solutions, such as Wireshark, facilitate capturing VoIP conversations.

Government and military organizations use various security measures to protect VoIP traffic, such as voice over secure IP (VoSIP), secure voice over IP (SVoIP), and secure voice over secure IP (SVoSIP).^[44] The distinction lies in whether encryption is applied in the telephone endpoint or in the network.^[45] Secure voice over secure IP may be implemented by encrypting the media with protocols such as SRTP and ZRTP. Secure voice over IP uses Type 1 encryption on a classified network, such as SIPRNet.^[46]^[47]^[48]^[49] Public Secure VoIP is also available with free GNU software and in many popular commercial VoIP programs via libraries, such as ZRTP.^[50]

In June 2021, the National Security Agency (NSA) released comprehensive documents describing the four attack planes of a communications system – the network, perimeter, session controllers and endpoints – and explaining security risks and mitigation techniques for each of them.^[51]^[52]

Caller ID[edit]

Voice over IP protocols and equipment provide caller ID support that is compatible with the PSTN. Many VoIP service providers also allow callers to configure custom caller ID information.^[53]

Hearing aid compatibility[edit]

Wireline telephones which are manufactured in, imported to, or intended to be used in the US with Voice over IP service, on or after February 28, 2020, are required to meet the hearing aid compatibility requirements set forth by the Federal Communications Commission.^[54]

Operational cost[edit]

VoIP has drastically reduced the cost of communication by sharing network infrastructure between data and voice.^[55]^[56] A single broadband connection has the ability to transmit multiple telephone calls.

1966: (LPC) proposed by Fumitada Itakura of Nagoya University and Shuzo Saito of Nippon Telegraph and Telephone (NTT).^[80]

Linear predictive coding

1973: application by Danny Cohen.

Packet voice

1974: The (IEEE) publishes a paper entitled "A Protocol for Packet Network Interconnection".^[84]

Institute of Electrical and Electronics Engineers

1974: (NVP) tested over ARPANET in August 1974, carrying barely intelligible 16 kpbs CVSD encoded voice.^[80]

Network Voice Protocol

1974: The first successful real-time conversation over ARPANET achieved using 2.4 kpbs LPC, between Culler-Harrison Incorporated in , and MIT Lincoln Laboratory in Lexington, Massachusetts.^[80]

Goleta, California

1977: Danny Cohen and of the USC Information Sciences Institute, and Vint Cerf of the Defense Advanced Research Projects Agency (DARPA), agree to separate IP from TCP, and create UDP for carrying real-time traffic.

Jon Postel

1981: is described in RFC 791.

IPv4

1985: The commissions the creation of NSFNET.^[85]

National Science Foundation

1985: (CELP), a type of LPC algorithm, developed by Manfred R. Schroeder and Bishnu S. Atal.^[82]

Code-excited linear prediction

1986: Proposals from various standards organizations for , in addition to commercial packet voice products from companies such as StrataCom

Voice over ATM

1991: Speak Freely, a voice-over-IP application, was released to the public domain.^[87]

[86]

1992: The Frame Relay Forum conducts development of standards for voice over .

Frame Relay

1992: announces and launches its desktop conferencing product Communique, which includes VoIP and video.^[86]^[88] The company is credited with developing the first generation of commercial, US-based VoIP, Internet media streaming and real-time Internet telephony/collaborative software and standards that would provide the basis for the Real Time Streaming Protocol (RTSP) standard.

InSoft Inc.

1993 Release of VocalChat, a commercial packet network PC voice communication software from .

VocalTec

1994: MTALK, a freeware LAN VoIP application for ^[89]

Linux

VocalTec

ITU-T

1997: began development of its first softswitch, a term they coined in 1998.^[96]

Level 3

Session Initiation Protocol

2001: , the first inter-provider SIP network is deployed; this is also the first voice network to reach all seven continents.^[102]

INOC-DBA

2003: released in August 2003. This was the creation of Niklas Zennström and Janus Friis, in cooperation with four Estonian developers. It quickly became a popular program that helped democratize VoIP.

Skype

2004: Commercial VoIP service providers proliferate.

2005: VoIP service is launched by TelEvolution, Inc. of California.^[103]

PhoneGnome

2006: wideband codec introduced, using MDCT and CELP (LPC) algorithms.^[104]

G.729.1

2007: VoIP device manufacturers and sellers boom in Asia, specifically in the Philippines where many families of overseas workers reside.

[105]

2009: codec introduced, using LPC algorithm,^[106] and used for voice calling in Skype.^[107]

SILK

2010: introduces FaceTime, which uses the LD-MDCT-based AAC-LD codec.^[108]

Apple

WebRTC

2012: codec introduced, using MDCT and LPC algorithms.^[110]

Opus

The dictionary definition of VoIP at Wiktionary

Internet telephony travel guide from Wikivoyage