Phoneme

In linguistics and specifically phonology, a phoneme (/ˈfoʊniːm/) is any set of similar phones (speech sounds) that, within a given language, is perceptually regarded as a single distinct sound and helps distinguish one word from another.^[1]

This article is about the speech unit. For the JavaME library, see phoneME. For the collection of phenotypes, see phenome.

For example, in dialects of English, the sound patterns /sɪn/ (sin) and /sɪŋ/ (sing) are two separate words that are entirely distinguished by the substitution of one phoneme, /n/, for another phoneme, /ŋ/.^[a] Two words like this that differ in meaning through the contrast of a single phoneme form a minimal pair. If, in another language, any two sequences differing only by pronunciation of the final sounds [n] or [ŋ] are perceived as being the same in meaning, then these two sounds are interpreted in that language as phonetic variants of a single phoneme, which linguists call allophones. For example, the sound sequences [pan] and [paŋ] are interpreted in Spanish as the same word (pan: the Spanish word for bread) because in Spanish, unlike in English, [n] and [ŋ] are not separate phonemes but rather regional or dialect-specific allophones of the same phoneme. In the International Phonetic Alphabet (IPA), linguists use slashes to transcribe phonemes but square brackets to transcribe more exact pronunciation details; they describe this basic distinction as phonemic versus phonetic. Thus, minimal pairs, such as tap vs tab, or pat vs bat, can be transcribed phonemically and are written between slashes (including /p/, /b/, etc.), while nuances of exactly how a speaker pronounces /p/ are phonetic and written between brackets, such as [p] (for the p in spit) versus [pʰ] (for the p in pit, which in English is an aspirated allophone of /p/: pronounced with an extra burst of air).

There are differing views as to exactly what phonemes are and how a given language should be analyzed in phonemic (or phonematic) terms. Generally, a phoneme is regarded as an abstraction of a set (or equivalence class) of spoken sound variations (phones) that are nevertheless perceived as a single unit, a single sound, by the ordinary native speakers of a given language. For example, in American English, the sound spelled with the symbol t is usually articulated with: a glottal stop [ʔ] (or a similar glottalized sound) in the word cat, an alveolar flap [ɾ] in dating, an alveolar plosive [t] in stick, and an aspirated alveolar plosive [tʰ] in tie; however, English speakers perceive or "hear" all of these sounds as merely being variants (allophones) of a single phoneme that is traditionally transcribed as /t/. Allophones each have technically different articulations, yet their differences do not create meaningful distinctions between words. Sometimes, allophonic variation is conditioned, in which case a certain phoneme is realized as a certain allophone in particular phonological environments, or it may otherwise be free, or may vary by exact dialect or even by speaker. Therefore, phonemes are often considered to constitute an abstract underlying representation for segments of words, while speech sounds make up the corresponding phonetic realization, or the surface form.

Notation[edit]

Phonemes are conventionally placed between slashes in transcription, whereas speech sounds (phones) are placed between square brackets. Thus, /pʊʃ/ represents a sequence of three phonemes, /p/, /ʊ/, /ʃ/ (the word push in Standard English), and [pʰʊʃ] represents the phonetic sequence of sounds [pʰ] (aspirated p), [ʊ], [ʃ] (the usual pronunciation of push). This should not be confused with the similar convention of the use of angle brackets to enclose the units of orthography, graphemes. For example, ⟨f⟩ represents the written letter (grapheme) f.

The symbols used for particular phonemes are often taken from the International Phonetic Alphabet (IPA), the same set of symbols most commonly used for phones. For computer-typing purposes, systems such as X-SAMPA exist to represent IPA symbols using only ASCII characters. However, descriptions of particular languages may use different conventional symbols to represent the phonemes of those languages. For languages whose writing systems employ the phonemic principle, ordinary letters may be used to denote phonemes, although this approach is often hampered by the complexity of the relationship between orthography and pronunciation (see § Correspondence between letters and phonemes below).

Distribution of allophones[edit]

When a phoneme has more than one allophone, the one actually heard at a given occurrence of that phoneme may be dependent on the phonetic environment (surrounding sounds). Allophones that normally cannot appear in the same environment are said to be in complementary distribution. In other cases, the choice of allophone may be dependent on the individual speaker or other unpredictable factors. Such allophones are said to be in free variation, but allophones are still selected in a specific phonetic context, not the other way around.

Background and related ideas[edit]

The term phonème (from Ancient Greek: φώνημα, romanized: phōnēma, "sound made, utterance, thing spoken, speech, language"^[6]) was reportedly first used by A. Dufriche-Desgenettes in 1873, but it referred only to a speech sound. The term phoneme as an abstraction was developed by the Polish linguist Jan Baudouin de Courtenay and his student Mikołaj Kruszewski during 1875–1895.^[7] The term used by these two was fonema, the basic unit of what they called psychophonetics. Daniel Jones became the first linguist in the western world to use the term phoneme in its current sense, employing the word in his article "The phonetic structure of the Sechuana Language".^[8] The concept of the phoneme was then elaborated in the works of Nikolai Trubetzkoy and others of the Prague School (during the years 1926–1935), and in those of structuralists like Ferdinand de Saussure, Edward Sapir, and Leonard Bloomfield. Some structuralists (though not Sapir) rejected the idea of a cognitive or psycholinguistic function for the phoneme.^[9]^[10]

Later, it was used and redefined in generative linguistics, most famously by Noam Chomsky and Morris Halle,^[11] and remains central to many accounts of the development of modern phonology. As a theoretical concept or model, though, it has been supplemented and even replaced by others.^[12]

Some linguists (such as Roman Jakobson and Morris Halle) proposed that phonemes may be further decomposable into features, such features being the true minimal constituents of language.^[13] Features overlap each other in time, as do suprasegmental phonemes in oral language and many phonemes in sign languages. Features could be characterized in different ways: Jakobson and colleagues defined them in acoustic terms,^[14] Chomsky and Halle used a predominantly articulatory basis, though retaining some acoustic features, while Ladefoged's system^[15] is a purely articulatory system apart from the use of the acoustic term 'sibilant'.

In the description of some languages, the term chroneme has been used to indicate contrastive length or duration of phonemes. In languages in which tones are phonemic, the tone phonemes may be called tonemes. Though not all scholars working on such languages use these terms, they are by no means obsolete.

By analogy with the phoneme, linguists have proposed other sorts of underlying objects, giving them names with the suffix -eme, such as morpheme and grapheme. These are sometimes called emic units. The latter term was first used by Kenneth Pike, who also generalized the concepts of emic and etic description (from phonemic and phonetic respectively) to applications outside linguistics.^[16]

/ŋ/, as in sing, occurs only at the end of a syllable, never at the beginning (in many other languages, such as , Swahili, Tagalog, Thai, and Setswana, /ŋ/ can appear word-initially).

Māori

/h/ occurs only at the beginning of a syllable, never at the end (a few languages, such as and Romanian, allow /h/ syllable-finally).

Arabic

In , /ɹ/ can occur immediately only before a vowel, never before a consonant.

non-rhotic dialects

/w/ and /j/ occur only before a vowel, never at the end of a syllable (except in interpretations in which a word like boy is analyzed as /bɔj/).

Languages do not generally allow words or syllables to be built of any arbitrary sequences of phonemes. There are phonotactic restrictions on which sequences of phonemes are possible and in which environments certain phonemes can occur. Phonemes that are significantly limited by such restrictions may be called restricted phonemes.

In English, examples of such restrictions include the following:

Some phonotactic restrictions can alternatively be analyzed as cases of neutralization. See Neutralization and archiphonemes below, particularly the example of the occurrence of the three English nasals before stops.

Biuniqueness[edit]

Biuniqueness is a requirement of classic structuralist phonemics. It means that a given phone, wherever it occurs, must unambiguously be assigned to one and only one phoneme. In other words, the mapping between phones and phonemes is required to be many-to-one rather than many-to-many. The notion of biuniqueness was controversial among some pre-generative linguists and was prominently challenged by Morris Halle and Noam Chomsky in the late 1950s and early 1960s.

An example of the problems arising from the biuniqueness requirement is provided by the phenomenon of flapping in North American English. This may cause either /t/ or /d/ (in the appropriate environments) to be realized with the phone [ɾ] (an alveolar flap). For example, the same flap sound may be heard in the words hitting and bidding, although it is intended to realize the phoneme /t/ in the first word and /d/ in the second. This appears to contradict biuniqueness.

For further discussion of such cases, see the next section.

Numbers of phonemes in different languages[edit]

All known languages use only a small subset of the many possible sounds that the human speech organs can produce, and, because of allophony, the number of distinct phonemes will generally be smaller than the number of identifiably different sounds. Different languages vary considerably in the number of phonemes they have in their systems (although apparent variation may sometimes result from the different approaches taken by the linguists doing the analysis). The total phonemic inventory in languages varies from as few as 9–11 in Pirahã and 11 in Rotokas to as many as 141 in ǃXũ.^[19]^[20]^[21]

The number of phonemically distinct vowels can be as low as two, as in Ubykh and Arrernte. At the other extreme, the Bantu language Ngwe has 14 vowel qualities, 12 of which may occur long or short, making 26 oral vowels, plus six nasalized vowels, long and short, making a total of 38 vowels; while !Xóõ achieves 31 pure vowels, not counting its additional variation by vowel length, by varying the phonation. As regards consonant phonemes, Puinave and the Papuan language Tauade each have just seven, and Rotokas has only six. !Xóõ, on the other hand, has somewhere around 77, and Ubykh 81. The English language uses a rather large set of 13 to 21 vowel phonemes, including diphthongs, although its 22 to 26 consonants are close to average. Across all languages, the average number of consonant phonemes per language is about 22, while the average number of vowel phonemes is about 8.^[22]

Some languages, such as French, have no phonemic tone or stress, while Cantonese and several of the Kam–Sui languages have six to nine tones (depending on how they are counted), and the Kam-Sui Dong language has nine to 15 tones by the same measure. One of the Kru languages, Wobé, has been claimed to have 14,^[23] though this is disputed.^[24]

The most common vowel system consists of the five vowels /i/, /e/, /a/, /o/, /u/. The most common consonants are /p/, /t/, /k/, /m/, /n/.^[25] Relatively few languages lack any of these consonants, although it does happen: for example, Arabic lacks /p/, standard Hawaiian lacks /t/, Mohawk and Tlingit lack /p/ and /m/, Hupa lacks both /p/ and a simple /k/, colloquial Samoan lacks /t/ and /n/, while Rotokas and Quileute lack /m/ and /n/.

The non-uniqueness of phonemic solutions[edit]

During the development of phoneme theory in the mid-20th century, phonologists were concerned not only with the procedures and principles involved in producing a phonemic analysis of the sounds of a given language, but also with the reality or uniqueness of the phonemic solution. These were central concerns of phonology. Some writers took the position expressed by Kenneth Pike: "There is only one accurate phonemic analysis for a given set of data",^[26] while others believed that different analyses, equally valid, could be made for the same data. Yuen Ren Chao (1934), in his article "The non-uniqueness of phonemic solutions of phonetic systems"^[27] stated "given the sounds of a language, there are usually more than one possible way of reducing them to a set of phonemes, and these different systems or solutions are not simply correct or incorrect, but may be regarded only as being good or bad for various purposes". The linguist F. W. Householder referred to this argument within linguistics as "God's Truth" (i.e. the stance that a given language has an intrinsic structure to be discovered) vs. "hocus-pocus" (i.e. the stance that any proposed, coherent structure is as good as any other).^[28]

Different analyses of the English vowel system may be used to illustrate this. The article English phonology states that "English has a particularly large number of vowel phonemes" and that "there are 20 vowel phonemes in Received Pronunciation, 14–16 in General American and 20–21 in Australian English". Although these figures are often quoted as fact, they actually reflect just one of many possible analyses, and later in the English Phonology article an alternative analysis is suggested in which some diphthongs and long vowels may be interpreted as comprising a short vowel linked to either /j/ or /w/. The fullest exposition of this approach is found in Trager and Smith (1951), where all long vowels and diphthongs ("complex nuclei") are made up of a short vowel combined with either /j/, /w/ or /h/ (plus /r/ for rhotic accents), each comprising two phonemes.^[29] The transcription for the vowel normally transcribed /aɪ/ would instead be /aj/, /aʊ/ would be /aw/ and /ɑː/ would be /ah/, or /ar/ in a rhotic accent if there is an ⟨r⟩ in the spelling. It is also possible to treat English long vowels and diphthongs as combinations of two vowel phonemes, with long vowels treated as a sequence of two short vowels, so that 'palm' would be represented as /paam/. English can thus be said to have around seven vowel phonemes, or even six if schwa were treated as an allophone of /ʌ/ or of other short vowels.

In the same period there was disagreement about the correct basis for a phonemic analysis. The structuralist position was that the analysis should be made purely on the basis of the sound elements and their distribution, with no reference to extraneous factors such as grammar, morphology or the intuitions of the native speaker; this position is strongly associated with Leonard Bloomfield.^[30] Zellig Harris claimed that it is possible to discover the phonemes of a language purely by examining the distribution of phonetic segments.^[31] Referring to mentalistic definitions of the phoneme, Twaddell (1935) stated "Such a definition is invalid because (1) we have no right to guess about the linguistic workings of an inaccessible 'mind', and (2) we can secure no advantage from such guesses. The linguistic processes of the 'mind' as such are quite simply unobservable; and introspection about linguistic processes is notoriously a fire in a wooden stove."^[9] This approach was opposed to that of Edward Sapir, who gave an important role to native speakers' intuitions about where a particular sound or group of sounds fitted into a pattern. Using English [ŋ] as an example, Sapir argued that, despite the superficial appearance that this sound belongs to a group of three nasal consonant phonemes (/m/, /n/ and /ŋ/), native speakers feel that the velar nasal is really the sequence [ŋɡ]/.^[32] The theory of generative phonology which emerged in the 1960s explicitly rejected the structuralist approach to phonology and favoured the mentalistic or cognitive view of Sapir.^[33]^[11]

These topics are discussed further in English phonology#Controversial issues.