eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

phonemes.md 31KB

Phonemes


Evan Kirshenbaum created an ASCII transcription of the International Phonetic Alphabet (IPA)[1], [2]. As well as using ASCII characters for specific IPA phonemes, this transcription provides a set of 3-letter feature abbreviations allowing a phoneme to be described as a sequence of features.

This document describes the IPA phonemes using the features used by Kirshenbaum. Where Kirshenbaum does not specify a feature name, the feature name from Cainteoir Text-to-Speech[5] is used. This is to provide a consistent naming scheme for the extended feature set. Where there is still no feature available, a custom 3-letter feature name is chosen.

The aim of the feature set described in this document is to specify the underlying phonetics and phonemics of the sounds being produced in a way that is consistent between languages and voices. While this feature set is modelled on the IPA, it is not meant to be able to preserve phoneme transcriptions when using a transcription as both the input and output phoneme sets. This document provides commentary on the intended usage of these features where there is ambiguity from the associated IPA usage between authors.

This document is grouped into two sections. The first section displays the IPA charts using the feature names instead of their names, showing the IPA phoneme at that position in the chart. This makes it easier to look up the features for a given IPA phoneme.

The second section lists the features and their associated name. This section does not describe what these mean. Their meaning is described in phonetics articles, books and Wikipedia. The Wikipedia IPA[4] article can be used as a starting point, as it links to topics and descriptions of the various phonemes.

The diacritics and suprasegmental feature lists also show their corresponding IPA symbol. This is to avoid duplicating the lists in the IPA Phonemes and Feature sections.

IPA Phonemes

Consonants (Pulmonic)

blb lbd dnt alv pla rfx alp pal vel uvl phr glt
nas m ɱ n ɳ ɲ ŋ ɴ
stp pb td ʈɖ cɟ kɡ qɢ ʡ ʔ
sib afr t͡sd͡z t͡ʃd͡ʒ ʈ͡ʂɖ͡ʐ t͡ɕd͡ʑ
afr p͡ɸb͡β p̪͡fb̪͡v t͡θd͡ð c͡çɟ͡ʝ k͡xɡ͡ɣ q͡χɢ͡ʁ ʡ͡ħʡ͡ʕ ʔ͡h
lat afr t͡ɬd͡ɮ
sib frc sz ʃʒ ʂʐ ɕʑ
frc ɸβ fv θð çʝ xɣ χʁ ħʕ hɦ
lat frc ɬɮ
apr ʋ ɹ ɻ j ɰ
lat apr l ɭ ʎ ʟ
flp ɾ ɽ
lat flp ɺ
trl ʙ r ʀ ʜʢ

Symbols to the left have a vls phonation, and to the right have mdv phonation.

Consonants (Non-Pulmonic)

blb lbd dnt alv pla rfx pal vel uvl phr glt
clk ʘ ǀ ǃ ǂ
lat clk ǁ
mdv imp ɓ ɗ ʄ ɠ ʛ
ejc ʈʼ ʡʼ
ejc frc θʼ ʃʼ ʂʼ χʼ
lat ejc frc ɬʼ

Other Symbols

Symbol Alternative Features
ʍ ɰ̊ʷ vls vel ptr apr
w ɰʷ mdv vel ptr apr
ɥ mdv pal ptr apr
ɧ vls vzd pla frc
ɫ mdv fzd alv lat apr
ɚ unr mid cnt rzd vwl
ɝ unr lmd cnt rzd vwl
k͡p vls lbv stp
ɡ͡b mdv lbv stp
ŋ͡m mdv lbv stp
p͡f vls bld afr
b͡v mdv bld afr
Gemination

Gemination is found in several languages including Italian and Japanese. It is also present in the suprasegmental phonology between words such as “lamppost” and “evenness”.

Some linguists use the long suprasegmental for geminate consonants. The eSpeak NG convention is to use consonant length for phonation when consonant length is distinct without gemination occurring.

The way gemination is represented in eSpeak NG is to duplicate the phonemes, with the first phoneme using the unx feature. For example, n̚.n for a geminated n. This describes how with the stp and nas consonants, the mouth remains closed (unx) for the first of the geminated consonants.

Vowels

fnt cnt bck
hgh iy ɨʉ ɯu
smh ɪʏ ʊ
umd eø ɘɵ ɤo
mid ə
lmd ɛœ ɜɞ ʌɔ
sml æ ɐ
low aɶ ɑɒ

Symbols to the left are unr, and to the right are rnd.

NOTE: The smh vowels are more cnt than the other vowels. However, this distinction is not needed to classify these vowels, so is not included in the above table.

Positioning Diacritics

The following IPA diacritics are only used by eSpeak NG to fill out positions in the IPA consonant and vowel charts. As such those phonemes are transcribed according to the features at that position, not using the features at the location of the base phoneme with a feature for each of the positioning diacritics.

Symbol Name
◌̟ advanced
◌̠ retracted
◌̈ centralized
◌̽ mid-centralized
◌̝ raised
◌̞ lowered

Features

Manner of Articulation

The manner of articulation is described in terms of several distinct feature types. The possible manners of articulation are:

Manner of Articulation Feature Symbol Features
nasal nas pmc egs nsl occ
plosive (stop) stp pmc egs orl occ
affricate afr pmc egs orl occ frr
fricative frc pmc egs orl frv
tap/flap flp pmc egs orl fla
trill trl pmc egs orl tri
approximant apr pmc egs orl app
click clk vlc igs orl
ejective ejc vlc igs orl occ
implosive imp ◌ʼ gtc igs
vowel vwl pmc egs orl vow

The features for these manners of articulation are provided for convenience, and to make it easier to describe the IPA consonants. Internally, the distinct feature types are used.

For imp consonants, they use the features of the base phoneme except for the pmc and egs features. Thus, a nas imp is a gtc igs nsl occ.

The vwl phonemes are described using vowel height and backness features, while consonants (the other manners of articulation) are described using place of articulation features.

Additionally, the manner of articulation can be refined using the following features:

Feature Name Description
lat lateral The air flow is directed along the sides of the tongue.
sib sibilant The air flow is directed through the teeth with the tongue.

Air Flow

Feature Symbol Name Description
egs egressive The air flow is moving outwards from the initiator to the target.
igs ingressive The air flow is moving inwards from the target to the initiator.

The ↑ and ↓ symbols are from the extended IPA[7]. They only need to be used when the air flow is different to the base IPA phoneme (e.g. using ↓ on pulmonic consonants).

Initiator

Feature Name Description
pmc pulmonic The diaphragm and lungs are used to generate the airstream.
gtc glottalic The glottis is used to generate the airstream.
vlc velaric The velum is closed and the tongue is used to generate the airstream.
pcv percussive There is no airstream used to produce this sound.

Target

Feature Name Description
nsl nasal The air flows through the nose.
orl oral The air flows through the mouth.

Manner

Feature Name Description
occ occlusive The air flow is blocked within the vocal tract.
frv fricative The air flow is constricted, causing turbulence.
fla flap A single tap of the tongue against the secondary articulator.
tri trill A rapid vibration of the primary articulator against the secondary articulator.
app approximant The vocal tract is narrowed at the place of articulation without being turbulant.
vow vowel The phoneme is articulated as a vowel instead of a consonant.

Phonation

The phonation features describe the degree to which the glottis (vocal chords) are open or closed.

Feature Symbol Name Description
vls voiceless The glottis is fully open, such that the vocal chords do not vibrate.
brv ◌̤ breathy voice The glottis is closed slightly, to produce a whispered or murmured sound.
slv ◌̥ slack voice The glottis is opened wider than mdv, but not enough to be brv.
mdv modal voice The glottis is opened to provide the optimal vibration of the vocal chords.
stv ◌̬ stiff voice The glottis is closed narrower than mdv, but not enough to be crv.
crv ◌̰ creaky voice The glottis is closed to produce a vocal or glottal fry.
glc ʔ͡◌ glottal closure The glottis is fully closed.

The IPA ◌̥ diacritic is also used to fill the vls spaces in the IPA consonant charts. Thus, when ◌̥ is used with a mdv consonant that does not have an equivalent vls consonant, the resulting consonant is vls, not slv.

Place of Articulation

The place of articulation is described in terms of an active articulator and one or more passive articulators[9]. The possible places of articulation are:

Place of Articulation Feature Symbol Active Lips Teeth Passive
bilabial blb lbl ulp
linguolabial lgl ◌̼ lmn ulp
labiodental lbd ◌̪ lbl utt
bilabial-labiodental bld bld ulp utt
interdental idt ◌̪͆ lmn utt
dental dnt ◌̪ apc utt
denti-alveolar dta lmn utt alf
alveolar alv lmn alf
apico-alveolar apa ◌̺ apc alf
palato-alveolar pla lmn alb
apical retroflex arf sac alb
retroflex rfx ◌̺ apc hpl
alveolo-palatal alp dsl alb
palatal pal dsl hpl
velar vel dsl spl
labio-velar lbv dsl ulp spl
uvular uvl dsl uvu
pharyngeal phr rdl prx
epiglotto-pharyngeal epp lyx prx
(ary-)epiglottal epg lyx egs
glottal glt lyx gts

The features for these places of articulation are provided for convenience, and to make it easier to describe the IPA consonants. Internally, the active and passive articulators are used.

The ◌̪ diacritic is lbd when used on blb consonants, and dnt when used on alv consonants.

The bld place of articulation is used for afr consonants that have a blb onset and a lbd release, e.g. in the p͡f consonant.

The alv consonant is lmn as found in French and Spanish, while apa is apc as found in English, as such ◌̻ (laminal) is not needed.

NOTE: The IPA charts make a distinction between pharyngeal and epiglottal consonants, but Wikipedia does not. This model uses the Wikipedia descriptions.

Active Articulators

Feature Name Articulator
lbl labial lower lip
lmn laminal tongue blade
apc apical tongue tip
sac subapical underside of the tongue
dsl dorsal tongue body
rdl radical tongue root
lyx laryngeal larynx

Passive Articulators

Feature Articulator
ulp upper lip
utt upper teeth
alf alveolar ridge (front)
alb alveolar ridge (back)
hpl hard palate
spl soft palate
uvu uvular
prx pharynx
egs epiglottis
gts glottis

Rounding and Labialization

Feature Symbol Name Rounded Position
unr unrounded No Close to the jaw.
ptr ◌ʷ, ◌ᶣ protruded Yes Protrude outward from the jaw.
cmp ◌ᵝ compressed Yes Close to the jaw.
rnd rounded Yes ptr if bck or cnt; cmp if fnt

The unr and rnd features are used for vowels to describe their default labialization. Consonants are unr by default, and can use the ◌ʷ, ◌ᶣ and ◌ᵝ annotations to specify the type of labialization. Vowels can use these to change their labialization from the default one specified by rnd.

Vowel Height

Feature Name
hgh close (high)
smh near-close (semi-high)
umd close-mid (upper-mid)
mid mid
lmd open-mid (lower-mid)
sml near-open (semi-low)
low open (low)

Vowel Backness

Feature Name
fnt front
cnt center
bck back

Syllabicity

Feature Symbol Name
syl ◌̩ syllabic
nsy ◌̯ non-syllabic

Consonant Release

Feature Symbol Name
frr fricative release
asp ◌ʰ aspirated
nrs ◌ⁿ nasal release
lrs ◌ˡ lateral release
unx ◌̚ no audible release (unexploded)

Fortis and Lenis

Feature Kirshenbaum IPA Name
fts ◌͈ fortis
lns ◌͉ lenis

The extended IPA[7] ◌͈ and ◌͉ diacritics are used to specify lesser (lns) and greater (fts) oral pressure than the unmodified voiced or voiceless phoneme. This distinction is made by the Ewe, Tabasaran, Archi, and other languages[8].

Where fortis and lenis are used to contrast consonant durations (e.g. in the Jawoyn, Ojibwe, and Zurich German languages[8]), the length suprasegmentals are used instead.

Co-articulation

Feature Kirshenbaum IPA Name
mrd ◌̹ more rounded
lrd ◌̜ less rounded
pzd pzd ◌ʲ palatalized
vzd vzd ◌ˠ velarized
fzd fzd ◌ˤ pharyngealized
atr ◌̘ advanced tongue root
rtr ◌̙ retracted tongue root
nzd nzd ◌̃ nasalized
rzd rzd ◌˞ rhoticized

NOTE: The IPA supports ◌̴ for velarized or pharynealized consonants. Unicode has deprecated this combining character, while keeping the combined forms. As such, only the combined forms are supported, using the fzd feature. Cainteoir Text-to-Speech uses vfz for this combining character, but eSpeak NG does not preserve the distinction between ◌ˤ and ◌̴.

Suprasegmentals

Stress

Feature Kirshenbaum IPA Name
st1 ˈ◌ primary stress
st2 ˌ◌ secondary stress
st3 ˈˈ◌ extra stress

Length

Feature Kirshenbaum IPA Name
est ◌̆ extra short
hlg ◌ˑ half-long
lng lng ◌ː long

Rhythm

Feature Kirshenbaum IPA Name
sbr ◌.◌ syllable break
lnk ◌‿◌ linked (no break)

Intonation

Feature Kirshenbaum IPA Name
fbr | minor (foot) break
ibr major (intonation) break
glr global rise
glf global fall

Tones

Tone IPA Start Middle End
extra high (top) ◌˥ ts5 tm5 te5
high ◌˦ ts4 tm4 te4
mid ◌˧ ts3 tm3 te3
low ◌˨ ts2 tm2 te2
extra low (bottom) ◌˩ ts1 tm1 te1

Tone Stepping

Feature Kirshenbaum IPA Name
dst ꜛ◌ downstep
ust ꜜ◌ upstep

References

  1. Kirshenbaum, Evan, Representing IPA phonetics in ASCII (HTML). 1993.

  2. Kirshenbaum, Evan, Representing IPA phonetics in ASCII (PDF). 2001.

  3. International Phonetic Association, The International Phonetic Alphabet and the IPA Chart. 2015. Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  4. Wikipedia. International Phonetic Alphabet. 2017. Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  5. Dunn, R. H., Cainteoir Text-to-Speech Phoneme Features. 2013-2015.

  6. Wikipedia. Voiced glottal fricative. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  7. Wikipedia. Extensions to the International Phonetic Alphabet. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  8. Wikipedia. Fortis and lenis. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  9. Wikipedia. Place of articulation. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).