eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

phonemes.md 24KB

Phonemes


Evan Kirshenbaum created an ASCII transcription of the International Phonetic Alphabet (IPA)[1], [2]. As well as using ASCII characters for specific IPA phonemes, this transcription provides a set of 3-letter feature abbreviations allowing a phoneme to be described as a sequence of features.

This document is grouped into three sections. The first section displays the IPA charts using the feature names instead of their names, showing the IPA phoneme at that position in the chart. This makes it easier to look up the features for a given IPA phoneme.

The second section lists the features and their associated name. The Wikipedia IPA[4] article can be used as a starting point into the various phonetic topics contained in this document.

The third section describes the properties (named values) used to describe the phonemes. These, in addition to the features, should allow all possible phonemes from any language to be described.

The goal of this document is not to provide a detailed guide on phonetics. Instead, it is designed to be a transcription guide on how to specify phonemes in a language or voice so that the narrow transcriptions are consistent between the two.

Consonants

blb lbd dnt alv pla rfx alp pal vel uvl phr glt
nas m ɱ n ɳ ɲ ŋ ɴ
stp pb td ʈɖ cɟ kɡ qɢ ʡ ʔ
sib afr t͡sd͡z t͡ʃd͡ʒ ʈ͡ʂɖ͡ʐ t͡ɕd͡ʑ
afr p͡ɸb͡β p̪͡fb̪͡v t͡θd͡ð c͡çɟ͡ʝ k͡xɡ͡ɣ q͡χɢ͡ʁ ʡ͡ħʡ͡ʕ ʔ͡h
lat afr t͡ɬd͡ɮ
sib frc sz ʃʒ ʂʐ ɕʑ
frc ɸβ fv θð çʝ xɣ χʁ ħʕ hɦ
lat frc ɬɮ
apr ʋ ɹ ɻ j ɰ
lat apr l ɭ ʎ ʟ
flp ɾ ɽ
lat flp ɺ
trl ʙ r ʀ ʜʢ
clk ʘ ǀ ǃ ǂ
lat clk ǁ
imp ɓ ɗ ʄ ɠ ʛ
ejc ʈʼ ʡʼ
ejc frc θʼ ʃʼ ʂʼ χʼ
lat ejc frc ɬʼ

Symbols to the left are vls, and to the right are vcd.

Other Symbols

Symbol Alternative Features
ʍ ɰ̊ʷ vls vel ptr apr
w ɰʷ vcd vel ptr apr
ɥ vcd pal ptr apr
ɧ vls vzd pla frc
ɫ vcd fzd alv lat apr
ɚ unr mid cnt rzd vwl
ɝ unr lmd cnt rzd vwl
k͡p vls lbv stp
ɡ͡b vcd lbv stp
ŋ͡m vcd lbv stp
p͡f vls bld afr
b͡v vcd bld afr

Gemination

Gemination is found in several languages including Italian and Japanese. It is also present in the suprasegmental phonology between words such as “lamppost” and “evenness”.

Some linguists use the long suprasegmental for geminate consonants. The eSpeak NG convention is to use consonant length for phonation when consonant length is distinct without gemination occurring.

The way gemination is represented in eSpeak NG is to duplicate the phonemes, with the first phoneme using the unx feature. For example, n̚.n for a geminated n. This describes how with the stp and nas consonants, the mouth remains closed (unx) for the first of the geminated consonants.

Manner of Articulation

Feature Symbol Name
nas nasal
stp plosive (stop)
afr affricate
frc fricative
flp tap/flap
trl trill
apr approximant
clk click
ejc ejective
imp ◌ʼ implosive
vwl vowel

The vwl phonemes are described using vowel height and backness features, while consonants (the other manners of articulation) are described using place of articulation features.

Additionally, the manner of articulation can be refined using the following features:

Feature Name
lat lateral
sib sibilant

Place of Articulation

Feature Name
blb bilabial
lbd labiodental
bld bilabial-labiodental
dnt dental
alv alveolar
pla palato-alveolar
rfx retroflex
alp alveolo-palatal
pal palatal
vel velar
lbv labio-velar
uvl uvular
phr pharyngeal
glt glottal

The bld place of articulation is used for afr consonants that have a blb onset and a lbd release, e.g. in the German p͡f consonant.

NOTE: The IPA charts make a distinction between pharyngeal and epiglottal consonants, but Wikipedia does not. This model uses the Wikipedia descriptions.

Voice

Feature Name
vls voiceless
vcd voiced

Vowels

fnt cnt bck
hgh iy ɨʉ ɯu
smh ɪʏ ʊ
umd eø ɘɵ ɤo
mid ə
lmd ɛœ ɜɞ ʌɔ
sml æ ɐ
low aɶ ɑɒ

Symbols to the left are unr, and to the right are rnd.

NOTE: The smh vowels are more cnt than the other vowels. However, this distinction is not needed to classify these vowels, so is not included in the above table.

Diacritics

Articulation

Feature Symbol Name
lgl ◌̼ linguolabial
idt ◌̪͆ interdental
◌̪ dental
apc ◌̺ apical
lmn ◌̻ laminal
◌̟ advanced
◌̠ retracted
◌̈ centralized
◌̽ mid-centralized
◌̝ raised
◌̞ lowered

The articulations that do not have a corresponding feature name are recorded using the features of their new location in the consonant or vowel charts, not using the features of the base phoneme.

Air Flow

Feature Symbol Name
egs egressive
igs ingressive

The ↑ and ↓ symbols are from the extended IPA[7]. They only need to be used when the air flow is different to the base IPA phoneme (e.g. using ↓ on pulmonic consonants).

Phonation

Feature Symbol Name
brv ◌̤ breathy voice
slv ◌̥ slack voice
stv ◌̬ stiff voice
crv ◌̰ creaky voice
glc ʔ͡◌ glottal closure

The IPA ◌̥ diacritic is also used to fill the vls spaces in the IPA consonant charts. Thus, when ◌̥ is used with a vcd consonant that does not have an equivalent vls consonant, the resulting consonant is vls, not slv.

Features

Rounding and Labialization

Feature Symbol Name Rounded Position
unr unrounded No Close to the jaw.
ptr ◌ʷ, ◌ᶣ protruded Yes Protrude outward from the jaw.
cmp ◌ᵝ compressed Yes Close to the jaw.
rnd rounded Yes ptr if bck or cnt; cmp if fnt

The unr and rnd features are used for vowels to describe their default labialization. Consonants are unr by default, and can use the ◌ʷ, ◌ᶣ and ◌ᵝ annotations to specify the type of labialization. Vowels can use these to change their labialization from the default one specified by rnd.

Additionally, the degree of rounding/labialization can be specified using the following features:

Feature Symbol Name
mrd ◌̹ more rounded
lrd ◌̜ less rounded

Vowel Height

Feature Name
hgh close (high)
smh near-close (semi-high)
umd close-mid (upper-mid)
mid mid
lmd open-mid (lower-mid)
sml near-open (semi-low)
low open (low)

Vowel Backness

Feature Name
fnt front
cnt center
bck back

Syllabicity

Feature Symbol Name
syl ◌̩ syllabic
nsy ◌̯ non-syllabic

Consonant Release

Feature Symbol Name
frr fricative release
asp ◌ʰ aspirated
nrs ◌ⁿ nasal release
lrs ◌ˡ lateral release
unx ◌̚ no audible release (unexploded)

Co-articulation

Feature Symbol Name Co-Articulator Type
pzd ◌ʲ palatalized hpl Passive Articulator
vzd ◌ˠ velarized spl Passive Articulator
fzd ◌ˤ pharyngealized prx Passive Articulator
nzd ◌̃ nasalized nsl Target
rzd ◌˞ rhoticized rfx Place of Articulation

Additionally, the tongue root position can be specified using the following features:

Feature Symbol Name
atr ◌̘ advanced tongue root
rtr ◌̙ retracted tongue root

Fortis and Lenis

Feature Symbol Name
fts ◌͈ fortis
lns ◌͉ lenis

The extended IPA[7] ◌͈ and ◌͉ diacritics are used to specify lesser (lns) and greater (fts) oral pressure than the unmodified voiced or voiceless phoneme. This distinction is made by the Ewe, Tabasaran, Archi, and other languages[8].

Where fortis and lenis are used to contrast consonant durations (e.g. in the Jawoyn, Ojibwe, and Zurich German languages[8]), the length suprasegmentals are used instead.

Suprasegmentals

Stress

Feature Symbol Name
st1 ˈ◌ primary stress
st2 ˌ◌ secondary stress
st3 ˈˈ◌ extra stress

Length

Feature Symbol Name
est ◌̆ extra short
hlg ◌ˑ half-long
lng ◌ː long

Rhythm

Feature Symbol Name
sbr ◌.◌ syllable break
lnk ◌‿◌ linked (no break)

Intonation

Feature Symbol Name
fbr | minor (foot) break
ibr major (intonation) break
glr global rise
glf global fall

Tone Stepping

Feature Symbol Name
ust ꜛ◌ upstep
dst ꜜ◌ downstep

Properties

Tones

Tones are defined using the following 3 properties:

tone_start  <value>
tone_middle <value>
tone_end    <value>

The <value> field for these properties is a number with one of the following values:

Tone Symbol <value>
extra high (top) ◌˥ 5
high ◌˦ 4
mid ◌˧ 3
low ◌˨ 2
extra low (bottom) ◌˩ 1

A level tone can be specified by just using the tone_start value. A raising or falling tone can be specified using the tone_start and tone_end values. A raising-falling (peaking) or falling-raising (dipping) tone can be specified using all three values.

References

  1. Kirshenbaum, Evan, Representing IPA phonetics in ASCII (HTML). 1993.

  2. Kirshenbaum, Evan, Representing IPA phonetics in ASCII (PDF). 2001.

  3. International Phonetic Association, The International Phonetic Alphabet and the IPA Chart. 2015. Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  4. Wikipedia. International Phonetic Alphabet. 2017. Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  5. Dunn, R. H., Cainteoir Text-to-Speech Phoneme Features. 2013-2015.

  6. Wikipedia. Voiced glottal fricative. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  7. Wikipedia. Extensions to the International Phonetic Alphabet. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  8. Wikipedia. Fortis and lenis. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).

  9. Wikipedia. Place of articulation. 2017, Creative Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).