Browse Source

Fix and update the phontab.md documentation; merge in the relevant parts of the phonemes documentation.

master
Reece H. Dunn 9 years ago
parent
commit
75719bf814
6 changed files with 468 additions and 958 deletions
  1. 1
    0
      Makefile.am
  2. 0
    1
      docs/index.md
  3. 0
    168
      docs/phonemes.html
  4. 0
    109
      docs/phonemes.md
  5. 0
    387
      docs/phontab.html
  6. 467
    293
      docs/phontab.md

+ 1
- 0
Makefile.am View File

@@ -65,6 +65,7 @@ docs: docs/index.html \
docs/add_language.html \
docs/dictionary.html \
docs/mbrola.html \
docs/phontab.html \
docs/voices.html \
src/espeak-ng.1.html \
README.html \

+ 0
- 1
docs/index.md View File

@@ -6,7 +6,6 @@
- [Text to Phoneme Translation](dictionary.md)
- [Voice Files](voices.md)
- [MBROLA Voices](mbrola.md)
- [Phonemes](phonemes.md)
- [Phoneme Tables](phontab.md)
- [Intonation](intonation.md)
- [Markup Tags](ssml.md)

+ 0
- 168
docs/phonemes.html View File

@@ -1,168 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>

<head>
<title>sSpeak: Phonemes</title>
<meta name="GENERATOR" content="Quanta Plus">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>

<A href="docindex.html">Back</A>
<hr>
<h2>PHONEMES</h2>
<hr>
In general a different set of phonemes can be defined for each language.
<p>
In most cases different languages inherit the same basic set of consonants. They can add to these or modify them as needed.
<p>
The phoneme mnemonics are based on the scheme by Kirshenbaum which represents International Phonetic Alphabet symbols using ascii characters. See: <a href="http://www.kirshenbaum.net/IPA/ascii-ipa.pdf">www.kirshenbaum.net/IPA/ascii-ipa.pdf</a>.
<p>
Phoneme mnemonics can be used directly in the text input to <strong>espeak-ng</strong>. They are enclosed within double square brackets. Spaces are used to separate words, and all stressed syllables must be marked explicitly. eg:<br>
<code>[[D,Is Iz sVm f@n'EtIk t'Ekst 'InpUt]]</code>
<h3>English Consonants</h3>
<table>
<tbody valign=top>
<tr>
<td width=25><code>[p]</code><td width=150>
<td width=25><code>[b]</code><td width=150>
<tr>
<td><code>[t]</code><td>
<td><code>[d]</code><td>
<tr>
<td><code>[tS]</code><td><b>ch</b>urch
<td><code>[dZ]</code><td><b>j</b>udge
<tr>
<td><code>[k]</code><td>
<td><code>[g]</code><td>
<tr><td><p>

<tr>
<td><code>[f]</code><td>
<td><code>[v]</code><td>
<tr>
<td><code>[T]</code><td><b>th</b>in
<td><code>[D]</code><td><b>th</b>is
<tr>
<td><code>[s]</code><td>
<td><code>[z]</code><td>
<tr>
<td><code>[S]</code><td><b>sh</b>op
<td><code>[Z]</code><td>plea<b>s</b>ure
<tr>
<td><code>[h]</code><td>
<tr><td><p>

<tr>
<td><code>[m]</code><td>
<td><code>[n]</code><td>
<tr>
<td><code>[N]</code><td>si<b>ng</b>
<tr>
<td><code>[l]</code><td>
<td><code>[r]</code><td><b>r</b>ed (Omitted if not immediately followed by a vowel).
<tr>
<td><code>[j]</code><td><b>y</b>es
<td><code>[w]</code><td>
<tr><td><p>

<tr><td colspan=3><strong>Some Additional Consonants</strong></td>
<p>
<tr>
<td><code>[C]</code><td>German i<b>ch</b>
<td><code>[x]</code><td>German bu<b>ch</b>
<tr>
<td><code>[l^]</code><td>Italian <b>gl</b>i
<td><code>[n^]</code><td>Spanish <b>ñ</b>

</tbody>
</table>


</tbody>
</table>


<h3>English Vowels</h3>
These are the phonemes which are used by the English spelling-to-phoneme translations (en_rules and en_list). In some varieties of English different phonemes may have the same sound, but they are kept separate because they may differ in another variety.
<p>
In rhotic accents, such as General American, the phonemes <code>[3:], [A@], [e@], [i@], [O@], [U@] </code> include the "r" sound.
<p>

<table>
<tbody valign=top>
<tr><td width=25><code>[@]</code>
<td width=60>alph<b>a</b><td width=400>schwa

<tr><td><code>[3]</code>
<td>bett<b>er</b><td>rhotic schwa. In British English this is the same as <code>[@]</code>, but it includes 'r' colouring in American and other rhotic accents. In these cases a separate <code>[r]</code> should not be included unless it is followed immediately by another vowel.

<tr><td><code>[3:]</code><td>n<b>ur</b>se
<tr><td><code>[@L]</code><td>simp<b>le</b>
<tr><td><code>[@2]</code><td>the<td>Used only for "the".
<tr><td><code>[@5]</code><td>to<td>Used only for "to".
<tr><td><p>

<tr><td><code>[a]</code><td>tr<b>a</b>p
<tr><td><code>[aa]</code><td>b<b>a</b>th<td>This is <code>[a]</code> in some accents, <code>[A:]</code> in others.
<tr><td><code>[a#]</code><td><b>a</b>bout<td>This may be <code>[@]</code> or may be a more open schwa.
<tr><td><code>[A:]</code><td>p<b>al</b>m
<tr><td><code>[A@]</code><td>st<b>ar</b>t
<tr><td><p>

<tr><td><code>[E]</code><td>dr<b>e</b>ss
<tr><td><code>[e@]</code><td>squ<b>are</b>
<tr><td><p>

<tr><td><code>[I]</code><td>k<b>i</b>t
<tr><td><code>[I2]</code><td><b>i</b>ntend<td>As <code>[I]</code>, but also indicates an unstressed syllable.
<tr><td><code>[i]</code><td>happ<b>y</b><td>An unstressed "i" sound at the end of a word.
<tr><td><code>[i:]</code><td>fl<b>ee</b>ce
<tr><td><code>[i@]</code><td>n<b>ear</b>
<tr><td><p>

<tr><td><code>[0]</code><td>l<b>o</b>t
<tr><td><p>

<tr><td><code>[V]</code><td>str<b>u</b>t
<tr><td><p>

<tr><td><code>[u:]</code><td>g<b>oo</b>se
<tr><td><code>[U]</code><td>f<b>oo</b>t
<tr><td><code>[U@]</code><td>c<b>ure</b>
<tr><td><p>

<tr><td><code>[O:]</code><td>th<b>ou</b>ght
<tr><td><code>[O@]</code><td>n<b>or</b>th
<tr><td><code>[o@]</code><td>f<b>or</b>ce
<tr><td><p>


<tr><td><code>[aI]</code><td>pr<b>i</b>ce
<tr><td><code>[eI]</code><td>f<b>a</b>ce
<tr><td><code>[OI]</code><td>ch<b>oi</b>ce
<tr><td><code>[aU]</code><td>m<b>ou</b>th
<tr><td><code>[oU]</code><td>g<b>oa</b>t

<tr><td><code>[aI@]</code><td>sc<b>ie</b>nce
<tr><td><code>[aU@]</code><td>h<b>our</b>
</tbody>
</table>

<h3>Some Additional Vowels</h3>
Other languages will have their own vowel definitions, eg:

<table>
<tbody valign=top>
<tr><td width=30><code>[e]</code><td>German <b>eh</b>, French <b>é</b>
<tr><td><code>[o]</code><td>German <b>oo</b>, French <b>o</b>
<tr><td><code>[y]</code><td>German <b>ü</b>, French <b>u</b>
<tr><td><code>[Y]</code><td>German <b>ö</b>, French <b>oe</b>

</tbody>
</table>
<p>
<code> [:] </code> can be used to lengthen a vowel, eg <code> [e:]</code>

</body>
</html>

+ 0
- 109
docs/phonemes.md View File

@@ -1,109 +0,0 @@
# Table of contents

* [Phonemes](#phonemes)
* [English Consonants](#english-consonants)
* [Some Additional Consonants](#some-additional-consonants)
* [English Vowels](#english-vowels)
* [Some Additional Vowels](#some-additional-vowels)

# Phonemes

In general a different set of phonemes can be defined for each language.

In most cases different languages inherit the same basic set of
consonants. They can add to these or modify them as needed.

The phoneme mnemonics are based on the scheme by Kirshenbaum which
represents International Phonetic Alphabet symbols using ascii
characters. See:
[www.kirshenbaum.net/IPA/ascii-ipa.pdf](http://www.kirshenbaum.net/IPA/ascii-ipa.pdf).

Phoneme mnemonics can be used directly in the text input to
**espeak-ng**. They are enclosed within double square brackets. Spaces
are used to separate words, and all stressed syllables must be marked
explicitly. eg:
\[[D,Is Iz sVm f@n'EtIk t'Ekst 'InpUt]\]

## English Consonants

Phoneme|Phoneme
----------------|-----------
\[p\] | \[b\]
\[t\] | \[d\]
\[tS\] **ch**urch | \[dZ\] **j**udge
\[k\] | \[g\]
\[f\] | \[v\]
\[T\] **th**in | \[D\] **th**is
\[s\] | \[z\]
\[S\] **sh**op | \[Z\] plea**s**ure
\[h\] |
\[m\] | \[n\]
\[N\] si**ng** |
\[l\] | \[r\] **r**ed (Omitted if not immediately followed by a vowel). |
\[j\] **y**es | \[w\]

## Some Additional Consonants

Phoneme|Phoneme
------------------------|-----------
\[C]\ German i**ch** | \[x\] German bu**ch**
\[l^\] Italian **g**li | \[n^\] Spanish **ñ**

## English Vowels

These are the phonemes which are used by the English spelling-to-phoneme
translations (en\_rules and en\_list). In some varieties of English
different phonemes may have the same sound, but they are kept separate
because they may differ in another variety.

In rhotic accents, such as General American, the phonemes
`[3:], [A@], [e@], [i@], [O@], [U@]` include the "r" sound.

Phoneme|Pronunciation|Description
----------|-------------------------|--------------------------------------------
\[@\] | alph**a** | schwa
\[3\] | bett**er** | rhotic schwa. In British English this is the same as \[@\], but it includes 'r' colouring in American and other rhotic accents. In these cases a separate \[r\] should not be included unless it is followed immediately by another vowel.
\[3:\] | n**ur**se |
\[@L\] | simp**le** |
\[@2\] | the Used only for "the". |
\[@5\] | to Used only for "to". |
\[a\] | tr**a**p |
\[aa\] | b**a**th | This is \[a\] in some accents, \[A:\] in others.
\[a#\] | **a**bout | This may be \[@\] or may be a more open schwa.
\[A:\] | p**al**m |
\[A@\] | st**ar**t |
\[E\] | dr**e**ss |
\[e@\] | squ**are** |
\[I\] | k**i**t |
\[I2\] | **i**ntend | As \[I\], but also indicates an unstressed syllable.
\[i\] | happ**y** | An unstressed "i" sound at the end of a word.
\[i:\] | fl**ee**ce |
\[i@\] | n**ear** |
\[0\] | l**o**t |
\[V\] | str**u**t |
\[u:\] | g**oo**se |
\[U\] | f**oo**t |
\[U@\] | c**ure** |
\[O:\] | th**ou**ght |
\[O@\] | n**or**th |
\[o@\] | f**or**ce |
\[aI\] | pr**i**ce |
\[eI\] | f**a**ce |
\[OI\] | ch**oi**ce |
\[aU\] | m**ou**th |
\[oU\] | g**oa**t |
\[aI@\] | sc**ie**nce |
\[aU@\] | h**our** |

## Some Additional Vowels

Other languages will have their own vowel definitions, eg:

Phoneme|Description
--------|------------------------
\[e\] | German **eh**, French **é**
\[o\] | German **oo**, French **o**
\[y\] | German **ü**, French **u**
\[Y\] | German **ö**, French **oe**

**\[:\]** can be used to lengthen a vowel, eg \[e:\]

+ 0
- 387
docs/phontab.html View File

@@ -1,387 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>

<head>
<title>eSpeak: Phoneme tables</title>
<meta name="GENERATOR" content="Quanta Plus">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body>
<A href="docindex.html">Back</A>
<hr>
<h2>PHONEME TABLES</h2>
<hr>
A phoneme table defines all the phonemes which are used by a language, together with their properties and the data for their production as sounds.
<p>
Generally each language has its own phoneme table, although additional phoneme tables can be used for different voices within the language. These alternatives are referenced from Voice files.
<p>
A phoneme table does not need to define all the phonemes used by a language. It can inherit the phonemes from a previously defined phoneme table. For example, a phoneme table may redefine (or add) some of the vowels that it uses, but inherit most of its consonants from a standard set.
<p>
The source files for the phoneme data are in the "phsource" directory in the espeakedit download package. "Vowel files", which are referenced in FMT(), VowelStart(), and VowelEnding() instructions are made using the espeakedit program.
</blockquote>
<p>&nbsp;<hr>
<h3>Phoneme files</h3>
The phoneme tables are defined in a master phoneme file, named <strong>phonemes</strong>. This starts with the <strong>base</strong> phoneme table followed by phoneme tables for other languages and voices. These inherit phonemes from the <strong>base</strong> table or previously defined tables.
<p>
In addition to phoneme definitions, the phoneme file can contain the following:
<dl>
<dt><strong>include</strong> &lt;filename&gt;
<dd>Includes the text of the specified file at this point. This allows different phoneme tables to be kept in different text files, for convenience. &lt;filename&gt; is a relative path. The included file can itself contain <strong>include</strong> statements.
<p>
<dt><strong>phonemetable</strong> &lt;name&gt; &lt;parent&gt;
<dd>Starts a new phoneme table, and ends the previous table.<br>
&lt;name&gt; Is the name of this phoneme table. This name is used in Voice files.<br>
&lt;parent&gt; Is the name of a previously defined phoneme table whose phoneme definitions are inherited by this one. The name <strong>base</strong> indicates the first (base) phoneme table.

</dl>
<p>&nbsp;<hr>
<h3>Phoneme definitions</h3>
Note: These new Phoneme definitions apply to eSpeak version 1.42.20 and later.
<p>
A phoneme table contains a list of phoneme definitions. Each starts with the keyword <strong>phoneme</strong> and the phoneme name (this is the name used in the pronunciation rules in a language's *_rules and *_list files), and ends with the keyword <strong>endphoneme</strong>. For example:
<pre> phoneme aI
vowel
starttype #a endtype #i
length 230
FMT(vowels/ai)
endphoneme

phoneme s
vls alv frc sibilant
voicingswitch z
lengthmod 3
Vowelin f1=0 f2=1700 -300 300 f3=-100 80
Vowelout f1=0 f2=1700 -300 250 f3=-100 80 rms=20

IF nextPh(isPause) THEN
WAV(ufric/s_)
ELIF nextPh(p) OR nextPh(t) OR nextPh(k) THEN
WAV(ufric/s!)
ENDIF
WAV(ufric/s)
endphoneme

</pre>
<p>
A phoneme definition contains both static properties and executed instructions. The instructions may contain conditional statements, so that the effect of the phoneme may be different depending on adjacent phonemes, whether the syllable is stressed, etc.
<p>
The instructions of a phoneme are interpreted in two different phases. In the first phase, the instructions may change the phoneme and replace it by a different phoneme. In the second phase, instructions are used to produce the sound for the phoneme.
<p>
The <strong>import_phoneme</strong> statement can be used to copy a previously defined phoneme from a specified phoneme table. For example:
<pre>
phoneme t
import_phoneme base/t[
endphoneme
</pre>
means: <code>phoneme t</code> in this phoneme table is a copy of <code>phoneme t[</code> from phoneme table "base". A <strong>length</strong> instruction can be used after <strong>import_phoneme</strong> to vary the length from the original.

<p>&nbsp;<hr>
<h3>Phoneme Properties</h3>

Within the phoneme definition the following lines may occur: ( (V) indicates only for vowels, (C) only for consonants)
<p>
<ul>
<dl><dt>Type. One of these must be present.
<dd><table>
<tr><TD width="100"><b>vowel</b></TD></tr>
<tr><TD><b>liquid</b></TD><td>semi-vowels, such as:&nbsp; <code> r, l, j, w</code></td></tr>
<tr><TD><b>nasal</b></TD><td>nasal eg:&nbsp; <code> m, n, N</code></td></tr>
<tr><TD><b>stop</b></TD><td>stop eg:&nbsp; <code> p, b, t, d, k, g</code></td></tr>
<tr><TD><b>frc</b></TD><td>fricative eg:&nbsp; <code> f, v, T, D, s, z, S, Z, C, x</code></td></tr>
<tr><TD><b>afr</b></TD><td>affricate eg:&nbsp; <code> tS, dZ</code></td></tr>
<tr><TD><b>pause</b></TD><td></td></tr>
<tr><TD><b>stress</b></TD><td>used for stress symbols, eg: ' , = %</td></tr>
<tr><TD><b>virtual</b></TD><td>Used to represent a class of phonemes.</td></tr>
</table>
</dl>
<dl><dt>Properties:
<dd><table>
<tr><TD width="100"><b>vls</b></TD><td>(C) voiceless eg. <code> p, t, k, f, s</code></TD></tr>
<tr><TD><b>vcd</b></TD><td>(C) voiced eg. <code> b, d, g, v, z</code></td></tr>
<tr><TD><b>sibilant</b></TD><td>(C) eg: <code> s, z, S, Z, tS, dZ</code></td></tr>
<tr><TD><b>palatal</b></TD><td>(C) A palatal or palatalized consonant.</td></tr>
<tr><TD><b>rhotic</b></TD><td>(C) An "r" type consonant.</td></tr>
<tr><TD><b>unstressed</b></TD><td>(V) This vowel is always unstressed, unless explicitly marked otherwise.</td></tr>
<tr><TD><b>nolink</b></TD><td>Prevent any linking from the previous phoneme.</td></tr>
<tr><TD><b>nopause</b></TD><td>Used in a <code>liquid</code> or <code>nasal</code> phoneme to prevent eSpeak inserting a short pause if a word starts with this phoneme and the previous word ends with a vowel.</td></tr>
<tr><TD><b>trill</b></TD><td>(C) Apply trill to the voicing.</td></tr>
</table>
</dl>
<dl><dt>Place of Articulation (C):
<dd><table>
<tr><TD><b>blb &nbsp;</b></TD><td width="100">bi-labial</TD>
<TD><b>ldb &nbsp;</b></TD><td width="110">labio-dental</TD>
<TD><b>dnt &nbsp;</b></TD><td>dental</TD></tr>

<tr><TD><b>alv</b></TD><td>alveolar</td>
<TD><b>rfx</b></TD><td>retroflex</TD>
<TD><b>pla</b></TD><td>palato-alveolar</TD></tr>

<tr><TD><b>pal</b></TD><td>palatal</td>
<TD><b>vel</b></TD><td>velar</TD>
<TD><b>lbv</b></TD><td>labio-velar</TD></tr>

<tr><TD><b>uvl</b></TD><td>uvular</td>
<TD><b>phr</b></TD><td>pharyngeal</TD>
<TD><b>glt</b></TD><td>glottal</TD></tr>

</table>
<p>
<dt><strong>starttype</strong> &lt;phoneme&gt;
<dd>Allocates this phoneme to a group so that conditions such as nextPh(#e) can test for any of a group of phonemes. Pre-defined groups for use for vowels are: #@ #a #e #i #o #u. Additional groups can be defined as phonemes with type "virtual".
<p>
<dt><strong>endtype</strong> &lt;phoneme&gt;
<dd>Allocates this phoneme to a group so that conditions such as prevPh(#e) can test for any of a group of phonemes. Pre-defined groups for use for vowels are: #@ #a #e #i #o #u. Additional groups can be defined as phonemes with type "virtual".
<p>
<dt><strong>lengthmod</strong> &lt;integer&gt;
<dd>(C) Determines how this consonant affects the length of the previous vowel. This value is used as index into the <code>length_mods</code> table in the <code>CalcLengths()</code> function in the eSpeak program.
<p>
<dt><strong>voicingswitch</strong> &lt;phoneme&gt;
<dd>This is used for some languages to change between voiced and unvoiced phonemes.

</dl>
</ul>

<p>&nbsp;<hr>
<h3>Phoneme Instructions</h3>

Phoneme Instructions may be included within conditional statements.
<p>
During the first phase of phoneme interpretation, an instruction which causes a change to a different phoneme will terminate the instructions. During the second phase, FMT() and WAV() instructions will terminate the instructions.
<ul>
<dl>
<dt><strong>length</strong> &lt;length&gt;
<dd>The relative length of the phoneme, typically about 140 for a short vowel and from 200 to 300 for a long vowel or diphong. A length() instruction is needed for vowels. It is optional for consonants.
<p>
<dt><strong>ipa</strong> &lt;ipa string&gt;
<dd>In many cases, eSpeak makes IPA (International Phonetic Alpbabet) phoneme names automatically from eSpeak phoneme names. If this is not correct, then the phoneme definition can include an <b>ipa</b> instruction to specify the correct IPA name. IPA strings may include non-ascii characters. They may also include characters specified by their character codes in the form U+ followed by 4 hexadecimal digits. For example a string: aU+0303 indicates 'a' with a 'combining tilde'.
<p>
<dt><strong>WAV</strong>(&lt;wav file&gt;, &lt;amplitude&gt;)
<dd>&nbsp;&lt;wav file&gt; is a path to a WAV file (22 kHz, 16 bits, mono) within <code>phsource/</code> which will be played to produce the sound. This method is used for unvoiced consonants. &lt;wavefile&gt; does not include a .WAV filename extension, although the file to which it refers may or may not have one.<br>
&lt;amplitude&gt; is optional. It is a percentage change to the amplitude of the WAV file. So, <code>WAV(ufric/s, 50)</code> means: play file 'ufric/s.wav' at 50% amplitude.
<p>
<dt><strong>FMT</strong>(&lt;vowel file&gt;, &lt;amplitude&gt;)
<dd>&lt;vowel file&gt; is a path to a file (within <code>phsource/</code>) which defines how to generate the sound (a vowel or voiced consonant) from a sequence of formant values. Vowel files are made using the espeakedit program.<br>
&lt;amplitude&gt; is optional. It is a percentage change to the amplitude of the sound which is synthesized from the FMT() instruction.
<p>
<dt><strong>FMT</strong>(&lt;vowel file&gt;, &lt;amplitude&gt;) <strong>addWav</strong>(&lt;wav file&gt;, &lt;amplitude&gt;)
<dd>For voiced consonants, a FMT() instruction may be followed by an addWav() instruction. addWav() has the same format as a WAV() instruction, but the WAV file is mixed with the sound which is synthesized from the FMT() instruction.
<p>
<dt><strong>VowelStart</strong>(&lt;vowel file&gt;, &lt;length adjust&gt;)
<dd>This is used to modify the start of a vowel when it follows a sonorant consonant (such as [l] or [j]). It replaces the first frame of the &lt;vowel file&gt; which is specified in a FMT() instruction by this &lt;vowel file&gt;, and adjusts the length of the original by a signed value &lt;length adjust&gt;. The VowelStart() instruction may be specified either in the phoneme definition of the vowel, or in the phoneme definition of the sonorant consonant which precedes the vowel. The former takes precedence.
<p>
<dt><strong>VowelEnding</strong>(&lt;vowel file&gt;, &lt;length adjust&gt;)
<dd>This is used to modify the end of a vowel when it is followed by a sonorant consonant (such as [l] or [j]). It is appended to the &lt;vowel file&gt; which is specified in a FMT() instruction by this &lt;vowel file&gt;, and adjusts the length of the original by a signed value &lt;length adjust&gt;. The VowelEnding() instruction may be specified either in the phoneme definition of the vowel, or in the phoneme definition of the sonorant consonant which follows the vowel. The former takes precedence.
<p>
<dt><strong>Vowelin</strong> &lt;vowel transition data&gt;
<dd>(C) Specifies the effects of this consonant on the formants of a following vowel. See "vowel transitions", below.
<p>
<dt><strong>Vowelout</strong> &lt;vowel transition data&gt;
<dd>(C) Specifies the effects of this consonant on the formants of a preceding vowel. See "vowel transitions", below.
<p>
<dt><strong>ChangePhoneme(</strong>&lt;phoneme&gt;)
<dd>Change to the specified phoneme.
<p>
<dt><strong>ChangeIfDiminished(</strong>&lt;phoneme&gt;)
<dd>Change to the specified phoneme (such as schwa, @) if this syllable has "diminished" stress.
<p>
<dt><strong>ChangeIfUnstressed(</strong>&lt;phoneme&gt;)
<dd>Change to the specified phoneme if this syllable has "diminished" or "unstressed" stress.
<p>
<dt><strong>ChangeIfNotStressed(</strong>&lt;phoneme&gt;)
<dd>Change to the specified phoneme if this syllable does not have "primary" stress.
<p>
<dt><strong>ChangeIfStressed(</strong>&lt;phoneme&gt;)
<dd>Change to the specified phoneme if this syllable has "primary" stress.
<p>
<dt><strong>IfNextVowelAppend(</strong>&lt;phoneme&gt;)
<dd>If the following phoneme is a vowel then this additional phoneme will be inserted before it.
<p>
<dt><strong>RETURN</strong>
<dd>Ends executions of instructions.
<p>
<dt><strong>CALL</strong> &lt;phoneme table&gt;/&lt;phoneme&gt;
<dd>Executes the instructions of the specified phoneme.

</dl>
</ul>
<p>&nbsp;<hr>
<h3>Conditional Statements</h3>
Phoneme definitions can contain conditional statements such as:
<pre>
IF &lt;condition&gt; THEN
&lt;statements&gt;
ENDIF
</pre>
or more generally:
<pre>
IF &lt;condition&gt; THEN
&lt;statements&gt;
ELIF &lt;condition&gt; THEN
&lt;statements&gt;
...
ELSE
&lt;statements&gt;
ENDIF
</pre>
where the <code>ELSE</code> and multiple <code>ELSE</code> parts are optional.
<p>
Multiple conditions may be joined with <code>AND</code> or <code>OR</code>, but not a mixture of <code>AND</code>s and <code>OR</code>s.
<p>
A condition may be preceded by <code>NOT</code>. For example:
<pre>
IF &lt;condition&gt; AND NOT &lt;condition&gt; THEN
&lt;statements&gt;
ENDIF
</pre>
<p>
<strong>Condition</strong>
Can be:
<ul>
<dl>
<dt>thisPh(&lt;attribute&gt;)
<dd>Test this current phoneme
<p>
<dt>prevPh(&lt;attribute&gt;)
<dd>Test the previous phoneme
<p>
<dt>prevPhW(&lt;attribute&gt;)
<dd>Test the previous phoneme, but only within the same word. Returns false if there is no previous phoneme in the word.
<p>
<dt>prev2PhW(&lt;attribute&gt;)
<dd>Test the phoneme before the previous phoneme, but only within the same word. Returns false if it is not in this word.
<p>
<dt>nextPh(&lt;attribute&gt;)
<dd>Test the following phoneme
<p>
<dt>next2Ph(&lt;attribute&gt;)
<dd>Test the phoneme after the next phoneme.
<p>
<dt>nextPhW(&lt;attribute&gt;)
<dd>Test the next phoneme, but only within the same word. Returns false if there is no following phoneme in the word.
<p>
<dt>next2PhW(&lt;attribute&gt;)
<dd>Test the phoneme after the next phoneme, but only within the same word. Returns false if not found before the word end.
<p>
<dt>next3PhW(&lt;attribute&gt;)
<dd>Test the third phoneme after the current phoneme, but only within the same word. Returns false if not found before the word end.
<p>
<dt>nextVowel(&lt;attribute&gt;)
<dd>Test the next vowel after the current phoneme, but only within the same word. Returns false if there is none.
<p>
<dt>prevVowel(&lt;attribute&gt;)
<dd>Test the previous vowel before the current phoneme, but only within the same word. Returns false if there is none.
<p>
<dt>PreVoicing()
<dd>This is used as part of the instructions for voiced stop consonants (eg. [d] [g]). If true then produce a voiced murmur before the stop.
<p>
<dt>KlattSynth()
<dd>Returns true if the voice is using the Klatt synthesizer rather than the eSpeak synthesizer.
</dl>
</ul>
<strong>Attributes</strong>
<ul>
Note: Additional attributes could be added to eSpeak if needed.
<p>
<dl>
<dt>&lt;phoneme name&gt;
<dd>True if the phoneme has this phoneme name.
<p>
<dt>&lt;phoneme group&gt;
<dd>True if the phoneme has this starttype (or if it has this endtype if it's used in prevPh() ). The pre-defined phoneme groups are #@, #a, #e, #i, #o, #u.
<p>
<dt>isPause
<dd>True if the phoneme is a pause.
<p>
<dt>isPause2
<dd><code>nextPh(isPause2)</code> is used to test whether the next phoneme is not a vowel or liquid consonant within the same word.
<p>
<dt>isVowel
<dt>isNotVowel
<dt>isLiquid
<dt>isNasal
<dt>isVFricative
<dd>These test the phoneme type.
<p>
<dt>isPalatal
<dt>isRhotic
<dd>These test whether the phoneme has this property.
<p>
<dt>isWordStart
<dt>notWordStart
<dd>These text whether this is the first phoneme in a word.
<p>
<dt>isWordEnd
<dd>True if this is the final phoneme in a word.
<p>
<dt>isFirstVowel
<dt>isSecondVowel
<dt>isFinalVowel
<dd>True if this is the First, Second, or Last vowel in a word.
<p>
<dt>isAfterStress
<dd>True if this phoneme is after the stressed vowel in a word.
<p>
<dt>isVoiced
<dd>True if this phoneme is a vowel or a voiced consonant.
<p>
<dt>isDiminished
<dd>True if the syllable stress is "diminished"
<p>
<dt>isUnstressed
<dd>True if the syllable stress is "diminished" or "unstressed"
<p>
<dt>isNotStressed
<dd>True if the syllable stress is not "primary stress".
<p>
<dt>isStressed
<dd>True if the syllable stress is "primary stress".
<p>
<dt>isMaxStress
<dd>True if this is the highest stressed syllable in the word.
<p>
<dt>
<dd>

</dl>
</ul>

<p>&nbsp;<hr>
<h3>Sound Specifications</h3>
There are three ways to produce sounds:
<ul>
<li>Playing a WAV file, by using a WAV() instruction. This is used for unvoiced consonants such as <code> [p] [t] [s]</code>.
<li>Generating a wave from a sequence of formant parameters, by using a FMT() instruction.This is used for vowels and also for sonorants such as <code> [l] [j] [n]</code>.
<li>A mixture of these. A stored WAV file is mixed with a wave generated from formant parameters. Use a FMT() instruction followed by addWav(). This is used for voiced stops and fricatives such as <code> [b] [g] [v] [z]</code>.
</ul>
<p>&nbsp;<hr>
<h3>Vowel Transitions</h3>
These specify how a consonant affects an adjacent vowel. A consonant may cause a transition in the vowel's formants as the mouth changes shape between the consonant and the vowel. The following attributes may be specified. Note that the maximum rate of change of formant frequencies is limited by the speak program.<p>
<ul><dl>
<dt><strong>len=&lt;integer&gt;</strong>
<dd>Nominal length of the transition in mS. If omitted a default value is used.
<dt><strong>rms=&lt;integer&gt;</strong>
<dd>Adjusts the amplitude of the vowel at the end of the transition. If omitted a default value is used.
<dt><strong>f1=&lt;integer&gt;</strong>
<dd>
0: &nbsp; f1 formant frequency unchanged.<br>
1: &nbsp; f1 formant frequency decreases.<br>
2: &nbsp; f1 formant frequency decreases more.
<dt><strong>f2=&lt;freq&gt; &lt;min&gt; &lt;max&gt;</strong>
<dd>
&lt;freq&gt;: &nbsp; The frequency towards which the f2 formant moves (Hz).<br>
&lt;min&gt;: &nbsp; Signed integer (Hz).&nbsp; The minimum f2 frequency change.<br>
&lt;max&gt;: &nbsp; Signed integer (Hz).&nbsp; The maximum f2 frequency change.
<dt><strong>f3=&lt;change&gt; &lt;amplitude&gt;</strong>
<dd>
&lt;change&gt;: &nbsp; Signed integer (Hz).&nbsp; Frequence change of f3, f4, and f5 formants.<br>
&lt;amplitude&gt;: &nbsp; Amplitude of the f3, f4, and f5 formants at the end of the transition. 100 = no change.
<dt><strong>brk</strong>
<dd>Break. Do not merge the synthesized wave of the consonant into the vowel. This will produce a discontinuity in the formants.
<dt><strong>rate</strong>
<dd>Allow a greater maximum rate of change of formant frequencies.
<dt><strong>glstop</strong>
<dd>Indicates a glottal stop.
</dl></ul>
</body>
</html>

+ 467
- 293
docs/phontab.md View File

@@ -1,15 +1,39 @@
# Table of contents

* [Phoneme tables](#phoneme-tables)
* [Phoneme files](#phoneme-files)
* [Phoneme definitions](#phoneme-definitions)
* [Phoneme Properties](#phoneme-properties)
* [Phoneme Instructions](#phoneme-instructions)
* [Conditional Statements](#conditional-statements)
* [Sound Specifications](#sound-specifications)
* [Vowel Transitions](#vowel-transitions)

# Phoneme tables
# Phoneme Tables

- [Phoneme files](#phoneme-files)
- [Phoneme Definitions](#phoneme-definitions)
- [Phoneme Properties](#phoneme-properties)
- [Type](#type)
- [Properties](#properties)
- [Place of Articulation](#place-of-articulation)
- [starttype](#starttype)
- [endtype](#endtype)
- [lengthmod](#lengthmod)
- [voicingswitch](#voicingswitch)
- [Phoneme Instructions](#phoneme-instructions)
- [length](#length)
- [ipa](#ipa)
- [WAV](#wav)
- [FMT](#fmt)
- [VowelStart](#vowelstart)
- [VowelEnding](#vowelending)
- [Vowelin](#vowelin)
- [Vowelout](#vowelout)
- [ChangePhoneme](#changephoneme)
- [ChangeIfDiminished](#changeifdiminished)
- [ChangeIfUnstressed](#changeifunstressed)
- [ChangeIfNotStressed](#changeifnotstressed)
- [ChangeIfStressed](#changeifstressed)
- [IfNextVowelAppend](#ifnextvowelappend)
- [RETURN](#return)
- [CALL](#call)
- [Conditional Statements](#conditional-statements)
- [Conditions](#conditions)
- [Attributes](#attributes)
- [Sound Specifications](#sound-specifications)
- [Vowel Transitions](#vowel-transitions)

----------

A phoneme table defines all the phonemes which are used by a language,
together with their properties and the data for their production as
@@ -27,65 +51,71 @@ set.

The source files for the phoneme data are in the "phsource" directory in
the espeakedit download package. "Vowel files", which are referenced in
FMT(), VowelStart(), and VowelEnding() instructions are made using the
`FMT()`, `VowelStart()`, and `VowelEnding()` instructions are made using the
espeakedit program.

## Phoneme files
## Phoneme Files

The phoneme tables are defined in a master phoneme file, named
**phonemes**. This starts with the **base** phoneme table followed by
`phonemes`. This starts with the `base` phoneme table followed by
phoneme tables for other languages and voices. These inherit phonemes
from the **base** table or previously defined tables.
from the `base` table or previously defined tables.

In addition to phoneme definitions, the phoneme file can contain the
following:

* **include** \<filename\>
Includes the text of the specified file at this point. This allows
different phoneme tables to be kept in different text files, for
convenience. \<filename\> is a relative path. The included file can
itself contain **include** statements.
* **phonemetable** \<name\> \<parent\>
Starts a new phoneme table, and ends the previous table.
\<name\> Is the name of this phoneme table. This name is used in
Voice files.
\<parent\> Is the name of a previously defined phoneme table whose
phoneme definitions are inherited by this one. The name **base**
indicates the first (base) phoneme table.
include <filename>

Includes the text of the specified file at this point. This allows
different phoneme tables to be kept in different text files, for
convenience. \<filename\> is a relative path. The included file can
itself contain `include` statements.

phonemetable <name> <parent>

## Phoneme definitions
Starts a new phoneme table, and ends the previous table.

Note: These new Phoneme definitions apply to eSpeak NG version 420 and
later.
\<name\> Is the name of this phoneme table. This name is used in Voice files.

\<parent\> Is the name of a previously defined phoneme table whose phoneme
definitions are inherited by this one. The name `base` indicates the first
(base) phoneme table.

## Phoneme Definitions

A phoneme table contains a list of phoneme definitions. Each starts with
the keyword **phoneme** and the phoneme name (this is the name used in
the keyword `phoneme` and the phoneme name (this is the name used in
the pronunciation rules in a language's \*\_rules and \*\_list files),
and ends with the keyword **endphoneme**. For example:

```
phoneme aI
vowel
starttype #a endtype #i
length 230
FMT(vowels/ai)
endphoneme

phoneme s
vls alv frc sibilant
voicingswitch z
lengthmod 3
Vowelin f1=0 f2=1700 -300 300 f3=-100 80
Vowelout f1=0 f2=1700 -300 250 f3=-100 80 rms=20

IF nextPh(isPause) THEN
WAV(ufric/s_)
ELIF nextPh(p) OR nextPh(t) OR nextPh(k) THEN
WAV(ufric/s!)
ENDIF
WAV(ufric/s)
endphoneme
```
and ends with the keyword `endphoneme`.

The phoneme mnemonics are based on the scheme by
[Evan Kirshenbaum](http://www.kirshenbaum.net/IPA/ascii-ipa.pdf)
which represents International Phonetic Alphabet symbols using ascii
characters.

For example:

phoneme aI
vowel
starttype #a endtype #i
length 230
FMT(vowels/ai)
endphoneme

phoneme s
vls alv frc sibilant
voicingswitch z
lengthmod 3
Vowelin f1=0 f2=1700 -300 300 f3=-100 80
Vowelout f1=0 f2=1700 -300 250 f3=-100 80 rms=20

IF nextPh(isPause) THEN
WAV(ufric/s_)
ELIF nextPh(p) OR nextPh(t) OR nextPh(k) THEN
WAV(ufric/s!)
ENDIF
WAV(ufric/s)
endphoneme

A phoneme definition contains both static properties and executed
instructions. The instructions may contain conditional statements, so
@@ -97,82 +127,101 @@ In the first phase, the instructions may change the phoneme and replace
it by a different phoneme. In the second phase, instructions are used to
produce the sound for the phoneme.

The **import\_phoneme** statement can be used to copy a previously
The `import_phoneme` statement can be used to copy a previously
defined phoneme from a specified phoneme table. For example:

```
phoneme t
import_phoneme base/t[
endphoneme
```
phoneme t
import_phoneme base/t[
endphoneme

means: `phoneme t` in this phoneme table is a copy of
`phoneme t[` from phoneme table "base". A **length**
instruction can be used after **import\_phoneme** to vary the length
from the original.
means: `phoneme t` in this phoneme table is a copy of`phoneme t[` from phoneme
table `base`. A `length` instruction can be used after `import\_phoneme` to
vary the length from the original.

## Phoneme Properties

Within the phoneme definition the following lines may occur: ( (V)
indicates only for vowels, (C) only for consonants)

Type. One of these must be present.

Type|Description
-----------|----------------------------------------------
**vowel** |
**liquid** | semi-vowels, such as: `r, l, j, w`
**nasal** | nasal eg: `m, n, N`
**stop** | stop eg: `p, b, t, d, k, g`
**frc** | fricative eg: `f, v, T, D, s, z, S, Z, C, x`
**afr** | affricate eg: `tS, dZ`
**pause** |
**stress** | used for stress symbols, eg: ' , = %
**virtual**| Used to represent a class of phonemes.

Properties:

Property|Description
--------------|--------------------------------------
**vls** | (C) voiceless eg. `p, t, k, f, s`
**vcd** | (C) voiced eg. `b, d, g, v, z`
**sibilant** | (C) eg: `s, z, S, Z, tS, dZ`
**palatal** | (C) A palatal or palatalized consonant.
**rhotic** | (C) An "r" type consonant.
**unstressed**| (V) This vowel is always unstressed, unless explicitly marked otherwise.
**nolink** | Prevent any linking from the previous phoneme.
**nopause** | Used in a `liquid` or `nasal` phoneme to prevent eSpeak inserting a short pause if a word starts with this phoneme and the previous word ends with a vowel.
**trill** | \(C\) Apply trill to the voicing.

Place of Articulation (C):

Articulation|Description
--------|------------------
**blb** | bi-labial
**ldb** | labio-dental
**dnt** | dental
**alv** | alveolar
**rfx** | retroflex
**pla** | palato-alveolar
**pal** | palatal
**vel** | velar
**lbv** | labio-velar
**uvl** | uvular
**phr** | pharyngeal
**glt** | glottal

* **starttype** \<phoneme\>
Allocates this phoneme to a group so that conditions such as nextPh(#e) can test for any of a group of phonemes. Pre-defined groups for use for vowels are: #@ #a #e #i #o #u. Additional groups can be defined as phonemes with type "virtual".

* **endtype** \<phoneme\>
Allocates this phoneme to a group so that conditions such as prevPh(#e) can test for any of a group of phonemes. Pre-defined groups for use for vowels are: #@ #a #e #i #o #u. Additional groups can be defined as phonemes with type "virtual".

* **lengthmod** \<integer\>
\(C\) Determines how this consonant affects the length of the previous vowel.
This value is used as index into the `length_mods` table in the `CalcLengths()` function in the eSpeak program.

* **voicingswitch** \<phoneme\>
This is used for some languages to change between voiced and unvoiced phonemes.
Within the phoneme definition the following lines may occur: (`(V)` indicates
only for vowels, `(C)` only for consonants).

### Type

One of these must be present.

| Type | Description |
|-----------|-------------|
| `vowel` | |
| `liquid` | semi-vowels, such as: `r`, `l`, `j`, `w` |
| `nasal` | nasal e.g.: `m`, `n`, `N` |
| `stop` | stop (plosive) e.g.: `p`, `b`, `t`, `d`, `k`, `g` |
| `frc` | fricative e.g.: `f`, `v`, `T`, `D`, `s`, `z`, `S`, `Z`, `C`, `x` |
| `afr` | affricate e.g.: `tS`, `dZ` |
| `pause` | |
| `stress` | Used for stress symbols, eg: `'` `,` `=` `%` |
| `virtual` | Used to represent a class of phonemes. |

### Properties

| Property | Type | Description |
|--------------|------|-------------|
| `vls` | (C) | voiceless e.g. `p`, `t`, `k`, `f`, `s` |
| `vcd` | (C) | voiced e.g. `b`, `d`, `g`, `v`, `z` |
| `sibilant` | (C) | e.g.: `s`, `z`, `S`, `Z`, `tS`, `dZ` |
| `palatal` | (C) | A palatal or palatalized consonant. |
| `rhotic` | (C) | An `r` type consonant. |
| `unstressed` | (V) | This vowel is always unstressed, unless explicitly marked otherwise. |
| `nolink` | | Prevent any linking from the previous phoneme. |
| `nopause` | | Used in a `liquid` or `nasal` phoneme to prevent eSpeak NG inserting a short pause if a word starts with this phoneme and the previous word ends with a vowel. |
| `trill` | (C) | Apply trill to the voicing. |

### Place of Articulation

| Articulation | Type | Description |
|--------------|------|-----------------|
| `blb` | (C) | bilabial |
| `ldb` | (C) | labio-dental |
| `dnt` | (C) | dental |
| `alv` | (C) | alveolar |
| `rfx` | (C) | retroflex |
| `pla` | (C) | palato-alveolar |
| `pal` | (C) | palatal |
| `vel` | (C) | velar |
| `lbv` | (C) | labio-velar |
| `uvl` | (C) | uvular |
| `phr` | (C) | pharyngeal |
| `glt` | (C) | glottal |

### starttype

starttype <phoneme>

Allocates this phoneme to a group so that conditions such as `nextPh(#e)` can
test for any of a group of phonemes. Pre-defined groups for use for vowels are:
`#@` `#a` `#e` `#i` `#o` `#u`. Additional groups can be defined as phonemes
with type `virtual`.

### endtype

endtype <phoneme>

Allocates this phoneme to a group so that conditions such as `prevPh(#e)` can
test for any of a group of phonemes. Pre-defined groups for use for vowels are:
`#@` `#a` `#e` `#i` `#o` `#u`. Additional groups can be defined as phonemes
with type `virtual`.

### lengthmod

lengthmod <integer>

(C) Determines how this consonant affects the length of the previous vowel.

This value is used as index into the `length_mods` table in the `CalcLengths()`
function in the eSpeak NG program.

### voicingswitch

voicingswitch <phoneme>

This is used for some languages to change between voiced and unvoiced phonemes.

## Phoneme Instructions

@@ -183,212 +232,325 @@ causes a change to a different phoneme will terminate the instructions.
During the second phase, FMT() and WAV() instructions will terminate the
instructions.

* **length** \<length\>
The relative length of the phoneme, typically about 140 for a short vowel and from 200 to 300 for a long vowel or diphong. A length() instruction is needed for vowels. It is optional for consonants.
### length

length <length>

The relative length of the phoneme, typically about 140 for a short vowel and
from 200 to 300 for a long vowel or diphong. A `length()` instruction is
needed for vowels. It is optional for consonants.

### ipa

ipa <ipa string>

In many cases, eSpeak NG makes IPA (International Phonetic Alpbabet) phoneme
names automatically from eSpeak NG phoneme names. If this is not correct, then
the phoneme definition can include an `ipa` instruction to specify the correct
IPA name. IPA strings may include non-ascii characters. They may also include
characters specified by their character codes in the form `U+` followed by 4
hexadecimal digits. For example a string: `aU+0303` indicates 'a' with a
'combining tilde'.

### WAV

WAV(<wav file>, <amplitude>)

\<wav file\> is a path to a WAV file (22 kHz, 16 bits, mono) within `phsource/`
which will be played to produce the sound. This method is used for unvoiced
consonants. \<wavefile\> does not include a .WAV filename extension, although
the file to which it refers may or may not have one.

\<amplitude\> is optional. It is a percentage change to the amplitude of the
WAV file. So, `WAV(ufric/s, 50)` means: play file 'ufric/s.wav' at 50% amplitude.

### FMT

FMT(<vowel file>, <amplitude>)

\<vowel file\> is a path to a file (within `phsource/`) which defines how to
generate the sound (a vowel or voiced consonant) from a sequence of formant
values. Vowel files are made using the espeakedit program.

\<amplitude\> is optional. It is a percentage change to the amplitude of the
sound which is synthesized from the `FMT()` instruction.

FMT(<vowel file>, <amplitude>) addWav(<wav file>, <amplitude>)

For voiced consonants, a `FMT()` instruction may be followed by an `addWav()`
instruction. `addWav()` has the same format as a `WAV()` instruction, but the
WAV file is mixed with the sound which is synthesized from the `FMT()` instruction.

### VowelStart

VowelStart(<vowel file>, <length adjust>)

This is used to modify the start of a vowel when it follows a sonorant consonant
(such as `[l]` or `[j]`). It replaces the first frame of the \<vowel file\> which
is specified in a `FMT()` instruction by this \<vowel file\>, and adjusts the
length of the original by a signed value \<length adjust\>. The `VowelStart()`
instruction may be specified either in the phoneme definition of the vowel, or
in the phoneme definition of the sonorant consonant which precedes the vowel.
The former takes precedence.

### VowelEnding

VowelEnding(<vowel file>, <length adjust>)

This is used to modify the end of a vowel when it is followed by a sonorant
consonant (such as `[l]` or `[j]`). It is appended to the \<vowel file\> which
is specified in a `FMT()` instruction by this \<vowel file\>, and adjusts the
length of the original by a signed value \<length adjust\>. The `VowelEnding()`
instruction may be specified either in the phoneme definition of the vowel, or
in the phoneme definition of the sonorant consonant which follows the vowel.
The former takes precedence.

### Vowelin

Vowelin <vowel transition data>

(C) Specifies the effects of this consonant on the formants of a following
vowel. See "vowel transitions", below.

### Vowelout

Vowelout <vowel transition data>

(C) Specifies the effects of this consonant on the formants of a preceding
vowel. See "vowel transitions", below.

### ChangePhoneme

* **ipa** \<ipa string\>
In many cases, eSpeak makes IPA (International Phonetic Alpbabet) phoneme names automatically from eSpeak phoneme names. If this is not correct, then the phoneme definition can include an **ipa** instruction to specify the correct IPA name. IPA strings may include non-ascii characters. They may also include characters specified by their character codes in the form U+ followed by 4 hexadecimal digits. For example a string: aU+0303 indicates 'a' with a 'combining tilde'.
ChangePhoneme(<phoneme>)

* **WAV**(\<wav file\>, \<amplitude\>)
\<wav file\> is a path to a WAV file (22 kHz, 16 bits, mono) within `phsource/` which will be played to produce the sound. This method is used for unvoiced consonants. \<wavefile\> does not include a .WAV filename extension, although the file to which it refers may or may not have one.
\<amplitude\> is optional. It is a percentage change to the amplitude of the WAV file. So, `WAV(ufric/s, 50)` means: play file 'ufric/s.wav' at 50% amplitude.
Change to the specified phoneme.

* **FMT**(\<vowel file\>, \<amplitude\>)
\<vowel file\> is a path to a file (within `phsource/`) which defines how to generate the sound (a vowel or voiced consonant) from a sequence of formant values. Vowel files are made using the espeakedit program.
\<amplitude\> is optional. It is a percentage change to the amplitude of the sound which is synthesized from the FMT() instruction.
### ChangeIfDiminished

* **FMT**(\<vowel file\>, \<amplitude\>) **addWav**(\<wav file\>, \<amplitude\>)
For voiced consonants, a FMT() instruction may be followed by an addWav() instruction. addWav() has the same format as a WAV() instruction, but the WAV file is mixed with the sound which is synthesized from the FMT() instruction.
ChangeIfDiminished(<phoneme>)

* **VowelStart**(\<vowel file\>, \<length adjust\>)
This is used to modify the start of a vowel when it follows a sonorant consonant (such as [l] or [j]). It replaces the first frame of the \<vowel file\> which is specified in a FMT() instruction by this \<vowel file\>, and adjusts the length of the original by a signed value \<length adjust\>. The VowelStart() instruction may be specified either in the phoneme definition of the vowel, or in the phoneme definition of the sonorant consonant which precedes the vowel. The former takes precedence.
Change to the specified phoneme (such as schwa, `@`) if this syllable has
"diminished" stress.

* **VowelEnding**(\<vowel file\>, \<length adjust\>)
This is used to modify the end of a vowel when it is followed by a sonorant consonant (such as [l] or [j]). It is appended to the \<vowel file\> which is specified in a FMT() instruction by this \<vowel file\>, and adjusts the length of the original by a signed value \<length adjust\>. The VowelEnding() instruction may be specified either in the phoneme definition of the vowel, or in the phoneme definition of the sonorant consonant which follows the vowel. The former takes precedence.
### ChangeIfUnstressed

* **Vowelin** \<vowel transition data\>
(C) Specifies the effects of this consonant on the formants of a following vowel. See "vowel transitions", below.
ChangeIfUnstressed(<phoneme>)

* **Vowelout** \<vowel transition data\>
(C) Specifies the effects of this consonant on the formants of a preceding vowel. See "vowel transitions", below.
Change to the specified phoneme if this syllable has "diminished" or
"unstressed" stress.

* **ChangePhoneme(**\<phoneme\>)
Change to the specified phoneme.
### ChangeIfNotStressed

* **ChangeIfDiminished(**\<phoneme\>)
Change to the specified phoneme (such as schwa, @) if this syllable has "diminished" stress.
ChangeIfNotStressed(<phoneme>)

* **ChangeIfUnstressed(**\<phoneme\>)
Change to the specified phoneme if this syllable has "diminished" or "unstressed" stress.
Change to the specified phoneme if this syllable does not have "primary" stress.

* **ChangeIfNotStressed(**\<phoneme\>)
Change to the specified phoneme if this syllable does not have "primary" stress.
### ChangeIfStressed

* **ChangeIfStressed(**\<phoneme\>)
Change to the specified phoneme if this syllable has "primary" stress.
ChangeIfStressed(<phoneme>)

* **IfNextVowelAppend(**\<phoneme\>)
If the following phoneme is a vowel then this additional phoneme will be inserted before it.
Change to the specified phoneme if this syllable has "primary" stress.

* **RETURN**
Ends executions of instructions.
### IfNextVowelAppend

* **CALL** \<phoneme table\>/\<phoneme\>
Executes the instructions of the specified phoneme.
IfNextVowelAppend(<phoneme>)

If the following phoneme is a vowel then this additional phoneme will be
inserted before it.

### Conditional Statements
### RETURN

Ends executions of instructions.

### CALL

CALL <phoneme table>/<phoneme>

Executes the instructions of the specified phoneme.

## Conditional Statements

Phoneme definitions can contain conditional statements such as:

```
<pre> IF <condition> THEN
<statements>
ENDIF
</pre>
```
IF <condition> THEN
<statements>
ENDIF

or more generally:

```
<pre> IF <condition> THEN
<statements>
ELIF <condition> THEN
<statements>
...
ELSE
<statements>
ENDIF
</pre>
```
IF <condition> THEN
<statements>
ELIF <condition> THEN
<statements>
...
ELSE
<statements>
ENDIF

where the `ELSE` and multiple `ELSE` parts are optional.

Multiple conditions may be joined with `AND` or `OR`, but not a mixture of `AND`s and `OR`s.
Multiple conditions may be joined with `AND` or `OR`, but not a mixture of
`AND`s and `OR`s.

A condition may be preceded by `NOT`. For example:

```
<pre> IF <condition> AND NOT <condition> THEN
<statements>
ENDIF
</pre>
```
IF <condition> AND NOT <condition> THEN
<statements>
ENDIF

### Conditions

Conditions can be:

* thisPh(\<attribute\>)
Test this current phoneme

* prevPh(\<attribute\>)
Test the previous phoneme

* prevPhW(\<attribute\>)
Test the previous phoneme, but only within the same word. Returns false if there is no previous phoneme in the word.
* prev2PhW(\<attribute\>)
Test the phoneme before the previous phoneme, but only within the same word. Returns false if it is not in this word.
* nextPh(\<attribute\>)
Test the following phoneme
* next2Ph(\<attribute\>)
Test the phoneme after the next phoneme.
* nextPhW(\<attribute\>)
Test the next phoneme, but only within the same word. Returns false if there is no following phoneme in the word.
* next2PhW(\<attribute\>)
Test the phoneme after the next phoneme, but only within the same word. Returns false if not found before the word end.
* next3PhW(\<attribute\>)
Test the third phoneme after the current phoneme, but only within the same word. Returns false if not found before the word end.
* nextVowel(\<attribute\>)
Test the next vowel after the current phoneme, but only within the same word. Returns false if there is none.
* prevVowel(\<attribute\>)
Test the previous vowel before the current phoneme, but only within the same word. Returns false if there is none.
* PreVoicing()
This is used as part of the instructions for voiced stop consonants (eg. [d] [g]). If true then produce a voiced murmur before the stop.
* KlattSynth()
Returns true if the voice is using the Klatt synthesizer rather than the eSpeak synthesizer.
thisPh(<attribute>)

Test this current phoneme

prevPh(<attribute>)

Test the previous phoneme

prevPhW(<attribute>)

Test the previous phoneme, but only within the same word. Returns false if
there is no previous phoneme in the word.

prev2PhW(<attribute>)

Test the phoneme before the previous phoneme, but only within the same word.
Returns false if it is not in this word.

nextPh(<attribute>)

Test the following phoneme

next2Ph(<attribute>)

Test the phoneme after the next phoneme.

nextPhW(<attribute>)

Test the next phoneme, but only within the same word. Returns false if there
is no following phoneme in the word.

next2PhW(<attribute>)

Test the phoneme after the next phoneme, but only within the same word. Returns
false if not found before the word end.

next3PhW(<attribute>)

Test the third phoneme after the current phoneme, but only within the same word.
Returns false if not found before the word end.

nextVowel(<attribute>)

Test the next vowel after the current phoneme, but only within the same word.
Returns false if there is none.

prevVowel(<attribute>)

Test the previous vowel before the current phoneme, but only within the same
word. Returns false if there is none.

PreVoicing()

This is used as part of the instructions for voiced stop consonants (e.g. `[d]`
and `[g]`). If true then produce a voiced murmur before the stop.

KlattSynth()

Returns true if the voice is using the Klatt synthesizer rather than the eSpeak synthesizer.

### Attributes

Note: Additional attributes could be added to eSpeak if needed.
<phoneme name>

True if the phoneme has this phoneme name.

* \<phoneme name\>
True if the phoneme has this phoneme name.
<phoneme group>

True if the phoneme has this `starttype` (or if it has this `endtype` if it is
used in `prevPh()`). The pre-defined phoneme groups are `#@`, `#a`, `#e`, `#i`,
`#o`, `#u`.

isPause

True if the phoneme is a pause.

isPause2

`nextPh(isPause2)` is used to test whether the next phoneme is not a vowel or
liquid consonant within the same word.

isVowel
isNotVowel
isLiquid
isNasal
isVFricative

These test the phoneme type.

isPalatal
isRhotic

These test whether the phoneme has this property.

isWordStart
notWordStart

These text whether this is the first phoneme in a word.

* \<phoneme group\>
True if the phoneme has this starttype (or if it has this endtype if it's used in prevPh() ). The pre-defined phoneme groups are #@, #a, #e, #i, #o, #u.
isWordEnd

* isPause
True if the phoneme is a pause.
True if this is the final phoneme in a word.

* isPause2
`nextPh(isPause2)` is used to test whether the next phoneme is not a vowel or liquid consonant within the same word.
isFirstVowel
isSecondVowel
isFinalVowel

* isVowel
isNotVowel
isLiquid
isNasal
isVFricative
These test the phoneme type.
True if this is the First, Second, or Last vowel in a word.

* isPalatal
isRhotic
These test whether the phoneme has this property.
isAfterStress

* isWordStart
notWordStart
* These text whether this is the first phoneme in a word.
True if this phoneme is after the stressed vowel in a word.

* isWordEnd
True if this is the final phoneme in a word.
isVoiced

* isFirstVowel
isSecondVowel
isFinalVowel
* True if this is the First, Second, or Last vowel in a word.
True if this phoneme is a vowel or a voiced consonant.

* isAfterStress
True if this phoneme is after the stressed vowel in a word.
isDiminished

* isVoiced
True if this phoneme is a vowel or a voiced consonant.
True if the syllable stress is "diminished"

* isDiminished
True if the syllable stress is "diminished"
isUnstressed

* isUnstressed
True if the syllable stress is "diminished" or "unstressed"
True if the syllable stress is "diminished" or "unstressed"

* isNotStressed
True if the syllable stress is not "primary stress".
isNotStressed

* isStressed
True if the syllable stress is "primary stress".
True if the syllable stress is not "primary stress".

* isMaxStress
True if this is the highest stressed syllable in the word.
isStressed

True if the syllable stress is "primary stress".

isMaxStress

True if this is the highest stressed syllable in the word.

## Sound Specifications

There are three ways to produce sounds:

* Playing a WAV file, by using a WAV() instruction. This is used for unvoiced consonants such as `[p] [t] [s]`.
* Generating a wave from a sequence of formant parameters, by using a FMT() instruction.This is used for vowels and also for sonorants such as `[l] [j] [n]`.
* A mixture of these. A stored WAV file is mixed with a wave generated from formant parameters. Use a FMT() instruction followed by addWav(). This is used for voiced stops and fricatives such as `[b] [g] [v] [z]`.

* Playing a WAV file, by using a `WAV()` instruction. This is used for unvoiced
consonants such as `[p]`, `[t]` and [s]`.
* Generating a wave from a sequence of formant parameters, by using a `FMT()`
instruction. This is used for vowels and also for sonorants such as ``[l]`,
`[j]` and `[n]`.
* A mixture of these. A stored `WAV` file is mixed with a wave generated from
formant parameters. Use a `FMT()` instruction followed by `addWav()`. This is
used for voiced stops and fricatives such as `[b]`, `[g]`, `[v]` and `[z]`.

## Vowel Transitions

@@ -396,37 +558,49 @@ These specify how a consonant affects an adjacent vowel. A consonant may
cause a transition in the vowel's formants as the mouth changes shape
between the consonant and the vowel. The following attributes may be
specified. Note that the maximum rate of change of formant frequencies
is limited by the speak program.
is limited by the program.

len=<integer>

Nominal length of the transition in mS. If omitted a default value is used.

rms=<integer>

Adjusts the amplitude of the vowel at the end of the transition. If omitted
a default value is used.

f1=<integer>

0: f1 formant frequency unchanged.

1: f1 formant frequency decreases.

2: f1 formant frequency decreases more.

f2=<freq> <min> <max>

\<freq\>: The frequency towards which the f2 formant moves (Hz).

\<min\>: Signed integer (Hz). The minimum f2 frequency change.

* **len=<integer>**
Nominal length of the transition in mS. If omitted a default value is used.
\<max\>: Signed integer (Hz). The maximum f2 frequency change.

* **rms=<integer>**
Adjusts the amplitude of the vowel at the end of the transition. If omitted a default value is used.
f3=<change> <amplitude>

* **f1=<integer>**
0: f1 formant frequency unchanged.
1: f1 formant frequency decreases.
2: f1 formant frequency decreases more.
\<change\>: Signed integer (Hz). Frequence change of f3, f4, and f5 formants.

* **f2=<freq> <min> <max>**
\<freq\>: The frequency towards which the f2 formant moves (Hz).
\<min\>: Signed integer (Hz). The minimum f2 frequency change.
\<max\>: Signed integer (Hz). The maximum f2 frequency change.
\<amplitude\>: Amplitude of the f3, f4, and f5 formants at the end of the
transition. 100 = no change.

* **f3=<change> <amplitude>**
\<change\>: Signed integer (Hz). Frequence change of f3, f4, and f5 formants.
\<amplitude\>: Amplitude of the f3, f4, and f5 formants at the end of the transition. 100 = no change.
brk

* **brk**
Break. Do not merge the synthesized wave of the consonant into the vowel. This will produce a discontinuity in the formants.
Break. Do not merge the synthesized wave of the consonant into the vowel. This
will produce a discontinuity in the formants.

* **rate**
Allow a greater maximum rate of change of formant frequencies.
rate

* **glstop**
Indicates a glottal stop.
Allow a greater maximum rate of change of formant frequencies.

glstop

Indicates a glottal stop.

Loading…
Cancel
Save