10 years ago · 78aec60405
--- a/docs/add_language.md
+++ b/docs/add_language.md
@@ -0,0 +1,157 @@
 6. ADDING OR IMPROVING A LANGUAGE {.western}
 ---------------------------------

 Most of the work doesn't need any programming knowledge. Just an
 understanding of the language, an awareness of its features, patience
 and attention to detail. Wikipedia is a good source of basic phonetic
 information, eg
 [http://en.wikipedia.org/wiki/Vowel](http://en.wikipedia.org/wiki/Vowel).

 In many cases it should be fairly easy to add a rough implementation of
 a new language, hopefully enough to be intelligible. After that it's a
 gradual process of improvement.

 ### 6.1 Language Code {.western}

 Generally, the language's international [ISO
 639-1](http://en.wikipedia.org/wiki/ISO_639-1) code is used to identify
 the language. It is used in the filenames which contain the language's
 data. In the examples below the code **"fr"** is used as an example.
 Replace this with the code of your language.

 If the language does not have a 2-letter ISO\_639-1 code, then use the
 3-letter ISO\_639-3 code. Language codes may differ from country codes.

 It is possible to have different variants of a language for different
 dialects. For example the sound of some phonemes are changed, or some of
 the pronunciation rules differ.

 ### 6.2 Language Files {.western}

 The following files are needed for your language.

 -   -   -   -   

 The **fr\_rules** and **fr\_list** files are compiled to produce the
 file **espeak-data/fr\_dict**, which eSpeak uses when it is speaking.

 ### 6.3 Voice File {.western}

 Each language needs a voice file in **espeak-data/voices** or
 **espeak-data/voices/test**. The filename of the default voice for a
 language should be the same as the language code (eg. "fr" for French).

 Details of the contents of voice files are given in
 [voices.html](http://espeak.sf.net/voices.html).

 The simplest voice file would contain just 2 lines to give the language
 name and language code, eg:

 ~~~~ {.western}
  name french
  language fr
 ~~~~

 This language code specifies which phoneme table and dictionary to use
 (i.e. **phonemetable fr** and **espeak-data/fr\_dict**) to be used. If
 needed, these can be overridden by **phonemes** and **dictionary**
 attributes in the voice file. For example you may want to start the
 implementation of a new language by using the phoneme table of an
 existing language.

 ### 6.4 Phoneme Definition File {.western}

 You must first decide on the set of phonemes (vowel and consonant
 sounds) for the language. These should be defined in a phoneme
 definition file **ph\_xxxx**, where "ph\_xxxx" is the name of your
 language. A reference to this file is then included at the end of the
 master phoneme file, **phsource/phonemes**, eg:

 ~~~~ {.western}
  phonemetable  fr  base
  include  ph_french
 ~~~~

 This example defines a phoneme table **"fr"** which inherits the
 contents of phoneme table **"base"**. Its contents are found in the file
 **ph\_french**.

 The **base** phoneme table contains definitions of a basic set of
 consonants, and also some "control" phonemes such as stress marks and
 pauses. These are defined in **phsource/phonemes**. The phoneme table
 for a language will inherit these, or alternatively it may inherit the
 phoneme table of another language which in turn inherits the **base**
 phoneme table.

 The phonemes file for the language defines those additional phonemes
 which are not inherited (generally the vowels and diphthongs, plus any
 additional consonants that are needed), or phonemes whose definitions
 differ from the inherited version (eg. the redefinition of a consonant).

 Details of phonemes files are given in
 [phontab.html](http://espeak.sf.net/phontab.html).

 The **Compile phoneme data** function of the **espeakedit** program
 compiles the phonemes files of all languages to produce the files
 **espeak-data/phontab**, **phonindex**, and **phondata** which are used
 by eSpeak.

 For many languages, the consonant phonemes which are already available
 in eSpeak, together with the available vowel files which can be used to
 define vowel phonemes, will be sufficient. At least for an initial
 implementation.

 ### 6.5 Dictionary Files {.western}

 Once the language's phonemes have been defined, then pronunciation
 dictionary data can be produced in order to translate the language's
 source text into phonemes. This consists of two source files:
 **fr\_rules** (the spelling to phoneme rules) and **fr\_list** (an
 exceptions list, and attributes of certain words). The corresponding
 compiled data file is **espeak-data/fr\_dict** which is produced from
 **fr\_rules** and **fr\_list** sources by the command:

 > `espeak-ng --compile=fr`{.western}.

 Or by using the **espeakedit** program.

 Details of the contents of the dictionary files are given in
 [dictionary.html](http://espeak.sf.net/dictionary.html).

 The **fr\_list** file contains:

 -   -   -   -   

 ### 6.6 Program Code {.western}

 The behaviour of the eSpeak program is controlled by various options
 such as:

 -   -   -   -   

 The function SetTranslator() at the start of the source code file
 tr\_languages.cpp recognizes the language code and sets the appropriate
 options. For a new language, you would add its language code and the
 required options in SetTranslator(). However, this may not be necessary
 during testing because most of the options can also be set in the voice
 file in espeak-data/voices (see [Voice
 files](http://espeak.sf.net/voices.html)).

 ### 6.7 Improving a Language {.western}

 Listen carefully to the eSpeak voice. Try to identify what sounds wrong
 and what needs to be improved.

 -   -   -   -   -   

 **If you are interested in working on a language, please contact me so
 that I can set up the initial data and discuss the features of the
 language.**

 For most of the eSpeak voices, I do not speak or understand the
 language, and I do not know how it should sound. I can only make
 improvements as a result of feedback from speakers of that language. If
 you want to help to improve a language, listen carefully and try to
 identify individual errors, either in the spelling-to-phoneme
 translation, the position of stressed syllables within words, or the
 sound of phonemes, or problems with rhythm and vowel lengths.
--- a/docs/analyse.md
+++ b/docs/analyse.md
@@ -0,0 +1,101 @@
 ANALYSIS
 ========

 (Further notes are needed)

 Recordings of spoken words and phrases can be analysed to try and make
 eSpeak match a language more closely. Unlike most other (larger and
 better quality) synthesizers, eSpeak's data is not produced directly
 from recorded sounds. To use an analogy, it's like a drawing or sketch
 compared with a photograph. Or vector graphics compared with a bitmap
 image. It's smaller, less accurate, with less subtlety, but it can
 sometimes show some aspects of the picture more clearly than a more
 accurate image.

 #### Recording Sounds {.western}

 Recordings should be made while speaking slowly, clearly, and firmly and
 loudly (but not shouting). Speak about half a metre from the microphone.
 Try to avoid background noise and hum interference from electrical power
 cables.

 #### Praat {.western}

 I use a modified version of the praat program
 ([www.praat.org](www.praat.org)) to view and analyse both sound
 recordings and output from eSpeak. The modification adds a new function
 (`Spectrum->To_eSpeak`{.western}) which analysis a voiced sound and
 produces a file which can be loaded into espeakedit. Details of the
 modification are in the `"praat-mod"`{.western} directory in the
 espeakedit package. The analysis contains a sequence of frames, one per
 cycle at the speech's fundamental frequency. Each frame is a short time
 spectrum, together with praat's estimation of the f1 to f5 formant
 frequencies at the time of that cycle. I also use Praat's
 `New->Record_mono_sound`{.western} function to make sound recordings.

 ### Vowels and Diphthongs {.western}

 #### Analysing a Recording {.western}

 Make a recording, with a male voice, and trim it in Praat to keep just
 the required vowel sound. Then use the new
 `Spectrum->To_eSpeak`{.western} modification (this was named
 `To_Spectrogram2`{.western} in earlier versions) to analyse the sound.
 It produces a file named `"spectrum.dat"`{.western}. Load the
 `"spectrum.dat"`{.western} file into espeakedit. Espeakedit has two Open
 functions, `File->Open`{.western} and `File->Open2`{.western}. They are
 the same, except that they remember different paths. I generally use
 `File->Open2`{.western} for reading the `"spectrum.dat"`{.western} file.
 The data is displayed in espeakedit as a sequence of spectrum frames
 (see [editor.html](editor.html)).

 #### Tone Quality {.western}

 It can be difficult to match the tonal quality of a new vowel to be
 compatible with existing vowel files. This is determined by the relative
 heights and widths of the formant peaks. These vary depending on how the
 recording was made, the microphone, and the strength and tone of the
 voice. Also the positions of the higher peaks (F3 upwards) can vary
 depending on the characteristics of the speaker's voice. Formant peaks
 correspond to resonances within the mouth and throat, and they depend on
 its size and shape. With a female voice, all the formants (F1 upwards)
 are generally shifted to higher frequencies. For these reasons, it's
 best to use a male voice, and to use its analysed spectra only as
 guidance. Rather than construct formant-peaks entirely to match the
 analysed data, instead copy keyframes from a similar existing vowel.
 Then make small adjustments to match the position of the F1, F2, F3
 formant peaks and hopefully produce the required vowel sound.

 #### Using an Existing Vowel File {.western}

 Choose a similar vowel file from `phsource/vowel`{.western} and open it
 into espeakedit. It may be useful to use
 `phsource/vowel/vowelchart`{.western} as a map to show how vowel files
 compare with each other. You can select a keyframe from the vowel file
 and use CTRL-C and CTRL-V to copy the green formant peaks onto a frame
 of the new spectrum sequence. Then adjust the peaks to match the new
 frame. Press F1 to hear the sound of the formant peaks in the selected
 frame. The F0 peak is provided in order to adjust the correct balance of
 low frequencies, below the F1 peak. If the sound is too muffled, or
 conversely, too "thin", try adjusting the amplitude or position of the
 F0 peak.

 #### Length and Amplitude {.western}

 Use an existing vowel file as a guide for how to set the amplitude and
 length of the keyframes. At the right of each keyframe, its length is
 shown in mS and under that is its relative (RMS) amplitude. The second
 keyframe should be marked with a red marker (use CTRL-M to toggle this).
 This divides the vowel into the front-part (with one frame), and the
 rest. Use F2 to play the sound of the new vowel sequence. It will also
 produce a WAV file (the default name is speech.wav) which you can read
 into praat to see whether it has a sensible shape.

 #### Using the New Vowel {.western}

 Make a new directory (eg. vwl\_xx) in phsource for your new vowels. Save
 the spectrum sequence with a name which you have chosen for it. You can
 then edit the phoneme file for your language (eg. phsource/ph\_xxx), and
 change a phoneme to refer to your new vowel file. Then do
 `Data->Compile_Phoneme_Data`{.western} from espeakedit's menubar to
 re-compile the phoneme data.
--- a/docs/commands.md
+++ b/docs/commands.md
@@ -0,0 +1,279 @@
 2.1 INSTALLATION {.western}
 ----------------

 ### 2.1.1 Linux and other Posix systems {.western}

 There are two versions of the command line program. They both have the
 same command parameters (see below).

 1.  2.  

 Place the **espeak-ng** or **speak-ng** executable file in the command
 path, eg in **/usr/local/bin**

 Place the "**espeak-data**" directory in /usr/share as
 **/usr/share/espeak-data**.\
 Alternatively if it is placed in the user's home directory (i.e.
 **/home/\<user\>/espeak-data**) then that will be used instead.

 #### Dependencies {.western}

 **espeak-ng** uses the PortAudio sound library (version 18), so you will
 need to have the **libportaudio0** library package installed. It may be
 already, since it's used by other software, such as OpenOffice.org and
 the Audacity sound editor.

 Some Linux distrubitions (eg. SuSe 10) have version 19 of PortAudio
 which has a slightly different API. The speak program can be compiled to
 use version 19 of PortAudio by copying the file portaudio19.h to
 portaudio.h before compiling.

 The speak program may be compiled without using PortAudio, by removing
 the line

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   #define USE_PORTAUDIO
 ~~~~

 in the file speech.h.

 ### 2.1.2 Windows {.western}

 The installer: **setup\_espeak.exe** installs the SAPI5 version of
 eSpeak. During installation you need to specify which voices you want to
 appear in SAPI5 voice menus.

 It also installs a command line program **espeak-ng** in the espeak-ng
 program directory.

 2.2 COMMAND OPTIONS {.western}
 -------------------

 ### 2.2.1 Examples {.western}

 To use at the command line, type:\
   **espeak-ng "This is a test"**\
 or\
   **espeak-ng -f \<text file\>**

 Or just type\
   **espeak-ng**\
 followed by text on subsequent lines. Each line is spoken when RETURN
 is pressed.

 Use **espeak-ng -x** to see the corresponding phoneme codes.

 ### 2.2.2 The Command Line Options {.western}

 **espeak-ng [options] ["text words"]**
 :   Text input can be taken either from a file, from a string in the
    command, or from stdin.
 **-f \<text file\>**
 :   Speaks a text file.
 **--stdin**
 :   Takes the text input from stdin.
 If neither -f nor --stdin is given, then the text input is taken from "text words" (a text string within double quotes). \
 If that is not present then text is taken from stdin, but each line is treated as a separate sentence. \
 **-a \<integer\>**
 :   Sets amplitude (volume) in a range of 0 to 200. The default is 100.
 **-p \<integer\>**
 :   Adjusts the pitch in a range of 0 to 99. The default is 50.
 **-s \<integer\>**
 :   Sets the speed in words-per-minute (approximate values for the
    default English voice, others may differ slightly). The default
    value is 175. I generally use a faster speed of 260. The lower limit
    is 80. There is no upper limit, but about 500 is probably a
    practical maximum.
 **-b \<integer\>**
 :   Input text character format.
 :   1   UTF-8. This is the default.
 :   2   The 8-bit character set which corresponds to the language (eg.
    Latin-2 for Polish).
 :   4   16 bit Unicode.
 :   Without this option, eSpeak assumes text is UTF-8, but will
    automatically switch to the 8-bit character set if it finds an
    illegal UTF-8 sequence.
 **-g \<integer\>**
 :   Word gap. This option inserts a pause between words. The value is
    the length of the pause, in units of 10 mS (at the default speed of
    170 wpm).
 **-h**or **--help**
 :   The first line of output gives the eSpeak version number.
 **-k \<integer\>**
 :   Indicate words which begin with capital letters.
 :   1   eSpeak uses a click sound to indicate when a word starts with a
    capital letter, or double click if word is all capitals.
 :   2   eSpeak speaks the word "capital" before a word which begins with
    a capital letter.
 :   Other values:   eSpeak increases the pitch for words which begin
    with a capital letter. The greater the value, the greater the
    increase in pitch. Try -k20.
 **-l \<integer\>**
 :   Line-break length, default value 0. If set, then lines which are
    shorter than this are treated as separate clauses and spoken
    separately with a break between them. This can be useful for some
    text files, but bad for others.
 **-m**
 :   Indicates that the text contains SSML (Speech Synthesis Markup
    Language) tags or other XML tags. Those SSML tags which are
    supported are interpreted. Other tags, including HTML, are ignored,
    except that some HTML tags such as \<hr\> \<h2\> and \<li\> ensure a
    break in the speech.
 **-q**
 :   Quiet. No sound is generated. This may be useful with options such
    as -x and --pho.
 **-v \<voice filename\>[+\<variant\>]**
 :   Sets a Voice for the speech, usually to select a language. eg:

 ~~~~ {.western style="margin-left: 1cm; margin-bottom: 0.5cm"}
   espeak-ng -vaf
 ~~~~

 To use the Afrikaans voice. A modifier after the voice name can be used
 to vary the tone of the voice, eg:

 ~~~~ {.western style="margin-left: 1cm; margin-bottom: 0.5cm"}
   espeak-ng -vaf+3
 ~~~~

 The variants are `+m1 +m2 +m3 +m4 +m5 +m6 +m7`{.western} for male voices
 and `+f1 +f2 +f3 +f4 `{.western}which simulate female voices by using
 higher pitches. Other variants include `+croak`{.western} and
 `+whisper`{.western}.

 \<voice filename\> is a file within the `espeak-data/voices`{.western}
 directory.\
 \<variant\> is a file within the `espeak-data/voices/!v`{.western}
 directory.

 Voice files can specify a language, alternative pronunciations or
 phoneme sets, different pitches, tonal qualities, and prosody for the
 voice. See the [voices.html](voices.html) file.

 Voice names which start with **mb-** are for use with Mbrola diphone
 voices, see [mbrola.html](mbrola.html)

 Some languages may need additional dictionary data, see
 [languages.html](languages.html)

 **-w \<wave file\>**

 Writes the speech output to a file in WAV format, rather than speaking
 it.

 **-x**

 The phoneme mnemonics, into which the input text is translated, are
 written to stdout. If a phoneme name contains more than one letter (eg.
 [tS]), the --sep or --tie option can be used to distinguish this from
 separate phonemes.

 **-X**

 As -x, but in addition, details are shown of the pronunciation rule and
 dictionary list lookup. This can be useful to see why a certain
 pronunciation is being produced. Each matching pronunciation rule is
 listed, together with its score, the highest scoring rule being used in
 the translation. "Found:" indicates the word was found in the dictionary
 lookup list, and "Flags:" means the word was found with only properties
 and not a pronunciation. You can see when a word has been retranslated
 after removing a prefix or suffix.

 **-z**

 The option removes the end-of-sentence pause which normally occurs at
 the end of the text.

 **--stdout**

 Writes the speech output to stdout as it is produced, rather than
 speaking it. The data starts with a WAV file header which indicates the
 sample rate and format of the data. The length field is set to zero
 because the length of the data is unknown when the header is produced.

 **--compile [=\<voice name\>]**

 Compile the pronunciation rule and dictionary lookup data from their
 source files in the current directory. The Voice determines which
 language's files are compiled. For example, if it's an English voice,
 then *en\_rules*, *en\_list*, and *en\_extra* (if present), are compiled
 to replace *en\_dict* in the *speak-data* directory. If no Voice is
 specified then the default Voice is used.

 **--compile-debug [=\<voice name\>]**

 The same as **--compile**, but source line numbers from the \*\_rules
 file are included. These are included in the rules trace when the **-X**
 option is used.

 **--ipa**

 Writes phonemes to stdout, using the International Phonetic Alphabet
 (IPA).\
 If a phoneme name contains more than one letter (eg. [tS]), the --sep
 or --tie option can be used to distinguish this from separate phonemes.

 **--path [="\<directory path\>"]**

 Specifies the directory which contains the espeak-data directory.

 **--pho**

 When used with an mbrola voice (eg. -v mb-en1), it writes mbrola phoneme
 data (.pho file format) to stdout. This includes the mbrola phoneme
 names with duration and pitch information, in a form which is suitable
 as input to this mbrola voice. The --phonout option can be used to write
 this data to a file.

 **--phonout [="\<filename\>"]**

 If specified, the output from -x, -X, --ipa, and --pho options is
 written to this file, rather than to stdout.

 **--punct [="\<characters\>"]**

 Speaks the names of punctuation characters when they are encountered in
 the text. If \<characters\> are given, then only those listed
 punctuation characters are spoken, eg. `--punct=".,;?"`{.western}

 **--sep [=\<character\>]**

 The character is used to separate individual phonemes in the output
 which is produced by the -x or --ipa options. The default is a space
 character. The character z means use a ZWNJ character (U+200c).

 **--split [=\<minutes\>]**

 Used with **-w**, it starts a new WAV file every `<minutes>`{.western}
 minutes, at the next sentence boundary.

 **--tie [=\<character\>]**

 The character is used within multi-letter phonemes in the output which
 is produced by the -x or --ipa options. The default is the tie
 character  ͡  U+361. The character z means use a ZWJ character (U+200d).

 **--voices [=\<language code\>]**

 Lists the available voices.\
 If =\<language code\> is present then only those voices which are
 suitable for that language are listed.\
 `--voices=mbrola`{.western} lists the voices which use mbrola diphone
 voices. These are not included in the default `--voices`{.western} list\
 `--voices=variant`{.western} lists the available voice variants (voice
 modifiers).

 ### 2.2.3 The Input Text {.western}

 **HTML Input** 
 :   If the -m option is used to indicate marked-up text, then HTML can
    be spoken directly.
 **Phoneme Input** 
 :   As well as plain text, phoneme mnemonics can be used in the text
    input to **espeak-ng**. They are enclosed within double square
    brackets. Spaces are used to separate words and all stressed
    syllables must be marked explicitly.
 :     eg:  
    `espeak-ng -v en "[[D,Is     Iz sVm f@n'EtIk t'Ekst 'InpUt]]" `{.western}
 :   This command will speak: "This is some phonetic text input".

--- a/docs/dictionary.md
+++ b/docs/dictionary.md
@@ -0,0 +1,655 @@
 4. TEXT TO PHONEME TRANSLATION {.western}
 ------------------------------

 ### 4.1 Translation Files {.western}

 There is a separate set of pronunciation files for each language, their
 names starting with the language name.

 There are two separate methods for translating words into phonemes:

 -   -   

 These two files are compiled into the file ***\<language\>\_dict***  in
 the espeak-data directory (eg. espeak-data/en\_dict)

 ### 4.2 Phoneme names {.western}

 Each of the language's phonemes is represented by a mnemonic of 1, 2, 3,
 or 4 characters. Together with a number of utility codes (eg. stress
 marks and pauses), these are defined in the phoneme data file (see
 \*spec not yet available\*).

 The utility 'phonemes' are:

 +--------------------------------------+--------------------------------------+
 | **'**                                | primary stress                       |
 +--------------------------------------+--------------------------------------+
 | **,**                                | secondary stress                     |
 +--------------------------------------+--------------------------------------+
 | **%**                                | unstressed syllable                  |
 +--------------------------------------+--------------------------------------+
 | **=   **                             | put the primary stress on the        |
 |                                      | preceding syllable                   |
 +--------------------------------------+--------------------------------------+
 | **\_:**                              | short pause                          |
 +--------------------------------------+--------------------------------------+
 | **\_**                               | a shorter pause                      |
 +--------------------------------------+--------------------------------------+
 | **||**                               | indicates a word boundary within a   |
 |                                      | phoneme string                       |
 +--------------------------------------+--------------------------------------+
 | **|**                                | can be used to separate two adjacent |
 |                                      | characters, to prevent them from     |
 |                                      | being considered as a                |
 |                                      | multi-character phoneme mnemonic     |
 +--------------------------------------+--------------------------------------+

 It is not necessary to specify the stress of every syllable. Stress
 markers are only needed in order to change the effect of the language's
 default stress rule.

 The phonemes which are used to represent a language's sounds are based
 loosely on the Kirshenbaum ascii character representation of the
 International Phonetic Alphabet
 [www.kirshenbaum.net/IPA/ascii-ipa.pdf](http://www.kirshenbaum.net/IPA/ascii-ipa.pdf)

 ### 4.3 Pronunciation Rules {.western}

 The rules in the ***\<language\>\_rules***  file specify the phonemes
 which are used to pronounce each letter, or sequence of letters. Some
 rules only apply when the letter or letters are preceded by, or followed
 by, other specified letters.

 To find the pronunciation of a word, the rules are searched and any
 which match the letters at the in the word are given a score depending
 on how many letters are matched. The pronunciation from the best
 matching rule is chosen. The pointer into the source word is then
 advanced past those letters which have been matched and the process is
 repeated until all the letters of the word have been processed.

 #### 4.3.1 Rule Groups {.western}

 The rules are organized in groups, each starting with a ".group" line:

 When matching a word, firstly the 2-letter group for the two letters at
 the current position in the word (if such a group exists) is searched,
 and then the single-letter group. The highest scoring rule in either of
 those two groups is used.

 #### 4.3.2 Rules {.western}

 Each rule is on separate line, and has the syntax:

 eg.

 "oo" is pronounced as [u:], but when also preceded by "b" and followed
 by "k", it is pronounced [U].

 In the case of a single-letter group, the first character of \<match\>
 much be the group letter. In the case of a 2-letter group, the first two
 characters of \<match\> must be the group letters. The second and third
 rules above may be in either .group o or .group oo

 Alphabetic characters in the \<pre\>, \<match\>, and \<post\> parts must
 be lower case, and matching is case-insensitive. Some upper case letters
 are used in \<pre\> and \<post\> with special meanings.

 #### 4.3.3 Special characters in \<phoneme string\>: {.western}

 +--------------------------------------+--------------------------------------+
 | **\_\^\_\<language code\>   **       | Translate using a different          |
 |                                      | language.                            |
 +--------------------------------------+--------------------------------------+

 #### 4.3.4 Special Characters in both \<pre\> and \<post\>: {.western}

 +--------------------------------------+--------------------------------------+
 | **\_**                               | Beginning or end of a word (or a     |
 |                                      | hyphen).                             |
 +--------------------------------------+--------------------------------------+
 | **-**                                | Hyphen.                              |
 +--------------------------------------+--------------------------------------+
 | **A**                                | Any vowel (the set of vowel          |
 |                                      | characters may be defined for a      |
 |                                      | particular language).                |
 +--------------------------------------+--------------------------------------+
 | **C**                                | Any consonant.                       |
 +--------------------------------------+--------------------------------------+
 | **B H F G Y **                       | These may indicate other sets of     |
 |                                      | characters (defined for a particular |
 |                                      | language).                           |
 +--------------------------------------+--------------------------------------+
 | **L\<nn\>**                          | Any of the sequence of characters    |
 |                                      | defined as a letter group (see 4.3.1 |
 |                                      | above).                              |
 +--------------------------------------+--------------------------------------+
 | **D**                                | Any digit.                           |
 +--------------------------------------+--------------------------------------+
 | **K**                                | Not a vowel (i.e. a consonant or     |
 |                                      | word boundary or non-alphabetic      |
 |                                      | character).                          |
 +--------------------------------------+--------------------------------------+
 | **X**                                | There is no vowel until the word     |
 |                                      | boundary.                            |
 +--------------------------------------+--------------------------------------+
 | **Z**                                | A non-alphabetic character.          |
 +--------------------------------------+--------------------------------------+
 | **%**                                | Doubled (placed before a character   |
 |                                      | in \<pre\> and after it in \<post\>. |
 +--------------------------------------+--------------------------------------+
 | **/**                                | The following character is treated   |
 |                                      | literally.                           |
 +--------------------------------------+--------------------------------------+

 The sets of letters indicated by A, B, C, E, F G may be defined
 differently for each language.

 Examples of rules:

 ~~~~ {.western}
     _)  a         // "a" at the start of a word
         a (CC     // "a" followed by two consonants
         a (C%     // "a" followed by a double consonant (the same letter twice)
         a (/%     // "a" followed by a percent sign
     %C) a         // "a" preceded by a double consonants
 ~~~~

 #### 4.3.5 Special characters only in \<pre\>: {.western}

 +--------------------------------------+--------------------------------------+
 | **@   **                             | Any syllable.                        |
 +--------------------------------------+--------------------------------------+
 | **&**                                | A syllable which may be stressed     |
 |                                      | (i.e. is not defined as unstressed). |
 +--------------------------------------+--------------------------------------+
 | **V**                                | Matches only if a previous word has  |
 |                                      | indicated that a verb form is        |
 |                                      | expected.                            |
 +--------------------------------------+--------------------------------------+

 eg.

 ~~~~ {.western}
     @@)  bi      // "bi" preceded by at least two syllables
     @@a) bi      // "bi" preceded by at least 2 syllables and following 'a'
 ~~~~

 Note, that matching characters in the \<pre\> part do not affect the
 syllable counting.

 #### 4.3.6 Special characters only in \<post\>: {.western}

 +--------------------------------------+--------------------------------------+
 | **@**                                | A vowel follows somewhere in the     |
 |                                      | word.                                |
 +--------------------------------------+--------------------------------------+
 | **+**                                | Force an increase in the score in    |
 |                                      | this rule (may be repeated for more  |
 |                                      | effect).                             |
 +--------------------------------------+--------------------------------------+
 | **S\<number\>  **                    | This number of matching characters   |
 |                                      | are a standard suffix, remove them   |
 |                                      | and retranslate the word.            |
 +--------------------------------------+--------------------------------------+
 | **P\<number\>**                      | This number of matching characters   |
 |                                      | are a standard prefix, remove them   |
 |                                      | and retranslate the word.            |
 +--------------------------------------+--------------------------------------+
 | **Lnn**                              | **nn** is a 2-digit decimal number   |
 |                                      | in the range 01 to 20\               |
 |                                      |  Matches with any of the letter      |
 |                                      | sequences which have been defined    |
 |                                      | for letter group **nn**              |
 +--------------------------------------+--------------------------------------+
 | **N**                                | Only use this rule if the word is    |
 |                                      | not a retranslation after removing a |
 |                                      | suffix.                              |
 +--------------------------------------+--------------------------------------+
 | **\#**                               | (English specific) change the next   |
 |                                      | "e" into a special character "E"     |
 +--------------------------------------+--------------------------------------+
 | **\$noprefix**                       | Only use this rule if the word is    |
 |                                      | not a retranslation after removing a |
 |                                      | prefix.                              |
 +--------------------------------------+--------------------------------------+
 | **\$w\_alt\                          | Only use this rule if the word is    |
 |  \$w\_alt2\                          | found in the \*\_list file with the  |
 |  \$w\_alt3**                         | **\$alt**, **\$alt2** or **\$alt3**  |
 |                                      | attribute respectively.              |
 +--------------------------------------+--------------------------------------+
 | **\$p\_alt\                          | Only use this rule if the part-word, |
 |  \$p\_alt2\                          | up to and including the pre and      |
 |  \$p\_alt3**                         | match parts of this rule, is found   |
 |                                      | in the \*\_list file with the        |
 |                                      | **\$alt**, **\$alt2** or **\$alt3**  |
 |                                      | attribute respectively.              |
 +--------------------------------------+--------------------------------------+

 eg.

 ~~~~ {.western}
   @) ly (_S2   lI      // "ly", at end of a word with at least one other
                        //   syllable, is a suffix pronounced [lI].  Remove
                        //   it and retranslate the word.

   _) un (@P2   %Vn     // "un" at the start of a word is an unstressed
                        //   prefix pronounced [Vn]
   _) un (i     ju:     // ... except in words starting "uni"
   _) un (inP2  ,Vn     // ... but it is for words starting "unin"
 ~~~~

 S and P must be at the end of the \<post\> string.

 S\<number\> may be followed by additional letters (eg. S2ei ). Some of
 these are probably specific to English, but similar functions could be
 made for other languages.

 +--------------------------------------+--------------------------------------+
 | **q**                                | query the \_list file to find stress |
 |                                      | position or other attributes for the |
 |                                      | stem, but don't re-translate the     |
 |                                      | word with the suffix removed.        |
 +--------------------------------------+--------------------------------------+
 | **t**                                | determine the stress pattern of the  |
 |                                      | word **before** adding the suffix    |
 +--------------------------------------+--------------------------------------+
 | **d   **                             | the previous letter may have been    |
 |                                      | doubled when the suffix was added.   |
 +--------------------------------------+--------------------------------------+
 | **e**                                | "e" may have been removed.           |
 +--------------------------------------+--------------------------------------+
 | **i**                                | "y" may have been changed to "i."    |
 +--------------------------------------+--------------------------------------+
 | **v**                                | the suffix means the verb form of    |
 |                                      | pronunciation should be used.        |
 +--------------------------------------+--------------------------------------+
 | **f**                                | the suffix means the next word is    |
 |                                      | likely to be a verb.                 |
 +--------------------------------------+--------------------------------------+
 | **m**                                | after this suffix has been removed,  |
 |                                      | additional suffixes may be removed.  |
 +--------------------------------------+--------------------------------------+

 P\<number\> may be followed by additonal letters (eg. P3v ).

 +--------------------------------------+--------------------------------------+
 | **t   **                             | determine the stress pattern of the  |
 |                                      | word **before** adding the prefix    |
 +--------------------------------------+--------------------------------------+
 | **v**                                | the suffix means the verb form of    |
 |                                      | pronunciation should be used.        |
 +--------------------------------------+--------------------------------------+

 ### 4.4 Pronunciation Dictionary List {.western}

 The ***\<language\>\_list***  file contains a list of words whose
 pronunciations are given explicitly, rather than determined by the
 Pronunciation Rules. The ***\<language\>\_extra***  file, if present, is
 also used and it's contents are taken as coming after those in
 ***\<language\>\_list***.

 Also the list can be used to specify the stress pattern, or other
 properties, of a word.

 If the Pronunciation rules are applied to a word and indicate a standard
 prefix or suffix, then the word is again looked up in Pronunciation
 Dictionary List after the prefix or suffix has been removed.

 Lines in the dictionary list have the form:

 eg.

 ~~~~ {.western style="margin-bottom: 0.5cm"}
     book      bUk
 ~~~~

 Rather than a full pronunciation, just the stress may be given, to
 change where it would be otherwise placed by the Pronunciation Rules:

 ~~~~ {.western}
     berlin       $2      // stress on second syllable
     absolutely   $3      // stress on third syllable
     for          $u      // an unstressed word
 ~~~~

 #### 4.4.1 Multiple Words {.western}

 A pronunciation may also be specified for a group of words, when these
 appear together. Up to four words may be given, enclosed in brackets.
 This may be used for change the pronunciation or stress pattern when
 these words occur together,

 ~~~~ {.western style="margin-bottom: 0.5cm"}
    (de jure)    deI||dZ'U@rI2   // note || used as a word break in the phoneme string
 ~~~~

 or to run them together, pronounced as a single word

 ~~~~ {.western style="margin-bottom: 0.5cm"}
    (of a)       @v@
 ~~~~

 or to give them a flag when they occur together

 ~~~~ {.western style="margin-bottom: 0.5cm"}
    (such as)    sVtS||a2z   $pause        // precede with a pause
 ~~~~

 Hyphenated words in the ***\<language\>\_list***  file must also be
 enclosed within brackets, because the two parts are considered as
 separate words.

 #### 4.4.2 Special characters in \<phoneme string\>: {.western}

 +--------------------------------------+--------------------------------------+
 | **\_\^\_\<language code\>   **       | Translate using a different          |
 |                                      | language. See explanation in 4.3.3   |
 |                                      | above.                               |
 +--------------------------------------+--------------------------------------+

 #### 4.4.3 Flags {.western}

 A word (or group of words) may be given one or more flags, either
 instead of, or as well as, the phonetic translation.

 +--------------------------------------+--------------------------------------+
 | \$u                                  | The word is unstressed. In the case  |
 |                                      | of a multi-syllable word, a slight   |
 |                                      | stress is applied according to the   |
 |                                      | default stress rules.                |
 +--------------------------------------+--------------------------------------+
 | \$u1                                 | The word is unstressed, with a       |
 |                                      | slight stress on its 1st syllable.   |
 +--------------------------------------+--------------------------------------+
 | \$u2                                 | The word is unstressed, with a       |
 |                                      | slight stress on its 2nd syllable.   |
 +--------------------------------------+--------------------------------------+
 | \$u3                                 | The word is unstressed, with a       |
 |                                      | slight stress on its 3rd syllable.   |
 +--------------------------------------+--------------------------------------+
 |                                      |                                      |
 +--------------------------------------+--------------------------------------+
 | \$u+ \$u1+ \$u2+ \$u3+               | As above, but the word has full      |
 |                                      | stress if it's at the end of a       |
 |                                      | clause.                              |
 +--------------------------------------+--------------------------------------+
 |                                      |                                      |
 +--------------------------------------+--------------------------------------+
 | \$1                                  | Primary stress on the 1st syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$2                                  | Primary stress on the 2nd syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$3                                  | Primary stress on the 3rd syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$4                                  | Primary stress on the 4th syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$5                                  | Primary stress on the 5th syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$6                                  | Primary stress on the 6th syllable.  |
 +--------------------------------------+--------------------------------------+
 | \$7                                  | Primary stress on the 7th syllable.  |
 +--------------------------------------+--------------------------------------+
 |                                      |                                      |
 +--------------------------------------+--------------------------------------+
 | \$pause                              | Ensure a short pause before this     |
 |                                      | word (eg. for conjunctions such as   |
 |                                      | "and", some prepositions, etc).      |
 +--------------------------------------+--------------------------------------+
 | \$brk                                | Ensure a very short pause before     |
 |                                      | this word, shorter than \$pause (eg. |
 |                                      | for some prepositions, etc).         |
 +--------------------------------------+--------------------------------------+
 | \$only                               | The rule does not apply if a prefix  |
 |                                      | or suffix has already been removed.  |
 +--------------------------------------+--------------------------------------+
 | \$onlys                              | As \$only, except that a standard    |
 |                                      | plural ending is allowed.            |
 +--------------------------------------+--------------------------------------+
 | \$stem                               | The rule only applies if a suffix    |
 |                                      | has already been removed.            |
 +--------------------------------------+--------------------------------------+
 | \$strend                             | Word is fully stressed if it's at    |
 |                                      | the end of a clause.                 |
 +--------------------------------------+--------------------------------------+
 | \$strend2                            | As \$strend, but the word is also    |
 |                                      | stressed if followed only by         |
 |                                      | unstressed word(s).                  |
 +--------------------------------------+--------------------------------------+
 | \$unstressend                        | Word is unstressed if it's at the    |
 |                                      | end of a clause.                     |
 +--------------------------------------+--------------------------------------+
 | \$atend                              | Use this pronunciation if it's at    |
 |                                      | the end of a clause.                 |
 +--------------------------------------+--------------------------------------+
 | \$double                             | Cause a doubling of the initial      |
 |                                      | consonant of the following word      |
 |                                      | (used for Italian).                  |
 +--------------------------------------+--------------------------------------+
 | \$capital                            | Use this pronunciation if the word   |
 |                                      | has initial capital letter (eg.      |
 |                                      | polish v Polish).                    |
 +--------------------------------------+--------------------------------------+
 | \$allcaps                            | Use this pronunciation if the word   |
 |                                      | is all capitals.                     |
 +--------------------------------------+--------------------------------------+
 | \$dot                                | Ignore a . after this word even when |
 |                                      | followed by a capital letter (eg.    |
 |                                      | Mr. Dr. ).                           |
 +--------------------------------------+--------------------------------------+
 | \$hasdot                             | Use this pronunciation if the word   |
 |                                      | is followed by a dot. (This          |
 |                                      | attribute also implies \$dot).       |
 +--------------------------------------+--------------------------------------+
 | \$sentence                           | The rule only applies if the clause  |
 |                                      | includes end-of-sentence (i.e. it is |
 |                                      | not terminated by a comma). For      |
 |                                      | example, "\$atend \$sentence" means  |
 |                                      | that the rule only applies at the    |
 |                                      | end of a sentence.                   |
 +--------------------------------------+--------------------------------------+
 | \$abbrev                             | This has two meanings.\              |
 |                                      |  1. If there is no phoneme string:   |
 |                                      | Speak the word as individual         |
 |                                      | letters, even if it contains a vowel |
 |                                      | (eg. "abc" should be spoken as "a"   |
 |                                      | "b" "c").\                           |
 |                                      |  2. If there is a phoneme string:    |
 |                                      | This word is capitalized because it  |
 |                                      | is an abbreviation and               |
 |                                      | capitalization does not indicate     |
 |                                      | emphasis (if the "emphasize          |
 |                                      | all-caps" is on).                    |
 +--------------------------------------+--------------------------------------+
 |                                      |                                      |
 +--------------------------------------+--------------------------------------+
 | \$accent                             | Used for the pronunciation of a      |
 |                                      | single alphabetic character. The     |
 |                                      | character name is spoken as the      |
 |                                      | base-letter name plus the accent     |
 |                                      | (diacritic) name. eg. It can be used |
 |                                      | to specify that "â" is spoken as "a" |
 |                                      | "circumflex".                        |
 +--------------------------------------+--------------------------------------+
 | \$combine                            | This word is treated as though it is |
 |                                      | combined with the following word     |
 |                                      | with a hyphen. This may be subject   |
 |                                      | to fuither conditions for certain    |
 |                                      | languages.                           |
 +--------------------------------------+--------------------------------------+
 | \$alt   \$alt2   \$alt3              | These are language specific. Their   |
 |                                      | use should be described in the       |
 |                                      | language's \*\*\_list file           |
 +--------------------------------------+--------------------------------------+
 |                                      |                                      |
 +--------------------------------------+--------------------------------------+
 | \$verb                               | Use this pronunciation if it's a     |
 |                                      | verb.                                |
 +--------------------------------------+--------------------------------------+
 | \$noun                               | Use this pronunciation if it's a     |
 |                                      | noun.                                |
 +--------------------------------------+--------------------------------------+
 | \$past                               | Use this pronunciation if it's past  |
 |                                      | tense.                               |
 +--------------------------------------+--------------------------------------+
 | \$verbf                              | The following word is probably is a  |
 |                                      | verb.                                |
 +--------------------------------------+--------------------------------------+
 | \$verbsf                             | The following word is probably is a  |
 |                                      | if it has an "s" suffix.             |
 +--------------------------------------+--------------------------------------+
 | \$nounf                              | The following word is probably not a |
 |                                      | verb.                                |
 +--------------------------------------+--------------------------------------+
 | \$pastf                              | The following word is probably past  |
 |                                      | tense.                               |
 +--------------------------------------+--------------------------------------+
 | \$verbextend                         | Extend the influence of \$verbf and  |
 |                                      | \$verbsf.                            |
 +--------------------------------------+--------------------------------------+

 The last group are probably English specific, but something similar may
 be useful in other languages. They are a crude attempt to improve the
 accuracy of pairs like ob'ject (verb) v 'object (noun) and read
 (present) v read (past).

 The dictionary list is searched from bottom to top. The first match that
 satisfies any conditions is used (i.e. the one lowest down the list). So
 if we have:

 ~~~~ {.western}
    to    t@               // unstressed version
    to    tu:   $atend     // stressed version
 ~~~~

 then if "to" is at the end of the clause, we get [tu:], if not then we
 get [t@].

 #### 4.4.4 Translating a Word to another Word {.western}

 Rather than specifying the pronunciation of a word by a phoneme string,
 you can specify another "sounds like" word.

 Use the attribute **\$text** eg.

 ~~~~ {.western style="margin-bottom: 0.5cm"}
    cough    coff   $text
 ~~~~

 Alternatively, use the command **\$textmode** on a line by itself to
 turn this on for all subsequent entries in the file, until it's turned
 off by **\$phonememode**. eg.

 ~~~~ {.western}
    $textmode
    cough     coff
    through   threw
    $phonememode
 ~~~~

 This feature cannot be used for the special entries in the **\_list**
 files which start with an underscore, such as numbers.

 Currently "textmode" entries are only recognized for complete words, and
 not for for stems from which a prefix or suffix has been removed (eg.
 the word "coughs" would not match the example above).

 ### 4.5 Conditional Rules {.western}

 Rules in a **\_rules** file and entries in a **\_list** file can be made
 conditional. They apply only to some voices. This can be useful to
 specify different pronunciations for different variants of a language
 (dialects or accents).

 Conditional rules have   **?**   and a condition number at the start if
 the line in the **\_rules** or **\_list** file. This means that the rule
 only applies of that condition number is specified in a **dictrules**
 line in the [voice file](voices.html).

 If the rule starts with   **?!**   then the rule only applies if the
 condition number is **not** specified in the voice file. eg.

 ~~~~ {.western}
   ?3     can't     kant    // only use this if the voice has:  dictrules 3
   ?!3    rather    rA:D3   // only use if the voice doesn't have:  dictrules 3
 ~~~~

 ### 4.6 Numbers and Character Names {.western}

 #### 4.6.1 Letter names {.western}

 The names of individual letters can be given either in the **\_rules**
 or **\_list** file. Sometimes an individual letter is also used as a
 word in the language and its pronunciation as a word differs from its
 letter name. If so, it should be listed in the **\_list** file, preceded
 by an underscore, to give the letter name (as distinct from its
 pronunciation as a word). eg. in English:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   _a   eI
 ~~~~

 #### 4.6.2 Numbers {.western}

 The operation the TranslateNumber() function is controlled by the
 language's `langopts.numbers`{.western} option. This constructs spoken
 numbers from fragments according to various options which can be set for
 each language. The number fragments are given in the **\_list** file.

 +--------------------------------------+--------------------------------------+
 | \_0 to \_9                           | The numbers 0 to 9                   |
 +--------------------------------------+--------------------------------------+
 | \_13                                 | etc. Any pronunciations which are    |
 |                                      | needed for specific numbers in the   |
 |                                      | range \_10 to \_99                   |
 +--------------------------------------+--------------------------------------+
 | \_2X  \_3X                           | Twenty, thirty, etc., used to make   |
 |                                      | numbers 10 to 99                     |
 +--------------------------------------+--------------------------------------+
 | \_0C                                 | The word for "hundred"               |
 +--------------------------------------+--------------------------------------+
 | \_1C  \_2C                           | Special pronunciation for one        |
 |                                      | hundred, two hundred, etc., if       |
 |                                      | needed.                              |
 +--------------------------------------+--------------------------------------+
 | \_1C0                                | Special pronunciation (if needed)    |
 |                                      | for 100 exactly                      |
 +--------------------------------------+--------------------------------------+
 | \_0M1                                | The word for "thousand"              |
 +--------------------------------------+--------------------------------------+
 | \_0M2                                | The word for "million"               |
 +--------------------------------------+--------------------------------------+
 | \_0M3                                | The word for 1000000000              |
 +--------------------------------------+--------------------------------------+
 | \_1M1  \_2M1                         | Special pronunciation for one        |
 |                                      | thousand, two thousand, etc, if      |
 |                                      | needed                               |
 +--------------------------------------+--------------------------------------+
 | \_0and                               | Word for "and" when speaking numbers |
 |                                      | (eg. "two hundred and twenty").      |
 +--------------------------------------+--------------------------------------+
 | \_dpt                                | Word spoken for the decimnal         |
 |                                      | point/comma                          |
 +--------------------------------------+--------------------------------------+
 | \_dpt2                               | Word spoken (if any) at the end of   |
 |                                      | all the digits after a decimal       |
 |                                      | point.                               |
 +--------------------------------------+--------------------------------------+

 ### 4.7 Character Substitution {.western}

 Character substitutions can be specified by using a **.replace**section
 at the start of the **\_rules**file. Each line specified either one or
 two alphabetic characters to be replaced by another one or two
 alphabetic characters. This substitution is done to a word before it is
 translated using the spelling-to-phoneme rules. Only the lower-case
 version of the characters needs to be specified. eg.

   .replace\
     ô   ő   // (Hungarian) allow the use of o-circumflex instead of
 o-double-accute\
     û   ű

     cx   ĉ   // (Esperanto) allow "cx" as an alternative to c-circumflex

     ﬁ   fi   // replace a single character ligature by two characters
--- a/docs/editor.md
+++ b/docs/editor.md
@@ -0,0 +1,46 @@
 ESPEAKEDIT PROGRAM {.western}
 ------------------

 The **espeakedit** program is used to prepare phoneme data for the
 eSpeak speech synthesizer.

 It has two main functions:

 -   -   

 ### Installation {.western}

 **espeakedit** needs the following packages:\
 (The package names mentioned here are those from the Ubuntu "Dapper"
 Linux distribution).

 -   -   -   

 In addition, a modified version of **praat**
 ([www.praat.org](www.praat.org)) is used to view and analyse WAV sound
 files. This needs the package **libmotif3** to run and **libmotif-dev**
 to compile.

 ### Quick Guide {.western}

 This will quickly illustrate the main features. Details of the interface
 and key commands are given in [editor\_if.html](editor_if.html)

 For more detailed information on analysing sound recordings and
 preparing phoneme definitions and keyframe data see
 [analyse.html](analyse.html) (to be written).

 #### Compiling Phoneme Data {.western}

 1.  2.  3.  4.  

 #### Keyframe Sequences {.western}

 1.  2.  3.  4.  5.  6.  7.  

 #### Text and Prosody Windows {.western}

 1.  2.  3.  4.  5.  6.  7.  8.  9.  

 The Prosody window can be used to experiment with different phoneme
 lengths and different intonation.
--- a/docs/editor_if.md
+++ b/docs/editor_if.md
@@ -0,0 +1,41 @@
 USER INTERFACE - FORMANT EDITOR {.western}
 -------------------------------

 ### Frame Sequence Display {.western}

 The eSpeak editor can display a number of frame-sequencies in tabbed
 windows. Each frame can contain a short-time frequency spectrum,
 covering the period of one cycle at the sound's pitch. Frames can also
 show:

 -   -   -   -   -   

 ### Text Tab {.western}

 Enter text in the top left text window. Click the **Translate** button
 to see the phonetic transcription in the text window below. Then click
 the **Speak** button to speak the text and show the results in the
 **Prosody** tab, if that is open.

 If changes are made in the **Prosody** tab, then clicking **Speak** will
 speak the modified prosody while **Translate** will revert to the
 default prosody settings for the text.

 To enter phonetic symbols (Kirschenbaum encoding) in the top left text
 window, enclose them within [[ ]].

 ### Spect Tab {.western}

 The "Spect" tab in the left panel of the eSpeak editor shows information
 about the currently selected frame and sequence.

 -   -   -   -   -   -   

 ### Key Commands {.western}

 -   -   -   -   -   

 USER INTERFACE - PROSODY EDITOR {.western style="margin-left: 1cm"}
 -------------------------------

 -   
--- a/docs/index.md
+++ b/docs/index.md
@@ -0,0 +1,52 @@
 # eSpeak NG - Documentation
 ======================

 ### [Usage](commands.md)

 ### [Languages](languages.md)

 ### [Voice Files](voices.md)

 Voice files specify a language and other characteristics of a voice.

 ### [Mbrola Voices](mbrola.md)

 eSpeak NG can be used as a front-end for Mbrola diphone voices.

 ### [Pronunciation Dictionary](dictionary.md)

 ### [Adding a Language](add_language.md)

 How to add or improve a language.

 ### [Phonemes](phonemes.md)

 The list of phoneme mnemonics for English, for use in the Pronunciation
 Dictionary.

 ### [Phoneme Tables](phontab.md)

 The tables of the phonemes used by each language, with their properties
 and sound production.

 ### [Intonation](intonation.md)

 Different intonation "tunes" may be defined for different languages for
 clauses which end in full-stop, comma, question-mark, and
 exclamation-mark.

 ### [eSpeak NG Library API](speak_lib.h)

 API definition and header file for a shared library version of eSpeak NG.

 ### [Markup tags](ssml.md)

 SSML (Speech Synthesis Markup Language) and HTML tags recognized by
 eSpeak NG.

 ### [The espeakedit program](editor.md)

 GUI software to edit vowel files and to compile the phoneme data for use
 by eSpeak NG. See also [Espeakedit user interface](editor_if.md). 


--- a/docs/intonation.md
+++ b/docs/intonation.md
@@ -0,0 +1,102 @@
 INTONATION {.western}
 ----------

 In eSpeak's standard intonation model, a "tune" is applied to each
 clause depending on its punctuation. Other intonation models may be used
 for some languages, such as tone languages.

 Named tunes are defined in the text file:
 `phsource/intonation`{.western}. This file must be compiled for use by
 eSpeak by using the espeakedit program, using the menu option:
 `Compile -> Compile intonation data`{.western}.

 ### Clauses {.western}

 The tunes which are used for a language can be specified by using a
 `tunes`{.western} statement in a voice file in
 `espeak-data/voices`{.western}. eg:

 `tunes   s1  c1  q1  e1`{.western}

 It's parameters are four tune names which are used for clauses which end
 in:

 1.  2.  3.  4.  

 A clause consists of the following parts:

 -   -   -   -   

 ### Tune definitions {.western}

 Here is an example tune definition from the file
 `phsource/intonation`{.western}.

 ~~~~ {.western}
 tune s1
 prehead   46 57
 headenv   fall 16
 head       4 80 55 -8 -5
 headextend 0 63 38 13 0
 nucleus  fall 70 18 24 12
 nucleus0 fall 64 8
 endtune
 ~~~~

 It contains:

 **tune** \<tune name\> 
 :   Starts the definition of a tune. The `tune     name`{.western} can
    be used in a `tunes`{.western} statements in voice files.
 **endtune** \<tune name\> 
 :   Ends the definition of a tune.
 **prehead** \<start pitch\> \<end pitch\> 
 :   Gives the pitch path for any series of unstressed syllables before
    the first stressed syllable.
 **headenv** \<envelope\> \<height\> 
 :   Gives the pitch envelope which is used for stressed syllables in the
    head (before the nucleus), including `onset`{.western} and
    `headlast`{.western} syllables if these are specified.
    `height`{.western} gives a pitch range for the envelope.
 **head** \<steps\> \<start pitch\> \<end pitch\> \<unstressed start\> \<unstressed end\> 
 :   `start pitch`{.western} and `end     pitch`{.western} give a pitch
    path for the stressed syllables of the head. `steps`{.western} is
    the maximum number of stressed syllables for which this applies. If
    there are additional stressed syllables, then the
    `headextend`{.western} statement is used for them.
 :   `unstressed start`{.western} and `unstressed     end`{.western} give
    a pitch path for unstressed syllables between two stressed
    syllables. Their values are relative to the pitch of the previous
    stressed syllable. Values are usually negative, meaning that the
    unstressed syllables have lower pitch than the previous stressed
    syllable.
 **headextend** \<percentage list\> 
 :   If the head contains more stressed syllables than is specified by
    `steps`{.western}, then `percentage     list`{.western} is used. It
    contains up to 8 numbers which are used repeatedly for the
    additional stressed syllables. A value of 0 corresponds to the lower
    the `start pitch`{.western} and `end pitch`{.western} values of the
    `head`{.western} statement. 100 corresponds to the higher value.
    Negative values and values greater than 100 are allowed.
 **nucleus** \<envelope\> \<top pitch\> \<bottom pitch\> \<tail start\> \<tail end\> 
 :   This gives the pitch envelope and pitch range of the last stressed
    syllable of the clause. `tail start`{.western} and
    `tail end`{.western} give a pitch path for the unstressed syllables
    which are after the last stressed syllable.
 **nucleus0** \<envelope\> \<top pitch\> \<bottom pitch\> 
 :   This is used instead of `nucleus`{.western} if there are no
    unstressed syllables after the last stressed syllable. In this case,
    the pitch changes of the nucleus and the tail and both included in
    the nucleus.

 The following attributes may also be included:

 **onset** \<pitch\> \<unstressed start\> \<unstressed end\> 
 :   This specifies the pitch for the first stressed syllable of the
    head. If the `onset`{.western} statement is present, then the
    `head`{.western} statement used for the stressed syllables after the
    first.
 **headlast** \<pitch\> \<unstressed start\> \<unstressed end\> 
 :   This specifies the pitch for the last stressed syllable of the head
    (i.e. the stressed syllable before the nucleus).

--- a/docs/languages.md
+++ b/docs/languages.md
@@ -0,0 +1,125 @@
 3. LANGUAGES {.western}
 ------------

 **Languages**. The eSpeak speech synthesizer supports several languages,
 however in many cases these are initial drafts and need more work to
 improve them. Assistance from native speakers is welcome for these, or
 other new languages. Please contact me if you want to help.

 eSpeak does text to speech synthesis for the following languages, some
 better than others. Afrikaans, Albanian, Armenian, Cantonese, Catalan,
 Croatian, Czech, Danish, Dutch, English, Esperanto, Finnish, French,
 German, Greek, Hindi, Hungarian, Icelandic, Indonesian, Italian,
 Kurdish, Latvian, Lojban, Macedonian, Mandarin, Norwegian, Polish,
 Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swahili,
 Swedish, Tamil, Turkish, Vietnamese, Welsh.


 #### Help Needed {.western}

 Many of these are just experimental attempts at these languages,
 produced after a quick reading of the corresponding article on
 wikipedia.org. They will need work or advice from native speakers to
 improve them. Please contact me if you want to advise or assist with
 these or other languages.

 The sound of some phonemes may be poorly implemented, particularly [r]
 since I'm English and therefore unable to make a "proper" [r] sound.

 A major factor is the rhythm or cadance. An Italian speaker told me the
 Italian voice improved from "difficult to understand" to "good" by
 changing the relative length of stressed syllables. Identifying
 unstressed function words in the xx\_list file is also important to make
 the speech flow well. See [Adding or Improving a
 Language](add_language.html)

 #### Character sets {.western}

 Languages recognise text either as UTF8 or alternatively in an 8-bit
 character set which is appropriate for that language. For example, for
 Polish this is Latin2, for Russian it is KOI8-R. This choice can be
 overridden by a line in the voices file to specify an ISO 8859 character
 set, eg. for Russian the line:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
     charset 5
 ~~~~

 will mean that ISO 8859-5 is used as the 8-bit character set rather than
 KOI8-R.

 In the case of a language which uses a non-Latin character set (eg.
 Greek or Russian) if the text contains a word with Latin characters then
 that particular word will be pronounced using English pronunciation
 rules and English phonemes. Speaking entirely English text using a Greek
 or Russian voice will sound OK, but each word is spoken separately so it
 won't flow properly.

 Sample texts in various languages can be found at
 [http://\<language\>.wikipedia.org](http://meta.wikimedia.org/wiki/List_of_Wikipedias)
 and [www.gutenberg.org](http://www.gutenberg.org/)

 ### 3.1 Voice Files {.western}

 A number of Voice files are provided in the
 `espeak-data/voices`{.western} directory. You can select one of these
 with the **-v \<voice filename\>** parameter to the speak command, eg:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   espeak-ng -vaf
 ~~~~

 to speak using the Afrikaans voice.

 Language voices generally start with the 2 letter [ISO 639-1
 code](http://en.wikipedia.org/wiki/ISO_639-1) for the language. If the
 language does not have an ISO 639-1 code, then the 3 letter [ISO 639-3
 code](http://www.sil.org/iso639-3/codes.asp) can be used.

 For details of the voice files see [Voices](voices.html).

 #### Default Voice {.western}

 ### 3.2 English Voices {.western}

 ### 3.3 Voice Variants {.western}

 To make alternative voices for a language, you can make additional voice
 files in espeak-data/voices which contains commands to change various
 voice and pronunciation attributes. See [voices.html](voices.html).

 Alternatively there are some preset voice variants which can be applied
 to any of the language voices, by appending `+`{.western} and a variant
 name. Their effects are defined by files in
 `espeak-data/voices/!v`{.western}.

 The variants are `+m1 +m2 +m3 +m4 +m5 +m6 +m7`{.western} for male
 voices, `+f1 +f2 +f3 +f4 +f5 `{.western}for female voices, and
 `+croak +whisper`{.western} for other effects. For example:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   espeak-ng -ven+m3
 ~~~~

 The available voice variants can be listed with:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   espeak-ng --voices=variant
 ~~~~

 ### 3.4 Other Languages {.western}

 The eSpeak speech synthesizer does text to speech for the following
 additional langauges.

 ### 3.5 Provisional Languages {.western}

 These languages are only initial naive implementations which have had
 little or no feedback and improvement from native speakers.

 ### 3.6 Mbrola Voices {.western}

 Some additional voices, whose name start with **mb-** (for example
 **mb-en1**) use eSpeak as a front-end to Mbrola diphone voices. eSpeak
 does the spelling-to-phoneme translation and intonation. See
 [mbrola.html](mbrola.html).
--- a/docs/mbrola.md
+++ b/docs/mbrola.md
@@ -0,0 +1,128 @@
 MBROLA VOICES {.western}
 -------------

 The Mbrola project is a collection of diphone voices for speech
 synthesis. They do not include any text-to-phoneme translation, so this
 must be done by another program. The Mbrola voices are cost-free but are
 not open source. They are available from the Mbrola website at:\

 [http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html](http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html)

 eSpeak can be used as a front-end to Mbrola. It provides the
 spelling-to-phoneme translation and intonation, which Mbrola then uses
 to generate speech sound.

 ### Voice Names {.western}

 To use a Mbrola voice, eSpeak needs information to translate from its
 own phonemes to the equivalent Mbrola phonemes. This has been set up for
 only some voices so far.

 The eSpeak voices which use Mbrola are named as:\
   **mb-**xxx

 where xxx is the name of a Mbrola voice (eg. **mb-en1** for the Mbrola
 "**en1**" English voice). These voice files are in eSpeak's directory
 `espeak-data/voices/mbrola`{.western}.

 The installation instructions below use the Mbrola voice "en1" as an
 example. You can use other mbrola voices for which there is an
 equivalent eSpeak voice in `espeak-data/voices/mbrola`{.western}.

 There are some additional eSpeak Mbrola voices which speak English text
 using a Mbrola voice for a different language. These contain the name of
 the Mbrola voice with a suffix **-en**. For example, the voice
 **mb-de4-en** will speak English text with a German accent by using the
 Mbrola **de4** voice.

 ### Windows Installation {.western}

 The SAPI5 version of eSpeak uses the mbrola.dll.

 1.  2.  3.  4.  

 ### Linux Installation {.western}

 From eSpeak version 1.44 onwards, eSpeak calls the mbrola program
 directly, rather than passing phoneme data to it using a pipe.

 1.  2.  3.  

 ### Mbrola Voice Files {.western}

 eSpeak's voice files for Mbrola voices are in directory
 `espeak-data/voices/mbrola`{.western}. They contain a line:\
   `mbrola <voice> <translation>`{.western} \
 eg.\
   `mbrola en1 en1_phtrans`{.western}

 -   -   

 They are binary files which are compiled, using espeakedit, from source
 files in `phsource/mbrola`{.western}, see below.

 ### Mbrola Phoneme Translation Data {.western}

 Mbrola phoneme translation files specify translations from eSpeak
 phoneme names to mbrola phoneme names. They are referenced from voice
 files.

 The source files are in `phsource/mbrola`{.western}. These are compiled
 using the `espeakedit`{.western} program
 (`Compile->Compile mbrola phonemes list`{.western}) to produce data
 files in `espeak-data/mbrola_ph`{.western} which are used by eSpeak.

 Each line in the mbrola phoneme translation file contains:

 `<control> <espeak ph1> <espeak ph2> <percent> <mbrola ph1> [<mbrola ph2>] `{.western}

 **\<control\>**

 -   -   -   -   

 **\<espeak ph1\>**\
 The eSpeak phoneme which is to be translated to an mbrola phoneme.

 **\<espeak ph2\>**\
 If this field is not `NULL`{.western}, then the match only occurs if
 this field matches the next phoneme. If control bit 1 is set, then the
 *previous* rather than the *next* phoneme is matched. This field may
 also have the following values:\
 `VWL`{.western}   matches any Vowel phoneme.

 **\<percent\>**\
 If this field is zero then only one mbrola phoneme is used. If this
 field is non-zero, then two mbrola phonemes are used, and this value
 gives the percentage length of the first mbrola phoneme.

 **\<mbrola ph1\>**\
 The mbrola phoneme to which the eSpeak phoneme is translated. This
 field may be `NULL`{.western}.

 **\<mbrola ph2\>**\
 The second mbrola phoneme. This field is only used if the \<percent\>
 field is not zero.

 The list is searched from start to finish, until a match is found.
 Therefore, a line with more specific match condition should appear
 before a line which matches the same eSpeak phoneme but with a more
 general condition.

 The file `dictsource/dict_phonemes`{.western} lists the eSpeak phonemes
 which are used for each language. Translations for all these should be
 given in the mbrola phoneme translation file. In addition, some phonemes
 which are referenced from phoneme files (eg.
 `phsource/ph_language, phsource/phonemes`{.western}) in lines such as:

 ~~~~ {.western}
   beforenotvowel   l/
   reduceto  a#  0
 ~~~~

 should also be included, even though they don't appear in
 `dictsource/dict_phonemes`{.western}.

 If the language's \*\_list or \*\_rules files includes rules to speak
 words "as English" the mbrola phoneme translation file should include
 rules which translate English phonemes into near equivalents, so that
 they can spoken by the mbrola voice.
--- a/docs/phonemes.md
+++ b/docs/phonemes.md
@@ -0,0 +1,283 @@
 PHONEMES {.western}
 --------

 In general a different set of phonemes can be defined for each language.

 In most cases different languages inherit the same basic set of
 consonants. They can add to these or modify them as needed.

 The phoneme mnemonics are based on the scheme by Kirshenbaum which
 represents International Phonetic Alphabet symbols using ascii
 characters. See:
 [www.kirshenbaum.net/IPA/ascii-ipa.pdf](http://www.kirshenbaum.net/IPA/ascii-ipa.pdf).

 Phoneme mnemonics can be used directly in the text input to
 **espeak-ng**. They are enclosed within double square brackets. Spaces
 are used to separate words, and all stressed syllables must be marked
 explicitly. eg:\
 `[[D,Is Iz sVm f@n'EtIk t'Ekst 'InpUt]]`{.western}

 ### English Consonants {.western}

 `[p]`{.western}

 `[b]`{.western}

 `[t]`{.western}

 `[d]`{.western}

 `[tS]`{.western}

 **ch**urch

 `[dZ]`{.western}

 **j**udge

 `[k]`{.western}

 `[g]`{.western}

 `[f]`{.western}

 `[v]`{.western}

 `[T]`{.western}

 **th**in

 `[D]`{.western}

 **th**is

 `[s]`{.western}

 `[z]`{.western}

 `[S]`{.western}

 **sh**op

 `[Z]`{.western}

 plea**s**ure

 `[h]`{.western}

 `[m]`{.western}

 `[n]`{.western}

 `[N]`{.western}

 si**ng**

 `[l]`{.western}

 `[r]`{.western}

 **r**ed (Omitted if not immediately followed by a vowel).

 `[j]`{.western}

 **y**es

 `[w]`{.western}

 **Some Additional Consonants**

 \

 `[C]`{.western}

 German i**ch**

 `[x]`{.western}

 German bu**ch**

 `[l^]`{.western}

 Italian **gl**i

 `[n^]`{.western}

 Spanish **ñ**

 ### English Vowels {.western}

 These are the phonemes which are used by the English spelling-to-phoneme
 translations (en\_rules and en\_list). In some varieties of English
 different phonemes may have the same sound, but they are kept separate
 because they may differ in another variety.

 In rhotic accents, such as General American, the phonemes
 `[3:], [A@], [e@], [i@], [O@], [U@] `{.western}include the "r" sound.

 `[@]`{.western}

 alph**a**

 schwa

 `[3]`{.western}

 bett**er**

 rhotic schwa. In British English this is the same as `[@]`{.western},
 but it includes 'r' colouring in American and other rhotic accents. In
 these cases a separate `[r]`{.western} should not be included unless it
 is followed immediately by another vowel.

 `[3:]`{.western}

 n**ur**se

 `[@L]`{.western}

 simp**le**

 `[@2]`{.western}

 the

 Used only for "the".

 `[@5]`{.western}

 to

 Used only for "to".

 `[a]`{.western}

 tr**a**p

 `[aa]`{.western}

 b**a**th

 This is `[a]`{.western} in some accents, `[A:]`{.western} in others.

 `[a#]`{.western}

 **a**bout

 This may be `[@]`{.western} or may be a more open schwa.

 `[A:]`{.western}

 p**al**m

 `[A@]`{.western}

 st**ar**t

 `[E]`{.western}

 dr**e**ss

 `[e@]`{.western}

 squ**are**

 `[I]`{.western}

 k**i**t

 `[I2]`{.western}

 **i**ntend

 As `[I]`{.western}, but also indicates an unstressed syllable.

 `[i]`{.western}

 happ**y**

 An unstressed "i" sound at the end of a word.

 `[i:]`{.western}

 fl**ee**ce

 `[i@]`{.western}

 n**ear**

 `[0]`{.western}

 l**o**t

 `[V]`{.western}

 str**u**t

 `[u:]`{.western}

 g**oo**se

 `[U]`{.western}

 f**oo**t

 `[U@]`{.western}

 c**ure**

 `[O:]`{.western}

 th**ou**ght

 `[O@]`{.western}

 n**or**th

 `[o@]`{.western}

 f**or**ce

 `[aI]`{.western}

 pr**i**ce

 `[eI]`{.western}

 f**a**ce

 `[OI]`{.western}

 ch**oi**ce

 `[aU]`{.western}

 m**ou**th

 `[oU]`{.western}

 g**oa**t

 `[aI@]`{.western}

 sc**ie**nce

 `[aU@]`{.western}

 h**our**

 ### Some Additional Vowels {.western}

 Other languages will have their own vowel definitions, eg:

 +--------------------------------------+--------------------------------------+
 | `[e]`{.western}                      | German **eh**, French **é**          |
 +--------------------------------------+--------------------------------------+
 | `[o]`{.western}                      | German **oo**, French **o**          |
 +--------------------------------------+--------------------------------------+
 | `[y]`{.western}                      | German **ü**, French **u**           |
 +--------------------------------------+--------------------------------------+
 | `[Y]`{.western}                      | German **ö**, French **oe**          |
 +--------------------------------------+--------------------------------------+

 `[:] `{.western}can be used to lengthen a vowel, eg `[e:]`{.western}
--- a/docs/phontab.md
+++ b/docs/phontab.md
@@ -0,0 +1,174 @@
 PHONEME TABLES {.western}
 --------------

 A phoneme table defines all the phonemes which are used by a language,
 together with their properties and the data for their production as
 sounds.

 Generally each language has its own phoneme table, although additional
 phoneme tables can be used for different voices within the language.
 These alternatives are referenced from Voice files.

 A phoneme table does not need to define all the phonemes used by a
 language. It can inherit the phonemes from a previously defined phoneme
 table. For example, a phoneme table may redefine (or add) some of the
 vowels that it uses, but inherit most of its consonants from a standard
 set.

 The source files for the phoneme data are in the "phsource" directory in
 the espeakedit download package. "Vowel files", which are referenced in
 FMT(), VowelStart(), and VowelEnding() instructions are made using the
 espeakedit program.

 ### Phoneme files {.western}

 The phoneme tables are defined in a master phoneme file, named
 **phonemes**. This starts with the **base** phoneme table followed by
 phoneme tables for other languages and voices. These inherit phonemes
 from the **base** table or previously defined tables.

 In addition to phoneme definitions, the phoneme file can contain the
 following:

 **include** \<filename\> 
 :   Includes the text of the specified file at this point. This allows
    different phoneme tables to be kept in different text files, for
    convenience. \<filename\> is a relative path. The included file can
    itself contain **include** statements.
 **phonemetable** \<name\> \<parent\> 
 :   Starts a new phoneme table, and ends the previous table.\
     \<name\> Is the name of this phoneme table. This name is used in
    Voice files.\
     \<parent\> Is the name of a previously defined phoneme table whose
    phoneme definitions are inherited by this one. The name **base**
    indicates the first (base) phoneme table.

 ### Phoneme definitions {.western}

 Note: These new Phoneme definitions apply to eSpeak version 1.42.20 and
 later.

 A phoneme table contains a list of phoneme definitions. Each starts with
 the keyword **phoneme** and the phoneme name (this is the name used in
 the pronunciation rules in a language's \*\_rules and \*\_list files),
 and ends with the keyword **endphoneme**. For example:

 ~~~~ {.western}
  phoneme aI
    vowel
    starttype #a endtype #i
    length 230
    FMT(vowels/ai)
  endphoneme

  phoneme s
    vls alv frc sibilant
    voicingswitch z
    lengthmod 3
    Vowelin  f1=0  f2=1700 -300 300  f3=-100 80
    Vowelout f1=0  f2=1700 -300 250  f3=-100 80  rms=20

    IF nextPh(isPause) THEN
      WAV(ufric/s_)
    ELIF nextPh(p) OR nextPh(t) OR nextPh(k) THEN
      WAV(ufric/s!)
    ENDIF
    WAV(ufric/s)
  endphoneme
 ~~~~

 A phoneme definition contains both static properties and executed
 instructions. The instructions may contain conditional statements, so
 that the effect of the phoneme may be different depending on adjacent
 phonemes, whether the syllable is stressed, etc.

 The instructions of a phoneme are interpreted in two different phases.
 In the first phase, the instructions may change the phoneme and replace
 it by a different phoneme. In the second phase, instructions are used to
 produce the sound for the phoneme.

 The **import\_phoneme** statement can be used to copy a previously
 defined phoneme from a specified phoneme table. For example:

 ~~~~ {.western}
  phoneme t
    import_phoneme base/t[
  endphoneme 
 ~~~~

 means: `phoneme t`{.western} in this phoneme table is a copy of
 `phoneme t[`{.western} from phoneme table "base". A **length**
 instruction can be used after **import\_phoneme** to vary the length
 from the original.

 ### Phoneme Properties {.western}

 Within the phoneme definition the following lines may occur: ( (V)
 indicates only for vowels, (C) only for consonants)

 ### Phoneme Instructions {.western}

 Phoneme Instructions may be included within conditional statements.

 During the first phase of phoneme interpretation, an instruction which
 causes a change to a different phoneme will terminate the instructions.
 During the second phase, FMT() and WAV() instructions will terminate the
 instructions.

 ### Conditional Statements {.western}

 Phoneme definitions can contain conditional statements such as:

 ~~~~ {.western}
  IF <condition> THEN
    <statements>
  ENDIF
 ~~~~

 or more generally:

 ~~~~ {.western}
  IF <condition> THEN
    <statements>
  ELIF <condition> THEN
    <statements>
  ...
  ELSE
    <statements>
  ENDIF
 ~~~~

 where the `ELSE`{.western} and multiple `ELSE`{.western} parts are
 optional.

 Multiple conditions may be joined with `AND`{.western} or
 `OR`{.western}, but not a mixture of `AND`{.western}s and
 `OR`{.western}s.

 A condition may be preceded by `NOT`{.western}. For example:

 ~~~~ {.western}
  IF <condition> AND NOT <condition> THEN
    <statements>
  ENDIF
 ~~~~

 **Condition** Can be:

 **Attributes**

 ### Sound Specifications {.western}

 There are three ways to produce sounds:

 -   -   -   

 ### Vowel Transitions {.western}

 These specify how a consonant affects an adjacent vowel. A consonant may
 cause a transition in the vowel's formants as the mouth changes shape
 between the consonant and the vowel. The following attributes may be
 specified. Note that the maximum rate of change of formant frequencies
 is limited by the speak program.


--- a/docs/ssml.md
+++ b/docs/ssml.md
@@ -0,0 +1,64 @@
 TEXT MARKUP {.western}
 -----------

 ### SSML: Speech Synthesis Markup Language {.western}

 The following markup tags and attributes are recognised:

 **\<speak\>**

 -   -   

 **\<voice\>**

 -   -   -   -   -   

 **\<prosody\>**

 -   -   -   -   

 **\<say-as\>**

 -   -   -   -   -   

 **\<mark\>** name

 **\<s\>**

 -   

 **\<p\>**

 -   

 **\<sub\>** alias

 **\<tts:style\>**

 -   -   

 **\<audio\>** src

 **\<emphasis\>**

 -   

 **\<break\>**

 -   -   

 ### HTML {.western}

 eSpeak can speak HTML text directly, or text containing both SSML and
 HTML markup.\
 Any unrecognised tags are ignored.

 The following tags case a sentence break.\
 **\<br\>   \<dd\>   \<li\>   \<img\>   \<td\>  **

 The following tags case a paragraph break.\
 **\<h1\>   \<h2\>   \<h3\>   \<h4\>   \<hr\>  **

 Text between the following tags is ignored.\
 **\<script\>   ...   \</script\>  \
 \<style\>   ...   \</style\>  **
--- a/docs/voices.md
+++ b/docs/voices.md
@@ -0,0 +1,311 @@
 5. VOICES {.western}
 ---------

 ### 5.1 Voice Files {.western}

 A Voice file specifies a language (and possibly a language variant or
 dialect) together with various attributes that affect the
 characteristics of the voice quality and how the language is spoken.

 Voice files are placed in the `espeak-data/voices`{.western} directory,
 or within subdirectories in there.

 The available voice files can be listed by:

 ~~~~ {.western}
   espeak-ng --voices
 or
   espeak-ng --voices=<language>
 ~~~~

 also

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   espeak-ng --voices=<variant>
 ~~~~

 Lists voice variants which can be applied to eSpeak voices.

 ~~~~ {.western style="margin-bottom: 0.5cm"}
   espeak-ng --voices=<mbrola>
 ~~~~

 Lists the Mbrola voices.

 ### 5.2 Contents of Voice Files {.western}

 The **language** attribute is mandatory. All the other attributes are
 optional.

 #### Identification Attributes {.western}

 **name  \<name\>**

 A name given to this voice.

 **language  \<language code\> [\<priority\>]**

 This attribute should appear before the other attributes which are
 listed below.

 It selects the default behaviour and characteristics for the language,
 and sets default values for "phonemes", "dictionary" and other
 attributes. The \<language code\> should be a two-letter ISO 639-1
 language code. One or more language variant codes may be appended,
 separated by hyphens. (eg. en-uk-north).

 The optional \<priority\> value gives the preference of this voice
 compared with others for the specified language. A low value indicates a
 more preferred voice. The default value is 5.

 More than one **language** line may be present. A voice may be selected
 for other related languages (variants which have the same initial 2
 letter language code as the specified language), but it will be less
 preferred for these. Different language variants may be specified by
 additional **language** lines in order to indicate that this is a
 preferred voice for them also. Eg.

 ~~~~ {.western}
   language en-uk-north
   language en
 ~~~~

 indicates that this is voice is for the "en-uk-north" dialect, but it is
 also a main choice when a general "en" language is specified. Without
 the second **language** line, it would be disfavoured for "en" for being
 a more specialised voice.

 **gender  \<gender\> [\<age\>]**

 This attribute is only a label for use in voice selection. It doesn't
 change the sound of the voice.

 \<gender\> may be male, female, or unknown.\
 \<age\> is optional and gives an age in years.

 **pitch  \<base\> \<range\>**

 Two integer values. The first gives a base pitch to the voice (value in
 Hz) The second controls the range of pitches used by the voice. Setting
 it equal to the base pitch will give a monotone. The default values are
 82 118.

 **formant  \<number\> \<frequency\> \<strength\> \<width\>
 \<freq\_add\>**

 Systematically adjusts the frequency, strength, and width of the
 resonance peaks of the voice. Values are percentages of the default
 values. Changing these affects the tone/quality of the voice.

 **freq\_add**Adds a constant value (in Hz) to the frequency of the
 formant peak. The value may be negative.

 -   -   -   -   

 **echo  \<delay\> \<amplitude\>**

 Parameter 1 gives the delay in mS (0 to 250mS).\
 Parameter 2 gives the echo amplitude (0 to 100).\
 Adding some echo can give a clearer or more interesting sound,
 especially when listening through a domestic stereo sound system, rather
 than small computer speakers.

 **tone**

 Controls the tone of the sound.\
 **tone** is followed by up to 4 pairs of \<frequency\> \<amplitude\>
 which define a frequency response graph. Frequency is in Hz and
 amplitude is in the range 0 to 255. The default is:

 `  `{.western}`tone 600         170  1200 135  2000 110`{.western}

 This means that from frequency 0Hz to 600Hz the amplitude is 170. From
 600Hz to 1200Hz the amplitude decreases from 170 to 135, then decreases
 to 110 at 2000Hz and remains at 110 at higher frequencies. This
 adjustment applies only to voiced sounds such as vowels and sonorant
 consonants (such as [n] and [l]). Unvoiced sounds such as [s] are
 unaffected.

 This **tone** statement can also appear in
 `espeak-data/config`{.western}, in which case it applies to all voices
 which don't have their own **tone** statement.

 **flutter  \<value\>**

 Default value: 2.\
 Adds pitch fluctuations to give a wavering or older-sounding voice. A
 large value (eg. 20) makes the voice sound "croaky".

 **roughness  \<value\>**

 Default value: 2. Range 0 - 7\
 Reduces the amplitude of alternate waveform cycles in order to make the
 voice sound creaky.

 **voicing  \<value\>**

 Default value: 100.\
 Adjusts the strength of formant-synthesized sounds (vowels and sonorant
 consonants).

 **consonants  \<value\> \<value\>**

 Default values: 100, 100.\
 Adjusts the strength of noise sounds which are used in consonants. The
 first value is the strength of unvoiced consonants such as "s" and "t".
 The second value is the strength of the noise component of voiced
 consonants such as "z" and "d".

 **breath  \<up to 8 integer values\>**

 Default values: 0.\
 Adds noise which corresponds to the formant frequency peaks. The values
 give the strength of noise for each formant peak (formants 1 to 8).

 Use together with a low or zero value of the **voicing** attribute to
 make a "wisper". For example:\

 `breath           75 75 60 40 15 10 breathw  150 150 200 200 400         400 voicing  18 flutter  20 formant           0 100 0 100   // remove formant 0 `{.western}

 **breathw  \<up to 8 integer values\>**

 These values give bandwidths of the noise peaks of the **breath**
 attribute. If **breathw** values are not given, then suitable default
 values will be used.

 **speed  \<value\>**

 Default value 100.\
 Adjusts the speaking speed by a percentage of the default rate. This
 can be used if a language voice seems faster or slower compared to other
 voices.

 **phonemes  \<name\>**

 Specifies which set of phonemes to use from those contained in the
 phontab, phonindex, and phondata data files. This is a **phonemetable**
 name as given in the "phoneme" source file.

 This parameter is usually not needed as it is set by default to the
 first two letters of the "language" parameter. However, different voices
 of the same language can use different phoneme sets, to give different
 accents.

 **dictionary  \<name\>**

 Specifies which pair of dictionary files to use. eg. "english" indicates
 that *speak-data/en\_dict* should be used to translate from words to
 phonemes. This parameter is usually not needed as it is set by default
 to the first two letters of "language" parameter.

 **dictrules  \<list of rule numbers\>**

 Gives a list of conditional dictionary rules which are applied for this
 voice. Rule numbers are in the range 0 to 31 and are specific to a
 language dictionary. They apply to rules in the language's **\_rules**
 dictionary file and also its **\_list** exceptions list. See
 [dictionary.html](dictionary.html).

 **replace  \<flags\> \<phoneme\> \<replacement phoneme\>**

 Replace a phoneme by another whenever it occurs.

 \<replacement phoneme\> may be NULL.

 Flags: bit 0: replacement only occurs on the final phoneme of a word.\
 Flags: bit 1: replacement doesn't occur in stressed syllables.\
 eg.

 ~~~~ {.western}
      replace  0  h  NULL      // drops h's
      replace  0  V  U         // replaces vowel in 'strut' by that in 'foot'
                               // as occurs in northern British English
      replace  3  N  n         // change 'fishing' to 'fishin' etc.
                               // (only the last phoneme of a word, only in unstressed syllables)
 ~~~~

 The phoneme mnemonics can be defined for each language, but some are
 listed in [phonemes.html](phonemes.html)

 **stressLength  \<8 integer values\>**

 Eight integer parameters. These control the relative lengths of the
 vowels in stressed and unstressed syllables.

 -   -   -   -   -   -   -   -   

 **stressAdd  \<8 integer values\>**

 Eight integer parameters. These are added to the voice's corresponding
 stressLength values. They are used in the voice variant files in
 `espeak-data/voices/!v`{.western} to give some variety. Negative values
 may be used.

 **stressAmp  \<8 integer values\>**

 Eight integer parameters. These control the relative amplitudes of the
 vowels in stressed and unstressed syllables (see stressLength above).
 The general default values are: 16, 16, 20, 20, 20, 24, 24, 22, although
 these defaults may be different for particular languages.

 **intonation  \<param1\>**

 -   -   -   -   

 **charset  \<param1\>**

 The ISO 8859 character set number. (not all are implemented).

 **dictmin  \<value\>**

 Used for some languages to detect if additional language data is
 installed. If the size of the compiled dictionary data for the language
 (the file `espeak-data/*_dict`{.western}) is less than this size then a
 warning is given.

 **alphabet2  \<alphabet\> \<language\>**

 Used to specify a language to be used to speak words which are written
 in a non-native alphabet. eg:

 ~~~~ {.western style="margin-bottom: 0.5cm"}
 alphabet2 cyr ru
 ~~~~

 Alphabets names include: latin, cyr (cyrillic), ar (arabic). The default
 language for latin alphabet is English.

 **dictdialect  \<dialect\>**

 Words can be marked in the \*\_list or \*\_rules file to be spoken using
 a foreign voice. This **dictdialect** attribute can be used to specify
 which dialect of the foreign language should be used, instead of the
 default dialect. The currently available dialects are:\
 **en-us** (US English)\
 **es-la** (Latin American Spanish).\
 eg.

 ~~~~ {.western style="margin-bottom: 0.5cm"}
 dictdialect en-us
 ~~~~

 This means that any words or rules which are maked with \_\^\_EN will be
 spoken with the US English voice instead of the default UK English
 voice.

 Additional attributes are available to set various internal options
 which control how language is processed. These would normally be set in
 the program code rather than in a voice file.

 A number of Voice files are provided in the
 `espeak-data/voices`{.western} directory. You can select one of these
 with the **-v \<voice filename\>** parameter to the speak command.

 **default**

 This voice is used if none is specified in the speak command. You can
 copy your preferred voice to "default" so you can use the speak command
 without the need to specify a voice.

 For a list of voices provided for English and other languages see
 [Languages](languages.html).