9 years ago · 91779563dd
--- a/docs/add_language.md
+++ b/docs/add_language.md
 # Adding or Improving a Language
 - [Language Code](#language-code)
 - [Language Files](#language-files)
  - [Language](#language)
  - [Accent](#accent)
  - [Language Family](#language-family)
 - [Language Files](#language-files)
 - [Voice File](#voice-file)
 - [Phoneme Definition File](#phoneme-definition-file)
 - [Dictionary Files](#dictionary-files)
 list of valid tags originate from various standards and have been combined
 into the
 [IANA Language Subtag Registry](http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry).
 Additional private-use tags for other accents and dialects are defined in the
 [bcp47-extensions](https://raw.githubusercontent.com/espeak-ng/bcp47-data/master/bcp47-extensions)
 file of the [bcp47-data](https://github.com/rhdunn/bcp47-data) project.
 ### Language
 *  `de` (German) -- The [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1)
   2-letter language code for the language.
   __NOTE:__ BCP 47 uses ISO 639-1 codes for languages that are allocated
   2-letter codes (e.g. using `en` instead of `eng`).
 *  `yue` (Cantonese) -- The [ISO 639-3](https://en.wikipedia.org/wiki/ISO_639-3)
   3-letter language codes for the language.
 *  `ta-Arab` (Tamil written in the Arabic alphabet) -- The
   [ISO 15924](https://en.wikipedia.org/wiki/ISO_15924) 4-letter script code.
 __NOTE:__ The language tags listed in the IANA Language Subtag Registry should
 be used instead of those from the standards they were inherited from. For
 example, ISO 639-3 duplicates languages found in ISO 639-1, but BCP 47 always
 uses the ISO 639-1 form when available. That is, ISO 639-3 `eng` is never used
 for English in BCP 47.
 __NOTE:__ Where the script is the primary script for the language, the script
 tag should be omitted.
   __NOTE:__ Where the script is the primary script for the language, the script
   tag should be omitted.
 ### Accent
   language tags for accents that cannot be described using the available
   BCP 47 language tags.
 __NOTE:__ If the accent you are trying to describe cannot be specified using
 the above system, raise an issue in the
 [bcp47-data](https://github.com/rhdunn/bcp47-data) project and a private use
 tag will be defined for that accent.
   __NOTE:__ If the accent you are trying to describe cannot be specified using
   the above system, raise an issue in the
   [bcp47-data](https://github.com/rhdunn/bcp47-data) project and a private use
   tag will be defined for that accent.
 ### Language Family
 The following files are needed for your language.
  * `espeak-data/voices/fr`. The voice file. This gives the language name and
    may set some options.
  * `espeak-data/voices/roa/fr`. The voice file. This gives the language name
    and may set some options.
  * `phsource/ph_french`. The phoneme definition file. This contains phoneme
    definitions for the vowels and consonants which the language uses. Usually
    it will contain mostly vowels. Most consonants will be inherited from the
    attributes such as "unstressed" and "pause" to some common words.
 The `fr_rules` and `fr_list` files are compiled to produce the
 file `espeak-data/fr_dict`, which eSpeak uses when it is speaking.
 `espeak-data/fr_dict` file, which eSpeak uses when it is speaking.
 ## Voice File
 Each language needs a voice file in `espeak-data/voices` or
 `espeak-data/voices/test`. The filename of the default voice for a
 language should be the same as the language code (eg. "fr" for French).
 Each language needs a voice file in `espeak-data/voices` grouped by the
 [language family](#language-family). The filename of the default voice for a
 language should be the same as the language code (e.g. `fr` for French).
 Details of the contents of voice files are given in [Voices](voices.md).
--- a/docs/voices.md
+++ b/docs/voices.md
 characteristics of the voice quality and how the language is spoken.
 Voice files are located in the `espeak-data/voices` directory, and are
 grouped by the language family of the language being specified in the
 voice files.
 grouped by the [ISO 639-5](https://en.wikipedia.org/wiki/ISO_639-5)
 language family of the language being specified in the voice files.
 See also Wikipedia's
 [List of language families] (https://en.wiktionary.org/wiki/Wiktionary:List_of_families)
 for more details.
 The `default` voice is used if none is specified in the speak command. You
 can copy your preferred voice to "default" so you can use the speak command
 and sets default values for "phonemes", "dictionary" and other
 attributes.
 The \<language code\> is a
 [BCP 47](https://en.wikipedia.org/wiki/IETF_language_tag) language tag.
 When this is not enough to identify an accent, the
 [bcp47-data](https://github.com/rhdunn/bcp47-data) accents file describes
 the private use tags used by eSpeak NG. For example:
 The \<language code\> is a valid
 [BCP 47](https://en.wikipedia.org/wiki/IETF_language_tag) language tag. The
 list of valid tags originate from various standards and have been combined
 into the
 [IANA Language Subtag Registry](http://www.iana.org/assignments/language-subtag-registry/language-subtag-registry).
 For example:
 *  `de` (German) -- The [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1)
   2-letter language code for the language.
   __NOTE:__ BCP 47 uses ISO 639-1 codes for languages that are allocated
   2-letter codes (e.g. using `en` instead of `eng`).
 *  `yue` (Cantonese) -- The [ISO 639-3](https://en.wikipedia.org/wiki/ISO_639-3)
   3-letter language codes for the language.
 *  `ta-Arab` (Tamil written in the Arabic alphabet) -- The
   [ISO 15924](https://en.wikipedia.org/wiki/ISO_15924) 4-letter script code.
   __NOTE:__ Where the script is the primary script for the language, the script
   tag should be omitted.
 *  `es-419` (Spanish (Latin America)) -- The
   [UN M.49](https://en.wikipedia.org/wiki/UN_M.49) 3-number region codes.
 *  `fr-CA` (French (Canada)) -- Using the
   [ISO 3166-2](https://en.wikipedia.org/wiki/ISO_3166-2) 2-letter region codes.
 *  `en-GB-scotland` (English (Scotland)) -- This is using the BCP 47 variant
   tags.
 *  `en-GB-x-rp` (English (Received Pronunciation)) -- This is using the
   [bcp47-extensions](https://raw.githubusercontent.com/espeak-ng/bcp47-data/master/bcp47-extensions)
   language tags for accents that cannot be described using the available
   BCP 47 language tags.
 *  `en` -- English
 *  `en-GB-scotland` -- English with a Scottish accent
 *  `en-GB-x-rp` -- English with a Received Pronunciation accent
 *  `es-419` -- Spanish with a Latin American accent
 *  `fr-CA` -- French with a Canadian accent
   __NOTE:__ If the accent you are trying to describe cannot be specified using
   the above system, raise an issue in the
   [bcp47-data](https://github.com/rhdunn/bcp47-data) project and a private use
   tag will be defined for that accent.
 The optional \<priority\> value gives the preference of this voice
 compared with others for the specified language. A low value indicates a
 more preferred voice. The default value is 5.
 additional `language` lines in order to indicate that this is a
 preferred voice for them also. E.g.
 	language en-uk-north
 	language en-GB-x-gbclan
 	language en
 indicates that this is voice is for the "en-uk-north" dialect, but it is
 also a main choice when a general "en" language is specified. Without
 the second `language` line, it would be disfavoured for "en" for being
 indicates that this is voice is for the `en-GB-x-gbclan` dialect, but it is
 also a main choice when a general `en` language is specified. Without
 the second `language` line, it would be disfavoured from `en` for being
 a more specialised voice.
 ### gender