eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

add_language.html 8.2KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <HTML>
  3. <HEAD>
  4. <META HTTP-EQUIV="CONTENT-TYPE" CONTENT="text/html; charset=utf-8">
  5. <TITLE>eSpeak: Adding a Language</TITLE>
  6. </HEAD>
  7. <BODY LANG="en-GB" DIR="LTR">
  8. <A href="docindex.html">Back</A>
  9. <HR>
  10. <H2>6. ADDING OR IMPROVING A LANGUAGE</H2>
  11. <HR>
  12. Most of the work doesn't need any programming knowledge. Just an understanding of the language, an
  13. awareness of its features, patience and attention to detail. Wikipedia is a good source of basic phonetic information, eg
  14. <a href="http://en.wikipedia.org/wiki/Vowel">http://en.wikipedia.org/wiki/Vowel</a>
  15. <P>
  16. In many cases it should be fairly easy to add a rough implementation of a new language, hopefully
  17. enough to be intelligible.<p>
  18. After that it's a gradual process of improvement to:
  19. <ul>
  20. <li>Make the spelling-to-phoneme translation rules more accurate, including the position of stressed
  21. syllables within words. Some languages are easier than others. I expect most are easier than English.
  22. <p><li>Improve the sounds of the phonemes. It may be that a phoneme should sound different depending on adjacent sounds, or whether it's at the start or the end of a word, between vowels, etc. This may consist of making small adjustments to vowel and diphthong quality,
  23. or adjusting the strength of consonants. Bigger changes may be recording new or replacement consonant
  24. sounds, or even writing program code to implement new types of sounds.
  25. <p><li>Some common words should be added to the dictionary (the *_list file for the language) with an "unstressed" attribute (eg. in English, words such as "the", "is", "had", "my", "she", "of", "in", "some"), or should be preceded
  26. by a short pause (such as "and", "but", "which"), or have other attributes, in order to make the speech flow better.
  27. <p><li>Improve the rhythm of the speech by adjusting the relative lengths of vowels in different contexts, eg. stressed/unstressed syllable,
  28. or depending on the following phonemes. This is important for making the speech sound good for the language.
  29. <p><li>Identify or implement new functions in the program to improve the speech, or to deal with
  30. characteristics of the language which are not currently implemented. For example, a different intonation module.
  31. </ul>
  32. <b><em>If you are interested in working on a language, please contact me so that I can set up the initial data and discuss the features of the language.</em></b>
  33. <p>
  34. For most of the eSpeak voices, I do not speak or understand the language, and I do not know how it should sound. I can only make improvements as a result of feedback from speakers of that language. If you want to help to improve a language, listen carefully and try to identify individual errors, either in the spelling-to-phoneme translation, the position of stressed syllables within words, or the sound of phonemes, or problems with rhythm and vowel lengths.
  35. <HR>
  36. <H3>6.1 Language Code</H3>
  37. <P>Generally, the language's international <a href="http://en.wikipedia.org/wiki/ISO_639-1">ISO 639-1 code</a> is used to
  38. identify the language. It is used in the filenames which
  39. contains the language's data. In the examples below the code &quot;<B>en</B>&quot;
  40. (English) is used as an example. Replace this with the code of your
  41. language.<p>
  42. It is possible to have different variants of a language, for example where the sound of some phonemes changed,
  43. or where some of the pronunciation rules differ.
  44. <HR>
  45. <H3>6.2 Phoneme File</H3>
  46. <P>You must first decide on the set of phonemes to be used for the
  47. language. These should be listed and defined in a phonemes file such as
  48. <B>ph_english</B>. A reference to this file is then included at the end of
  49. the <B>phonemes,</B> file (the master phoneme file), eg:</P>
  50. <PRE> phonemetable en base
  51. include ph_english</PRE><P>
  52. This example defines a phoneme table &quot;<B>en</B>&quot; which inherits
  53. the contents of phoneme table &quot;<B>base</B>&quot;. Its contents are
  54. found in the file <B>ph_english</B>.</P>
  55. <P>The <B>base</B> phoneme table contains definitions of a basic set of
  56. consonants, and also some &quot;control&quot; phonemes such as stress marks and
  57. pauses. The phoneme table for a language will generally inherit this,
  58. or alternatively it may inherit the phoneme table of another language
  59. which in turn inherits the <B>base</B> phoneme table.</P>
  60. <P>The phonemes file for the language defines those additional
  61. phonemes which are not inherited (generally the vowels and diphthongs, plus any additional
  62. consonants), or phonemes whose definitions differ from the
  63. inherited version (eg. the redefinition of a consonant).</P>
  64. <P>Details of the contents of phonemes files are given in
  65. <A href="phontab.html">phontab.html</A>.</P>
  66. The <B>Compile phoneme data</B> function of the <B>espeakedit</B>
  67. program compiles the phonemes files to produce the files
  68. <B>espeak-data/phontab</B>, <B>phonindex</B>, and <B>phondata.</B><P>
  69. For information on how to analyse recorded sounds of the language and to
  70. prepare the corresponding phoneme data, see <a href="editor_if.html">espeakedit</a> and <a href="analyse.html">analysis</a>).<p>
  71. For an initial draft a language will often be able to use vowels and
  72. consonants which have already been set up for another language.
  73. <HR>
  74. <H3>6.3 Dictionary Files</H3>
  75. <P STYLE="font-weight: medium">Once the language's phonemes have been
  76. defined, then pronunciation dictionary data can be produced in order
  77. to translate the language's source text into phonemes. This consists
  78. of two source files: <B>en_rules</B> (the spelling to phoneme rules) and
  79. <B>en_list</B> (an exceptions list, and attributes of certain words). The corresponding compiled data
  80. file is <B>espeak-data/en_dict</B> which is produced from <B>en_rules</B>
  81. and <B>en_list</B> sources by the command: <B>speak&nbsp; --compile=en</B>.</P>
  82. <P STYLE="font-weight: medium">Details of the contents of the
  83. dictionary files are given in <A href="dictionary.html">dictionary.html</A>.</P>
  84. <P STYLE="font-weight: medium">The <B>en_list</B> file contains not
  85. only pronunciation exceptions, but also gives attributes to specific
  86. words, Most notable of these are:</P>
  87. <P STYLE="font-weight: medium"><B>$u </B>Some common words should be
  88. marked as &quot;unstressed&quot; in order to make the speech flow better.
  89. These words generally include articles (eg: a, the, this, that),
  90. auxillary verbs (eg: is, have, will, can, may), pronouns and
  91. possessive adjectives (eg: he, his), some common prepositions (eg:
  92. of, to, in, of), some common conjunctions (eg. and, or, if), some
  93. common adverbs and adjectives (eg. any, already)</P>
  94. <P><B>$pause </B>Some words should be marked to have a short pause
  95. before then, in order to produce natural pauses in long sentences.
  96. These include conjunctions (eg. and, or, but, however, which) and perhaps
  97. some prepositions.</P>
  98. <HR>
  99. <H3>6.4 Voice File</H3>
  100. <P STYLE="font-weight: medium">Each language should have one or more
  101. voice files in <B>espeak-data/voices</B>. The filename of the default voice
  102. for a language should be the same as the language code.</P>
  103. <P STYLE="font-weight: medium">Details of the contants of voice files
  104. are given in <A href="voices.html">voices.html</A>.</P>
  105. <P STYLE="font-weight: medium">The simplest voice file would contain
  106. just a single line to give the language code, eg:</P>
  107. <PRE STYLE="margin-bottom: 0.5cm"> language en</PRE><P STYLE="font-weight: medium">
  108. This language code specifies the phoneme table (i.e. <b>phonemetable en</b> and the
  109. dictionary (i.e. <B>espeak-data/en_dict</B>) to be used. If needed, these can be
  110. overridden by <B>phonemes</B> and <B>dictionary</B> attributes in the
  111. voices file.</P>
  112. <HR>
  113. <H3>6.5 Program Code</H3>
  114. <P STYLE="font-weight: medium">The behaviour of the speak program is
  115. controlled by various options (eg. whether words are stressed on the first,
  116. last, or penultimate syllable). The function <B>SetTranslator()</B> at the
  117. start of the <B>tr_languages.cpp</B> file recognizes the language
  118. code and sets the appropriate set of options.</P>
  119. <P STYLE="font-weight: medium">For a new language, you would add its
  120. language code and the required options in <B>SetTranslator()</B>. However, this
  121. may not be necessary during testing because most of the options can also be
  122. set from the voice file in
  123. <B>espeak-data/voices</B>.</P>
  124. </BODY>
  125. </HTML>