eSpeak NG is an open source speech synthesizer that supports more than hundred languages and accents.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

mbrola.html 4.8KB

1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495
  1. <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
  2. <html>
  3. <head>
  4. <title>espeakedit: Mbrola Voices</title>
  5. <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
  6. </head>
  7. <body>
  8. <A href="docindex.html">Back</A>
  9. <hr>
  10. <h2>MBROLA VOICES</h2>
  11. <hr>
  12. The Mbrola project is a collection of diphone voices for speech synthesis. They do not include any text-to-phoneme translation, so this must be done by another program. The Mbrola voices are cost-free but are not open source. They are available from the Mbrola website at:<br>
  13. <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html">http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>
  14. <p>
  15. eSpeak can be used as a front-end to Mbrola. It provides the spelling-to-phoneme translation and intonation, which Mbrola then uses to generate speech sound.
  16. <h3>Voice Names</h3>
  17. To use a Mbrola voice, eSpeak needs information to translate from its own phonemes to the equivalent Mbrola phonemes. This has been set up for only some voices so far.
  18. <p>
  19. The eSpeak voices which use Mbrola are named as:<br>
  20. &nbsp; <b>mb-</b>xxx
  21. <p>
  22. where xxx is the name of a Mbrola voice (eg. <b>mb-en1</b> for the Mbrola "<b>en1</b>" English voice). These voice files are in eSpeak's directory <code>espeak-data/voices/mbrola</code>.
  23. <p>
  24. The installation instructions below use the Mbrola voice "en1" as an example. You can use other mbrola voices for which there is an equivalent eSpeak voice in <code>espeak-data/voices/mbrola</code>.
  25. <p>
  26. There are some additional eSpeak Mbrola voices which speak English text using a Mbrola voice for a different language. These contain the name of the Mbrola voice with a suffix <b>-en</b>. For example, the voice <b>mb-de4-en</b> will speak English text with a German accent by using the Mbrola <b>de4</b> voice.
  27. <h3>Windows Installation</h3>
  28. The SAPI5 version of eSpeak uses the mbrola.dll.
  29. <ol>
  30. <li>Install eSpeak. Include the voice <b>mb-en1</b> in the
  31. list of voices during the eSpeak installation.
  32. <p>
  33. <li>Install the PC/Windows version of Mbrola (MbrolaTools35.exe) from:
  34. <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pcwin/MbrolaTools35.exe"> http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pcwin/MbrolaTools35.exe</a>.
  35. <p>
  36. <li>Get the <b>en1</b> voice from:
  37. <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html"> http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>
  38. unpack the archive, and copy the "<b>en1</b>" data file (not the whole "en1"
  39. directory) into
  40. <code>C:/Program Files/eSpeak/espeak-data/mbrola</code>.
  41. <p>
  42. <li>Use the voice <b>espeak-MB-EN1</b> from the list of SAPI5 voices.
  43. </ol>
  44. <h3>Linux Installation</h3>
  45. I don't think there's a Linux shared library version of Mbrola (equivalent to mbrola.dll), so eSpeak has to pipe phoneme data to the command-line Mbrola.
  46. <ol>
  47. <li>To install the Linux Mbrola binary, download:
  48. <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pclinux/mbr301h.zip"> http://www.tcts.fpms.ac.be/synthesis/mbrola/bin/pclinux/mbr301h.zip</a>.
  49. Unpack the archive, and copy and rename the file: <code>mbrola-linux-i386</code> to
  50. <code>mbrola</code> somewhere in your executable path (eg. <code>/usr/bin/mbrola</code> ).
  51. <p>
  52. <li>Get the en1 voice from:
  53. <a href="http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html"> http://www.tcts.fpms.ac.be/synthesis/mbrola/mbrcopybin.html</a>.
  54. Unpack the archive, and copy the "<b>en1</b>" data file (not the whole "en1"
  55. directory) somewhere convenient (eg. <code>/usr/share/mbrola/en1</code> ).
  56. <p>
  57. <li>If you use the eSpeak voice "<b>mb-en1</b>" then eSpeak will generate
  58. Mbrola phoneme data on its stdout. You can pipe this into Mbrola.
  59. <p>
  60. <code>espeak -v mb-en1 -f textfile | mbrola -e /usr/share/mbrola/en1 -
  61. test.wav</code>
  62. <p>
  63. will put the Mbrola speech output into a WAV file. Or you can pipe the output from Mbrola through aplay:
  64. <p>
  65. <code>espeak -v mb-en1 -f textfile | mbrola -e /usr/share/mbrola/en1 - - | aplay -r16000 -fS16</code>
  66. <p>
  67. The -e option prevents Mbrola from stopping if it finds a combination
  68. of phonemes which it doesn't recognise.
  69. <p>
  70. Some mbrola voices (de5, de6) use a sample rate of 22050 Hz. These need -r22050 rather than -r16000.
  71. </ol>
  72. <h3>Mbrola Voice Files</h3>
  73. eSpeak's voice files for Mbrola voices are in directory <code>espeak-data/voices/mbrola</code>. They contain a line:<br>
  74. &nbsp; <code>mbrola &lt;voice&gt; &lt;translation&gt;</code>
  75. <br>
  76. eg.<br>
  77. &nbsp; <code>mbrola en1 en1_phtrans</code>
  78. <ul>
  79. <li><b>&lt;voice&gt;</b> is the name of the Mbrola voice.
  80. <p>
  81. <li><b>&lt;translation&gt;</b> is a translation file to convert between eSpeak phonemes and the equivalent Mbrola phonemes. These are kept in:
  82. <code>espeak-data/mbrola_ph</code>
  83. </ul>
  84. They are binary files which are compiled, using espeakedit, from source files in <code>phsource/mbrola</code>. Details to be defined.
  85. </body>
  86. </html>