Browse Source

Incorporate the commands.html documentation in espeak-ng.1.ronn

master
Reece H. Dunn 9 years ago
parent
commit
e7ae2fde18
3 changed files with 33 additions and 232 deletions
  1. 0
    227
      docs/commands.html
  2. 0
    1
      docs/index.md
  3. 33
    4
      src/espeak-ng.1.ronn

+ 0
- 227
docs/commands.html View File

@@ -1,227 +0,0 @@
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>

<head>
<title>eSpeak Speech Synthesizer</title>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body>
<A href="index.html">Back</A>
<hr>
<h2>2.1 INSTALLATION</h2>
<hr>
<h3>2.1.1 Linux and other Posix systems</h3>
There are two versions of the command line program. They both have the same command parameters (see below).
<ol>
<li><strong>espeak-ng</strong> uses speech engine in the <strong>libespeak-ng</strong> shared library. The libespeak-ng library must first be installed.
<p>
<li><strong>speak-ng</strong> is a stand-alone version which includes its own copy of the speech engine.
</ol>
Place the <strong>espeak-ng</strong> or <strong>speak-ng</strong> executable file in the command path, eg in <strong>/usr/local/bin</strong>
<p>
Place the "<strong>espeak-data</strong>" directory in /usr/share as <strong>/usr/share/espeak-data</strong>.<br>
Alternatively if it is placed in the user's home directory (i.e. <strong>/home/&lt;user&gt;/espeak-data</strong>)
then that will be used instead.
<p>
<h4>Dependencies</h4>
<strong>espeak-ng</strong> uses the PortAudio sound library (version 18), so you will need to have the <strong>libportaudio0</strong> library package installed. It may be already, since it's used by other software, such as OpenOffice.org and the Audacity sound editor.<p>
Some Linux distrubitions (eg. SuSe 10) have version 19 of PortAudio which has a slightly different API. The speak program can be compiled to use version 19 of PortAudio by copying the file portaudio19.h to portaudio.h before compiling.<p>
The speak program may be compiled without using PortAudio, by removing the line<pre> #define USE_PORTAUDIO
</pre>in the file speech.h.
<p>&nbsp;<hr>

<h3>2.1.2 Windows</h3>
The installer: <strong>setup_espeak.exe</strong> installs the SAPI5 version of eSpeak.
During installation you need to specify which voices you want to appear in SAPI5 voice menus.
<p>
It also installs a command line program <strong>espeak-ng</strong> in the espeak-ng program directory.

<p>&nbsp;<hr>
<h2>2.2 COMMAND OPTIONS</h2>
<hr>
<h3>2.2.1 Examples</h3>
To use at the command line, type:<br>
&nbsp; <strong>espeak-ng "This is a test"</strong><br>
or<br>
&nbsp; <strong>espeak-ng -f &lt;text file&gt;</strong>
<p>
Or just type<br>
&nbsp; <strong>espeak-ng</strong><br>
followed by text on subsequent lines. Each line is spoken when
RETURN is pressed.
<p>
Use <strong>espeak-ng -x</strong> to see the corresponding phoneme codes.
<p>&nbsp;<hr>
<h3>2.2.2 The Command Line Options</h3>
<dl>
<dt>
<strong>espeak-ng [options] ["text words"]</strong><br>
<dd>Text input can be taken either from a file, from a string in the command, or from stdin.
<p>
<dt>
<strong>-f &lt;text file&gt;</strong><br>
<dd>Speaks a text file.
<p>
<dt>
<strong> --stdin</strong><br>
<dd>Takes the text input from stdin.
<p>
<dt>
If neither -f nor --stdin is given, then the text input is taken from "text words" (a text string within double quotes). <br>If that is not present then text is taken from stdin, but each line is treated as a separate sentence.
<p>
<dt>
<strong>-a &lt;integer&gt;</strong><br>
<dd>Sets amplitude (volume) in a range of 0 to 200. The default is 100.
<p>
<dt>
<strong>-p &lt;integer&gt;</strong><br>
<dd>Adjusts the pitch in a range of 0 to 99. The default is 50.
<p>
<dt>
<strong>-s &lt;integer&gt;</strong><br>
<dd>Sets the speed in words-per-minute (approximate values for the default English voice, others may differ slightly). The default value is 175. I generally use a faster speed
of 260. The lower limit is 80. There is no upper limit, but about 500 is probably a practical maximum.
<p>
<dt>
<strong>-b &lt;integer&gt;</strong><br>
<dd>Input text character format.<p>
1 &nbsp; UTF-8. This is the default.<p>
2 &nbsp; The 8-bit character set which corresponds to the language (eg. Latin-2 for Polish).<p>
4 &nbsp; 16 bit Unicode.<p>
Without this option, eSpeak assumes text is UTF-8, but will automatically switch to the 8-bit character set if it finds an illegal UTF-8 sequence.
<p>
<dt>
<strong>-g &lt;integer&gt;</strong><br>
<dd>Word gap. This option inserts a pause between words. The value is the length of the pause, in units of 10 mS (at the default speed of 170 wpm).
<p>
<dt>
<strong>-h </strong> or <strong> --help</strong><br>
<dd>The first line of output gives the eSpeak version number.
<p>
<dt>
<strong>-k &lt;integer&gt;</strong><br>
<dd>Indicate words which begin with capital letters.<p>
1 &nbsp; eSpeak uses a click sound to indicate when a word starts with a capital letter, or double click if word is all capitals.<p>
2 &nbsp; eSpeak speaks the word "capital" before a word which begins with a capital letter.<p>
Other values: &nbsp; eSpeak increases the pitch for words which begin with a capital letter. The greater the value, the greater the increase in pitch. Try -k20.
<p>
<dt>
<strong>-l &lt;integer&gt;</strong><br>
<dd>Line-break length, default value 0. If set, then lines which are shorter
than this are treated as separate clauses and spoken separately with a
break between them. This can be useful for some text files, but bad for
others.
<p>
<dt>
<strong>-m</strong><br>
<dd>Indicates that the text contains SSML (Speech Synthesis Markup Language) tags or other XML tags. Those SSML tags which are supported are interpreted. Other tags, including HTML, are ignored, except that some HTML tags such as &lt;hr&gt; &lt;h2&gt; and &lt;li&gt; ensure a break in the speech.
<p>
<dt>
<strong>-q</strong><br><dd>
Quiet. No sound is generated. This may be useful with options such as -x and --pho.
<p>
<dt>
<strong>-v &lt;voice filename&gt;[+&lt;variant&gt;]</strong><br>
<dd>Sets a Voice for the speech, usually to select a language. eg:
<pre> espeak-ng -vaf</pre>
To use the Afrikaans voice. A modifier after the voice name can be used to vary the tone of the voice, eg:
<pre> espeak-ng -vaf+3</pre>
The variants are <code> +m1 +m2 +m3 +m4 +m5 +m6 +m7</code> for male voices and <code> +f1 +f2 +f3 +f4 </code> which simulate female voices by using higher pitches. Other variants include <code>+croak</code> and <code>+whisper</code>.
<p>
&lt;voice filename&gt; is a file within the <code>espeak-data/voices</code> directory.<br>
&lt;variant&gt; is a file within the <code>espeak-data/voices/!v</code> directory.<p>
Voice files can specify a language, alternative pronunciations or phoneme sets, different pitches, tonal qualities, and prosody for the voice.
See the <a href="voices.html">voices.html</a> file.<p>
Voice names which start with <b>mb-</b> are for use with Mbrola diphone voices, see <a href="mbrola.html">mbrola.html</a><p>
Some languages may need additional dictionary data, see <a href="languages.html">languages.html</a>
<p>
<dt>
<strong>-w &lt;wave file&gt;</strong><br>
<dd>Writes the speech output to a file in WAV format, rather than speaking it.
<p>
<dt>
<strong>-x</strong><br>
<dd>The phoneme mnemonics, into which the input text is translated, are written to stdout.
If a phoneme name contains more than one letter (eg. [tS]), the --sep or --tie option can be used to distinguish
this from separate phonemes.
<p>
<dt>
<strong>-X</strong><br>
<dd>As -x, but in addition, details are shown of the pronunciation rule and dictionary list lookup. This can be useful to see why a certain pronunciation is being produced. Each matching pronunciation rule is listed, together with its score, the highest scoring rule being used in the translation. "Found:" indicates the word was found in the dictionary lookup list, and "Flags:" means the word was found with only properties and not a pronunciation. You can see when a word has been retranslated after removing a prefix or suffix.
<p>
<dt>
<strong>-z</strong><br>
<dd>The option removes the end-of-sentence pause which normally occurs at the end of the text.
<p>
<dt>
<strong>--stdout</strong><br>
<dd>Writes the speech output to stdout as it is produced, rather than speaking it. The data starts with a WAV file header which indicates the sample rate and format of the data. The length field is set to zero because the length of the data is unknown when the header is produced.
<p>
<dt><strong>--compile [=&lt;voice name&gt;]</strong><br>
<dd>
Compile the pronunciation rule and dictionary lookup data from their source files in the current directory. The Voice determines which language's files are compiled. For example, if it's an English voice, then <em>en_rules</em>, <em>en_list</em>, and <em>en_extra</em> (if present), are compiled to replace <em>en_dict</em> in the <em>speak-data</em> directory. If no Voice is specified then the default Voice is used.
<p>
<dt><strong>--compile-debug [=&lt;voice name&gt;]</strong><br>
<dd>
The same as <strong>--compile</strong>, but source line numbers from the *_rules file are included. These are included in the rules trace when the <strong>-X</strong> option is used.
<p>
<dt><strong>--ipa</strong><br>
<dd>
Writes phonemes to stdout, using the International Phonetic Alphabet (IPA).<br>
If a phoneme name contains more than one letter (eg. [tS]), the --sep or --tie option can be used to distinguish
this from separate phonemes.
<p>
<dt><strong>--path [="&lt;directory path&gt;"]</strong><br>
<dd>
Specifies the directory which contains the espeak-data directory.
<p>
<dt><strong>--pho</strong><br>
<dd>
When used with an mbrola voice (eg. -v mb-en1), it writes mbrola phoneme data (.pho file format) to stdout. This includes the mbrola phoneme names with duration and pitch information, in a form which is suitable as input to this mbrola voice. The --phonout option can be used to write this data to a file.
<p>
<dt><strong>--phonout [="&lt;filename&gt;"]</strong><br>
<dd>
If specified, the output from -x, -X, --ipa, and --pho options is written to this file, rather than to stdout.
<p>
<dt><strong>--punct [="&lt;characters&gt;"]</strong><br>
<dd>
Speaks the names of punctuation characters when they are encountered in the text. If &lt;characters&gt; are given, then only those listed punctuation characters are spoken, eg. <code> --punct=".,;?"</code>
<p>
<dt><strong>--sep [=&lt;character&gt;]</strong><br>
<dd>
The character is used to separate individual phonemes in the output which is produced by the -x or --ipa options. The default is a space character. The character z means use a ZWNJ character (U+200c).
<p>
<dt><strong>--split [=&lt;minutes&gt;]</strong><br>
<dd>
Used with <strong>-w</strong>, it starts a new WAV file every <code>&lt;minutes&gt;</code> minutes, at the next sentence boundary.
<p>
<dt><strong>--tie [=&lt;character&gt;]</strong><br>
<dd>
The character is used within multi-letter phonemes in the output which is produced by the -x or --ipa options. The default is the tie character&nbsp; &#x361; &nbsp;U+361. The character z means use a ZWJ character (U+200d).
<p>
<dt>
<strong>--voices [=&lt;language code&gt;]</strong><br>
<dd>Lists the available voices.<br>
If =&lt;language code&gt; is present then only those voices which are suitable for that language are listed.<br>
<code>--voices=mbrola</code> lists the voices which use mbrola diphone voices. These are not included in the default <code>--voices</code> list<br>
<code>--voices=variant</code> lists the available voice variants (voice modifiers).<br>

</dl>
<p>&nbsp;<hr>
<h3>2.2.3 The Input Text</h3>
<dl>
<dt><b>HTML Input</b>
<dd>
If the -m option is used to indicate marked-up text, then HTML can be spoken directly.
<p>
<dt><b>Phoneme Input</b>
<dd>
As well as plain text, phoneme mnemonics can be used in the text input to <strong>espeak-ng</strong>. They are enclosed within double square brackets. Spaces are used to separate words and all stressed syllables must be marked explicitly.<p>
&nbsp; eg: &nbsp; <code> espeak-ng -v en "[[D,Is Iz sVm f@n'EtIk t'Ekst 'InpUt]]" </code><p>
This command will speak: "This is some phonetic text input".
</dl>

<hr>
<a href="http://sourceforge.net"><img src="http://sflogo.sourceforge.net/sflogo.php?group_id=159649&amp;type=2" width="125" height="37" border="0" alt="SourceForge.net Logo" /></a>

</body>

+ 0
- 1
docs/index.md View File

@@ -2,7 +2,6 @@

- [Features](#features)
- [History](#history)
- [Installation](commands.html)
- [Languages](languages.html)
- [Documents](docindex.html)
- [License](../COPYING)

+ 33
- 4
src/espeak-ng.1.ronn View File

@@ -11,7 +11,7 @@ languages.

## OPTIONS

* `-h`:
* `-h`, `--help`:
Show summary of options.

* `--version`:
@@ -51,11 +51,16 @@ words are provided then text is spoken from stdin a line at a time.
Speed in words per minute, default is 160.

* `-v <voice name>`:
Use voice file of this name from espeak-data/voices.
Use voice file of this name from espeak-data/voices. A variant can be
specified using <voice>+<variant>, such as af+m3.

* `-w <wave file name>`:
Write output to this WAV file, rather than speaking it directly.

* `--split=<minutes>`:
Used with `-w` to split the audio output into &lt;minutes&gt; recorded
chunks.

* `-b`:
Input text encoding, 1=UTF8, 2=8 bit, 4=16 bit.

@@ -81,9 +86,9 @@ words are provided then text is spoken from stdin a line at a time.

* `--compile=voicename`:
Compile the pronunciation rules and dictionary in the current directory.
=&lt;voice name&lt; is optional and specifies which language.
=&lt;voicename&lt; is optional and specifies which language is compiled.

* `--compile=debug`:
* `--compile-debug=voicename`:
Compile the pronunciation rules and dictionary in the current directory as
above, but include line numbers, that get shown when -X is used.

@@ -91,6 +96,9 @@ words are provided then text is spoken from stdin a line at a time.
Write phonemes to stdout using International Phonetic Alphabet. --ipa=1 Use
ties, --ipa=2 Use ZWJ, --ipa=3 Separate with _.

* `--tie=<character>`:
The character to use to join multi-letter phonemes in -x and --ipa output.

* `--path=<path>`:
Specifies the directory containing the espeak-data directory.

@@ -104,6 +112,9 @@ words are provided then text is spoken from stdin a line at a time.
Speak the names of punctuation characters during speaking. If
=&lt;characters&gt; is omitted, all punctuation is spoken.

* `--sep=<character>`:
The character to separate phonemes from the -x and --ipa output.

* `--voices[=<language code>]`:
Lists the available voices. If =&lt;language code&gt; is present then only
those voices which are suitable for that language are listed.
@@ -111,6 +122,24 @@ words are provided then text is spoken from stdin a line at a time.
* `--voices=<directory>`:
Lists the voices in the specified subdirectory.

## EXAMPLES

* `espeak-ng "This is a test"`:
Speak the sentence "This is a test" using the default English voice.

* `espeak-ng -f hello.txt`:
Speak the contents of hello.txt using the default English voice.

* `cat hello.txt | espeak-ng`:
Speak the contents of hello.txt using the default English voice.

* `espeak-ng -x hello`:
Speak the word "hello" using the default English voice, and print the
phonemes that were spoken.

* `espeak-ng -ven-us "[[h@'loU]]"`:
Speak the phonemes "h@'loU" using the American English voice.

## AUTHOR

eSpeak NG is maintained by Reece H. Dunn <[email protected]>. It is based on

Loading…
Cancel
Save