SSML and HTML Support
SSML (Speech Synthesis Markup Language)
SSML consists of XML-like tags, for example: Did you mean the <emphasis level="strong"><prosody pitch="75">green</prosody></emphasis> beans?
The following markup tags and attributes are recognised:
speak
- xml:base (the value is just passed back as a parameter with the UriCallback() function)
- xml:lang
voice
- xml:lang
- name
- age
- variant
- gender
prosody
- rate (
x-slow, slow, medium, fast, x-fast or a percentage such as 125%)
- volume (
silent, x-soft, soft, medium, loud, x-loud, +1dB or -1dB)
- pitch (a number, for example “75”)
- range (
default, x-low, low, medium, high, x-high)
say-as
- interpret-as=“characters”
- interpret-as=“characters” format=“glyphs”
- interpret-as=“tts:key”
- interpret-as=“tts:char”
- interpret-as=“tts:digits”
mark
s
p
sub
tts:style
- field=“punctuation” mode=none,all,some
- field=“capital_letters” mode=no,spelling,icon,pitch
audio
emphasis
- level (
none, reduced, moderate, strong or x-strong)
break
HTML
eSpeak can speak HTML text directly, or text containing both SSML and HTML markup.
Any unrecognised tags are ignored.
The following tags cause a sentence break:
The following tags cause a paragraph break:
Text between the following tags is ignored:
References
SSML
- Speech Synthesis Markup Language (SSML) Version 1.0.
W3C Recommendation, 3 March 2009. W3C.
- Speech Synthesis Markup Language (SSML) Version 1.1.
W3C Recommendation, 7 September 2010. W3C.
- SSML 1.0 say-as attribute values.
W3C NOTE, 26 May 2005. W3C.
HTML
- HTML 5.2.
W3C Recommendation, 14 December 2017. W3C.
- HTML Living Standard.
Continually updated. WHATWG.