Russian: Several new words have been added, such as some medical and linguistic terms, the names of some government organizations of the Russian Federation and other words.
Rewrite cmn_rules. Vowel will be spoken as Mandarin only when it's with a tone number. Otherwise, it will be regarded as English. This will make English words translated more correctly.
cmn: search for dictionary matches instead of translating characters.
cmn (Mandarin chinese) has been broken since 4825905.
This fix makes mandarin behave more like Cantonese. Instead of
translating characters, we search for dictionary matches.
The functionality of normal vs Chao tones should be investigated more.
Looks like latin characters as pinyin still uses Chao tones whereas
the characters in cmn_list and cmn_listx do not.
See #1044 for discussion. See also #1028 and #1163.
This item will lead `冉冉` to be pronounced 3 times rather than 2, and
it's not necessary because this character has no overloaded
pronounciation.
Signed-off-by: Icenowy Zheng <[email protected]>
Update zh_listx and zhy_listx to capture the corrections made by the Ekho project, also adding in readings of characters in the Enclosed Ideographic Supplement and CJK Compatibility Ideographs blocks of Unicode which appear in some texts
The espeak subversion and development releases do not provide the
full dictionary data for the Russian and Chinese languages due to
their size. The supplemental data is instead provided at
http://espeak.sourceforge.net/data/ for the user to download and
use to build the improved language files themselves.
These files are included in the dictsource/extra directory for
completeness and so they can be properly versioned/tracked over
time.
They are not included in the dictsource directory as this would
cause them to be used to build the dictionary files, so would
differ from the dictionaries built by espeak.