Add Pashto language support based on Urdu language files
This commit adds support for the Pashto language (ps) to espeak-ng. The implementation is based on Urdu language files and includes ps_rules, ps_list, ps_emoji, ps_extra files and updated Makefile.am to include Pashto in the dictionary targets and build rules.
fix: add Pashto data file.
refactor: use enhance rules for stress.
fix: add missing configs.
Add Pashto phoneme support and improve voice files
fix: add Pashto phonemes test.
fix: restore original phonemes.
fix: remove renduandant ps_dict from Make.am file.
fix: use correct phonemes with ipa and stress rules.
feat: translate all en_emoji to Pashto.
fix: add Pashto dict entry in Makefile.am
feat: enhance ps_rules with example pairs and words.
mbrola: Fix portuguese and french voices priorities
- We want to use br* voices in priority for Brazilian Portugues, but not
for general Portuguese.
- We want to use ca* and Belgian voices for Canadian and Belgian French,
but rather not for general French.
voice fast is related to speed.fast_settings. Others were errors in
LoadVoice() parsing or commented out code.
Note that some voices are from the original espeak and have comments
that might be outdated.
See commit message for 23a4d88f.
This commit fixes cmn and yue.
CalcPitches_Tone() now accepts cmn for translator_name.
SelectTranslator() now has a case for yue instead of zhy.
Option "language <name>"already causes SelectTranslator(<name>) to be
called. Having two options to do almost the same thing is unnecessary
and confusing.
In the long term, all options from SelectTranslator() should have a
switch case in LoadVoice() so they are user configurable (see #218). If
needed, a new option (maybe called "LoadOptions") could be added to load
an existing voice or language file.
Changes language configuration files for: hak, cmn, yue, ltg, ms, mb-ma1.
No changes to users.
The six modified files all had spurious characters introduced apparently
as a result of files with the UTF8 BOM marker, U-FFFE, which is
conventionally used at the start of text files to indicate a UTF-8 file
and is invisible under normal circumstances (e.g. the file is opened as
a text file).
None of these files are recognized by espeak-ng on Linux systems because
the 'language variant' line is seen by espeak-ng as starting with a new
character.
'gustave' is an uncorrupted file, it correctly starts with the BOM in
UTF-8 (three bytes), however even though it is correct espeak-ng does
not read it (this may be a separate bug!)
'marcelo' somehow got the BOM character replaced by a literal '?',
notice that 'git diff' on these changes will, indeed, show the removed
character in 'gustave' as a literal '?'. Notice also that the character
in question, the BOM, is actually the Unicode 'zero width no-break
space', so it is pretty invisible.
The remaining files seem to have suffered major corruption possible as a
result of dostounix style convertions. The line endings, normally <lf>
on Unix or <cr><lf> on Windows, had been converted to <cr><lf><lf> and
the BOM had been replaced by a <tab> character.
Signed-off-by: John Bowler <[email protected]>