Add Pashto language support based on Urdu language files
This commit adds support for the Pashto language (ps) to espeak-ng. The implementation is based on Urdu language files and includes ps_rules, ps_list, ps_emoji, ps_extra files and updated Makefile.am to include Pashto in the dictionary targets and build rules.
fix: add Pashto data file.
refactor: use enhance rules for stress.
fix: add missing configs.
Add Pashto phoneme support and improve voice files
fix: add Pashto phonemes test.
fix: restore original phonemes.
fix: remove renduandant ps_dict from Make.am file.
fix: use correct phonemes with ipa and stress rules.
feat: translate all en_emoji to Pashto.
fix: add Pashto dict entry in Makefile.am
feat: enhance ps_rules with example pairs and words.
This fixes the intonation in Latin American Spanish questions, removes the phoneme replacement (T to s) because it is already included in phonemes es_la. It also eliminates the line languaje es 6, because when the voice is in the dialect of Spain and it automatically changes to the dialect of Latin America, it does not return to the dialect of Spain.
This fixes the intonation in Latin American Spanish questions, removes the phoneme replacement (T to s) because it is already included in phonemes es_la. It also eliminates the line languaje es 6, because when the voice is in the dialect of Spain and it automatically changes to the dialect of Latin America, it does not return to the dialect of Spain.
- fixed/removed not working rules in be_list
- added stress to the words in be_list
- fixed multi thousand transcription
- removed not working rules in be_rules
- added rules of palatalization, phonemes lengthen
- fixed dropping of [a] at the end of words
- fixed message "Full dictionary is not installed for"
- added configuration in tr_languages.c
- fixed/added phonemes for `Q`, `ts`, `ts;`, `dz`, `dz.`, `;` etc
Rewrite cmn_rules. Vowel will be spoken as Mandarin only when it's with a tone number. Otherwise, it will be regarded as English. This will make English words translated more correctly.
This commit implements support for [Totontepec Mixe](https://en.wikipedia.org/wiki/Totontepec_Mixe). The Espeak rules are based on the phonological inventory, orthographic mappings, and phonetic processes described in the "Esbozo fonológico" (phonological outline/sketch) chapter of Verónica Guzmán Guzmán's 2012 master's thesis in Indo American Linguistics awarded by the [Centro de Investigaciones y Estudios Superiores en Antropología Social](https://ciesas.edu.mx/) and *Vocabulario Mixe de Totontepec* (Totontepec Mixe vocabulary), compiled by Alvin Schoenhals and Louise C. Schoenhals and published by the Summer Institute of Linguistics in 1965.
This commit was developed as part of a project for [Computational Linguistics](https://jnw.domains.swarthmore.edu/ling073/syllabus.php) at [Swarthmore College](https://swarthmore.edu). We feel that this language is suitable for merge with "testing" status, but further verification/improvements by native speakers would be very helpful.
co-authored-by: Elizabeth Resendiz <[email protected]>
The Elvish languages have the same general stress rule as Latin [1]:
stress falls on the penultimate syllable if that is heavy (contains a
long vowel, a diphthong, or a vowel followed by two or more consonants),
otherwise on the antepenultimate syllable. For Latin, espeak-ng
implements this by declaring “penultimate syllable” as the general
stress rule in espeak-ng-data/, and then adding rules in dictsource/
that match light syllables and move the primary stress to the previous
syllable, i.e. the antepenultimate one. We use the same basic principle
for the Elvish languages here (but using the terms “heavy” and “light”
rather than “weak” and “strong” like in the Latin files).
Note that this doesn’t fully implement the stress rules yet: we have no
concept of diphthongs, long vowels aren’t really properly handled yet,
and we also still count ⟨ch⟩, ⟨dh⟩, ⟨th⟩ as two consonants rather than
one as it should be. This will be improved separately (I prefer doing
this in small incremental steps).
[1]: https://menegroth.github.io/stress-in-sindarin.html
This prepares the languages of Quenya and Sindarin, setting up their
infrastructure without declaring a lot of rules yet – just enough for
“Eä” (a Quenya word, but I can’t think of a similarly simple one for
Sindarin). Phonemes are inherited from Esperanto for now.
cmn now handles all latin characters as English.
The old functionality of assuming latin characters are pinyin can be
achieved with new language cmn-Latn-pinyin.
See commit message for 23a4d88f.
This commit fixes cmn and yue.
CalcPitches_Tone() now accepts cmn for translator_name.
SelectTranslator() now has a case for yue instead of zhy.
Option "language <name>"already causes SelectTranslator(<name>) to be
called. Having two options to do almost the same thing is unnecessary
and confusing.
In the long term, all options from SelectTranslator() should have a
switch case in LoadVoice() so they are user configurable (see #218). If
needed, a new option (maybe called "LoadOptions") could be added to load
an existing voice or language file.
Changes language configuration files for: hak, cmn, yue, ltg, ms, mb-ma1.
No changes to users.
Code cleanup: remove param2 from langopts and rename keyword option in language files.
- param2[] is only used to set a second value to LOPT_BRACKET_PAUSE. It is simpler
to have two values in param[] instead. This simplifies the codebase.
- Instead of setting "option bracket X Y" in language files, use
keywords "brackets X" and "bracketsAnnounced Y" instead to follow the
naming convention of other keywords.
- Add missing documentation to docs/voices.md.