The Elvish languages have the same general stress rule as Latin [1]:
stress falls on the penultimate syllable if that is heavy (contains a
long vowel, a diphthong, or a vowel followed by two or more consonants),
otherwise on the antepenultimate syllable. For Latin, espeak-ng
implements this by declaring “penultimate syllable” as the general
stress rule in espeak-ng-data/, and then adding rules in dictsource/
that match light syllables and move the primary stress to the previous
syllable, i.e. the antepenultimate one. We use the same basic principle
for the Elvish languages here (but using the terms “heavy” and “light”
rather than “weak” and “strong” like in the Latin files).
Note that this doesn’t fully implement the stress rules yet: we have no
concept of diphthongs, long vowels aren’t really properly handled yet,
and we also still count ⟨ch⟩, ⟨dh⟩, ⟨th⟩ as two consonants rather than
one as it should be. This will be improved separately (I prefer doing
this in small incremental steps).
[1]: https://menegroth.github.io/stress-in-sindarin.html
This prepares the languages of Quenya and Sindarin, setting up their
infrastructure without declaring a lot of rules yet – just enough for
“Eä” (a Quenya word, but I can’t think of a similarly simple one for
Sindarin). Phonemes are inherited from Esperanto for now.
Revert "Move most stress rule definitions from tr_languages.c to language files"
This reverts commit 0f55204522.
This breaks using voice files like mb-de5-en. It could also use
named values for the stress rules (e.g. 'first-syllable').
Move most stress rule definitions from tr_languages.c to language files
Keyword stressRule in voices.c handles setting langopts->stress_rule.
tr_languages.c still contains the setting for some languages that don't have a language file yet.
Contributes to:
Issue #218, https://github.com/espeak-ng/espeak-ng/issues/218
Move handling of SetLetterVowel() to language files
Contributes to:
Issue #218, https://github.com/espeak-ng/espeak-ng/issues/218
Changes:
language files have a new keyword letterVowel. It can be a latin alphabet from a-z or a hex value (used by bulgarian).
Errors in parsing the values are only reported, nothing is done about them.
About testing:
I haven't noticed any differences in the output audio with letterVowel set or unset in any tested language. The code seems to work and the correct bits seem to be set, but I don't know how to confirm from the final audio.
TODO:
1. Write better documentation in docs/voices.md
2. The code uses new_translator. Should it use translator instead?