code cleanup: start moving translateWord3() to a new source file.
The file will be organized to have one callable function only. This
should make code structure simpler.
Existing code will be changed to use function parameters instead of
global variables.
Possible problems include too much dependencies with numbers.c.
cmn: search for dictionary matches instead of translating characters.
cmn (Mandarin chinese) has been broken since 4825905.
This fix makes mandarin behave more like Cantonese. Instead of
translating characters, we search for dictionary matches.
The functionality of normal vs Chao tones should be investigated more.
Looks like latin characters as pinyin still uses Chao tones whereas
the characters in cmn_list and cmn_listx do not.
See #1044 for discussion. See also #1028 and #1163.
./src/espeak-ng -v ar --path=$PWD 5777
would access out of the vowel_stress array, because GetVowelStress, in the
phcode == phonSYLLABIC case, only increments count if stress is 0. This
code seems fishy, and is not coherent with the loop in SetWordStress. I
made it similar to the other case above, thus fixing the out-of-bound
access.
This is modifiying the stress in the tn test, I don't know whether this
is expected.
utf8_in2, when working in backward mode, is assuming that we are giving
the address of the last byte of the previous character (see all other
calls to utf8_in2). Otherwise, utf8_in2 returns the size of the current
multibyte character instead of that of the previous multibyte character.
This first reverts "Fix number_buf buffer overflow"
(commit ada93e2db0)
This for loop is apparently actually expected to to skip over NUL
characters.
Fixes #1302
Instead, this limits number processing to 32 digits, as break_numbers does
not support more and would provide bogus result with further digits.
Also fix the signedness of break_numbers so that the 32th bit
actually effectively works.
TranslateWord2 passes translator2 as tr to TranslateWord which may call
TranslateWord3, SpeakIndividualLetters, TranslateLetter, which was calling
SetTranslator2 again, thus freeing the very tr being used. Make that
latter use another translator.
When there are language switches, when we rewind to the start of the
phoneme list we have to reset the phoneme table back.
This avoids some branching that depends on undefined values, caught by
valgrind in the case e.g. of an emoji substitution that contains a
language switch.
Ref #874
And make sure that the unfilled entries are NULL to catch any spurious
reference.
Also fix some missing phoneme table switch according to language
changes.
And fill the last phlist prepause and newword fields, otherwise they are
detected as undefined:
==483407== Conditional jump or move depends on uninitialised value(s)
==483407== at 0x488E6AB: Generate (synthesize.c:1228)
==483407== by 0x488FD94: SpeakNextClause (synthesize.c:1587)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407== Uninitialised value was created by a client request
==483407== at 0x4884893: MakePhonemeList (phonemelist.c:155)
==483407== by 0x4895712: TranslateClause (translate.c:2682)
==483407== by 0x488FCCF: SpeakNextClause (synthesize.c:1569)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407==
==483407== Conditional jump or move depends on uninitialised value(s)
==483407== at 0x488E622: Generate (synthesize.c:1211)
==483407== by 0x488FD94: SpeakNextClause (synthesize.c:1587)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407== Uninitialised value was created by a client request
==483407== at 0x4884893: MakePhonemeList (phonemelist.c:155)
==483407== by 0x4895712: TranslateClause (translate.c:2682)
==483407== by 0x488FCCF: SpeakNextClause (synthesize.c:1569)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407==
This is changing the ssml.test output, but with no audible difference,
so this is probably a real fix for it.
These are global arrays reused several times. When using them msan and
valgrind thus believe they are always initialized, which reduces their
capacity to detect uninitialized values. We can however explicitly tell them
when they are reused, and thus to be considered as uninitialized.
When pollint() returns 100.0, multiplying by 2.55 doesn't actually seem to
be getting 255 on i386. Multiplying by 255 and dividing by 100, however,
does (probably because float computation with small integer values are
guaranteed to have integer results).
Fixes #1151