./src/espeak-ng -v ar --path=$PWD 5777
would access out of the vowel_stress array, because GetVowelStress, in the
phcode == phonSYLLABIC case, only increments count if stress is 0. This
code seems fishy, and is not coherent with the loop in SetWordStress. I
made it similar to the other case above, thus fixing the out-of-bound
access.
This is modifiying the stress in the tn test, I don't know whether this
is expected.
utf8_in2, when working in backward mode, is assuming that we are giving
the address of the last byte of the previous character (see all other
calls to utf8_in2). Otherwise, utf8_in2 returns the size of the current
multibyte character instead of that of the previous multibyte character.
This first reverts "Fix number_buf buffer overflow"
(commit ada93e2db0)
This for loop is apparently actually expected to to skip over NUL
characters.
Fixes #1302
Instead, this limits number processing to 32 digits, as break_numbers does
not support more and would provide bogus result with further digits.
Also fix the signedness of break_numbers so that the 32th bit
actually effectively works.
TranslateWord2 passes translator2 as tr to TranslateWord which may call
TranslateWord3, SpeakIndividualLetters, TranslateLetter, which was calling
SetTranslator2 again, thus freeing the very tr being used. Make that
latter use another translator.
When there are language switches, when we rewind to the start of the
phoneme list we have to reset the phoneme table back.
This avoids some branching that depends on undefined values, caught by
valgrind in the case e.g. of an emoji substitution that contains a
language switch.
Ref #874
And make sure that the unfilled entries are NULL to catch any spurious
reference.
Also fix some missing phoneme table switch according to language
changes.
And fill the last phlist prepause and newword fields, otherwise they are
detected as undefined:
==483407== Conditional jump or move depends on uninitialised value(s)
==483407== at 0x488E6AB: Generate (synthesize.c:1228)
==483407== by 0x488FD94: SpeakNextClause (synthesize.c:1587)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407== Uninitialised value was created by a client request
==483407== at 0x4884893: MakePhonemeList (phonemelist.c:155)
==483407== by 0x4895712: TranslateClause (translate.c:2682)
==483407== by 0x488FCCF: SpeakNextClause (synthesize.c:1569)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407==
==483407== Conditional jump or move depends on uninitialised value(s)
==483407== at 0x488E622: Generate (synthesize.c:1211)
==483407== by 0x488FD94: SpeakNextClause (synthesize.c:1587)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407== Uninitialised value was created by a client request
==483407== at 0x4884893: MakePhonemeList (phonemelist.c:155)
==483407== by 0x4895712: TranslateClause (translate.c:2682)
==483407== by 0x488FCCF: SpeakNextClause (synthesize.c:1569)
==483407== by 0x4887F56: Synthesize (speech.c:457)
==483407== by 0x488884C: sync_espeak_Synth (speech.c:570)
==483407== by 0x487B270: espeak_Synth (espeak_api.c:90)
==483407== by 0x10ACA0: main (espeak-ng.c:691)
==483407==
This is changing the ssml.test output, but with no audible difference,
so this is probably a real fix for it.
These are global arrays reused several times. When using them msan and
valgrind thus believe they are always initialized, which reduces their
capacity to detect uninitialized values. We can however explicitly tell them
when they are reused, and thus to be considered as uninitialized.
When pollint() returns 100.0, multiplying by 2.55 doesn't actually seem to
be getting 255 on i386. Multiplying by 255 and dividing by 100, however,
does (probably because float computation with small integer values are
guaranteed to have integer results).
Fixes #1151
CheckThousandsGroup: Avoid reading uninitialized data
For the case when word is smaller than 4 characters, we should not look at
the 3rd or 4th character before checking the previous ones, otherwise
we'd at best read uninitialized data, at worse non-existing data.
Some rules test against character not being of a certain type. That may
match with the \0 end-of-text marker, and thus actually step over
it and let MatchRule continue with uninitialized data after it, leading
to potential random behavior.
This commits fixes it by making sure that we don't read past that \0.
This seems to be changing the pronunciation of "capitals" from k'apIt@Lz to
k'apIt,alz, I don't know why, I guess the rule for it was actually
bogus?
MatchRule: Prevent non-eating special characters from eating characters
Special characters such as N, S1, etc. are not actually eating
characters. Their treatment should thus *not* update pre_ptr and post_ptr,
otherwise those would underflow/overflow, e.g. in the case
@) s (_NS1 [z]
this would overflow. This for instance noticeable with the memory sanitizer:
ESPEAK_DATA_PATH=$PWD ./src/espeak-ng -qX "capitals"
Translate 'capitals'
1 c [k]
1 a [a]
1 p [p]
1 i [I]
1 t [t]
1 a [a]
1 l [l]
20 l (C [l]
==2837201==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x7f7f4422744b in utf8_in2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:281:2
#1 0x7f7f442281bc in utf8_in /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:332:9
#2 0x7f7f440e0d31 in MatchRule /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:1767:21
#3 0x7f7f440d937f in TranslateRules /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/dictionary.c:2320:6
#4 0x7f7f44230e5f in TranslateWord3 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:733:15
#5 0x7f7f44229844 in TranslateWord /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1100:14
#6 0x7f7f44256e50 in TranslateWord2 /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1361:11
#7 0x7f7f4424d6cc in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:2623:17
#8 0x7f7f44213359 in SpeakNextClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/synthesize.c:1569:2
#9 0x7f7f441a9f56 in Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:457:2
#10 0x7f7f441a9023 in sync_espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:570:29
#11 0x7f7f441ad59f in espeak_ng_Synthesize /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/speech.c:678:10
#12 0x7f7f4410b3f4 in espeak_Synth /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/espeak_api.c:90:32
#13 0x4a8be3 in main /home/samy/brl/speech/espeak-ng-git/src/espeak-ng.c:691:3
#14 0x7f7f43a2e7fc in __libc_start_main csu/../csu/libc-start.c:332:16
#15 0x421449 in _start (/home/samy/ens/projet/1/speech/espeak-ng-git/src/.libs/espeak-ng+0x421449)
Uninitialized value was created by an allocation of 'sbuf' in the stack frame of function 'TranslateClause'
#0 0x7f7f4423a1f0 in TranslateClause /home/samy/brl/speech/espeak-ng-git/src/libespeak-ng/translate.c:1941
While trying to match _NS1, MatchRule is overflowing the buffer.
It happens that this had not usually posed problem because rules usually
have these non-eating special characters last in the rule and thus it wasn't
mattering that post_ptr is pointing outside valid text.