This is a fix for https://github.com/nvaccess/nvda/issues/7740. With the addition of emoji support, dictionary entries can now be longer than 128 bytes. This fix makes sure the character is interpreted as an unsigned byte so it does not treat long entries as having a negative offset. Treating the offset as a signed byte (like in the previous code) could cause the hash chain search to loop indefinitely when processing certain input, like the Tamil characters in the NVDA issue noted above that is added as a test case to translate.test.

7 years ago · 566e904b33
--- a/src/libespeak-ng/dictionary.c
+++ b/src/libespeak-ng/dictionary.c
@@ -2607,7 +2607,7 @@ static const char *LookupDict2(Translator *tr, const char *word, const char *wor
 	// This corresponds to the last matching entry in the *_list file.

 	while (*p != 0) {
 		next = p + p[0];
 		next = p + (p[0] & 0xff);

 		if (((p[1] & 0x7f) != wlen) || (memcmp(word, &p[2], wlen & 0x3f) != 0)) {
 			// bit 6 of wlen indicates whether the word has been compressed; so we need to match on this also.
--- a/tests/translate.test
+++ b/tests/translate.test
@@ -18,6 +18,9 @@ test_phonemes en " h@l'oU" "hello"
 # correct word stress
 test_phonemes en " s'VmTIN Imp'o@t@nt" "something important"

 # bug: https://github.com/nvaccess/nvda/issues/7740
 test_phonemes ta " 'il." "ள்"

 # bug: https://github.com/nvaccess/nvda/issues/7805
 test_phonemes hi " r'UcI" "रुचि"
 test_phonemes hi " dUk'a:n" "दुकान"