Update zh_listx and zhy_listx to capture the corrections made by the Ekho project, also adding in readings of characters in the Enclosed Ideographic Supplement and CJK Compatibility Ideographs blocks of Unicode which appear in some texts
With bash, echo "a\nb" will not interpret \n, while with dash, echo will
interpret \n. bash's echo would need -e, but dash does not know that
option and just prints it.
We can however just put \n litteraly in the script, both bash and dash
will understand it.
With bash, echo "a\nb" will not interpret \n, while with dash, echo will
interpret \n. bash's echo would need -e, but dash does not know that
option and just prints it.
We can however just put \n litteraly in the script, both bash and dash
will understand it.
The six modified files all had spurious characters introduced apparently
as a result of files with the UTF8 BOM marker, U-FFFE, which is
conventionally used at the start of text files to indicate a UTF-8 file
and is invisible under normal circumstances (e.g. the file is opened as
a text file).
None of these files are recognized by espeak-ng on Linux systems because
the 'language variant' line is seen by espeak-ng as starting with a new
character.
'gustave' is an uncorrupted file, it correctly starts with the BOM in
UTF-8 (three bytes), however even though it is correct espeak-ng does
not read it (this may be a separate bug!)
'marcelo' somehow got the BOM character replaced by a literal '?',
notice that 'git diff' on these changes will, indeed, show the removed
character in 'gustave' as a literal '?'. Notice also that the character
in question, the BOM, is actually the Unicode 'zero width no-break
space', so it is pretty invisible.
The remaining files seem to have suffered major corruption possible as a
result of dostounix style convertions. The line endings, normally <lf>
on Unix or <cr><lf> on Windows, had been converted to <cr><lf><lf> and
the BOM had been replaced by a <tab> character.
Signed-off-by: John Bowler <[email protected]>