docs: add details about number flags to the documentation.
It's clearly intended to be enabled by default:
- it's defined as default behaviour translate.h (NUM_DEFAULT)
- tr_languages.c sets many default values related to number processing
that have no meaning unless langopts.numbers == 1.
It is also a more sensible default since most languages will want to
have number processing on. This makes adding new languages easier
because adding an entry to tr_languages.c is unnecessary.
A negative side effect is that languages with partial number defines
might experience bugs when reading undefined numbers. This is a bug and
should be fixed.
This will have the side effect of enabling number processing for
languages that currently have it disabled. However, there shouldn't be
any.
Here's a way to check affected languages:
for voice in $(ESPEAK_DATA_PATH=`pwd` LD_LIBRARY_PATH=src:${LD_LIBRARY_PATH}
src/espeak-ng --voices | grep -v Languages | awk '{print $2}'); do
OUTPUT=$(ESPEAK_DATA_PATH=`pwd` LD_LIBRARY_PATH=src:${LD_LIBRARY_PATH}
src/espeak-ng -qx -v $voice "1 - 2 - 3 - 12 - 123") && echo "$voice:
$OUTPUT" ; done
These voices clearly benefit from enabling numbers (they already have
number rules in *_list):
ba, cmn (zh), hak, haw, ja, kok, nb, nci
Some languages are missing some definitions (like _12) in _list files.
It causes the program to skip some numbers.
Numbering needs to be turned off explicitly for:
jbo, mi, my, piqd, py, qu, quc, th, uz
Languages with no number rules at all:
chr, cv, he, nog, tk, ug
master
| @@ -12,8 +12,10 @@ language: | |||
| These controls how numbers are pronounced. | |||
| If `numbers` is set to `0` (the default value), numbers will not be pronounced. | |||
| Setting it to `1` will enable number pronunciation using the dictionary rules. | |||
| If `numbers` is set to `0`, numbers will not be pronounced. | |||
| Setting it to `1` (the default value) will enable number pronunciation using the dictionary rules. | |||
| For more control over number pronunciation, see the flags in `translate.h`. | |||
| tr->langopts.max_digits | |||
| @@ -2,6 +2,7 @@ | |||
| * Copyright (C) 2005 to 2015 by Jonathan Duddington | |||
| * email: [email protected] | |||
| * Copyright (C) 2015-2016, 2020 Reece H. Dunn | |||
| * Copyright (C) 2021 Juho Hiltunen | |||
| * | |||
| * This program is free software; you can redistribute it and/or modify | |||
| * it under the terms of the GNU General Public License as published by | |||
| @@ -305,6 +306,7 @@ static Translator *NewTranslator(void) | |||
| tr->langopts.min_roman = 2; | |||
| tr->langopts.thousands_sep = ','; | |||
| tr->langopts.decimal_sep = '.'; | |||
| tr->langopts.numbers = NUM_DEFAULT; | |||
| tr->langopts.break_numbers = BREAK_THOUSANDS; | |||
| tr->langopts.max_digits = 14; | |||
| @@ -477,6 +479,18 @@ Translator *SelectTranslator(const char *name) | |||
| switch (name2) | |||
| { | |||
| case L('m', 'i'): | |||
| case L('m', 'y'): | |||
| case L4('p', 'i', 'q', 'd'): // piqd | |||
| case L('p', 'y'): | |||
| case L('q', 'u'): | |||
| case L3('q', 'u', 'c'): | |||
| case L('t', 'h'): | |||
| case L('u', 'z'): | |||
| { | |||
| tr->langopts.numbers = 0; // disable numbers until the definition are complete in _list file | |||
| } | |||
| break; | |||
| case L('a', 'f'): | |||
| { | |||
| static const short stress_lengths_af[8] = { 170, 140, 220, 220, 0, 0, 250, 270 }; | |||
| @@ -1065,6 +1079,7 @@ Translator *SelectTranslator(const char *name) | |||
| tr->langopts.param[LOPT_CAPS_IN_WORD] = 1; // capitals indicate stressed syllables | |||
| SetLetterVowel(tr, 'y'); | |||
| tr->langopts.max_lengthmod = 368; | |||
| tr->langopts.numbers = 0; // disable numbers until the definition are complete in _list file | |||
| } | |||
| break; | |||
| case L('k', 'a'): // Georgian | |||