Reece H. Dunn
088de546e9
Use the generic BSD-2-Clause license text in COPYING.BSD2.
6 years ago
Reece H. Dunn
1c60fb7f62
Don't use STRESSPOSN_1L for thousands_sep in the Slovak/Czech language setup.
6 years ago
Reece H. Dunn
c6ac526847
When printing phonemes, don't add a space at the start of a sentence or clause.
6 years ago
Reece H. Dunn
65186c07df
Preserve the sourceix property of a deleted phonSWITCH phoneme.
6 years ago
Reece H. Dunn
cf6d14783c
Preserve the sourceix property of a deleted phoneme for replaced phonemes.
6 years ago
Reece H. Dunn
8e13f7147c
Add constants for use with PHONEME_LIST.newword.
6 years ago
Valdis Vitolins
d0e806b600
ur: improvements for Urdu by Ejaz Shah
6 years ago
Reece H. Dunn
910f4c2a72
Add ISO 15924 script codes to the remaining language pronunciation tests.
6 years ago
Reece H. Dunn
475bfdcb66
Use the ISO 15924 4-letter script names consistently in the tests.
6 years ago
Valdis Vitolins
4a7118dba1
Fix issue #530 Broken replacement from Cyrrilic to Latin for Lingua Franca Nova
6 years ago
Valdis Vitolins
5439b89db8
Issue #521 — add spelling tests for more langugages
Sample sentences for languages are taken from:
- Afrikaans https://www.omniglot.com/writing/afrikaans.htm
- Albanian https://www.omniglot.com/writing/albanian.htm
- Amharic https://www.bbc.com/amharic
- Ancient Greek: http://titus.uni-frankfurt.de/unicode/samples/grbeisp.htm
- Aragonese https://www.omniglot.com/writing/aragonese.php
- Armenian: https://elinux.org/UTF8_Sampler
- Assamese https://www.omniglot.com/writing/assamese.htm
- Azerbaijani https://www.omniglot.com/writing/azeri.htm
- Basque https://www.omniglot.com/writing/basque.htm
- Bengali https://www.bbc.com/bengali/news
- Dutch https://www.omniglot.com/writing/dutch.htm
- Greenlandic: https://www.omniglot.com/writing/greenlandic.htm
- Guarani: https://www.omniglot.com/writing/guarani.htm
- Gujarati: http://mylanguages.org/gujarati_reading.php
- Haitian Creole: https://www.omniglot.com/writing/haitiancreole.htm
- Interlingua: https://www.omniglot.com/writing/interlingua.htm
- Kannada: https://www.omniglot.com/language/phrases/kannada.php
- Kyrgyz: https://ru.wikipedia.org/wiki/%D0%9A%D0%B8%D1%80%D0%B3%D0%B8%D0%B7%D1%81%D0%BA%D0%B0%D1%8F_%D0%BF%D0%B8%D1%81%D1%8C%D0%BC%D0%B5%D0%BD%D0%BD%D0%BE%D1%81%D1%82%D1%8C
- Konkani (Devanagari) https://r12a.github.io/scripts/devanagari/
- Kurdish https://www.omniglot.com/writing/kurdish.htm
- Lingua Franca Nova https://www.omniglot.com/writing/lfn.htm
- Lobjan: https://www.omniglot.com/writing/lojban.htm
- Malay https://www.omniglot.com/writing/malay.htm
- Maltese https://www.omniglot.com/writing/maltese.htm
- Marathi https://www.bbc.com/marathi
- Māori https://www.omniglot.com/writing/maori.htm
- Nahuatl https://www.gutenberg.org/files/12219/12219-h/12219-h.htm
- Oriya https://www.omniglot.com/writing/oriya.htm
- Oromo https://www.omniglot.com/writing/oromo.htm
- Papiamento https://www.omniglot.com/writing/papiamento.php
- Punjabi https://pa.wikipedia.org/wiki/%E0%A8%AD%E0%A8%BE%E0%A8%B0%E0%A8%A4_%E0%A8%A6%E0%A8%BE_%E0%A8%B0%E0%A8%BE%E0%A8%B8%E0%A8%BC%E0%A8%9F%E0%A8%B0%E0%A8%AA%E0%A8%A4%E0%A9%80
- Setswana https://www.omniglot.com/writing/tswana.php
- Sindhi https://en.wikipedia.org/wiki/Sindhi_language
- Sinhala https://www.bbc.com/sinhala
- Tamil http://kermitproject.org/utf8.html
- Tatar https://www.omniglot.com/writing/tatar.htm
- Telugu http://kermitproject.org/utf8.html
- Vietnamese https://www.omniglot.com/writing/vietnamese.htm
6 years ago
Valdis Vitolins
d3f2a753f3
Fix issue #527 — spelling differs for Russian with or without extended dictionary
6 years ago
Reece H. Dunn
86bbc257b0
Support matching any length strings in the replacement rules.
6 years ago
Reece H. Dunn
98e9122dfc
FindReplacementChars: Pass in the source buffer (next characters) instead of next_in.
6 years ago
Reece H. Dunn
4fbcda9c2a
FindReplacementChars: Use an nc (next character) variable.
6 years ago
Reece H. Dunn
cacc212d4b
FindReplacementChars: Rename uc to fc.
6 years ago
Reece H. Dunn
a9d4bdd7f7
Make ignore_next into ignore_next_n to support ignoring multiple next characters.
6 years ago
Reece H. Dunn
3518fbf3ff
mk: Support additional romanizations (ISO 9, BGN/PCGN, Cadastre, and MJMS/SSO).
6 years ago
Reece H. Dunn
0b64b04baa
mk: Remove the Latin script groups -- these are handled by replacement characters.
6 years ago
Reece H. Dunn
27454f56f4
mk: Don't map đ and ć to Serbian ђ and ћ (use Macedonian ѓ and ќ instead).
6 years ago
Reece H. Dunn
db3ae0eaec
mk: Reformat the Latin to Cyrillic romanization support table.
6 years ago
Reece H. Dunn
252f5772ae
Simplify printing the replace message.
6 years ago
Reece H. Dunn
6a7e31e24e
Merge remote-tracking branch 'Christianlm/master'
6 years ago
Reece H. Dunn
55c64036e0
Use UTF-8 strings in replace rules, instead of a packed UTF-16 pair.
6 years ago
Reece H. Dunn
0e91fcbc04
Don't use pw when reading the replacement data.
6 years ago
Reece H. Dunn
424f705525
Revert the new (broken) replacement rule logic.
The replacement tests for bs, hr, and sr are no longer marked as
broken as they work using the old code. The mk tests keep the
broken annotation, as they don't work in the old code either.
This reverts commit 801a8d197c
.
This reverts commit 64d5701e5e
.
This reverts commit 3b51ebf617
.
This reverts commit 1fd235d2c0
.
This reverts commit 9f0667de86
.
6 years ago
chrislm
5303b6b570
IT: addedsome rules for pronominal verbs and for suffix *filia*
6 years ago
Reece H. Dunn
bae92dab38
ja: Add tests for replacing Katakana (Kana) with Hiragana (Hira).
6 years ago
Reece H. Dunn
9660df7743
mk: Add tests for replacing Latin with Cyrillic.
6 years ago
Reece H. Dunn
bfb624824e
Move the additional English replacement rule test to language-replace.test.
6 years ago
Reece H. Dunn
672c07b3a9
Reorganize the language pronunciation tests.
6 years ago
Reece H. Dunn
93e23a47c8
issue #521: add spelling tests for all languages
Tests include pangrams from http://clagnut.com/blog/2380/ .
Based on a patch by Valdis Vitolins <[email protected] >.
6 years ago
Reece H. Dunn
28ed50c423
Add support for skipping broken tests, and make the bs, hr, and sr tests use the correct output.
6 years ago
Reece H. Dunn
d03d09a207
Merge remote-tracking branch 'jaacoppi/french'
6 years ago
Reece H. Dunn
cd54fae72d
Don't run the bs, hr, and sr replacement tests as these are unreliable.
6 years ago
Reece H. Dunn
d83f5e654d
sl: remove the Cyrillic to Latin replacements.
These were added in espeak 1.47.11a, but Slovenian uses Latin
like Polish and Czech.
The earliest form of Slovene (the Freising Manuscripts, between
972-1039) are written in Latin.
Some Cyrillic and Cyrillic-like letters were used in addition to
Latin in the Dajnko and Metelko alphabets between 1824 and 1838,
but these alphabets did not catch on.
The modern form is based on the Serbo-Croatian Gaj's Latin alphabet.
This is the alphabet represented in the espeak `replace_cyrillic_latin`
table. Slovenian does not use the Cyrillic variant of this alphabet.
6 years ago
Reece H. Dunn
9de0a55405
bs: add replacement rule tests for Cyrillic
6 years ago
Reece H. Dunn
c1c184816a
hr: add replacement rule tests for Cyrillic
6 years ago
Reece H. Dunn
2e5179842f
sr: add replacement rule tests for Cyrillic
6 years ago
Reece H. Dunn
b07448cf30
en: add replacement rule tests
6 years ago
Reece H. Dunn
bba7069cb3
issue #520: Use .replace rules in the language rule files for Cyrillic to Latin
6 years ago
Reece H. Dunn
530686f2fd
language-pronunciation.test: Support an optional message/description.
6 years ago
Juho Hiltunen
fe60e0e9b3
fr: change pronunciation of some words ending in -us.
6 years ago
Reece H. Dunn
32ab396ea1
Don't declare the Arabic letter strings using array syntax.
6 years ago
Reece H. Dunn
bfe184c641
Make PrepareLetters actually set the letter bits, and rename to SetLetterBitsUTF8.
6 years ago
Reece H. Dunn
7f42bd39b4
Move PrepareLetters next to SetLetterBitsRange.
6 years ago
Reece H. Dunn
22ee347234
Use script name prefixes in the Set[Script]Letters functions for group bit lists.
6 years ago
Reece H. Dunn
36c6727e90
issue #518: Add the 'е є ї' letters to the Y group in Translate_Russian, not SetCyrillicLetters.
6 years ago
Reece H. Dunn
17dac5ea53
Add a comment to SetCyrillicLetters describing which languages it applies to.
6 years ago
Reece H. Dunn
e7e59f99d3
Fix the warnings in PrepareLetters.
6 years ago