Browse Source

Issue #757: add initial support for Thai language

master
Valdis Vitolins 5 years ago
parent
commit
24eb5b3cde
8 changed files with 323 additions and 2 deletions
  1. 1
    0
      CHANGELOG.md
  2. 4
    0
      Makefile.am
  3. 1
    1
      dictsource/shn_list
  4. 12
    0
      dictsource/th_list
  5. 298
    0
      dictsource/th_rules
  6. 2
    1
      docs/languages.md
  7. 3
    0
      espeak-ng-data/lang/tai/th
  8. 2
    0
      phsource/phonemes

+ 1
- 0
CHANGELOG.md View File

* ltg (Latgalian) -- Valdis Vitolins * ltg (Latgalian) -- Valdis Vitolins
* nog (Nogai) -- boracasli98, Valdis Vitolins * nog (Nogai) -- boracasli98, Valdis Vitolins
* qu (Quechua) -- Valdis Vitolins * qu (Quechua) -- Valdis Vitolins
* th (Thai) -- Valdis Vitolins
* tk (Turkmen) -- boracasli98, Valdis Vitolins * tk (Turkmen) -- boracasli98, Valdis Vitolins
* ug (Uyghur) -- boracasli98, Valdis Vitolins * ug (Uyghur) -- boracasli98, Valdis Vitolins
* uk (Ukrainian) -- Valdis Vitolins * uk (Ukrainian) -- Valdis Vitolins

+ 4
- 0
Makefile.am View File

espeak-ng-data/sw_dict \ espeak-ng-data/sw_dict \
espeak-ng-data/ta_dict \ espeak-ng-data/ta_dict \
espeak-ng-data/te_dict \ espeak-ng-data/te_dict \
espeak-ng-data/th_dict \
espeak-ng-data/tk_dict \ espeak-ng-data/tk_dict \
espeak-ng-data/tn_dict \ espeak-ng-data/tn_dict \
espeak-ng-data/tr_dict \ espeak-ng-data/tr_dict \
te: espeak-ng-data/te_dict te: espeak-ng-data/te_dict
espeak-ng-data/te_dict: dictsource/te_list dictsource/te_rules dictsource/te_extra dictsource/te_emoji espeak-ng-data/te_dict: dictsource/te_list dictsource/te_rules dictsource/te_extra dictsource/te_emoji


th: espeak-ng-data/th
espeak-ng-data/th: dictsource/th dictsource/th dictsource/te_extra

tk: espeak-ng-data/tk_dict tk: espeak-ng-data/tk_dict
espeak-ng-data/tk_dict: dictsource/tk_list dictsource/tk_listx dictsource/tk_rules dictsource/tk_extra espeak-ng-data/tk_dict: dictsource/tk_list dictsource/tk_listx dictsource/tk_rules dictsource/tk_extra



+ 1
- 1
dictsource/shn_list View File

_7XXXX tS;et4||mWn2 _7XXXX tS;et4||mWn2
_8XXXX pEt2||mWn2 _8XXXX pEt2||mWn2
_9XXXX kaw3||mWn2 _9XXXX kaw3||mWn2
_1XXXXX nEN3||sEn // ၼိုင်ႈသႅၼ်
_1XXXXX nEN3||sEn // ၼိုင်ႈသႅၼ်

+ 12
- 0
dictsource/th_list View File

// numbers
_0 s3un
_1 n5ueng
_2 s5ong
_3 s3am
_4 s5i
_5 h3a
_6 h3ok
_7 ch3et
_8 p3aet
_9 k5ao


+ 298
- 0
dictsource/th_rules View File

// Thai pronunciation rules

.replace
// Alternative full stop
๏ .

// Numbers
๐ 0
๑ 1
๒ 2
๓ 3
๔ 4
๕ 5
๖ 6
๗ 7
๘ 8
๙ 9
๑๐ 10


// Letter groups
// Look for phoneme tones at phonemes/ph_shan file

// Consonants
.L10 ก ข ฃ ค ฅ ฆ ง จ ฉ ช ซ ฌ ญ ฎ ฏ ฐ ฑ ฒ ณ ด ต ถ ท ธ น บ ป ผ ฝ พ ฟ ภ ม ย ร ฤ ล ว ศ ษ ส ห ฬ อ ฮ ะ า ำ

// Vowels (note that some of them are zero width characters with negative offset)
.L20 ิ ี ึ ื ุ เ แ โ ใ ไ ๅ

.group ก
ก k3 // default pronunciation
ก (L10 k3a // add vowel, if next is consonant
ก (L20 k3 // but not add, if next is vowel

.group ข
ข kh55
ข (L10 kh55a

.group ฃ
ฃ kh55
ฃ (L10 kh55a

.group ค
ค kh2
ค (L10 kh2a

.group ฅ
ฅ kh2
ฅ (L10 kh2a

.group ฆ
ฆ kh2
ฆ (L10 kh2a

.group ง
ง ng2
ง (L10 ng2a

.group จ
จ ch3
จ (L10 ch3a

.group ฉ
ฉ ch55
ฉ (L10 ch55a

.group ช
ช ch2
ช (L10 ch2a

.group ซ
ซ s2
ซ (L10 s2a

.group ฌ
ฌ ch2
ฌ (L10 ch2a

.group ญ
ญ j2
ญ (L10 j2a

.group ฎ
ฎ d3
ฎ (L10 d3a

.group ฏ
ฏ t3
ฏ (L10 t3a

.group ฐ
ฐ th55
ฐ (L10 th55a

.group ฑ
ฑ th
ฑ (L10 tha

.group ฒ
ฒ th2
ฒ (L10 th2a

.group ณ
ณ n2
ณ (L10 n2a

.group ด
ด d3
ด (L10 d3a

.group ต
ต t3
ต (L10 t3a

.group ถ
ถ th55
ถ (L10 th55a

.group ท
ท th2
ท (L10 th2a

.group ธ
ธ th2
ธ (L10 th2a

.group น
น n2
น (L10 n2a

.group บ
บ b3
บ (L10 b3a

.group ป
ป p3
ป (L10 p3a

.group ผ
ผ ph55
ผ (L10 ph55a

.group ฝ
ฝ f55
ฝ (L10 f55a

.group พ
พ ph2
พ (L10 ph2a

.group ฟ
ฟ f2
ฟ (L10 f2a

.group ภ
ภ ph2
ภ (L10 ph2a

.group ม
ม m2
ม (L10 m2a

.group ย
ย j2
ย (L10 j2a

.group ร
ร r2
ร (L10 r2a

.group ฤ
ฤ r
ฤ (L10 ra

.group ล
ล l2
ล (L10 l2a

.group ว
ว w2
ว (L10 w2a

.group ศ
ศ s55
ศ (L10 s55a

.group ษ
ษ s55
ษ (L10 s55a

.group ส
ส s55
ส (L10 s55a

.group ห
ห h55
ห (L10 h55a

.group ฬ
ฬ l2
ฬ (L10 l2a

.group อ
อ ?
อ (L10 ?a

.group ฮ
ฮ h2
ฮ (L10 h2a

.group ะ
ะ a

.group ะ
ะ s
ะ s

.group า
า s
า s

.group ำ
ำ s
ำ s

.group ิ
ิ i

.group ึ
ึ ue

.group ุ
ุ u

.group เ
เ e

.group แ
แ ae

.group โ
โ o

.group ใ
ใ ai

.group ไ
ไ ai

.group ๅ
ๅ u

.group ๆ
ๆ m

.group ี
ี s

.group ั
ั m
ั m

.group ู
ู u:

.group ื
ื ue:

.group // all other characters


// Switch to English for Latin characters
a _^_en
b _^_en
c _^_en
d _^_en
e _^_en
f _^_en
g _^_en
h _^_en
i _^_en
j _^_en
k _^_en
l _^_en
m _^_en
n _^_en
o _^_en
p _^_en
q _^_en
r _^_en
s _^_en
t _^_en
u _^_en
v _^_en
w _^_en
x _^_en
y _^_en
z _^_en

+ 2
- 1
docs/languages.md View File

[private-use extensions](https://raw.githubusercontent.com/espeak-ng/bcp47-data/master/bcp47-extensions) [private-use extensions](https://raw.githubusercontent.com/espeak-ng/bcp47-data/master/bcp47-extensions)
have been used. have been used.


The 119 supported languages and accents are:
The 120 supported languages and accents are:


| Family Code | Identifier | Language Family | Language | Accent/Dialect | | Family Code | Identifier | Language Family | Language | Accent/Dialect |
|-------------|-------------------|-----------------------|-----------------------------|------------------------| |-------------|-------------------|-----------------------|-----------------------------|------------------------|
| `bnt` | `sw` | Bantu | Swahili | | | `bnt` | `sw` | Bantu | Swahili | |
| `gmq` | `sv` | North Germanic | Swedish | | | `gmq` | `sv` | North Germanic | Swedish | |
| `dra` | `ta` | Dravidian | Tamil | | | `dra` | `ta` | Dravidian | Tamil | |
| `tai` | `th` | Tai | Thai | |
| `trk` | `tk` | Turkic | Turkmen<sup>\[1\]</sup> | | | `trk` | `tk` | Turkic | Turkmen<sup>\[1\]</sup> | |
| `trk` | `tt` | Turkic | Tatar | | | `trk` | `tt` | Turkic | Tatar | |
| `dra` | `te` | Dravidian | Telugu | | | `dra` | `te` | Dravidian | Telugu | |

+ 3
- 0
espeak-ng-data/lang/tai/th View File

name Thai
language th
status testing

+ 2
- 0
phsource/phonemes View File

phonemetable sl pl phonemetable sl pl
include ph_slovenian include ph_slovenian


phonemetable th shn

phonemetable cs sk phonemetable cs sk
include ph_czech include ph_czech



Loading…
Cancel
Save