chrislm
|
5d8bb74169
|
IT: new improvements tested on april 2017
reduced length to 160 for unstressed syllables
Added some exceptions to the italian dictionaries
|
8 years ago |
Reece H. Dunn
|
d50f3f2fa5
|
tokenizer.c: Support word tokens.
|
8 years ago |
Reece H. Dunn
|
d093513b65
|
tokenizer.c: Add an options parameter to the tokenizer_reset API.
|
8 years ago |
Reece H. Dunn
|
c41ac642fa
|
tokenizer.c: Tokenise Zp codepoints as paragraphs.
|
8 years ago |
Reece H. Dunn
|
fc7a4e6701
|
tokenizer.c: Recognise U+000C [FORM FEED (FF)] as a newline codepoint.
|
8 years ago |
Reece H. Dunn
|
d2d718d700
|
tokenizer.c: Tokenize line separator codepoints as newline tokens.
|
8 years ago |
Reece H. Dunn
|
bf45e7ce36
|
tokenizer.c: Recognise U+0085 [NEW LINE (NEL)] as a newline codepoint.
|
8 years ago |
Reece H. Dunn
|
df6ca7a22c
|
tokenizer.c: Support whitespace tokens.
|
8 years ago |
Reece H. Dunn
|
539edac795
|
tokenizer.c: Create a codepoint_type helper function to classify codepoints for the tokenizer.
|
8 years ago |
Reece H. Dunn
|
8f0dae6a38
|
tokenizer.c: Support windows newlines.
|
8 years ago |
Reece H. Dunn
|
b897ff5aa8
|
encoding.c: Support calling peekc past the end of the buffer. This makes calling peekc easier.
|
8 years ago |
Reece H. Dunn
|
3f692f498b
|
encoding.c: Implement a peekc API.
|
8 years ago |
Reece H. Dunn
|
1c8ed9c190
|
tokenizer.c: Support mac newlines.
|
8 years ago |
Reece H. Dunn
|
7602c9ac18
|
tokenizer.c: Support linux newlines.
|
8 years ago |
Reece H. Dunn
|
bce44316bb
|
Create a basic tokenizer API using a structure that mirrors the TtsTokenizer interface in the tts-dev-studio project.
|
8 years ago |
Reece H. Dunn
|
3cc53d98f4
|
Add ucd.h to tokenizer.c to provide the definition of the ucd_category identifier for the emscripten build.
|
8 years ago |
Reece H. Dunn
|
61d668c0cb
|
ucd-tools: Inverted_Terminal_Punctuation eSpeakNG extended property support; use in clause_type_from_codepoint.
|
8 years ago |
Reece H. Dunn
|
5c6bc0e556
|
Armenian emphasis mark (U+055B) is used for interjections, so treat it as an exclamation mark.
|
8 years ago |
Reece H. Dunn
|
bc13173ac4
|
ucd-tools: Punctuation_In_Word eSpeakNG extended property support; use in clause_type_from_codepoint.
|
8 years ago |
Reece H. Dunn
|
1131d0924b
|
ucd-tools: Optional_Space_After eSpeakNG extended property support; use in clause_type_from_codepoint.
|
8 years ago |
Reece H. Dunn
|
b932f3c493
|
ucd-tools: Extended_Dash eSpeakNG extended property support; use in clause_type_from_codepoint.
|
8 years ago |
Reece H. Dunn
|
3100ca9d1b
|
Use ucd_properties to implement clause_type_from_codepoint for supported types.
|
8 years ago |
Reece H. Dunn
|
be86091088
|
Use #defines for the ESPEAKNG_PROPERTY_ constants, so they can be used in things like switch expressions.
|
8 years ago |
Reece H. Dunn
|
d18d98b92c
|
Use #defines for the UCD_PROPERTY_ constants, so they can be used in things like switch expressions.
|
8 years ago |
Reece H. Dunn
|
1c4ce3dcd3
|
tokenizer.c: create and use a clause_type_from_codepoint function, with tests.
|
8 years ago |
Reece H. Dunn
|
92f703d98b
|
Use defines instead of hard-coded numbers for more clause logic.
|
8 years ago |
Reece H. Dunn
|
8749891069
|
Better specify the CLAUSE_ flags returned by ReadClause.
|
8 years ago |
Reece H. Dunn
|
2e375362c4
|
ucd-tools: Paragraph_Separator eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
7fc4f5ece2
|
ucd-tools: Ellipsis eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
8d8c8b3b56
|
ucd-tools: Semi_Colon eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
9869ee051e
|
ucd-tools: Colon eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
9ef03b8ac8
|
ucd-tools: Comma eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
5017153d62
|
ucd-tools: Exclamation_Mark eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
31d66fddb1
|
ucd-tools: Question_Mark eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
07cd2b12e1
|
ucd-tools: Full_Stop eSpeakNG extended property support.
|
8 years ago |
Reece H. Dunn
|
9bbb16a74c
|
Add ucd-tools proplist.c to the build.
|
8 years ago |
Reece H. Dunn
|
ff018e33df
|
Support compiling ucd.h with a strict C11 compiler.
|
8 years ago |
Reece H. Dunn
|
691457e98d
|
Add Prepended_Concatenation_Mark support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
429b8f3629
|
Add Pattern_Syntax support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
d91a249c14
|
Add Pattern_White_Space support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
eecd7984c5
|
Add Variation_Selector support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
f7f228bc0c
|
Add Sentence_Terminal support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
628dfb8887
|
Add Other_ID_Continue support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
90a3615c48
|
Add Other_ID_Start support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
a3f811aac1
|
Add Logical_Order_Exception support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
66b2404ce3
|
Add Soft_Dotted support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
423aa813aa
|
Add Deprecated support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
e18564899c
|
Add Other_Default_Ignorable_Code_Point support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
2459a4fa8f
|
Add Unified_Ideograph support from PropList.txt.
|
8 years ago |
Reece H. Dunn
|
4ce8b61180
|
Extend ucd_property to 64-bits to allow all properties to be specified.
|
8 years ago |