680 Commits (39b8768a791bb168e7027b10402560eb9bcda3c5)

Author SHA1 Message Date
  Reece H. Dunn 85801fc1e3 Remove the now unused dictdialect functionality from the code. 8 years ago
  Reece H. Dunn 2f8f125c68 Remove voice/language support for alphabet2. 8 years ago
  Reece H. Dunn dd90d3812d tokenizer.c: Support general symbol tokens. 8 years ago
  Reece H. Dunn 786575c6ed tokenizer.c: Support general punctuation tokens. 8 years ago
  Reece H. Dunn 0705844bf8 tokenizer.c: Move general category classification that does not override property behaviour to the end, for generic classification. 8 years ago
  Reece H. Dunn 683579f403 Make the tokenizer.h API public. 8 years ago
  Reece H. Dunn 9af96da469 Make the encoding.h API public. 8 years ago
  Reece H. Dunn 55bfbb4754 tokenizer.c: Support ellipsis tokens. 8 years ago
  Reece H. Dunn b847df63b5 tokenizer.c: Support semicolon tokens. 8 years ago
  Reece H. Dunn af7e8fc5a3 tokenizer.c: Support colon tokens. 8 years ago
  Reece H. Dunn 7560070dcd tokenizer.c: Support comma tokens. 8 years ago
  Reece H. Dunn c9199cfacb tokenizer.c: Support exclamation mark tokens. 8 years ago
  Reece H. Dunn 128ceaff6a tokenizer.c: Support question mark tokens. 8 years ago
  Reece H. Dunn 8f62e18324 tokenizer.c: Support full stop tokens. 8 years ago
  chrislm 5d8bb74169 IT: new improvements tested on april 2017 8 years ago
  Reece H. Dunn d50f3f2fa5 tokenizer.c: Support word tokens. 8 years ago
  Reece H. Dunn d093513b65 tokenizer.c: Add an options parameter to the tokenizer_reset API. 8 years ago
  Reece H. Dunn c41ac642fa tokenizer.c: Tokenise Zp codepoints as paragraphs. 8 years ago
  Reece H. Dunn fc7a4e6701 tokenizer.c: Recognise U+000C [FORM FEED (FF)] as a newline codepoint. 8 years ago
  Reece H. Dunn d2d718d700 tokenizer.c: Tokenize line separator codepoints as newline tokens. 8 years ago
  Reece H. Dunn bf45e7ce36 tokenizer.c: Recognise U+0085 [NEW LINE (NEL)] as a newline codepoint. 8 years ago
  Reece H. Dunn df6ca7a22c tokenizer.c: Support whitespace tokens. 8 years ago
  Reece H. Dunn 539edac795 tokenizer.c: Create a codepoint_type helper function to classify codepoints for the tokenizer. 8 years ago
  Reece H. Dunn 8f0dae6a38 tokenizer.c: Support windows newlines. 8 years ago
  Reece H. Dunn b897ff5aa8 encoding.c: Support calling peekc past the end of the buffer. This makes calling peekc easier. 8 years ago
  Reece H. Dunn 3f692f498b encoding.c: Implement a peekc API. 8 years ago
  Reece H. Dunn 1c8ed9c190 tokenizer.c: Support mac newlines. 8 years ago
  Reece H. Dunn 7602c9ac18 tokenizer.c: Support linux newlines. 8 years ago
  Reece H. Dunn bce44316bb Create a basic tokenizer API using a structure that mirrors the TtsTokenizer interface in the tts-dev-studio project. 8 years ago
  Reece H. Dunn 3cc53d98f4 Add ucd.h to tokenizer.c to provide the definition of the ucd_category identifier for the emscripten build. 8 years ago
  Reece H. Dunn 61d668c0cb ucd-tools: Inverted_Terminal_Punctuation eSpeakNG extended property support; use in clause_type_from_codepoint. 8 years ago
  Reece H. Dunn 5c6bc0e556 Armenian emphasis mark (U+055B) is used for interjections, so treat it as an exclamation mark. 8 years ago
  Reece H. Dunn bc13173ac4 ucd-tools: Punctuation_In_Word eSpeakNG extended property support; use in clause_type_from_codepoint. 8 years ago
  Reece H. Dunn 1131d0924b ucd-tools: Optional_Space_After eSpeakNG extended property support; use in clause_type_from_codepoint. 8 years ago
  Reece H. Dunn b932f3c493 ucd-tools: Extended_Dash eSpeakNG extended property support; use in clause_type_from_codepoint. 8 years ago
  Reece H. Dunn 3100ca9d1b Use ucd_properties to implement clause_type_from_codepoint for supported types. 8 years ago
  Reece H. Dunn 1c4ce3dcd3 tokenizer.c: create and use a clause_type_from_codepoint function, with tests. 8 years ago
  Reece H. Dunn 92f703d98b Use defines instead of hard-coded numbers for more clause logic. 8 years ago
  Reece H. Dunn 8749891069 Better specify the CLAUSE_ flags returned by ReadClause. 8 years ago
  Reece H. Dunn e4e1e4db0a TranslateWord: remove the unused add_plural_suffix variable. 8 years ago
  Reece H. Dunn 62d4aff9a9 Remove the now unused option_multibyte variable. 8 years ago
  Reece H. Dunn ec8a7b810f Use the text decoder object at the top-level Synthesize/espeak_TextToPhonemes call, not in TranslateClause. 8 years ago
  Reece H. Dunn b3e0fbc8ed encoding.c: Create a text_decoder_decode_string_multibyte helper to work with the espeakCHARS_* flags. 8 years ago
  Reece H. Dunn 9dabf64680 encoding.c: Support determining the string length for length < 0. 8 years ago
  Reece H. Dunn b5ed1f28a5 encoding.c: Don't crash if NULL is passed as the string to the decode APIs. 8 years ago
  Reece H. Dunn d167d5649b encoding.c: Implement support for the auto-detected character set (utf-8 + codepoint-encoding). 8 years ago
  Reece H. Dunn be480c12de Make TranslateClause return 'const void *' to preserve constness. 8 years ago
  Reece H. Dunn 6451917bde encoding.c: Fix text_decoder_get_buffer at EOF. 8 years ago
  Reece H. Dunn 7c16ac543c Use the text decoder API in readclause.c. 8 years ago
  Reece H. Dunn 8933185de4 Remove the unused f_in argument to the Read/Translate/SpeakNextClause functions. 8 years ago