43 Commits (c41ac642fadd766f4410b5e7da4b6dcfd377c74d)

Author SHA1 Message Date
  Reece H. Dunn c41ac642fa tokenizer.c: Tokenise Zp codepoints as paragraphs. 8 years ago
  Reece H. Dunn f3ea6f68f3 tokenizer.c: Tokenise U+000B [VERTICAL TAB (VT)] as whitespace, not as newlines. 8 years ago
  Reece H. Dunn fc7a4e6701 tokenizer.c: Recognise U+000C [FORM FEED (FF)] as a newline codepoint. 8 years ago
  Reece H. Dunn d2d718d700 tokenizer.c: Tokenize line separator codepoints as newline tokens. 8 years ago
  Reece H. Dunn bf45e7ce36 tokenizer.c: Recognise U+0085 [NEW LINE (NEL)] as a newline codepoint. 8 years ago
  Reece H. Dunn df6ca7a22c tokenizer.c: Support whitespace tokens. 8 years ago
  Reece H. Dunn 8f0dae6a38 tokenizer.c: Support windows newlines. 8 years ago
  Reece H. Dunn b897ff5aa8 encoding.c: Support calling peekc past the end of the buffer. This makes calling peekc easier. 8 years ago
  Reece H. Dunn 3f692f498b encoding.c: Implement a peekc API. 8 years ago
  Reece H. Dunn 1c8ed9c190 tokenizer.c: Support mac newlines. 8 years ago
  Reece H. Dunn 7602c9ac18 tokenizer.c: Support linux newlines. 8 years ago
  Reece H. Dunn bce44316bb Create a basic tokenizer API using a structure that mirrors the TtsTokenizer interface in the tts-dev-studio project. 8 years ago
  Reece H. Dunn 5c6bc0e556 Armenian emphasis mark (U+055B) is used for interjections, so treat it as an exclamation mark. 8 years ago
  Reece H. Dunn 1c4ce3dcd3 tokenizer.c: create and use a clause_type_from_codepoint function, with tests. 8 years ago
  Reece H. Dunn 691457e98d Add Prepended_Concatenation_Mark support from PropList.txt. 8 years ago
  Reece H. Dunn 4ce8b61180 Extend ucd_property to 64-bits to allow all properties to be specified. 8 years ago
  Reece H. Dunn a9aabc6242 Add tests for the PropList API. 8 years ago
  Reece H. Dunn 9dabf64680 encoding.c: Support determining the string length for length < 0. 8 years ago
  Reece H. Dunn b5ed1f28a5 encoding.c: Don't crash if NULL is passed as the string to the decode APIs. 8 years ago
  Reece H. Dunn d167d5649b encoding.c: Implement support for the auto-detected character set (utf-8 + codepoint-encoding). 8 years ago
  Reece H. Dunn 6a0b5e4ae1 encoding.c: Support using wchar_t strings with the text decoder API. 8 years ago
  Reece H. Dunn b74f756f00 encoding.c: Support the ISO-10646-UCS-2 encoding. 8 years ago
  Reece H. Dunn fa5d31a8af encoding.c: Support the UTF-8 encoding. 8 years ago
  Reece H. Dunn 2499610433 encoding.c: Support the ISCII encoding. 8 years ago
  Reece H. Dunn 39f3ea54cf encoding.c: Support the KOI8-R encoding. 8 years ago
  Reece H. Dunn b8a1006dd8 encoding.c: Support the ISO 8859-16 encoding. 8 years ago
  Reece H. Dunn 166e815723 encoding.c: Support the ISO 8859-15 encoding. 8 years ago
  Reece H. Dunn 91e054ec7c encoding.c: Fix the ISO 8859 encoding names with date suffices. 8 years ago
  Reece H. Dunn 0235c42652 encoding.c: Support the ISO 8859-14 encoding. 8 years ago
  Reece H. Dunn 24faceab57 encoding.c: Support the ISO 8859-13 encoding. 8 years ago
  Reece H. Dunn 495c0aed20 encoding.c: Support the ISO 8859-11 encoding. 8 years ago
  Reece H. Dunn 84f20f8bb8 encoding.c: Support the ISO 8859-10 encoding. 8 years ago
  Reece H. Dunn 0421f127e8 encoding.c: Support the ISO 8859-9 encoding. 8 years ago
  Reece H. Dunn 7da585e25e encoding.c: Support the ISO 8859-8 encoding. 8 years ago
  Reece H. Dunn 56c0b38785 encoding.c: Support the ISO 8859-7 encoding. 8 years ago
  Reece H. Dunn 9e4638ff25 encoding.c: Support the ISO 8859-6 encoding. 8 years ago
  Reece H. Dunn 51295d9d1b encoding.c: Support the ISO 8859-5 encoding. 8 years ago
  Reece H. Dunn b5589fc5ee encoding.c: Support the ISO 8859-4 encoding. 8 years ago
  Reece H. Dunn a93b0f3d64 encoding.c: Support the ISO 8859-3 encoding. 8 years ago
  Reece H. Dunn 0a0e84a322 encoding.c: Support the ISO 8859-2 encoding. 8 years ago
  Reece H. Dunn 26bec1eedf encoding.c: Support the ISO 8859-1 encoding. 8 years ago
  Reece H. Dunn 0590da5da7 encoding.c: Create a string decoding API; support US-ASCII decoding. 8 years ago
  Reece H. Dunn da7eaa7b9c encoding.c: Create a text decoder API based on the usage in readclause.c. 8 years ago
  Reece H. Dunn 887b1c837f encoding.c: Don't crash when passing a NULL string to LookupMnem. 8 years ago
  Reece H. Dunn 26f4eb4f8f encoding.c: Support US-ASCII encoding names. 8 years ago
  Reece H. Dunn b47363b7d3 Create an espeak_ng_EncodingFromName API. 8 years ago
  Reece H. Dunn ac082c9400 Add tests for the remaining is* APIs. 8 years ago
  Reece H. Dunn c9f2940373 isblank: don't include <noBreak> characters, and add tests for this API. 8 years ago
  Reece H. Dunn 5f9dc111cf Add tests for the isdigit and isxdigit ctype APIs. 8 years ago
  Reece H. Dunn bd71fed013 ctype: return true in isupper/islower if there is a simple case mapping present 8 years ago