|  Reece H. Dunn | b847df63b5 | tokenizer.c: Support semicolon tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | af7e8fc5a3 | tokenizer.c: Support colon tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | 7560070dcd | tokenizer.c: Support comma tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | c9199cfacb | tokenizer.c: Support exclamation mark tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | 128ceaff6a | tokenizer.c: Support question mark tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | 8f62e18324 | tokenizer.c: Support full stop tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | d50f3f2fa5 | tokenizer.c: Support word tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | a902f451d8 | tests/tokenizer.test: Support printing the tokens from a provided file, making it easy to investigate tokenizer issues. | 8 years ago | 
				
					
						|  Reece H. Dunn | d093513b65 | tokenizer.c: Add an options parameter to the tokenizer_reset API. | 8 years ago | 
				
					
						|  Reece H. Dunn | c41ac642fa | tokenizer.c: Tokenise Zp codepoints as paragraphs. | 8 years ago | 
				
					
						|  Reece H. Dunn | f3ea6f68f3 | tokenizer.c: Tokenise U+000B [VERTICAL TAB (VT)] as whitespace, not as newlines. | 8 years ago | 
				
					
						|  Reece H. Dunn | fc7a4e6701 | tokenizer.c: Recognise U+000C [FORM FEED (FF)] as a newline codepoint. | 8 years ago | 
				
					
						|  Reece H. Dunn | d2d718d700 | tokenizer.c: Tokenize line separator codepoints as newline tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | bf45e7ce36 | tokenizer.c: Recognise U+0085 [NEW LINE (NEL)] as a newline codepoint. | 8 years ago | 
				
					
						|  Reece H. Dunn | df6ca7a22c | tokenizer.c: Support whitespace tokens. | 8 years ago | 
				
					
						|  Reece H. Dunn | 8f0dae6a38 | tokenizer.c: Support windows newlines. | 8 years ago | 
				
					
						|  Reece H. Dunn | b897ff5aa8 | encoding.c: Support calling peekc past the end of the buffer. This makes calling peekc easier. | 8 years ago | 
				
					
						|  Reece H. Dunn | 3f692f498b | encoding.c: Implement a peekc API. | 8 years ago | 
				
					
						|  Reece H. Dunn | 1c8ed9c190 | tokenizer.c: Support mac newlines. | 8 years ago | 
				
					
						|  Reece H. Dunn | 7602c9ac18 | tokenizer.c: Support linux newlines. | 8 years ago | 
				
					
						|  Reece H. Dunn | bce44316bb | Create a basic tokenizer API using a structure that mirrors the TtsTokenizer interface in the tts-dev-studio project. | 8 years ago | 
				
					
						|  Reece H. Dunn | 5c6bc0e556 | Armenian emphasis mark (U+055B) is used for interjections, so treat it as an exclamation mark. | 8 years ago | 
				
					
						|  Reece H. Dunn | 1c4ce3dcd3 | tokenizer.c: create and use a clause_type_from_codepoint function, with tests. | 8 years ago | 
				
					
						|  Reece H. Dunn | 691457e98d | Add Prepended_Concatenation_Mark support from PropList.txt. | 8 years ago | 
				
					
						|  Reece H. Dunn | 4ce8b61180 | Extend ucd_property to 64-bits to allow all properties to be specified. | 8 years ago | 
				
					
						|  Reece H. Dunn | a9aabc6242 | Add tests for the PropList API. | 8 years ago | 
				
					
						|  Reece H. Dunn | 9dabf64680 | encoding.c: Support determining the string length for length < 0. | 8 years ago | 
				
					
						|  Reece H. Dunn | b5ed1f28a5 | encoding.c: Don't crash if NULL is passed as the string to the decode APIs. | 8 years ago | 
				
					
						|  Reece H. Dunn | d167d5649b | encoding.c: Implement support for the auto-detected character set (utf-8 + codepoint-encoding). | 8 years ago | 
				
					
						|  Reece H. Dunn | 6a0b5e4ae1 | encoding.c: Support using wchar_t strings with the text decoder API. | 8 years ago | 
				
					
						|  Reece H. Dunn | b74f756f00 | encoding.c: Support the ISO-10646-UCS-2 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | fa5d31a8af | encoding.c: Support the UTF-8 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 2499610433 | encoding.c: Support the ISCII encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 39f3ea54cf | encoding.c: Support the KOI8-R encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | b8a1006dd8 | encoding.c: Support the ISO 8859-16 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 166e815723 | encoding.c: Support the ISO 8859-15 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 91e054ec7c | encoding.c: Fix the ISO 8859 encoding names with date suffices. | 8 years ago | 
				
					
						|  Reece H. Dunn | 0235c42652 | encoding.c: Support the ISO 8859-14 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 24faceab57 | encoding.c: Support the ISO 8859-13 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 495c0aed20 | encoding.c: Support the ISO 8859-11 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 84f20f8bb8 | encoding.c: Support the ISO 8859-10 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 0421f127e8 | encoding.c: Support the ISO 8859-9 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 7da585e25e | encoding.c: Support the ISO 8859-8 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 56c0b38785 | encoding.c: Support the ISO 8859-7 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 9e4638ff25 | encoding.c: Support the ISO 8859-6 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 51295d9d1b | encoding.c: Support the ISO 8859-5 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | b5589fc5ee | encoding.c: Support the ISO 8859-4 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | a93b0f3d64 | encoding.c: Support the ISO 8859-3 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 0a0e84a322 | encoding.c: Support the ISO 8859-2 encoding. | 8 years ago | 
				
					
						|  Reece H. Dunn | 26bec1eedf | encoding.c: Support the ISO 8859-1 encoding. | 8 years ago |