| - [Data Files](#data-files) | - [Data Files](#data-files) | ||||
| - [Unicode Character Database](#unicode-character-database) | - [Unicode Character Database](#unicode-character-database) | ||||
| - [ConScript Unicode Registry](#conscript-unicode-registry) | - [ConScript Unicode Registry](#conscript-unicode-registry) | ||||
| - [C Library](#c-library) | |||||
| - [Querying Properties](#querying-properties) | |||||
| - [Case Conversion](#case-conversion) | |||||
| - [wctype Compatibility](#wctype-compatibility) | |||||
| - [Library](#library) | |||||
| - [Build Dependencies](#build-dependencies) | - [Build Dependencies](#build-dependencies) | ||||
| - [Debian](#debian) | - [Debian](#debian) | ||||
| - [Building](#building) | - [Building](#building) | ||||
| This data is located in the `data/csur` directory in a form compatible with the | This data is located in the `data/csur` directory in a form compatible with the | ||||
| Unicode Character Data files. | Unicode Character Data files. | ||||
| ## C Library | |||||
| ## Library | |||||
| The C library provides several different facilities that make use of the UCD | |||||
| data. It provides a compact and efficient representation of the different data | |||||
| tables. | |||||
| The `ucd-tools` project provides a C library with a C++ binding. This library | |||||
| supports querying Unicode information about the codepoints in a compact and | |||||
| efficient representation of the different data tables. | |||||
| Detailed documentation is provided in the `src/include/ucd/ucd.h` file in the | |||||
| Doxygen documentation format. | |||||
| A ctype-compatible API is also provided, allowing programs to use that API on | |||||
| systems that don't provide wide-character case conversion and ctype | |||||
| implementations. | |||||
| ### Querying Properties | |||||
| The library exposes the following properties from the UCD data files: | |||||
| | C API | C++ API | Data | Description | | |||||
| |-----------------------|------------------------|-------------|-------------| | |||||
| | `ucd_lookup_category` | `ucd::lookup_category` | UnicodeData | A [General Category Value](http://www.unicode.org/reports/tr44/#General_Category_Values). | | |||||
| | `ucd_lookup_script` | `ucd::lookup_script` | Script | An [ISO 15924](http://www.unicode.org/iso15924/iso15924-codes.html) script code. | | |||||
| | `ucd_properties` | `ucd::properties` | PropList | The code point properties from the PropList Unicode data file. | | |||||
| ### Case Conversion | |||||
| The following character conversion functions are provided: | |||||
| | C API | C++ API | Description | | |||||
| |---------------|----------------|-------------| | |||||
| | `ucd_tolower` | `ucd::tolower` | convert letters to lower case | | |||||
| | `ucd_totitle` | `ucd::totitle` | convert letters to title case (UCD extension) | | |||||
| | `ucd_toupper` | `ucd::toupper` | convert letters to upper case | | |||||
| __NOTE:__ These functions use the simple case mapping algorithm. That is, they | |||||
| only ever map to a single character. This is to provide a compatible signature | |||||
| to the standard C `wctype.h` APIs. | |||||
| ### wctype Compatibility | |||||
| To facilitate working on platforms that don't have a useable wide-character | |||||
| ctypes library, or to provide a more consistent behaviour, the `ucd-tools` | |||||
| C library provides a set of APIs that are compatible with `wctype.h`. | |||||
| The following character classification functions are provided: | |||||
| | C API | C++ API | | |||||
| |----------------|-----------------| | |||||
| | `ucd_isalnum` | `ucd::isalnum` | | |||||
| | `ucd_isalpha` | `ucd::isalpha` | | |||||
| | `ucd_isblank` | `ucd::isblank` | | |||||
| | `ucd_iscntrl` | `ucd::iscntrl` | | |||||
| | `ucd_isdigit` | `ucd::isdigit` | | |||||
| | `ucd_isgraph` | `ucd::isgraph` | | |||||
| | `ucd_islower` | `ucd::islower` | | |||||
| | `ucd_isprint` | `ucd::isprint` | | |||||
| | `ucd_ispunct` | `ucd::ispunct` | | |||||
| | `ucd_isspace` | `ucd::isspace` | | |||||
| | `ucd_isupper` | `ucd::isupper` | | |||||
| | `ucd_isxdigit` | `ucd::isxdigit` | | |||||
| Detailed documentation is provided in the [src/include/ucd/ucd.h](ucd.h) file | |||||
| using the Doxygen documentation format. | |||||
| ## Build Dependencies | ## Build Dependencies | ||||