# Trivial UTF-8 Manual ###### [in package TRIVIAL-UTF-8] ## Introduction Trivial UTF-8 is a small library for doing UTF-8-based in- and output on a Lisp implementation that already supports Unicode - meaning char-code and code-char deal with Unicode character codes. The rationale for the existence of this library is that while Unicode-enabled implementations usually do provide some kind of interface to dealing with character encodings, these are typically not terribly flexible or uniform. The Babel library solves a similar problem while understanding more encodings. Trivial UTF-8 was written before Babel existed, but for new projects you might be better off going with Babel. The one plus that Trivial UTF-8 has is that it doesn't depend on any other libraries. [babel]: https://common-lisp.net/project/babel/ ## Links and Systems Here is the official repository and the HTML documentation for the latest version. [trivial-utf-8-repo]: https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8 [trivial-utf-8-doc]: http://melisgl.github.io/mgl-pax-world/trivial-utf-8-manual.html - [system] "trivial-utf-8" - Description: A small library for doing UTF-8-based input and output. - Licence: ZLIB - Author: Marijn Haverbeke - Maintainer: Gábor Melis - Homepage: - Bug tracker: - Source control: [GIT](https://gitlab.common-lisp.net/trivial-utf-8/trivial-utf-8.git) - Depends on: mgl-pax-bootstrap ## Reference - [function] utf-8-byte-length string Calculate the amount of bytes needed to encode string. - [function] string-to-utf-8-bytes string &key null-terminate Convert string into an array of unsigned bytes containing its UTF-8 representation. If null-terminate, add an extra 0 byte at the end. - [function] utf-8-group-size byte Determine the amount of bytes that are part of the character whose encoding starts with byte. May signal utf-8-decoding-error. - [function] utf-8-bytes-to-string bytes &key (start 0) (end (length bytes)) Convert the start, end subsequence of the array of bytes containing UTF-8 encoded characters to a string. The element type of bytes may be anything as long as it can be coerced into an (unsigned-bytes 8) array. May signal utf-8-decoding-error. - [function] read-utf-8-string input &key null-terminated stop-at-eof (char-length -1) (byte-length -1) Read UTF-8 encoded data from input, a byte stream, and construct a string with the characters found. When null-terminated is given, stop reading at a null character. If stop-at-eof, then stop at end-of-file without raising an error. The char-length and byte-length parameters can be used to specify the max amount of characters or bytes to read, where -1 means no limit. May signal utf-8-decoding-error. - [function] write-utf-8-bytes string byte-stream &key null-terminate Write string to byte-stream, encoding it as UTF-8. If null-terminate, write an extra 0 byte at the end. - [condition] utf-8-decoding-error simple-error