Sane OCaml String API
This library is a set of APIs defined with module types, and a set of modules and functors implementing one or more of those interfaces.
The APIs define what a character and a string of characters should be.
Module Types (APIs)
BASIC_CHARACTER: characters of any length.
NATIVE_CONVERSIONS: functions to transform from/to native OCaml strings.
BASIC_STRING: immutable strings of (potentially abstract) characters:
- contains a functor to provide a thread agnostic
sig val output: ... end.
UNSAFELY_MUTABLE: mutability of some string implementations (“unsafe” meaning that they break immutability invariants/assumptions).
MINIMALISTIC_MUTABLE_STRING: abstract mutable string used as argument of the
Native OCaml Characters
Native_character module implements
Native OCaml Strings
Native_string module implements
UNSAFELY_MUTABLE with OCaml's
string type (and hence
Lists Of Arbitrary Characters
List_of is a functor:
BASIC_STRING, i.e., it creates a
string datastructure made of a list of characters.
Build From Basic Mutable Data-structures
Of_mutable uses an implementation of
MINIMALISTIC_MUTABLE_STRING to build a
Integer UTF-8 Characters
Int_utf8_character module implements
OCaml integers (
int) representing Utf8 characters (we force the
handling of not more than 31 bits, even if RFC 3629
restricts them to end at U+10FFFF, c.f. also
wikipedia). Note that the function
only ASCII whitespace (useful while writing parsers for example).
Examples, Tests, and Benchmarks
See the file
sosa_test.ml for usage examples, the
library is tested with:
- native strings and characters,
- lists of native characters (
- lists of integers representing UTF-8 characters (
- arrays of integers representing UTF-8 characters (
- bigarrays of 8-bit integers (
The tests are a self-compiling “Shell-then-OCaml-script” which
depends on the Nonstd, and the OCaml
and you may add the basic benchmarks to the process with: