This is an OCaml library to parse and generate the YAML file format. It is intended to interoperable with the Ezjsonm JSON handling library, if the simple common subset of Yaml is used. Anchors and other advanced Yaml features are not implemented in the JSON compatibility layer.
The Yaml module docs are browseable online.
Install the library via opam install yaml
, and then execute a
toplevel via utop
. You can also build and execute the toplevel
locally by running dune utop
.
# #require "yaml" ;;
# Yaml.of_string "foo";;
- : Yaml.value Yaml.res = Result.Ok (`String "foo")
# Yaml.of_string "- foo";;
- : Yaml.value Yaml.res = Result.Ok (`A [`String "foo"])
# Yaml.to_string (`O ["foo1", `String "bar1"; "foo2", `Float 1.0]);;
- : string Yaml.res = Result.Ok "foo1: bar1\nfoo2: 1\n"
# #require "yaml.unix" ;;
# Yaml_unix.to_file Fpath.(v "my.yml") (`String "bar") ;;
- : (unit, [ `Msg of string ]) result = Result.Ok ()
# Yaml_unix.of_file Fpath.(v "my.yml");;
- : (Yaml.value, [ `Msg of string ]) result = Result.Ok (`String "bar")
# Yaml_unix.of_file_exn Fpath.(v "my.yml");;
- : Yaml.value = `String "bar"
The library tries to conform to the YAML 1.1 spec and correctly interpret scalar string values into Yaml null, bool or float: values.
Consider null values:
# Yaml.of_string_exn "null"
- : Yaml.value = `Null
# Yaml.of_string_exn ""
- : Yaml.value = `Null
# Yaml.of_string_exn "~"
- : Yaml.value = `Null
And bool values:
# Yaml.of_string_exn "true"
- : Yaml.value = `Bool true
# Yaml.of_string_exn "n"
- : Yaml.value = `Bool false
# Yaml.of_string_exn "yes"
- : Yaml.value = `Bool true
and float values:
# Yaml.of_string_exn "6.8523015e+5"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "685.230_15e+03"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "685_230.15"
- : Yaml.value = `Float 685230.15
# Yaml.of_string_exn "-.inf"
- : Yaml.value = `Float (neg_infinity)
# Yaml.of_string_exn "NaN"
- : Yaml.value = `Float nan
Note that yaml base60 ('sexagesimal') parsing is not yet supported, so this will show up as a string for now:
# Yaml.of_string_exn "190:20:30.15"
- : Yaml.value = `String "190:20:30.15"
Integers will be internally represented as a float (for JSON compat), but be printed back out without a trailing decimal point if it is just an integer.
# Yaml.of_string_exn "1"
- : Yaml.value = `Float 1.
# Yaml.of_string_exn "1" |> Yaml.to_string
- : string Yaml.res = Result.Ok "1\n"
ocaml-yaml is based around a binding to the C libyaml library to do the majority of the low-level parsing and serialisation, with a higher-level OCaml module that provides a simple interface for the majority of common uses.
We use the following major OCaml tools and libraries:
- build: dune is the build tool used.
- ffi: ctypes is the library to interface with the C FFI exposed by libYaml.
- preprocessor: ppx_sexp_conv generates s-expression serialises and deserialisers for the types exposed by the library, exposed in a
yaml-sexp
package. - tests: alcotest specifies conventional unit tests, and crowbar is used to drive property-based fuzz-testing of the library.
The following layers are present to make the high-level library work, contained within the following directories in the repository:
vendor/
contains the C sources for libyaml, with some minor modifications. to the header files to make them easier to use with Ctypes.types/
has OCaml definitions for the C types defined inyaml.h
.ffi/
has OCaml definitions for the C functions defined inyaml.h
.lib/
contains the high-level OCaml interface for Yaml manipulation, using the FFI definitions above.lib_sexp/
contains the reexported types with s-expression converters also included.unix/
contains OS-specific bindings with file-handling.tests/
has unit tests for the library functionality.fuzz/
contains exploratory fuzz testing that randomises inputs to find bugs.config/
has configuration tests to set the C compilation flags.
C library: A copy of the libyaml C library is included into vendor/
to eliminate the need
for a third-party dependency. The C code is built directly into a yaml.a
static library, and linked in with the OCaml bindings.
Bindings to C types: We then need to generate OCaml type definitions that correspond to the C header
definitions in libyaml. This is all done without writing a single line of C code,
via the stub generation support in ocaml-ctypes.
We define an OCaml library that describes the C enumerations or structs that we need a
corresponding definition for (see yaml_bindings_types.ml).
This code is also exported in the yaml.bindings.types
ocamlfind library.
These binding descriptions are then then compiled into an executable (see ffi_types_stubgen.ml).
When run, this calls the C compiler and generating a compatible OCaml module with the results
of probing the C library and statically determining values for (e.g.) struct offsets or macros.
The resulting OCaml library is expored in the yaml.types
ocamlfind library.
Bindings to C functions: Once we have the C type definitions bound into OCaml, we then need to
bind the corresponding C library functions that use them. We do exactly the same approach as we
did for probing types earlier, but define an OCaml descriptions of the functions
that we want to bind instead (see yaml_bindings.ml).
The ffi_stubgen executable then takes these descriptions and
generates two source code files: an OCaml module containing the typed function calls,
and the corresponding C bindings that link those typed function calls to the C library.
Again, this is all done automatically via Ctypes functions, and we never had to write
any manual C code. As an additional layer of safety, mistakes when writing the Ctypes
bindings will also result in a compile-time error, since the generated C code will fail
to compile with the C header files for the yaml library. The resulting OCaml functions
are exported in the yaml.ffi
ocamlfind library.
OCaml API: Finally, we define the OCaml API that uses the low-level FFI to expose
a well-typed OCaml interface. We adopt a convention of using the standard result
type to return explicit errors instead of raising OCaml exceptions. We also
define some polymorphic variant types to represent various configuration options
(such as the printing style of different Yaml values).
Since the most common use of Yaml is for relatively simple key-value stores, the OCaml API by default exposes polymorphic variant types that are completely compatible with the Ezjsonm library, meaning that you can print JSON or Yaml back and forth very easily. However, if you do need the advanced Yaml functions like anchors and aliases, then there are definitions that expose them too.
Testing: There are two test suites included with the repository. The first is a conventional unit test infrastructure that uses the Alcotest framework from MirageOS. The second is a property-based fuzz testing framework via Crowbar, which tries to find unexpected issues by exploring the library with randomised inputs that are guided by the control flow of the execution.
Docs: Documentation can be locally generated by running make doc
, and looking
in _build/default/_doc/index.html
with a web browser. The URL for online docs
is listed below.
- Discussion: Post on https://discuss.ocaml.org/ with the
yaml
tag under the Ecosystem category. - Bugs: https://github.com/avsm/ocaml-yaml/issues
- Docs: http://anil-code.recoil.org/ocaml-yaml
Contributions are very welcome. Please see the overall TODO list below, or please get in touch with any particular comments you might have.
- Warnings: handle the unsigned char
yaml_char_t
in the Ctypes bindings. - Warnings: const needs to be specified in the Ctypes binding.
- Send upstream PR for forked header file (due to removal of anonymous structs).