Chronicle Wire is a a Wire Format abstraction library, The purpose of Chronicle Wire is to address the following concerns in a consistent manner:
-
Application configuration. (Using YAML)
-
Data serialization (YAML, binary YAML, JSON, Raw binary data, CSV)
-
Accessing off-heap memory in a thread-safe manner. (Bind to shared off-heap memory)
-
High performance data exchange using binary formats. Only include as much meta data as you need.
Chronicle Wire uses Chronicle Bytes for bytes manipulation, and Chronicle Core for low level JVM access.
Often you want to use these interchangeably.
-
Configuration includes aliased type information. This supports easy extension by adding new classes/versions, and cross-platform using type aliasing.
-
By supporting types, a configuration file can bootstrap itself. You control how the configuration file is decoded. See
engine.yaml
. -
To send the configuration of a server to a client, or from a client to a server.
-
To store the configuration of a data store in its header.
-
In configuration, to be able to create any object or component.
-
Save a configuration after you have changed it.
-
To be able to share data in memory between processes in a thread-safe manner.
Chronicle Wire supports a separation of describing what data you want to store and retrieve, and how it should be rendered/parsed. Chronicle Wire handles a variety of formatting options, for a wide range of formats.
A key aim of Chronicle Wire is to support schema changes. It should make reasonable attempts to handle:
-
optional fields
-
fields in a different order
-
fields that the consumer doesn’t expect; optionally parsing them, or ignoring them
-
more or less data than expected; in field-less formats
-
reading a different type to the one written
-
updating fixed-length fields, automatically where possible using a
bound
data structure.
Chronicle Wire will also be efficient where any, or all, of the following points are true:
-
fields are in the order expected
-
fields are the type expected
-
fields names/numbers are not used
-
self-describing types are not needed
-
random access of data values is supported.
Chronicle Wire is designed to make it easy to convert from one wire format to another. For example, you can use fixed-width binary data in memory for performance, and variable-width or text over the network. Different TCP connections could use different formats.
Chronicle Wire also supports hybrid wire formats. For example, you can have one format embedded in another.
The text formats include:
-
YAML
- subset of mapping structures included -
JSON
- superset to support serialization -
CSV
- superset to support serialization -
XML
- planned -
FIX
- proposed
Options include:
-
field names (for example, JSON), or field numbers (for example, FIX)
-
optional fields with default values that can be dropped
-
zero-copy access to fields - planned
-
thread-safe operations in text - planned
To support wire format discovery, the first byte should be in the ASCII
range; adding an ASCII
whitespace if needed.
The binary formats include:
-
binary YAML.
-
delta compressing Binary YAML. This is a Chronicle Wire Enterprise feature
-
typed data without fields
-
raw untyped fieldless data
-
BSON (Binary JSON) - planned
Options for Binary format:
-
field names or field numbers
-
variable width
-
optional fields with a default value can be dropped
-
fixed width data with zero copy support
-
thread-safe operations
Note: Chronicle Wire supports debug/transparent combinations like self-describing data with zero copy support.
To support wire format discovery, the first bytes should have the top bit set.
First you need to have a buffer to write to. This can be a byte[]
, a ByteBuffer
, off-heap memory, or even an address and length that you have obtained from some other library.
// Bytes which wraps a ByteBuffer which is resized as needed.
Bytes<ByteBuffer> bytes = Bytes.elasticByteBuffer();
Now you can choose which format you are using. As the wire formats are themselves unbuffered, you can use them with the same buffer, but in general using one wire format is easier.
Wire wire = new TextWire(bytes);
// or
WireType wireType = WireType.TEXT;
Wire wireB = wireType.apply(bytes);
// or
Bytes<ByteBuffer> bytes2 = Bytes.elasticByteBuffer();
Wire wire2 = new BinaryWire(bytes2);
// or
Bytes<ByteBuffer> bytes3 = Bytes.elasticByteBuffer();
Wire wire3 = new RawWire(bytes3);
So now you can write to the wire with a simple document.
wire.write(() -> "message").text("Hello World")
.write(() -> "number").int64(1234567890L)
.write(() -> "code").asEnum(TimeUnit.SECONDS)
.write(() -> "price").float64(10.50);
System.out.println(bytes);
prints
message: Hello World
number: 1234567890
code: SECONDS
price: 10.5
// the same code as for text wire
wire2.write(() -> "message").text("Hello World")
.write(() -> "number").int64(1234567890L)
.write(() -> "code").asEnum(TimeUnit.SECONDS)
.write(() -> "price").float64(10.50);
System.out.println(bytes2.toHexString());
prints
00000000 C7 6D 65 73 73 61 67 65 EB 48 65 6C 6C 6F 20 57 ·message ·Hello W 00000010 6F 72 6C 64 C6 6E 75 6D 62 65 72 A3 D2 02 96 49 orld·num ber····I 00000020 C4 63 6F 64 65 E7 53 45 43 4F 4E 44 53 C5 70 72 ·code·SE CONDS·pr 00000030 69 63 65 90 00 00 28 41 ice···(A
Using RawWire
strips away all the meta data to reduce the size of the message, and improve speed.
The down-side is that we cannot easily see what the message contains.
// the same code as for text wire
wire3.write(() -> "message").text("Hello World")
.write(() -> "number").int64(1234567890L)
.write(() -> "code").asEnum(TimeUnit.SECONDS)
.write(() -> "price").float64(10.50);
System.out.println(bytes3.toHexString());
prints in RawWire
.
00000000 0B 48 65 6C 6C 6F 20 57 6F 72 6C 64 D2 02 96 49 ·Hello W orld···I 00000010 00 00 00 00 07 53 45 43 4F 4E 44 53 00 00 00 00 ·····SEC ONDS···· 00000020 00 00 25 40 ··%@
For more examples see Examples Chapter1
Chronicle Wire allows (and encourages) objects to be re-used in order to reduce allocation rates.
When a marshallable object is re-used or initialised by the framework, it is first reset by way of Wires.reset()
. In the case of most DTOs with simple scalar values, this will not cause any issues. However, more complicated objects with object instance fields may experience undesired behaviour.
In order to reset
a marshallable object, the process is as follows:
-
create a new instance of the object to be reset
-
copy all fields from the new instance to the existing instance
-
the existing instance is now considered 'reset' back to default values
The object created in step 1
is cached for performance reasons, meaning that both the new and existing instance of the marshallable object could have a reference to the same object.
While this will not be a problem for primitive or immutable values (for example, int
, Long
, String
), a mutable field such as ByteBuffer
will cause problems. Consider the following case:
private static final class BufferContainer {
private ByteBuffer b = ByteBuffer.allocate(16);
}
@Test
public void shouldDemonstrateMutableFieldIssue2() {
// create 2 instances of a marshallable POJO
final BufferContainer c1 = new BufferContainer();
final BufferContainer c2 = new BufferContainer();
// reset both instances - this will set each container's
// b field to a 'default' value
Wires.reset(c1);
Wires.reset(c2);
// write to the buffer in c1
c1.b.putInt(42);
// inspect the buffer in both c1 and c2
System.out.println(c1.b.position());
System.out.println(c2.b.position());
System.out.println(c1.b == c2.b);
}
The output of the test above is:
4 4 true
showing that the field b of each container object is now referencing the same ByteBuffer
instance.
In order to work around this, if necessary, the marshallable class
should implement ResetOverride
:
private static final class BufferContainer implements ResetOverride {
private ByteBuffer b = ByteBuffer.allocate(16);
@Override
public void onReset() {
// or acquire from a pool if allocation should
// be kept to a minimum
b = ByteBuffer.allocate(16);
}
}
While serialized data can be updated by replacing a whole record, this might not be the most efficient option, nor thread-safe.
Chronicle Wire offers the ability to bind a reference to a fixed value of a field, and perform atomic operations on that field; for example, volatile read/write, and compare-and-swap.
// field to cache the location and object used to reference a field.
private LongValue counter = null;
// find the field and bind an approritae wrapper for the wire format.
wire.read(COUNTER).int64(counter, x -> counter = x);
// thread safe across processes on the same machine.
long id = counter.getAndAdd(1);
Other types are supported; for example, 32-bit integer values, and an array of 64-bit integer values.
Chronicle Wire is built on top of the Bytes
library, however Bytes
, in turn, can wrap:
-
ByteBuffer
- heap and direct -
byte\[\]
- usingByteBuffer
-
raw memory addresses.
This feature allows Chronicle Wire to de-serialize, manipulate, and serialize an instance class of an unknown type.
If the type is unknown at runtime, a proxy is created; assuming that the required type is an interface.
When the tuple is serialized, it will be give the same type as when it was deserialized, even if that class is not available.
Methods following our getter
/setter
convention will be treated as getters
and setters
.
This feature is needed for a service that stores and passes on data, for classes it might not have in its class path.
Note
|
This is not garbage collection free, but if the volume is low, this may be easier to work with. |
Note
|
This only works when the expected type is not a class. |
@Test
public void unknownType() throws NoSuchFieldException {
Marshallable marshallable = Wires.tupleFor(Marshallable.class, "UnknownType");
marshallable.setField("one", 1);
marshallable.setField("two", 2.2);
marshallable.setField("three", "three");
String toString = marshallable.toString();
assertEquals("!UnknownType {\n" +
" one: !int 1,\n" +
" two: 2.2,\n" +
" three: three\n" +
"}\n", toString);
Object o = Marshallable.fromString(toString);
assertEquals(toString, o.toString());
}
@Test
public void unknownType2() {
String text = "!FourValues {\n" +
" string: Hello,\n" +
" num: 123,\n" +
" big: 1e6,\n" +
" also: extra\n" +
"}\n";
ThreeValues tv = Marshallable.fromString(ThreeValues.class, text);
assertEquals(text, tv.toString());
assertEquals("Hello", tv.string());
tv.string("Hello World");
assertEquals("Hello World", tv.string());
assertEquals(123, tv.num());
tv.num(1234);
assertEquals(1234, tv.num());
assertEquals(1e6, tv.big(), 0.0);
tv.big(0.128);
assertEquals(0.128, tv.big(), 0.0);
assertEquals("!FourValues {\n" +
" string: Hello World,\n" +
" num: !int 1234,\n" +
" big: 0.128,\n" +
" also: extra\n" +
"}\n", tv.toString());
}
interface ThreeValues {
ThreeValues string(String s);
String string();
ThreeValues num(int n);
int num();
ThreeValues big(double d);
double big();
}
@Test
public void testUnknownClass() {
Wire wire2 = new TextWire(Bytes.elasticHeapByteBuffer(256));
MRTListener writer2 = wire2.methodWriter(MRTListener.class);
String text = "top: !UnknownClass {\n" +
" one: 1,\n" +
" two: 2.2,\n" +
" three: words\n" +
"}\n" +
"---\n" +
"top: {\n" +
" one: 11,\n" +
" two: 22.2,\n" +
" three: many words\n" +
"}\n" +
"---\n";
Wire wire = TextWire.from(text);
MethodReader reader = wire.methodReader(writer2);
assertTrue(reader.readOne());
assertTrue(reader.readOne());
assertFalse(reader.readOne());
assertEquals(text, wire2.toString());
}
To support filtering, you need to make sure the first of multiple arguments can be used to filter the method call. If you have only one argument, you may need to add an additional argument to support efficient filtering.
This feature calls an implementation of MethodFilterOnFirstArg
to see if the rest of the method call should be parsed. For example, today you have:
interface MyInterface {
void method(ExpensiveDto dto);
}
This can be migrated to:
interface MyInterface extends MethodFilterOnFirstArg<String> {
@Deprecated
void method(ExpensiveDto dto);
void method2(String filter, ExpensiveDto dto);
}
where the implementation can look like this:
class MyInterfaceImpl extends MyInterface {
public void method(ExpensiveDto dto) {
// something
}
public void method2(String filter, ExpensiveDto dto) {
method(dto);
}
public boolean ignoreMethodBasedOnFirstArg(String methodName, String filter) {
return someConditionOn(methodName, filter);
}
}
For an example, see net.openhft.chronicle.wire.MethodFilterOnFirstArgTest
Chronicle Wire can be used for:
-
file headers
-
TCP connection headers; where the optimal wire format taht is actually used can be negotiated
-
message/excerpt contents
-
Chronicle Queue version 4.x and later
-
the API for marshalling generated data types.
Simple Binary Encoding (SBE) is designed to be a more efficient replacement for FIX. It is not limited to FIX protocols, and can be easily extended by updating an XML schema. It is simple, binary, and it supports C++ and Java.
XML, when it first started, did not use XML for its own schema files, and it is not insignificant that SBE does not use SBE for its schema either. This is because it is not trying to be human readable. It has XML which, though standard, is not designed to be human readable either. Chronicle believes that it is a limitation that it does not naturally lend itself to a human readable form.
The encoding that SBE uses is similar to binary; with field numbers and fixed-width types.
SBE assumes the field types, which can be more compact than Chronicle Wire’s most similar option; though not as compact as others.
SBE has support for schema changes provided that the type of a field doesn’t change.
Message Pack is a packed binary wire format which also supports JSON
for human readability and compatibility. It has many similarities to the binary (and JSON
) formats of this library. Chronicle Wire is designed to be human readable first, based on YAML
, and has a range of options to make it more efficient. The most extreme being fixed position binary.
Message Pack has support for embedded binary, whereas Chronicle Wire has support for comments and hints, to improve rendering for human consumption.
The documentation looks well thought out, and it is worth emulating.
Feature |
Wire Text |
Wire Binary |
Protobuf |
Cap’n Proto |
SBE |
FlatBuffers |
Schema evolution |
yes |
yes |
yes |
yes |
caveats |
yes |
Zero-copy |
yes |
yes |
no |
yes |
yes |
yes |
Random-access reads |
yes |
yes |
no |
yes |
no |
yes |
Random-access writes |
yes |
yes |
no |
? |
no |
? |
Safe against malicious input |
yes |
yes |
yes |
yes |
yes |
opt-in / upfront |
Reflection / generic algorithms |
yes |
yes |
yes |
yes |
yes |
yes |
Initialization order |
any |
any |
any |
any |
preorder |
bottom-up |
Unknown field retention |
yes |
yes |
yes |
yes |
no |
no |
Object-capability RPC system |
yes |
yes |
no |
yes |
no |
no |
Schema language |
no |
no |
custom |
custom |
XML |
custom |
Usable as mutable state |
yes |
yes |
yes |
no |
no |
no |
Padding takes space on wire? |
optional |
optional |
no |
optional |
yes |
yes |
Unset fields take space on wire? |
optional |
optional |
no |
yes |
yes |
no |
Pointers take space on wire? |
no |
no |
no |
yes |
no |
yes |
C++ |
planned |
planned |
yes |
yes (C++11)* |
yes |
yes |
Java |
Java 8 |
Java 8 |
yes |
yes* |
yes |
yes |
C# |
yes |
yes |
yes |
yes* |
yes |
yes* |
Go |
no |
no |
yes |
yes |
no |
yes* |
Other languages |
no |
no |
6+ |
others* |
no |
no |
Authors' preferred use case |
distributed computing |
financial / trading |
distributed computing |
platforms / sandboxing |
financial / trading |
games |
Note
|
The Binary YAML format can be automatically converted to YAML without any knowledge of the schema, because the messages are self-describing.
|
Note
|
You can parse all the expected fields (if any) and then parse any remaining fields. As YAML supports object field names (or keys), these could be strings or even objects as keys and values.
|
Note
|
It not clear what padding which does not take up space on the wire means. |
See https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html for a comparison to other encoders.
Wire optionally supports:
-
field name changes
-
field order changes
-
capturing or ignoring unexpected fields
-
setting of fields to the default, if not available
-
raw messages can be longer or shorter than expected
The more flexibility, the larger the overhead in terms of CPU and memory. Chronicle Wire allows you to dynamically pick the optimal configuration, and convert between these options.
Chronicle Wire supports zero-copy random access to fields, and direct-copy from in-memory to the network. It also supports translation from one wire format to another. For example, switching between fixed-length data and variable-length data.
You can access a random field in memory, For example, in a 2TB
file, page-in/pull-into CPU cache, only the data relating to your read or write.
format | access style |
---|---|
fixed-length binary |
random access without parsing first |
variable-length binary |
random access with partial parsing allowing you to skip large portions |
fixed-length text |
random access with parsing |
variable-length text |
no random access |
Chronicle Wire references are relative to the start of the data contained, to allow loading in an arbitrary point in memory.
Chronicle Wire has built in tiers of bounds checks to prevent accidental read/writing that corrupts the data. It is not complete enough for a security review.
Chronicle Wire supports generic reading and writing of an arbitrary stream. This can be used in combination with predetermined fields.
For example, you can read the fields you know about, and ask it to provide the fields that you do not.
You can also give generic field names like keys to a map as YAML
does.
Chronicle Wire can handle unknown information like lengths, by using padding. It will go back and fill in any data that it was not aware of when it was writing the data. For example, when it writes an object, it does not know how long it is going to be, so it adds padding at the start. Once the object has been written, it goes back and overwrites the length. It can also handle situations where the length was more than needed; this is known as packing.
Chronicle Wire can read data that it did not expect, interspersed with data it did expect. Rather than specify the expected field name, a StringBuilder
is provided.
Note: There are times when you want to skip/copy an entire field or message, without reading any more of it. This is also supported.
Chronicle Wire supports references based on name, number, or UUID. This is useful when including a reference to an object that the reader should look up by other means.
A common case is if you have a proxy to a remote object, and you want to pass or return this in an RPC call.
Chronicle Wire’s schema is not externalised from the code. However it is planned to use YAML
in a format that it can parse.
Chronicle Wire supports storing an application’s internal state. This will not allow it to grow or shrink. You can’t free any of it without copying the pieces that you need, and discarding the original copy.
The Chronicle Wire format that is chosen determines if there is any padding on the wire. If you copy the in-memory data directly, its format does not change.
If you want to drop padding, you can copy the message to a wire format without padding. You can decide whether the original padding is to be preserved or not, if turned back into a format with padding.
We could look at supporting Cap’n’Proto's zero-byte removal compression.
Chronicle Wire supports fields with, and without, optional fields, and automatic means of removing them. Chronicle Wire does not support automatically adding them back in, because information has been lost.
Chronicle Wire does not have pointers, but it does have content-lengths which are a useful hint for random access and robustness; but these are optional.
Chronicle Wire supports Java 8
. Future versions may support Java 9
, C++
, and C#
.