Skip to content

Commit

Permalink
Cleaned up Specification a bit
Browse files Browse the repository at this point in the history
  • Loading branch information
Kevin Krone authored and Kevin Krone committed Jun 1, 2023
1 parent ce7889b commit fb6c9cd
Showing 1 changed file with 23 additions and 44 deletions.
67 changes: 23 additions & 44 deletions content/docs/specification.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: "Specification"
weight: 4
# bookFlatSection: false
# bookToc: true
bookToc: false
# bookHidden: false
# bookCollapseSection: false
# bookComments: false
Expand All @@ -12,50 +12,29 @@ math: true

# Specification

SCALE defines encodings for the most elementary types. Encodings for more complex types are obtained by concatenating the encodings of their constituents – that is, the simple types that form these respective compound types. Encodings for variable-length types have length data prepended.
SCALE defines encodings for native Rust types, and constructs encodings for composite types, such as structs, by concatenating the encodings of their constituents – that is, the elementary types that form these respective complex types. Additionally, some variable-length types are encoded with their length prefixed. In this way, the encoding of any type can be simplified to the concatenation of encodings of less complex types.

This table offers a concise overview of the SCALE codec. For more detailed, hands-on explanations, please refer to the [encode section]({{< ref "/docs/encode" >}}). For the formal specification, please refer to the [Polkadot specification](https://spec.polkadot.network/id-cryptography-encoding#sect-scale-codec). Both the intermediary hexadecimal representation and the final byte-array SCALE encoding are provided to enhance readability.
This table offers a concise overview of the SCALE codec with examples. For more detailed, hands-on explanations, please refer to the [encode section]({{< ref "/docs/encode" >}}). For the formal specification, please refer to the relevant page of the [Polkadot specification](https://spec.polkadot.network/id-cryptography-encoding#sect-scale-codec).

| Data type | Encoding Description | SCALE decoded value | SCALE encoded value |
| -- | -- | -- | -- |
| Unit | Encoded as an empty byte array. | `()` | `` |
| | | | `[]` |
| Boolean | Encoded using the least significant bit of a single byte. | `true` | `0x01` |
| | | | `[01]` |
| | | `false`| `0x00` |
| | | | `[00]` |
| Integer | By default integers are encoded using a fixed-width little-endian format. | `69i8` | `0x45`|
| | | | `[45]` |
| | | `69i16`| `0x4500`|
| | | | `[00, 45]` |
| | | `42u16`| `0x2a00`|
| | | | `[00, 2a]` |
| | Non-negative integers $n$ also have a compact encoding. There are four modes.| | | |
| | Single-byte mode: Upper six bits are the LE encoding of the value. For $0 \leq n \leq 2^6 - 1$. |`0u8` | `0x0`|
| | | | `[00]` |
| | Two-byte mode: Upper six bits and the following byte is the LE encoding of the value. For $2^6 \leq n \leq 2^{14} - 1$. |`69u8` | `0x1501`|
| | | | `[01, 15]` |
| | Four-byte mode: Upper six bits and the following three bytes are the LE encoding of the value. For $2^{14} \leq n \leq 2^{30} - 1$. |`65535u32` | `0xfeff0300`|
| | | | `[00, 03, ff, fe]` |
| | Big-integer mode: The upper six bits are the number of bytes following, plus four. The value is contained, LE encoded, in the bytes following. The final (most significant) byte must be non-zero. For $2^{30} \leq n \leq 2^{536} - 1$. |`65535u32` | `0xfeff0300`|
| | | | `[00, 03, ff, fe]` |
| Vector | Encoded by concatening the encodings of its items and prepending with the compactly encoded length of the vector. |`vec![1u8, 2u8, 4u8]` | `0x0c010204`|
| | | | `[04, 02, 01, 0c]`|
| String | Encoded as `Vec<u8>` with UTF-8 characters. | `"SCALE♡"` | `0x205343414c45e299a1` |
| | | | `[a1, 99, e2, 45, 4c, 41, 43, 53, 20]`|
| Tuple, Struct, Array | All three types are encoded by concatenating the encodings of their respective elements consecutively. |`(1u16, true, "OK")` | `0x010001084f4b`|
| | | | `[4b, 4f, 08, 01, 00, 01]`|
| | | `MyStruct{id: 1u16, is_val: true, msg: "OK"}`| `0x010001084f4b` |
| | | | `[4b, 4f, 08, 01, 00, 01]`|
| | |`[64u16, 512u16]` | `0x40000002`|
| | | | `[02, 00, 00, 40]`|
| Result | Results are encoded by prepending the encoded inner value with `0x00` if the operation was successful and `0x01` if the operation was unsuccessful. |`Ok(42)` | `0x002a`|
| | | | `[2a, 00]` |
| | |`Err(false)` | `0x0100`|
| | | | `[00, 01]`|
| Option | Options are encoded by prepending the inner encoded value of `Some` with `0x01` and encoding `None` as `0x00`. |`Some(69u8)` | `0x0100`|
| | | | `[00, 01]`|
| | |`None` | `0x00`|
| | | | `[00]`|
| Enum | Enums are encoded by prepending the relevant u8 index, followed by the value if present. | `Example::Second(8u16)` | `0x010800` |
| | | | `[00, 08, 01]`|
| Unit | Encoded as an empty byte array. | `()` | `[]` |
| Boolean | Encoded using the least significant bit of a single byte. | `true` | `[01]` |
| | | `false`| `[00]` |
| Integer | By default integers are encoded using a fixed-width little-endian format. | `69i8` | `[2a]` |
| | | `69u32`| `[45, 00, 00, 00]`|
| | Unsigned integers $n$ also have a compact encoding. There are four modes. | | | |
| | Single-byte mode: Upper six bits are the LE encoding of the value. For $0 \leq n \leq 2^6 - 1$. |`0u8` | `[00]` |
| | Two-byte mode: Upper six bits and the following byte is the LE encoding of the value. For $2^6 \leq n \leq 2^{14} - 1$. |`69u8` | `[15, 01]` |
| | Four-byte mode: Upper six bits and the following three bytes are the LE encoding of the value. For $2^{14} \leq n \leq 2^{30} - 1$. |`65535u32` | `[fe, ff, 03, 00]` |
| | Big-integer mode: The upper six bits are the number of bytes following, plus four. The value is contained, LE encoded, in the bytes following. The final (most significant) byte must be non-zero. For $2^{30} \leq n \leq 2^{536} - 1$. |`1073741824u64` | `[03, 00, 00, 00, 40]` |
| Vector | Encoded by concatening the encodings of its items and prefixing with the compactly encoded length of the vector. |`vec![1u8, 2u8, 4u8]` | `[0c, 01, 02, 04]` |
| String | Encoded as `Vec<u8>` with UTF-8 characters. | `"SCALE♡"` | `[20, 53, 43, 41, 4c, 45, e2, 99, a1]` |
| Tuple, Struct, Array | Encoded by concatenating the encodings of their respective elements consecutively. |`(1u8, true, "OK")` | `[01, 01, 08, 4f, 4b]` |
| | | `MyStruct{id: 1u8, is_val: true, msg: "OK"}`| `[01, 01, 08, 4f, 4b]` |
| | |`[64u16, 512u16]` | `[40, 00, 00, 02]` |
| Result | Encoded by prefixing the encoded inner value with `0x00` if the operation was successful, and `0x01` if the operation was unsuccessful. |`Ok::<u32, ()>(42u32)` | `[00, 2a, 00, 00, 00]` |
| | |`Err::<u32, ()>(())` | `[01]` |
| Option | Encoded by prefixing the inner encoded value of `Some` with `0x01` and encoding `None` as `0x00`. |`Some(69u8)` | `[01, 45]` |
| | | `None::<u8>` | `[00]` |
| Enum | Encoded by the `u8`-index of the respective variant, followed by the encoded value if it is present. | `Example::Second(8u16)` | `[01, 08, 00]`|

0 comments on commit fb6c9cd

Please sign in to comment.