Ripped directly from @jtgrassie 's monero-binary-rpc repo. It's a very helpful little document and I think it deserves a place in the main repo.
5 KiB
Portable Storage Format
Background
Monero makes use of a set of helper classes from a small library named epee. Part of this library implements a networking protocol called Levin, which internally uses a storage format called Portable Storage. This format (amongst the rest of the epee library), is undocumented - or rather relies on the code itself to serve as the documentation. Unfortunately, whilst the rest of the library is fairly straightforward to decipher, the Portable Storage is less-so. Hence this document.
Preliminaries
String and integer encoding
varint
Varints are used to pack integers in an portable and space optimized way. The lowest 2 bits store the amount of bytes required, which means the largest value integer that can be packed into 1 byte is 63 (6 bits).
Lowest 2 bits | Size value | Value range |
---|---|---|
b00 | 1 byte | 0 to 63 |
b01 | 2 bytes | 64 to 16383 |
b10 | 4 bytes | 16384 to 1073741823 |
b11 | 8 bytes | 1073741824 to 4611686018427387903 |
string
These are simply length (varint) prefixed char strings.
Packet format
Header
A packet starts with a header:
Header | Type | Value |
---|---|---|
Signature | 8 bytes | 0x0111010101010201 |
Version | byte | 0x01 |
Section
Next we have a root object (or section as the library calls it). This is a map of name-value pairs called entries. It starts with a count:
Section | Type |
---|---|
Entry count | varint |
Which is followed by the section's name-value entries sequentially:
Entry
Entry | Type |
---|---|
Name | string1 |
Type | byte |
Count2 | varint |
Value(s) | (type dependant data) |
1 Note, the string used for the entry name is not prefixed with a varint, it is prefixed with a single byte to specify the length of the name. This means an entry name cannot be more that 255 chars, which seems a reasonable restriction.
2 Note, this is only present if the entry type has the array flag (see below).
Entry types
The types defined are:
#define SERIALIZE_TYPE_INT64 1
#define SERIALIZE_TYPE_INT32 2
#define SERIALIZE_TYPE_INT16 3
#define SERIALIZE_TYPE_INT8 4
#define SERIALIZE_TYPE_UINT64 5
#define SERIALIZE_TYPE_UINT32 6
#define SERIALIZE_TYPE_UINT16 7
#define SERIALIZE_TYPE_UINT8 8
#define SERIALIZE_TYPE_DUOBLE 9
#define SERIALIZE_TYPE_STRING 10
#define SERIALIZE_TYPE_BOOL 11
#define SERIALIZE_TYPE_OBJECT 12
#define SERIALIZE_TYPE_ARRAY 13
The entry type can be bitwise OR'ed with a flag:
#define SERIALIZE_FLAG_ARRAY 0x80
This signals there are multiple values for the entry. When we are dealing with an array, the next value is a varint specifying the array length followed by the array item values. For example:
name, type, count, value1, value2,..., valuen
Entry values
It's important to understand that entry values can be encoded any way in which an implementation chooses. For example, the integers can be in either big or little endian byte order.
Entry values which are objects (i.e. SERIALIZE_TYPE_OBJECT
), are stored as
sections.
Note, I have not yet seen the type SERIALIZE_TYPE_ARRAY
in use. My assumption
is this would be used for untyped arrays and so subsequent entries could be of
any type.
Monero specifics
Entry values
Strings
These are prefixed with a varint to specify the string length.
Integers
These are stored little endian byte order.
Hashes, Keys, Blobs
These are stored as strings, SERIALIZE_TYPE_STRING
.
STL containers (vector, list)
These can be arrays of standard integer types, strings or
SERIALIZE_TYPE_OBJECT
's for structs.