Skip to content

Commit

Permalink
raft: Don't keep full json objects in memory if no longer needed.
Browse files Browse the repository at this point in the history
Raft log entries (and raft database snapshot) contains json objects
of the data.  Follower receives append requests with data that gets
parsed and added to the raft log.  Leader receives execution requests,
parses data out of them and adds to the log.  In both cases, later
ovsdb-server reads the log with ovsdb_storage_read(), constructs
transaction and updates the database.  On followers these json objects
in common case are never used again.  Leader may use them to send
append requests or snapshot installation requests to followers.
However, all these operations (except for ovsdb_storage_read()) are
just serializing the json in order to send it over the network.

Json objects are significantly larger than their serialized string
representation.  For example, the snapshot of the database from one of
the ovn-heater scale tests takes 270 MB as a string, but 1.6 GB as
a json object from the total 3.8 GB consumed by ovsdb-server process.

ovsdb_storage_read() for a given raft entry happens only once in a
lifetime, so after this call, we can serialize the json object, store
the string representation and free the actual json object that ovsdb
will never need again.  This can save a lot of memory and can also
save serialization time, because each raft entry for append requests
and snapshot installation requests serialized only once instead of
doing that every time such request needs to be sent.

JSON_SERIALIZED_OBJECT can be used in order to seamlessly integrate
pre-serialized data into raft_header and similar json objects.

One major special case is creation of a database snapshot.
Snapshot installation request received over the network will be parsed
and read by ovsdb-server just like any other raft log entry.  However,
snapshots created locally with raft_store_snapshot() will never be
read back, because they reflect the current state of the database,
hence already applied.  For this case we can free the json object
right after writing snapshot on disk.

Tests performed with ovn-heater on 60 node density-light scenario,
where on-disk database goes up to 97 MB, shows average memory
consumption of ovsdb-server Southbound DB processes decreased by 58%
(from 602 MB to 256 MB per process) and peak memory consumption
decreased by 40% (from 1288 MB to 771 MB).

Test with 120 nodes on density-heavy scenario with 270 MB on-disk
database shows 1.5 GB memory consumption decrease as expected.
Also, total CPU time consumed by the Southbound DB process reduced
from 296 to 256 minutes.  Number of unreasonably long poll intervals
reduced from 2896 down to 1934.

Acked-by: Dumitru Ceara <[email protected]>
Acked-by: Han Zhou <[email protected]>
Signed-off-by: Ilya Maximets <[email protected]>
  • Loading branch information
igsilya committed Aug 31, 2021
1 parent b0bca6f commit 0de8829
Show file tree
Hide file tree
Showing 6 changed files with 160 additions and 64 deletions.
11 changes: 7 additions & 4 deletions ovsdb/ovsdb-tool.c
Original file line number Diff line number Diff line change
Expand Up @@ -919,7 +919,8 @@ print_raft_header(const struct raft_header *h,
if (!uuid_is_zero(&h->snap.eid)) {
printf(" prev_eid: %04x\n", uuid_prefix(&h->snap.eid, 4));
}
print_data("prev_", h->snap.data, schemap, names);
print_data("prev_", raft_entry_get_parsed_data(&h->snap),
schemap, names);
}
}

Expand Down Expand Up @@ -973,11 +974,13 @@ raft_header_to_standalone_log(const struct raft_header *h,
struct ovsdb_log *db_log_data)
{
if (h->snap_index) {
if (!h->snap.data || json_array(h->snap.data)->n != 2) {
const struct json *data = raft_entry_get_parsed_data(&h->snap);

if (!data || json_array(data)->n != 2) {
ovs_fatal(0, "Incorrect raft header data array length");
}

struct json_array *pa = json_array(h->snap.data);
struct json_array *pa = json_array(data);
struct json *schema_json = pa->elems[0];
struct ovsdb_error *error = NULL;

Expand Down Expand Up @@ -1373,7 +1376,7 @@ do_check_cluster(struct ovs_cmdl_context *ctx)
}
struct raft_entry *e = &s->entries[log_idx];
e->term = r->term;
e->data = r->entry.data;
raft_entry_set_parsed_data_nocopy(e, r->entry.data);
e->eid = r->entry.eid;
e->servers = r->entry.servers;
break;
Expand Down
95 changes: 83 additions & 12 deletions ovsdb/raft-private.c
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,14 @@

#include "raft-private.h"

#include "coverage.h"
#include "openvswitch/dynamic-string.h"
#include "ovsdb-error.h"
#include "ovsdb-parser.h"
#include "socket-util.h"
#include "sset.h"

COVERAGE_DEFINE(raft_entry_serialize);

/* Addresses of Raft servers. */

Expand Down Expand Up @@ -281,7 +284,8 @@ void
raft_entry_clone(struct raft_entry *dst, const struct raft_entry *src)
{
dst->term = src->term;
dst->data = json_nullable_clone(src->data);
dst->data.full_json = json_nullable_clone(src->data.full_json);
dst->data.serialized = json_nullable_clone(src->data.serialized);
dst->eid = src->eid;
dst->servers = json_nullable_clone(src->servers);
dst->election_timer = src->election_timer;
Expand All @@ -291,7 +295,8 @@ void
raft_entry_uninit(struct raft_entry *e)
{
if (e) {
json_destroy(e->data);
json_destroy(e->data.full_json);
json_destroy(e->data.serialized);
json_destroy(e->servers);
}
}
Expand All @@ -301,8 +306,9 @@ raft_entry_to_json(const struct raft_entry *e)
{
struct json *json = json_object_create();
raft_put_uint64(json, "term", e->term);
if (e->data) {
json_object_put(json, "data", json_clone(e->data));
if (raft_entry_has_data(e)) {
json_object_put(json, "data",
json_clone(raft_entry_get_serialized_data(e)));
json_object_put_format(json, "eid", UUID_FMT, UUID_ARGS(&e->eid));
}
if (e->servers) {
Expand All @@ -323,9 +329,10 @@ raft_entry_from_json(struct json *json, struct raft_entry *e)
struct ovsdb_parser p;
ovsdb_parser_init(&p, json, "raft log entry");
e->term = raft_parse_required_uint64(&p, "term");
e->data = json_nullable_clone(
raft_entry_set_parsed_data(e,
ovsdb_parser_member(&p, "data", OP_OBJECT | OP_ARRAY | OP_OPTIONAL));
e->eid = e->data ? raft_parse_required_uuid(&p, "eid") : UUID_ZERO;
e->eid = raft_entry_has_data(e)
? raft_parse_required_uuid(&p, "eid") : UUID_ZERO;
e->servers = json_nullable_clone(
ovsdb_parser_member(&p, "servers", OP_OBJECT | OP_OPTIONAL));
if (e->servers) {
Expand All @@ -344,9 +351,72 @@ bool
raft_entry_equals(const struct raft_entry *a, const struct raft_entry *b)
{
return (a->term == b->term
&& json_equal(a->data, b->data)
&& uuid_equals(&a->eid, &b->eid)
&& json_equal(a->servers, b->servers));
&& json_equal(a->servers, b->servers)
&& json_equal(raft_entry_get_parsed_data(a),
raft_entry_get_parsed_data(b)));
}

bool
raft_entry_has_data(const struct raft_entry *e)
{
return e->data.full_json || e->data.serialized;
}

static void
raft_entry_data_serialize(struct raft_entry *e)
{
if (!raft_entry_has_data(e) || e->data.serialized) {
return;
}
COVERAGE_INC(raft_entry_serialize);
e->data.serialized = json_serialized_object_create(e->data.full_json);
}

void
raft_entry_set_parsed_data_nocopy(struct raft_entry *e, struct json *json)
{
ovs_assert(!json || json->type != JSON_SERIALIZED_OBJECT);
e->data.full_json = json;
e->data.serialized = NULL;
}

void
raft_entry_set_parsed_data(struct raft_entry *e, const struct json *json)
{
raft_entry_set_parsed_data_nocopy(e, json_nullable_clone(json));
}

/* Returns a pointer to the fully parsed json object of the data.
* Caller takes the ownership of the result.
*
* Entry will no longer contain a fully parsed json object.
* Subsequent calls for the same raft entry will return NULL. */
struct json * OVS_WARN_UNUSED_RESULT
raft_entry_steal_parsed_data(struct raft_entry *e)
{
/* Ensure that serialized version exists. */
raft_entry_data_serialize(e);

struct json *json = e->data.full_json;
e->data.full_json = NULL;

return json;
}

/* Returns a pointer to the fully parsed json object of the data, if any. */
const struct json *
raft_entry_get_parsed_data(const struct raft_entry *e)
{
return e->data.full_json;
}

/* Returns a pointer to the JSON_SERIALIZED_OBJECT of the data. */
const struct json *
raft_entry_get_serialized_data(const struct raft_entry *e)
{
raft_entry_data_serialize(CONST_CAST(struct raft_entry *, e));
return e->data.serialized;
}

void
Expand Down Expand Up @@ -402,8 +472,8 @@ raft_header_from_json__(struct raft_header *h, struct ovsdb_parser *p)
* present, all of them must be. */
h->snap_index = raft_parse_optional_uint64(p, "prev_index");
if (h->snap_index) {
h->snap.data = json_nullable_clone(
ovsdb_parser_member(p, "prev_data", OP_ANY));
raft_entry_set_parsed_data(
&h->snap, ovsdb_parser_member(p, "prev_data", OP_ANY));
h->snap.eid = raft_parse_required_uuid(p, "prev_eid");
h->snap.term = raft_parse_required_uint64(p, "prev_term");
h->snap.election_timer = raft_parse_optional_uint64(
Expand Down Expand Up @@ -455,8 +525,9 @@ raft_header_to_json(const struct raft_header *h)
if (h->snap_index) {
raft_put_uint64(json, "prev_index", h->snap_index);
raft_put_uint64(json, "prev_term", h->snap.term);
if (h->snap.data) {
json_object_put(json, "prev_data", json_clone(h->snap.data));
if (raft_entry_has_data(&h->snap)) {
json_object_put(json, "prev_data",
json_clone(raft_entry_get_serialized_data(&h->snap)));
}
json_object_put_format(json, "prev_eid",
UUID_FMT, UUID_ARGS(&h->snap.eid));
Expand Down
12 changes: 11 additions & 1 deletion ovsdb/raft-private.h
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,10 @@ void raft_servers_format(const struct hmap *servers, struct ds *ds);
* entry. */
struct raft_entry {
uint64_t term;
struct json *data;
struct {
struct json *full_json; /* Fully parsed JSON object. */
struct json *serialized; /* JSON_SERIALIZED_OBJECT version of data. */
} data;
struct uuid eid;
struct json *servers;
uint64_t election_timer;
Expand All @@ -130,6 +133,13 @@ struct json *raft_entry_to_json(const struct raft_entry *);
struct ovsdb_error *raft_entry_from_json(struct json *, struct raft_entry *)
OVS_WARN_UNUSED_RESULT;
bool raft_entry_equals(const struct raft_entry *, const struct raft_entry *);
bool raft_entry_has_data(const struct raft_entry *);
void raft_entry_set_parsed_data(struct raft_entry *, const struct json *);
void raft_entry_set_parsed_data_nocopy(struct raft_entry *, struct json *);
struct json *raft_entry_steal_parsed_data(struct raft_entry *)
OVS_WARN_UNUSED_RESULT;
const struct json *raft_entry_get_parsed_data(const struct raft_entry *);
const struct json *raft_entry_get_serialized_data(const struct raft_entry *);

/* On disk data serialization and deserialization. */

Expand Down
Loading

0 comments on commit 0de8829

Please sign in to comment.