Skip to content

Commit

Permalink
[c++ client] AUTO_FLUSH_BACKGROUND optimizations
Browse files Browse the repository at this point in the history
Optimizations after initial performance testing of the
Kudu C++ client library with AUTO_FLUSH_BACKGROUND flush mode support.

The most important tuning is the default flush watermark
for the mutation buffer.  Changing it from 80% to 50% gave near 30%
performance boost in throughput for scenarios when a client pushes
data to the server as fast as it can, using workloads of 8M rows like
(int64, int32, string, string, int) where strings are about 32 bytes
long in average.  Each thread ran its own single-session KuduClient,
where each session was running in AUTO_FLUSH_BACKGROUND flush mode.

1-thread insertion (8M rows per thread)
  80% watermark:
    total  : 35229.7 ms
    per row: 0.00440372 ms

  50% watermark:
    total  : 22562.8 ms
    per row: 0.00282035 ms

2-thread insertion (4M rows per thread)
  80% watermark:
    total  : 19683.6 ms
    per row: 0.00246046 ms

  50% watermark:
    total  : 12931.8 ms
    per row: 0.00161647 ms

4-thread insertion (2M rows per thread)
  80% watermark:
    total  : 11941.9 ms
    per row: 0.00149274 ms

  50% watermark:
    total  : 7724.68 ms
    per row: 0.000965585 ms

Other related session parameters:
  mutation buffer size:       7M       (default)
  maximum number of batchers: 2        (default)
  time-based flush interval:  1 second (default)

The tests were run dual Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
(12 cores per CPU) with 98GiB of memory.

This is a follow-up for 93be131.

Change-Id: I1f0aa6d02c51bb063498709e8570e8c7214a31a0
Reviewed-on: http://gerrit.cloudera.org:8080/4308
Tested-by: Kudu Jenkins
Tested-by: Alexey Serbin <[email protected]>
Reviewed-by: David Ribeiro Alves <[email protected]>
  • Loading branch information
alexeyserbin committed Sep 16, 2016
1 parent af7f9c8 commit 7959cd4
Show file tree
Hide file tree
Showing 5 changed files with 10 additions and 13 deletions.
4 changes: 0 additions & 4 deletions src/kudu/client/batcher.cc
Original file line number Diff line number Diff line change
Expand Up @@ -510,10 +510,6 @@ void Batcher::FlushAsync(KuduStatusCallback* cb) {
FlushBuffersIfReady();
}

int64_t Batcher::GetOperationSizeInBuffer(KuduWriteOperation* write_op) {
return write_op->SizeInBuffer();
}

Status Batcher::Add(KuduWriteOperation* write_op) {
// As soon as we get the op, start looking up where it belongs,
// so that when the user calls Flush, we are ready to go.
Expand Down
4 changes: 3 additions & 1 deletion src/kudu/client/batcher.h
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,9 @@ class Batcher : public RefCountedThreadSafe<Batcher> {
}

// Compute in-buffer size for the given write operation.
static int64_t GetOperationSizeInBuffer(KuduWriteOperation* write_op);
static int64_t GetOperationSizeInBuffer(KuduWriteOperation* write_op) {
return write_op->SizeInBuffer();
}

private:
friend class RefCountedThreadSafe<Batcher>;
Expand Down
2 changes: 1 addition & 1 deletion src/kudu/client/client.h
Original file line number Diff line number Diff line change
Expand Up @@ -1244,7 +1244,7 @@ class KUDU_EXPORT KuduSession : public sp::enable_shared_from_this<KuduSession>
/// when running in AUTO_FLUSH_BACKGROUND mode: once the specified threshold
/// is reached, the session starts sending the accumulated write operations
/// to the appropriate tablet servers. By default, the buffer flush watermark
/// is to to 80%.
/// is to to 50%.
///
/// @note This setting is applicable only for AUTO_FLUSH_BACKGROUND sessions.
/// I.e., calling this method in other flush modes is safe, but
Expand Down
8 changes: 4 additions & 4 deletions src/kudu/client/session-internal.cc
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ KuduSession::Data::Data(shared_ptr<KuduClient> client,
batchers_num_(0),
batchers_num_limit_(2),
buffer_bytes_limit_(7 * 1024 * 1024),
buffer_watermark_pct_(80),
buffer_watermark_pct_(50),
buffer_bytes_used_(0),
buffer_pre_flush_enabled_(true) {
}
Expand Down Expand Up @@ -323,10 +323,10 @@ Status KuduSession::Data::ApplyWriteOp(
sp::weak_ptr<KuduSession> weak_session,
KuduWriteOperation* write_op) {

if (!write_op) {
if (PREDICT_FALSE(!write_op)) {
return Status::InvalidArgument("NULL operation");
}
if (!write_op->row().IsKeySet()) {
if (PREDICT_FALSE(!write_op->row().IsKeySet())) {
Status status = Status::IllegalState(
"Key not specified", write_op->ToString());
error_collector_->AddError(
Expand Down Expand Up @@ -358,7 +358,7 @@ Status KuduSession::Data::ApplyWriteOp(
// A sanity check: before trying to validate against any of run-time metrics,
// verify that the single operation can fit into an empty buffer
// given the restriction on the buffer size.
if (required_size > max_size) {
if (PREDICT_FALSE(required_size > max_size)) {
Status s = Status::Incomplete(strings::Substitute(
"buffer size limit is too small to fit operation: "
"required $0, size limit $1",
Expand Down
5 changes: 2 additions & 3 deletions src/kudu/client/write_op.cc
Original file line number Diff line number Diff line change
Expand Up @@ -77,9 +77,8 @@ int64_t KuduWriteOperation::SizeInBuffer() const {
size += schema->column(i).type_info()->size();
if (schema->column(i).type_info()->physical_type() == BINARY) {
ContiguousRow row(schema, row_.row_data_);
Slice bin;
memcpy(&bin, row.cell_ptr(i), sizeof(bin));
size += bin.size();
const Slice* bin = reinterpret_cast<const Slice*>(row.cell_ptr(i));
size += bin->size();
}
}
}
Expand Down

0 comments on commit 7959cd4

Please sign in to comment.