Skip to content

Commit

Permalink
Deflake ClientTest.TestServerTooBusyRetry
Browse files Browse the repository at this point in the history
ClientTest.TestServerTooBusyRetry is a mess of a test. In TSAN mode,
there are less rows inserted, so scans require less round trips to
complete, but at the same time threads start slower, so the number of
scans in-flight at once will tend to be lower. This causes the test to
occasionally fail to cause a service queue overflow, as it is intended
to do. Eventually, the test fails because TSAN has an upper bound on the
number of threads that can be created in the lifetime of a single TSAN
process, and the test slowly creates scan threads.

This patch attempts to address the problem by raising the scan batch
latency in TSAN mode. With this patch, I saw 0 failures in 1000 runs.
Without it, I got tired of waiting for 850/1000 to finish after 15
minutes.

This is a quick fix. In the future someone should consider a more
serious rewrite of this test.

Change-Id: Id4d2ee077e9d107fb475c399af5690084bdeef49
Reviewed-on: http://gerrit.cloudera.org:8080/13200
Reviewed-by: Adar Dembo <[email protected]>
Tested-by: Kudu Jenkins
  • Loading branch information
wdberkeley committed Apr 30, 2019
1 parent 762ae0f commit d4f16fc
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions src/kudu/client/client-test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -5251,7 +5251,11 @@ TEST_F(ClientTest, TestServerTooBusyRetry) {

// Introduce latency in each scan to increase the likelihood of
// ERROR_SERVER_TOO_BUSY.
#ifdef THREAD_SANITIZER
FLAGS_scanner_inject_latency_on_each_batch_ms = 100;
#else
FLAGS_scanner_inject_latency_on_each_batch_ms = 10;
#endif

// Reduce the service queue length of each tablet server in order to increase
// the likelihood of ERROR_SERVER_TOO_BUSY.
Expand Down

0 comments on commit d4f16fc

Please sign in to comment.