forked from apache/pulsar
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Java Client] Optimize batch flush scheduling (apache#14185)
* Java client: only schedule batch when there are pending messages * Update variable name; handle reconnect case * Remove code duplication * Fix method name * Improve method name; add tests; fix bugs raised by tests * Remove comment that is now incorrect * Chain batch flush task when task triggers delivery * Prevent early batch flush by storing lastBatchSendNanos * Refactor getPendingQueueSize * Remove unintentionally committed comments * Remove premature optimization to cancle and reschedule batchFlushTask * Ensure buffered batchMessageContainer messages can fail due to send timeout * Fix comment to match new design * Fix batch message counting log line * Guard against null batchMessageContainer * Fix send timeout logic to fire when batching is disabled Master Issue: apache#11100 ### Motivation As observed in apache#11100, the Java client producer consumes cpu even when doing nothing. This is especially true when using many producers. Instead, we can make it so that the batch timer is only scheduled when it has messages to deliver or just fired and delivered messages. If there are concerns about this optimization, I will need to take some time to complete benchmarks. ### Modifications * Skip message batch delivery if the producer is not in `Ready` state. As a consequence, ensure that messages pending in the `batchMessageContainer` are still eligible to fail for `sendTimeout`. * If there is no current batch flush task, schedule a single flush task when a message is added to a batch message container (assuming the message does not also trigger the delivery of the batch and the producer is in READY state). * Schedule another batch flush task if and only if the batch flush task itself was responsible for sending messages. * Keep track of `lastBatchSendNanoTime`, and only flush a batch message container if the `BatchingMaxPublishDelayMicros` time has passed since the last send time. * Note that the timer task is only ever updated within `synchronized (this)` block, so we are guaranteed to have a consistent view free of race conditions. (There is one race condition, and that is that an existing batch timer might get canceled while it is starting. This is of little consequence since it'll result in either no delivery or a small batch.) ### Verifying this change There are existing tests that cover batch message delivery. Specifically, the `BatchMessageTest` in the `pulsar-broker` module covers these changes. ### Does this pull request potentially affect one of the following parts: This is a backwards compatible change. ### Documentation - [x] `no-need-doc`
- Loading branch information
1 parent
0fe921f
commit b21e548
Showing
2 changed files
with
148 additions
and
45 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters