forked from apache/kudu
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[perf] KUDU-3140 Heuristics to disable predicate evaluation for Bloom…
… filter Column predicate evaluation can be expensive and ineffective column predicates can waste CPU. TPCH Q9 exhibits significant regression of 50-96% on enabling Bloom filter predicates. See KUDU-3140 for details. Excerpt from TPCH run exhibiting regression: https://gist.github.com/bbhavsar/943cf8ebbab63f598353efef8f87db32 TPCH Q9 specific info: https://gist.github.com/bbhavsar/811ccbe0cd144090f82bdabcd801f827 This change adds simple heuristic taken from HDFS scanner in Impala that basically checks for every 16 blocks and if a predicate has rejected less than 10% of the rows scanned then disables the predicate. To match the equivalent number of rows in Kudu, the check is made every 128 blocks by default. The stats collection and enforcement is enabled only for disableable predicate types, Bloom filter for now. With Bloom filter predicate type, false positives are expected so client is expected to do further filtering to remove false positives. Kudu makes the decision to disable the predicate independently and doesn't inform the client in this change which is okay for Bloom filter given the rationale above. Client API docs have been updated accordingly. Added a tablet level metric to track disabled column predicates. Tests with PS6: - TPCH no longer reports regression with Q9. With multiple runs, the delta are +1.95%, -24.67%, +2.67%, -17.09%, -14.59% with a std dev of 17% - 38% to report it neither as improvement nor as regression. https://gist.github.com/bbhavsar/0a773359b9225f014d353759a535c5be - Improvements with other queries reported before this change remain intact. Change-Id: I10197800a01a1b34c7821ac879caf8d272cab8dd Reviewed-on: http://gerrit.cloudera.org:8080/16036 Tested-by: Kudu Jenkins Reviewed-by: Andrew Wong <[email protected]> Reviewed-by: Alexey Serbin <[email protected]>
- Loading branch information
Showing
17 changed files
with
768 additions
and
75 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.