forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-41413][SQL] Avoid shuffle in Storage-Partitioned Join when par…
…tition keys mismatch, but join expressions are compatible ### What changes were proposed in this pull request? This enhances Storage Partitioned Join by handling mismatch partition keys from both sides of the join and skip shuffle in certain cases. ### Why are the changes needed? Currently in Storage Partitioned Join, when the partition transform expressions match, but the partition keys don't, we'd still fallback to shuffle. This is not necessary since we can find out the common set of partition keys and populate that to the scan nodes. On the scan node, those missing partition keys can be filled with empty partitions. The above scenario is pretty common for `MERGE INTO` queries, as the changing data to be merged into the base table often need to be applied to new partitions. The current implementation will cause these queries to trigger shuffle and thus become expensive. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added a few new tests in `KeyGroupedPartitioningSuite`. Closes apache#38950 from sunchao/SPARK-41413. Authored-by: Chao Sun <[email protected]> Signed-off-by: Chao Sun <[email protected]>
- Loading branch information
Showing
12 changed files
with
310 additions
and
36 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.