[SPARK-32299] [SQL] Decide SMJ Join Orientation adaptively#29097
[SPARK-32299] [SQL] Decide SMJ Join Orientation adaptively#29097mayurdb wants to merge 8 commits intoapache:masterfrom
Conversation
This reverts commit 89664b4.
|
Can one of the admins verify this patch? |
|
That also depends on the data values, right? Not always faster. |
c21
left a comment
There was a problem hiding this comment.
I have similar concern with @gatorsmile . I think this also depends on the run-time cardinality of data.
E.g., if left side is smaller than right side, but every row from left side is same, and every row from right side is not same (unique). We should buffer right side here even though ride side is larger, because if we buffer left side, we essentially need to read all left side into the buffer.
In addition, this PR is swapping left and right side based on total size. However, during run-time, each task/partition can have different amount of data per left + right side. I think simply swapping left and right side here might cause some tasks to regress but some tasks to improve.
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
To change SortMergeJoin orientation at runtime using adaptive query execution
Why are the changes needed?
For SortMerge join of type EquiJoin, the left and right side of the joins are decided on the basis of the user order. In SMJ, the left side of the join is streamed and the right side is buffered (matching values). Because of this, B SMJ A would perform better than A SMJ B if, sizeOf(B) > sizeOf(A)
With adaptive query execution, once both ShuffleQueryStages corresponding to the join have completed and if none of them have sizes lesser than the broadcast threshold (the join will not be converted to BroadcastHashJoin), join orientation can be changed at run time.
Does this PR introduce any user-facing change?
No
-->
How was this patch tested?