-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Make swap_hash_join public API #10702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||
|---|---|---|---|---|
|
|
@@ -157,7 +157,9 @@ fn swap_join_projection( | |||
| } | ||||
|
|
||||
| /// This function swaps the inputs of the given join operator. | ||||
| fn swap_hash_join( | ||||
| /// This function is public so other downstream projects can use it | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Without a clear comment, it might be wrongly moved out of public APIs easily.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree but the entire idea of |
||||
| /// to construct `HashJoinExec` with right side as the build side. | ||||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there is a benefit in leaving the documentation that explains the rationale for why this is public |
||||
| pub fn swap_hash_join( | ||||
|
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Spark supports build right in HashJoin operator internally. In planning stage, it decides which side is build side. However, DataFusion HashJoin planning takes alternative approach. DataFusion HashJoin operator only supports build left. In optimization stage, once DataFusion finds build right is better, it will call this function to swap inputs for HashJoin to achieve build right effect. So, to make Comet to use DataFusion HashJoin to support Spark build right HashJoin, we only need this function to be public so Comet can directly use it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Given this input takes So maybe we could update the code so this function is
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sounds good. 👍
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, some functions used in Only I can do is move these functions to joins/utils.rs in It makes more APIs public actually, including I think it is not so worth making more public APIs just for Seems the current approach is better? WDYT? |
||||
| hash_join: &HashJoinExec, | ||||
| partition_mode: PartitionMode, | ||||
| ) -> Result<Arc<dyn ExecutionPlan>> { | ||||
|
|
||||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.