-
Notifications
You must be signed in to change notification settings - Fork 1.9k
API-break: Support SubqueryAlias and remove Alias in Projection
#4333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Github |
|
#4293 is part job of this PR. It don't include |
SubqueryAlias and remove Projection-AliasSubqueryAlias and remove Alias in Projection
|
Thanks @jackwener I will review it later |
|
I will review this but it may take a few days -- there are a bunch of other PRs in the queue before this one |
0ea7de4 to
d18301b
Compare
|
Followup we need to remove alias in projection in |
alamb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @jackwener -- I think this design is much clearer.
I reviewed the plan changes carefully and they looked reasonable to me
The only thing I am concerned about is the regression in supporting limit pushdown through subquery. Otherwise I think this PR could be merged.
For anyone else reviewing this PR, I found whitespace blind diff very helpful: https://github.com/apache/arrow-datafusion/pull/4333/files?w=1
| .unwrap() | ||
| .to_string(); | ||
| assert!(formatted.contains("ParquetExec: limit=Some(10)")); | ||
| // TODO: limit_push_down support SubqueryAlias |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should perhaps track this with a ticket -- it seems like it is a regression not to push limits into the subquery
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new issue to trace #4381.
Yes, I'll follow up on this soon, the reason I didn't do this in this PR is because I didn't want to mix too many features into one big PR (Not only pushdown limit, other rules also need to support it, I want to support them altogether, and add ut for it, it will be easy to review.).
d18301b to
d3f7554
Compare
d3f7554 to
48b49b2
Compare
| " SubqueryAlias: d [a:Int64, b:Utf8]", | ||
| " SubqueryAlias: _data2 [a:Int64, b:Utf8]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Future ticket #4383
| " Left Join: t3.t1_int = t2.t2_int [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N]", | ||
| " Filter: t3.t1_id < UInt32(100) [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N]", | ||
| " SubqueryAlias: t3 [t1_id:UInt32;N, t1_name:Utf8;N, t1_int:UInt32;N, t2_id:UInt32;N, t2_name:Utf8;N, t2_int:UInt32;N]", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regression will be resolved in #4384, you could see it has been fixed inside this PR. |
|
I merged this PR to the latest master branch locally to ensure it has no logical conflicts. Thanks again @jackwener |
|
Benchmark runs are scheduled for baseline = da54fa5 and contender = ad3df7d. ad3df7d is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
close #3927
closes #2212
closes #4291
Rationale for this change
Remove
alias in projection, and replace it bySubqueryAlias.Some discussion in #4232.
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?
ProjectionStruct change