-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Consolidate Filter::remove_aliases into Expr::unalias_nested
#11001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| Expr::Alias(alias) => alias.expr.unalias_nested(), | ||
| _ => self, | ||
| } | ||
| pub fn unalias_nested(self) -> Transformed<Expr> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is an API change (it returns Transformed rather than just Expr) but I think that makes more sense
datafusion/expr/src/expr.rs
Outdated
| Expr::Alias(Alias { expr, .. }) => { | ||
| // directly recurse into the alias expression | ||
| // as we just modified the tree structure | ||
| Ok(Transformed::yes(expr.unalias_nested().data)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to directly recurse into the aliased expression? Why don't just retrun Transformed::yes(*expr) (that implies TreeNodeRecursion::Continue)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is what my first version did -- and it turns out that when Transformed::yes(*expr) is returned the recursion doesn't traverse into the *expr (so nested aliases like (x as foo) as bar are only rewritten to (x as bar)
I tried to convince myself this was a bug in tree node, but think the current behavior makes sense as the newly transformed node wasn't (strictly speaking) part of the original tree and thus automatically traversing down into it may not make senes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this is because of how transform_down() works when you replace the current alias to its child that is also an alias.
In the case of (x as foo) as bar when we return Transformed::yes(*expr) then ... as bar gets replaced to x as foo (so x as foo becomes the current node) and then recursion continues into x (the current node's children).
So currently with the expr.unalias_nested() call and the TreeNodeRecursion::Continue there is actually a double recursion.
I think transform_up() should be ok, but then you can't skip on subqueries.
So I think we have 2 options here:
- Use
transform_down()but then you should returnOk(Transformed::new(expr.unalias_nested(), true, TreeNodeRecursion::Jump))to avoid double recursion. - Or use
transform_down_up()and inf_downhandle the subqueries and inf_uphandle alias with returningOk(Transformed::yes(*expr)).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated my comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the code to use transform_down_up -- I think it looks nice now
|
@comphead I wonder if you have time to re-review this PR? |
| // f_down: skip subqueries. Check in f_down to avoid recursing into them | ||
| let recursion = if matches!( | ||
| expr, | ||
| Expr::Exists { .. } | Expr::ScalarSubquery(_) | Expr::InSubquery(_) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @alamb
I'm not sure if I read this line correctly, I just checked some queries with duckdb and it doesn't allow setting the alias for Exists or In Subqueries
D with t as (select 1 a) select 1 where exists (select * from t) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: ...elect 1 where exists (select * from t) as x;
^
D select 1 where 1 in (1, 2) as x;
Error: Parser Error: syntax error at or near "as"
LINE 1: select 1 where 1 in (1, 2) as x;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I don't think aliases can be set for such plans via SQL -- I think they are (or were) sometimes set programatically (e.g. via some other optimizer pass or rewrite)
comphead
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm thanks @alamb there is some comment, I'm not sure how is it relevant though
|
Thanks again @peter-toth and @comphead |
…che#11001) * Consolidate Expr::unalias_nested * fix doc test * Clarify unreachable code * Use transform_down_up to handle recursion
Which issue does this PR close?
Follow on to #10835
Rationale for this change
As part of #10835, @peter-toth noted that the nested unaliasing code in
FilterExecprobably belonged inExrpto make it more discovrable: https://github.com/apache/datafusion/pull/10835/files#r1644664005It turns out when I looked into doing so, there was already an existing implementation that could be used
What changes are included in this PR?
Filter::remove_aliasesand fold its implementation intoExpr::unalias_nestedAre these changes tested?
They are covered by existing tests / CI
Are there any user-facing changes?