-
Notifications
You must be signed in to change notification settings - Fork 1.9k
fix: Dialect requires derived table alias #12994
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Dialect requires derived table alias #12994
Conversation
* fix: Add Dialect option for requiring table aliases * feat: Add CustomDialectBuilder for requires_table_alias * docs: Spelling * refactor: rename requires_derived_table_alias * refactor: rename requires_derived_table_alias
goldmedal
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peasee for working on this. I left some suggestions here.
| expected: | ||
| // top projection sort gets derived into a subquery | ||
| // for MySQL, this subquery needs an alias | ||
| "SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found the test only covers the derived_sort case but some cases aren't covered, such as derived_projection, derived_limit,...
Could you add more tests for them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added another test case for a more complex SQL that results in 2 nested derives for derived_sort and derived_distinct in the same query.
I'm not too sure what SQL to use to trigger derived_limit or derived_projection though... any ideas that I could include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For derived_projection, I found the SQL can trigger:
select j1_id from (select 1 as j1_id)
-> SELECT `j1_id` FROM (SELECT 1 AS `j1_id`) AS `derived_projection`However, I guess it's not a valid SQL for MySQL, right? I tried it in DataFusion and DuckDB, they accept it but Postgres doesn't allow it. Maybe, it could be a DataFusion dialect to MySQL dialect case.
For derived_limit, a similar case can trigger it:
select * from (select * from j1 limit 10)
-> SELECT * FROM (SELECT * FROM `j1` LIMIT 10) AS `derived_limit`There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is valid for MySQL, I just tested it out in my CLI. I've added both as new cases, thanks!
|
By the way, I tried to pull your branch and run the tests on my laptop but the tests always fail. After merging with the latest main branch, the tests are passed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good improvement for the derived table. 👍 It makes sense to me.
I noticed some databases don't support this kind of unnamed subquery, see #12896 (comment)
This PR would be helpful.
|
Maybe @phillipleblanc and @sgrebnov want to take a look at this PR. |
|
Looks good to me, thanks @goldmedal and @peasee |
|
@goldmedal perhaps you would like to merge this PR as a way to verify your permissions are configured correctly? |
Sure, I plan to merge this PR if no more comments tomorrow. |
Let's hold off on merging this for now. I think I've introduced a regression that I'd like to test more first before merging. |
Okay, looks like my PR didn't introduce the regression but there's a somewhat related bug currently on This gets rewritten in the Which is invalid, because the subquery is un-aliased. I tested it in DuckDB to confirm: Let's merge my PR, and I can raise a new issue for this other bug @goldmedal ? |
It's valid for DataFusion Generally,
How about or If it's the first one, I think it's an issue because it isn't valid. We can file an issue for it. |
My bad, I should've clarified better. It is the first one with |
I see. I think it's an issue but we can fix it in the follow-up PR. Let's merge this PR first. |
|
|
Thanks @peasee and @phillipleblanc @alamb for reviewing. 👍 |
Which issue does this PR close?
Closes #12993.
Rationale for this change
Fixes a bug preventing derived logical plans from running in MySQL due to a requirement that every derived table must have an alias:
What changes are included in this PR?
Includes a dialect update to specify whether derived tables require aliases. For dialects that do, set a static alias. I've simply selected the name of the operation prefixed with
derived_, likederived_sort, etc.Are these changes tested?
Yes
Are there any user-facing changes?
No