Skip to content

Conversation

@WinkerDu
Copy link
Contributor

@WinkerDu WinkerDu commented Mar 28, 2022

Which issue does this PR close?

Closes #1818.

Rationale for this change

when union different some constants, UNION ALL schemas are expected to be the same would raise, like

❯ select 1,2
union all
select 3,4;
Plan("UNION ALL schemas are expected to be the same")

This pr fix it.

What changes are included in this PR?

  1. except for expression Alias and Column in Projection under Union, other expressions should be aliased with column index like column0
  2. change field names match check to field type compatible check.

Are there any user-facing changes?

No.

@github-actions github-actions bot added datafusion sql SQL Planner labels Mar 28, 2022
@WinkerDu WinkerDu mentioned this pull request Mar 28, 2022
@WinkerDu
Copy link
Contributor Author

cc @xudong963 @alamb PTAL

Copy link
Member

@Ted-Jiang Ted-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @WinkerDu ❤️! would you mind fix the conflicts?

@WinkerDu
Copy link
Contributor Author

Thanks @WinkerDu ❤️! would you mind fix the conflicts?

OK, ASAP

/// Check to see if df field data types in df schema matches with input df schema
pub fn matches_arrow_schema_with_type(&self, df_schema: &DFSchema) -> bool {
self.fields
.iter()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in postgresql union support type cast like below

postgres=# select '1.2' as a union select 2.3 as b union select 3 as b ;
  a
-----
 2.3
   3
 1.2
(3 rows)

postgres=# \gdesc
 Column |  Type
--------+---------
 a      | numeric
(1 row)

Maybe we can open a ticket support this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Ted-Jiang Agree with that.
Maybe we can solve Alias issue first, than open a new issue to fully discuss data type compatible cast?

@WinkerDu
Copy link
Contributor Author

cc @xudong963 @Ted-Jiang
PTAL, thanks.

.iter()
.zip(arrow_schema.fields().iter())
.all(|(l_field, r_field)| {
can_cast_types(r_field.data_type(), l_field.data_type())
Copy link
Contributor

@Dandandan Dandandan Mar 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this would return a Result<()> we could return a more specific error of the type mismatch here (and pass it to the user).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Dandandan OK, I'll fix it.

@WinkerDu
Copy link
Contributor Author

WinkerDu commented Apr 2, 2022

@Dandandan PTAL, thanks

@WinkerDu
Copy link
Contributor Author

WinkerDu commented Apr 2, 2022

cc @alamb @yjshen @xudong963
PTAL, thanks

Copy link
Member

@yjshen yjshen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little bit conservative on the general auto alias columns behavior. But since this work limits such aliases to only the union case, I think it's cool to go.

Thanks again @WinkerDu!

@yjshen yjshen merged commit 536210d into apache:master Apr 2, 2022
@alamb
Copy link
Contributor

alamb commented Apr 2, 2022

Thank you for sticking with this @WinkerDu -- 👍 and thank you @Ted-Jiang for the assist

@WinkerDu WinkerDu deleted the master-fix-union branch April 3, 2022 18:58
@xudong963
Copy link
Member

@WinkerDu Thank you! Sorry for not checking in time. I had some personal affairs recently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

A corner bug in union

6 participants