Skip to content

Conversation

@HaoYang670
Copy link
Contributor

@HaoYang670 HaoYang670 commented Nov 30, 2022

Signed-off-by: remzi 13716567376yh@gmail.com

Which issue does this PR close?

Closes #4431 .

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Yes, rename CrossJoinExec::try_new to CrossJoinExec::new and update the return type to Self.

Signed-off-by: remzi <13716567376yh@gmail.com>
@github-actions github-actions bot added the core Core DataFusion crate label Nov 30, 2022
Signed-off-by: remzi <13716567376yh@gmail.com>
Comment on lines +67 to 75
let all_columns = {
let left_schema = left.schema();
let right_schema = right.schema();
let left_fields = left_schema.fields().iter();
let right_fields = right_schema.fields().iter();
left_fields.chain(right_fields).cloned().collect()
};

let schema = Arc::new(Schema::new(all_columns));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use Schema::merge here instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do something like this. This would preserve schema metadata as well.

        let input_schemas = vec![left.schema().as_ref().clone(), right.schema().as_ref().clone()];
        let schema = Arc::new(Schema::try_merge(input_schemas)?);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm afraid try_merge doesn't fit this context.
What we want here is to concatenate schema, but not merge schema. It could happen that a table cross joins with itself, or two tables have same named columns, in which cases, merge doesn't work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the metadata, we should also do concatenation, instead of merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove, what's your suggestion?

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR makes sense to me -- thank you @HaoYang670

@alamb
Copy link
Contributor

alamb commented Dec 9, 2022

As I think all comments have been addressed, I am merging it in

@alamb alamb merged commit 03a6a9f into apache:master Dec 9, 2022
@HaoYang670 HaoYang670 deleted the 4413_remove_schema_checking_cross_join branch December 9, 2022 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove the schema checking from CrossJoinExec::try_new

3 participants