Add `Projection::try_new` and `Projection::try_new_with_schema` #2900

andygrove · 2022-07-13T16:39:57Z

Which issue does this PR close?

Part of #2907

Rationale for this change

I was hitting the following error when running a query due to common_subexpr_eliminate creating an invalid projection and this was hard to track down.

thread 'q11' panicked at 'index out of bounds: the len is 12 but the index is 12'

Rather than create the Projection struct directly, it is better to call a try_new constructor that can:

Perform some basic validation checks
Build the schema if none is provided to reduce boilerplate code for producing projection schemas

This gave a better error:

FAILURE: Plan("Projection has mismatch between number of expressions (12) and number of fields in schema (42)")

What changes are included in this PR?

Introduce try_new constructor and use that instead of creating structs directly

Are there any user-facing changes?

No, this is not a breaking change.

andygrove · 2022-07-14T15:02:42Z

datafusion/optimizer/src/common_subexpr_eliminate.rs

-    let mut schema = DFSchema::new_with_metadata(fields, HashMap::new())?;
-    schema.merge(input.schema());


This is the fix for #2907. This code is incorrect and is producing a projection output schema containing all the fields from the input schema. The code now just relies on the logic in Projection::try_new to produce a valid schema based on the projection expressions.

No, it wasn't the fix.

This reverts commit 28f07a0.

andygrove · 2022-07-14T15:31:26Z

@jdye64 This is the first step in tackling the index out-of-bounds issue. Could you review?

jdye64 · 2022-07-14T15:33:00Z

Sure thing, I'm cherry picking a few things to try on my local setup. If that all goes as planned will review the actual changes.

jdye64

LGTM, will be handy to have this check and more clear errors.

alamb

Looks like a nice improvement to me (and we now have a test!)

To be clear this doesn't fix any underlying bugs (yet), it just is refactoring the code to make it easer to test, right (which I think is great ❤️ )

👍

alamb · 2022-07-14T16:53:17Z

datafusion/expr/src/logical_plan/plan.rs

    }

+    #[test]
+    fn projection_expr_schema_mismatch() -> Result<()> {


ursabot · 2022-07-14T17:11:17Z

Benchmark runs are scheduled for baseline = 9401d6d and contender = 034678b. 034678b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

andygrove added 2 commits July 13, 2022 10:07

Projection::new

3708996

try_new and validation

2246957

github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Jul 13, 2022

andygrove added 3 commits July 13, 2022 10:48

better variable name

6720cd1

split into try_new and try_new_with_schema

e41ab4b

fmt

5e813b2

andygrove changed the title ~~MINOR: Add Projection::try_new~~ MINOR: Add Projection::try_new and Projection::try_new_with_schema Jul 13, 2022

This was referenced Jul 14, 2022

Optimization rule CommonSubexprEliminate creates invalid projections #2907

Closed

Optimizer should have option to skip failing rules #2909

Merged

andygrove added 2 commits July 14, 2022 08:35

upmerge

8ceab06

fix invalid projection in common_subexpr_eliminate

28f07a0

andygrove changed the title ~~MINOR: Add Projection::try_new and Projection::try_new_with_schema~~ Add Projection::try_new and Projection::try_new_with_schema and fix invalid projection in common_subexpr_eliminate Jul 14, 2022

andygrove commented Jul 14, 2022

View reviewed changes

Revert "fix invalid projection in common_subexpr_eliminate"

616ae60

This reverts commit 28f07a0.

andygrove changed the title ~~Add Projection::try_new and Projection::try_new_with_schema and fix invalid projection in common_subexpr_eliminate~~ Add Projection::try_new and Projection::try_new_with_schema Jul 14, 2022

fix merge conflict

dcf0be3

andygrove requested a review from alamb July 14, 2022 15:26

jdye64 approved these changes Jul 14, 2022

View reviewed changes

alamb approved these changes Jul 14, 2022

View reviewed changes

datafusion/expr/src/logical_plan/plan.rs

}

#[test]

fn projection_expr_schema_mismatch() -> Result<()> {

Copy link

Contributor

alamb Jul 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

andygrove merged commit 034678b into apache:master Jul 14, 2022

andygrove deleted the projection-new branch July 14, 2022 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add `Projection::try_new` and `Projection::try_new_with_schema` #2900

Add `Projection::try_new` and `Projection::try_new_with_schema` #2900

Uh oh!

andygrove commented Jul 13, 2022 •

edited

Loading

Uh oh!

andygrove Jul 14, 2022 •

edited

Loading

Uh oh!

andygrove commented Jul 14, 2022

Uh oh!

jdye64 commented Jul 14, 2022

Uh oh!

jdye64 left a comment

Uh oh!

alamb left a comment

Uh oh!

alamb Jul 14, 2022

Uh oh!

ursabot commented Jul 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		let mut schema = DFSchema::new_with_metadata(fields, HashMap::new())?;
		schema.merge(input.schema());

Add Projection::try_new and Projection::try_new_with_schema #2900

Add Projection::try_new and Projection::try_new_with_schema #2900

Uh oh!

Conversation

andygrove commented Jul 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are there any user-facing changes?

Uh oh!

andygrove Jul 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andygrove commented Jul 14, 2022

Uh oh!

jdye64 commented Jul 14, 2022

Uh oh!

jdye64 left a comment

Choose a reason for hiding this comment

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

alamb Jul 14, 2022

Choose a reason for hiding this comment

Uh oh!

ursabot commented Jul 14, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add `Projection::try_new` and `Projection::try_new_with_schema` #2900

Add `Projection::try_new` and `Projection::try_new_with_schema` #2900

andygrove commented Jul 13, 2022 •

edited

Loading

andygrove Jul 14, 2022 •

edited

Loading