Skip to content

Conversation

@alamb
Copy link
Contributor

@alamb alamb commented Jun 14, 2021

Closes #560

Rationale for this change

Logic is incorrect. The only way we can tell for sure that a predicate of col != literal is always false, is if both the min and max are literal (aka there is a single value in the column)

What changes are included in this PR?

  1. Fix up 2568323 / Not equal predicate in physical_planning pruning #544
  2. Add a test with expected results


let p = PruningPredicate::try_new(&expr, schema).unwrap();
let result = p.prune(&statistics).unwrap();
let expected = vec![true, true, true, true, true];
Copy link
Contributor Author

@alamb alamb Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On master this test fails thusly:

---- physical_optimizer::pruning::tests::prune_not_eq_data stdout ----
thread 'physical_optimizer::pruning::tests::prune_not_eq_data' panicked at 'assertion failed: `(left == right)`
  left: `[false, true, true, false, true, true]`,
 right: `[true, true, true, false, true, true]`', datafusion/src/physical_optimizer/pruning.rs:1218:9

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(aka it incorrectly prunes out the range of A -> Z)

@alamb alamb marked this pull request as draft June 14, 2021 18:08
@alamb alamb marked this pull request as ready for review June 14, 2021 18:23
@alamb
Copy link
Contributor Author

alamb commented Jun 14, 2021

fyi @jgoday

@alamb alamb changed the title Revert pruning on not equal predicate Fix pruning on not equal predicate Jun 14, 2021
Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great that this was caught by InfluxDB tests, seems correct now to me.

@jgoday
Copy link
Contributor

jgoday commented Jun 14, 2021

fyi @jgoday

@alamb, Sorry for introducing the bug, prune_not_eq_data test is much clearer to me now. Thank you!

@alamb
Copy link
Contributor Author

alamb commented Jun 14, 2021

@alamb, Sorry for introducing the bug, prune_not_eq_data test is much clearer to me now. Thank you!

No worries @jgoday -- both @Dandandan and I missed it on the review as well! This is tricky stuff

@alamb alamb merged commit e3e7e29 into apache:master Jun 14, 2021
@alamb alamb deleted the alamb/prune_bug branch June 14, 2021 21:16
@houqp houqp added bug Something isn't working datafusion labels Jul 30, 2021
unkloud pushed a commit to unkloud/datafusion that referenced this pull request Mar 23, 2025
* remove unsafe uses of from_raw_parts from crc32 and murmur hash functions

* remove unused hash_utils.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pruning on != predicate results in incorrect results

4 participants