Skip to content

Conversation

@ahmedabu98
Copy link
Contributor

@ahmedabu98 ahmedabu98 commented May 3, 2025

Part of #34789

Allows users to pass a SQL expression to filter files and rows when scanning. For example: "colA" = 'SUCCESS' AND "date" < '2025-05-06'

Uses Apache Calcite to parse SQL expressions (see doc reference: https://calcite.apache.org/docs/reference.html)

@ahmedabu98 ahmedabu98 marked this pull request as ready for review May 6, 2025 10:31
@github-actions
Copy link
Contributor

github-actions bot commented May 6, 2025

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @robertwb for label java.
R: @kennknowles for label website.

Available commands:

  • stop reviewer notifications - opt out of the automated review tooling
  • remind me after tests pass - tag the comment author after tests pass
  • waiting on author - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)

The PR bot will only process comments in the main thread (not review comments).

@ahmedabu98 ahmedabu98 changed the title [IcebergIO] support scan filtering [IcebergIO] Support filter pushdown during reads May 6, 2025
@derrickaw
Copy link
Collaborator

Hi @ahmedabu98, are you still trying to fix the tests or is this truly ready for review again? Thanks!

@ahmedabu98
Copy link
Contributor Author

Test failures are irrelevant, this is ready for a review

@derrickaw
Copy link
Collaborator

Hi @robertwb and @kennknowles - please review when you get a chance. Thanks!

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

* Utilities that convert between a SQL filter expression and an Iceberg {@link Expression}. Uses
* Apache Calcite semantics.
*
* <p>Note: Only supports top-level fields (i.e. cannot reference nested fields).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's make sure we clearly fail for unsupported queries.

call,
schema);
case NOT_EQ:
return convertFieldAndLiteral(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's try to do 100% test coverage for this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried to do that in FilterUtilsTest. Let me know if anything is missing

Copy link
Contributor

@chamikaramj chamikaramj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@ahmedabu98
Copy link
Contributor Author

Failing test is unrelated -- merging now

@ahmedabu98 ahmedabu98 merged commit 5c28966 into apache:master May 13, 2025
23 of 24 checks passed
@barunkumaracharya
Copy link

Any ETA when this change will be released officially ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants