-
Notifications
You must be signed in to change notification settings - Fork 3k
Add delete_files metadata table #4243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| .withPartition(TestHelpers.Row.of(0)) | ||
| .withRecordCount(1) | ||
| .build(); | ||
| static final DataFile FILE_PARTITION_1 = DataFiles.builder(SPEC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just some minor cleanup to unify some of the test tables used in "TestMetadataTableScans", I had added these sometime ago and noticed they are actually the same as the more commonly used FILES_A, FILES_B, etc.
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the approach is correct but I'd love to share more logic, if possible.
| caseSensitive) | ||
| .project(rowFilter); | ||
|
|
||
| ManifestEvaluator manifestEval = ManifestEvaluator.forPartitionFilter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not related to this PR but this logic may not be correct. We should create an issue and follow up. We always pass the current spec. Instead, we should use the spec to which our particular manifest belongs to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, we should support predicate pushdown in ALL_DATA_FILES.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not related to this PR but this logic may not be correct. We should create an issue and follow up. We always pass the current spec. Instead, we should use the spec to which our particular manifest belongs to.
Made the issue: #4271
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition, we should support predicate pushdown in ALL_DATA_FILES.
Yea this is on my list, I was thinking after this base class it will make this much easier.
4857613 to
3fbc71f
Compare
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change looks almost ready. I had some final nits.
I'll take a look at tests with fresh eyes tomorrow.
core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java
Outdated
Show resolved
Hide resolved
...2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
cc6b575 to
6ddb266
Compare
|
Thanks for taking a look @aokolnychyi , changed to address comments |
aokolnychyi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
core/src/test/java/org/apache/iceberg/hadoop/TestTableSerialization.java
Outdated
Show resolved
Hide resolved
core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java
Outdated
Show resolved
Hide resolved
...2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
...2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java
Outdated
Show resolved
Hide resolved
|
Thanks @aokolnychyi for the detailed review |
(cherry picked from commit ad1e634)
Closes #4139
It was decided there to make it into a separate table.
This change refactors some of the DataFilesTable logic (like schema, partition pruning) into a base class and re-uses it for DeleteFilesTable.