feat(reader): null struct default values in create_column#1847
Merged
liurenjie1024 merged 2 commits intoapache:mainfrom Nov 13, 2025
Merged
feat(reader): null struct default values in create_column#1847liurenjie1024 merged 2 commits intoapache:mainfrom
liurenjie1024 merged 2 commits intoapache:mainfrom
Conversation
…etes.testPosDeletesOnParquetFileWithMultipleRowGroups in Iceberg Java 1.10 with DataFusion Comet.
15 tasks
17 tasks
liurenjie1024
approved these changes
Nov 13, 2025
Contributor
liurenjie1024
left a comment
There was a problem hiding this comment.
Thanks @mbutrovich for this fix!
chenzl25
pushed a commit
to risingwavelabs/iceberg-rust
that referenced
this pull request
Feb 26, 2026
Fixes `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups` in Iceberg Java 1.10 with DataFusion Comet. ## Which issue does this PR close? - Partially address apache#1749. ## What changes are included in this PR? - While `RecordBatchTransformer` does not have exhaustive nested type support yet, this adds logic to `create_column` in the specific scenario for a schema evolution with a new struct column that uses the default NULL value. - If the column has a default value other than NULL defined, it will fall into the existing match arm and say it is unsupported. ## Are these changes tested? New test to reflect what happens with Iceberg Java 1.10's `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups`. The test is misleading, since I figured testing positional deletes would just be a delete vector and be schema agnostic, but [it includes schema change with binary and struct types so we need default NULL values](https://github.com/apache/iceberg/blob/53c046efda5d6c6ac67caf7de29849ab7ac6d406/data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java#L65). (cherry picked from commit 12c4c21)
chenzl25
added a commit
to risingwavelabs/iceberg-rust
that referenced
this pull request
Feb 26, 2026
… (#127) * feat(reader): null struct default values in create_column (apache#1847) Fixes `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups` in Iceberg Java 1.10 with DataFusion Comet. ## Which issue does this PR close? - Partially address apache#1749. ## What changes are included in this PR? - While `RecordBatchTransformer` does not have exhaustive nested type support yet, this adds logic to `create_column` in the specific scenario for a schema evolution with a new struct column that uses the default NULL value. - If the column has a default value other than NULL defined, it will fall into the existing match arm and say it is unsupported. ## Are these changes tested? New test to reflect what happens with Iceberg Java 1.10's `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups`. The test is misleading, since I figured testing positional deletes would just be a delete vector and be schema agnostic, but [it includes schema change with binary and struct types so we need default NULL values](https://github.com/apache/iceberg/blob/53c046efda5d6c6ac67caf7de29849ab7ac6d406/data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java#L65). (cherry picked from commit 12c4c21) * fix tests * fmt * fix ci * fix ci * fix ci * clippy * fix ci * fix ci * fix ci * fix ci * fix --------- Co-authored-by: Matt Butrovich <mbutrovich@users.noreply.github.com>
chenzl25
added a commit
to risingwavelabs/iceberg-rust
that referenced
this pull request
Mar 3, 2026
… (#127) * feat(reader): null struct default values in create_column (apache#1847) Fixes `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups` in Iceberg Java 1.10 with DataFusion Comet. - Partially address apache#1749. - While `RecordBatchTransformer` does not have exhaustive nested type support yet, this adds logic to `create_column` in the specific scenario for a schema evolution with a new struct column that uses the default NULL value. - If the column has a default value other than NULL defined, it will fall into the existing match arm and say it is unsupported. New test to reflect what happens with Iceberg Java 1.10's `TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups`. The test is misleading, since I figured testing positional deletes would just be a delete vector and be schema agnostic, but [it includes schema change with binary and struct types so we need default NULL values](https://github.com/apache/iceberg/blob/53c046efda5d6c6ac67caf7de29849ab7ac6d406/data/src/test/java/org/apache/iceberg/data/DeleteReadTests.java#L65). (cherry picked from commit 12c4c21) * fix tests * fmt * fix ci * fix ci * fix ci * clippy * fix ci * fix ci * fix ci * fix ci * fix --------- Co-authored-by: Matt Butrovich <mbutrovich@users.noreply.github.com>
blackmwk
pushed a commit
that referenced
this pull request
Mar 10, 2026
## Which issue does this PR close? Similar to #1847 - Closes #. ## What changes are included in this PR? - RecordBatchTransformer does not support timestamp type. This PR adds logic to create_column in the specific scenario for a schema evolution with a new timestamp column. ## Are these changes tested? <!-- Specify what test covers (unit test, integration test, etc.). If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> 2 unit tests test_create_timestamp_microsecond_with_timezone_array_repeated and test_create_timestamp_microsecond_array_repeated are added.
gbrgr
pushed a commit
to RelationalAI/iceberg-rust
that referenced
this pull request
Mar 10, 2026
## Which issue does this PR close? Similar to apache#1847 - Closes #. ## What changes are included in this PR? - RecordBatchTransformer does not support timestamp type. This PR adds logic to create_column in the specific scenario for a schema evolution with a new timestamp column. ## Are these changes tested? <!-- Specify what test covers (unit test, integration test, etc.). If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> 2 unit tests test_create_timestamp_microsecond_with_timezone_array_repeated and test_create_timestamp_microsecond_array_repeated are added.
big-mac-slice
pushed a commit
to perpetualsystems/iceberg-rust
that referenced
this pull request
Apr 2, 2026
## Which issue does this PR close? Similar to apache#1847 - Closes #. ## What changes are included in this PR? - RecordBatchTransformer does not support timestamp type. This PR adds logic to create_column in the specific scenario for a schema evolution with a new timestamp column. ## Are these changes tested? <!-- Specify what test covers (unit test, integration test, etc.). If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> 2 unit tests test_create_timestamp_microsecond_with_timezone_array_repeated and test_create_timestamp_microsecond_array_repeated are added.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes
TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroupsin Iceberg Java 1.10 with DataFusion Comet.Which issue does this PR close?
What changes are included in this PR?
RecordBatchTransformerdoes not have exhaustive nested type support yet, this adds logic tocreate_columnin the specific scenario for a schema evolution with a new struct column that uses the default NULL value.Are these changes tested?
New test to reflect what happens with Iceberg Java 1.10's
TestSparkReaderDeletes.testPosDeletesOnParquetFileWithMultipleRowGroups. The test is misleading, since I figured testing positional deletes would just be a delete vector and be schema agnostic, but it includes schema change with binary and struct types so we need default NULL values.