feat(core): Add support for replace in incremental scan by gbrgr · Pull Request #8 · RelationalAI/iceberg-rust

gbrgr · 2025-11-06T11:52:01Z

Adds support for replace operations in snapshot histories for incremental scans.

Even though replace operations logically keep data the same, we still report file additions and deletions, as their physical layout changes and files to which the rows belong change. This is necessary for incremental scan users who want to base change tracking off of file identifiers.

vustef

Just a couple clarifications

vustef · 2025-11-06T13:13:07Z

+    /// 1. Files to compact: Vec<String> of existing file names that are being compacted
+    /// 2. Target file: String name of the new compacted file
+    ///
+    /// Example: `Replace(vec!["file-a.parquet", "file-b.parquet"], "file-a-b-compacted.parquet")`


Is this how iceberg engines do it too? How do they retarget positional delete files to the file-a-b-compacted.parquet?

So what spark does is that essentially file-a-b-compacted.parquet will contain the records of file-a + file-b minus the positional deletes (and equality deletes). However, existing delete files of file-a and file-b remain in-place.

vustef · 2025-11-10T07:25:59Z

+    // Snapshot 6: Delete position 2 from file-ab-compact (record "4" deleted)
+    // Net result:
+    // - Additions: compacted records with position 1 filtered from file-a (1,3,5,10,11,12)
+    // - Deletions: All positions from file-a (0-4) and file-b (0-2) because these files


this would report duplicate delete for record "2", right?

No, not really. The delete for the compacted file is filtered out and only appends for that file are reported. Even if not, depends what we mean by duplicated delete: Since the file path is also reported, the file path for one of the records "2" would be file-a, while the other file-ab-compacted

Add support for replace

a64f9a5

gbrgr changed the title ~~Add support for replace~~ Add support for replace in incremental scan Nov 6, 2025

gbrgr marked this pull request as ready for review November 6, 2025 12:35

gbrgr requested a review from vustef November 6, 2025 12:35

vustef approved these changes Nov 6, 2025

View reviewed changes

Add positional delete test, refactor match arm

45da8bb

gbrgr changed the title ~~Add support for replace in incremental scan~~ feat(core): Add support for replace in incremental scan Nov 7, 2025

Merge branch 'main' into feature/gb/support-replace-incremental

ac70a3b

vustef approved these changes Nov 10, 2025

View reviewed changes

Merge branch 'main' into feature/gb/support-replace-incremental

f885dee

gbrgr merged commit 37e79b8 into main Nov 10, 2025
24 of 26 checks passed

gbrgr deleted the feature/gb/support-replace-incremental branch November 10, 2025 09:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(core): Add support for replace in incremental scan#8

feat(core): Add support for replace in incremental scan#8
gbrgr merged 4 commits intomainfrom
feature/gb/support-replace-incremental

gbrgr commented Nov 6, 2025 •

edited by atlassian Bot

Loading

Uh oh!

vustef left a comment

Uh oh!

vustef Nov 6, 2025

Uh oh!

gbrgr Nov 6, 2025

Uh oh!

Uh oh!

Uh oh!

vustef Nov 10, 2025

Uh oh!

gbrgr Nov 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gbrgr commented Nov 6, 2025 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vustef left a comment

Choose a reason for hiding this comment

Uh oh!

vustef Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

gbrgr Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

vustef Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

gbrgr Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

gbrgr commented Nov 6, 2025 •

edited by atlassian Bot

Loading