Core: Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger #8980

cccs-jc · 2023-11-03T12:19:25Z

@singhpk234 I have fixed the issue #8902. Could you have a look at it.

singhpk234 · 2023-11-03T17:30:44Z

core/src/main/java/org/apache/iceberg/MicroBatches.java

    for (ManifestFile manifest : manifestFiles) {
      manifestIndexes.add(Pair.of(manifest, currentFileIndex));
-      currentFileIndex += manifest.addedFilesCount() + manifest.existingFilesCount();
+      currentFileIndex += manifest.addedFilesCount();


can we please add a ut for this ?

I would but I don't know how?

ya we need to think of a case where this would actually result in incorrect results i am a bit surprised that there is no existing case for testing this code path. when i was adding rate limit code i just refactored this part to a common function.

ya none of the snapshots have an existing file count value.

when I look at production tables it's pretty rare actually to see that. I don't know what makes the existing file count be set.

@cccs-jc imho then we should not remove this line until we have coverage for this then, can you please revert this change then ?

I think it is relatively easy to reason about this part because it is used only in one path.

The existing entries will be entries of data files that are already part of a manifest. For example, when you rewrite the metadata; for example because you use a lot of fast-appends. With a fast append, a new manifest will be written and added to the manifest-list. At some point, you want to combine these manifests into a bigger one to reduce the number of calls to the object store and make the planning part faster.

We want to skip when only the metadata has been rewritten, so I think this change is correct.

Exactly, see here I show how the existingFilesCount is only set when doing rewrites.

#8980 (comment)

However the micro-batch streaming reader skips over rewrites. So it will never happen that the existingFilesCount is set in the above code.

It can happen, for example, it is easy to reproduce in TestTransaction::testTransactionRecommit where a new datafile is committed, and merged into an existing one:

Now, I'm questioning skipManifests, if it is valid to rely on the counts of manifests. But I have to dig deeper into this code since I'm not too familiar with it.

@singhpk234 I'll put the + manifest.existingFilesCount(); back.

Now that I know how to create manifests with existing file counts I will add this to the test suite

commit.manifest.min-count-to-merge=3

This way the test will include some manifest entries with existing file counts. I ran the tests again with the + manifest.existingFilesCount(); and without. All the test still pass. But I think that makes sense because the skipManifests is an optimization.

Have a look and let me know what you think.

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java

...k/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestStructuredStreamingRead3.java

singhpk234

@cccs-jc i would recommend to make the changes 1 spark version at a time and then create back-port pr, i am not sure what is the preferred though but checking it 1 version at a time helps in review and focus.

singhpk234 · 2023-11-14T23:07:37Z

core/src/main/java/org/apache/iceberg/MicroBatches.java

    for (ManifestFile manifest : manifestFiles) {
      manifestIndexes.add(Pair.of(manifest, currentFileIndex));
-      currentFileIndex += manifest.addedFilesCount() + manifest.existingFilesCount();
+      currentFileIndex += manifest.addedFilesCount();


@cccs-jc imho then we should not remove this line until we have coverage for this then, can you please revert this change then ?

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java

singhpk234 · 2023-11-14T23:09:19Z

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java

+          shouldContinueReading = false;
+          break;
+        }
+        // we found the next available snapshot, continue from there.


is this comment required ?

it makes it explicit we skipped some snapshots and we are continuing from nextValid

singhpk234 · 2023-11-14T23:10:37Z

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java

+
+    Snapshot nextSnapshot = SnapshotUtil.snapshotAfter(table, curSnapshot.snapshotId());
+    // skip over rewrite and delete snapshots
+    while (!shouldProcess(nextSnapshot)) {


does shouldProcess handle null ?

i think SnapshotUtil.snapshotAfter never returns null

spark/v3.4/spark/src/main/java/org/apache/iceberg/spark/source/SparkMicroBatchStream.java

cccs-jc · 2023-11-22T20:58:04Z

@cccs-jc i would recommend to make the changes 1 spark version at a time and then create back-port pr, i am not sure what is the preferred though but checking it 1 version at a time helps in review and focus.

I removed the 3.4 version. The only version in the commit is 3.5

cccs-jc · 2023-11-23T13:46:40Z

@singhpk234 As you recommended I removed the 3.4 implementation and only kept one version 3.5.

However, now the test cases for 3.4 are failing. Any idea how to fix this. Should I just put back the 3.4 implementation?

cccs-jc · 2023-11-28T13:00:38Z

@singhpk234 As you recommended I removed the 3.4 implementation and only kept one version 3.5.

However, now the test cases for 3.4 are failing. Any idea how to fix this. Should I just put back the 3.4 implementation?

What do you think @singhpk234 ?

singhpk234 · 2023-12-05T23:12:23Z

@cccs-jc i mean let's have changes for 3.5 with it's test only in 3.5 and we can backport the change with it's test in lower spark version like 3.4 and 3.3, 3.4 test failures are expected right as we don't have changes for SparkMicrobatch stream for 3.4 in it.

Also i would request to revert the change in core for Microbatch.java if we don't have coverage for it as i am unsure when would that fail (may be some legacy handling)

Apologies for getting being late in getting back at this.

cccs-jc · 2023-12-07T16:31:21Z

@cccs-jc i mean let's have changes for 3.5 with it's test only in 3.5 and we can backport the change with it's test in lower spark version like 3.4 and 3.3, 3.4 test failures are expected right as we don't have changes for SparkMicrobatch stream for 3.4 in it.

Also i would request to revert the change in core for Microbatch.java if we don't have coverage for it as i am unsure when would that fail (may be some legacy handling)

Apologies for getting being late in getting back at this.

Keeping the + existingFilesCount(); in the SparkMicrobatch.java makes no sense to me.

What is the purpose of adding that to the currentFileIndex ?

The way I understand it currentFileIndex is a position of the added files. So we want to only count the added files (addedFilesCount()). These are the files that you want a streaming job to consume.

Can you explain what is the purpose of using existingFilesCount here ?

singhpk234 · 2023-12-11T20:01:53Z

Can you explain what is the purpose of using existingFilesCount here ?

I am not fully aware of this logically i totally agree with you it makes no sense to keep it but what i am skeptical is if there is some handling happening due to backward compatibility (not sure about it either) and you also suggested here that you are not able to populate this field (refering to your comment here : #8980 (comment)) so my rationale here was to add this change when we can add some coverage or better would to take this change out of the current pr and involve more folks for the discussion and let this pr go as a functionality to skip overwrite commit only, please let me know your thoughts ?

cccs-jc · 2023-12-13T13:55:43Z

so I did more digging. On our production tables I search for all manifests which have a existing_data_files_count > 0 and added_data_files_count > 0 and I find none. This leads me to believe that a commit will either be an append with added_data_files_count or a rewrite with existing_data_files_count .

This query returns no results:

select
      distinct added_snapshot_id
  from
      catalog1.schema1.table1.manifests
  where
      existing_data_files_count > 0
      and added_data_files_count > 0

I can search for manifests which have existing_data_files_count > 0 and join those results to the snapshots.

select
    *
from
    catalog1.schema1.table1.snapshots
where
    snapshot_id in (
        select
            distinct added_snapshot_id
        from
            catalog1.schema1.table1.manifests
        where
            existing_data_files_count > 0
    )

Manifests with the snapshot_id they belong to

Their corresponding snapshots are all rewrite snapshots:

When streaming we skip over rewrites snapshots. Thus we will never encounter a manifest with an existing_data_files_count > 0.

So this calling this in the code does nothing + existingFilesCount();

Fokko

This makes sense to me, thanks @cccs-jc for working on this.

@singhpk234 Do you have any further concerns? I'm also very skeptical of counting the added files, and I think we might want to remove that piece of logic (in a separate PR).

singhpk234 · 2024-01-08T20:51:35Z

I'm also very skeptical of counting the added files, and I think we might want to remove that piece of logic (in a separate PR).

+1 on this @Fokko other than this no further concerns from my end !

Backport of apache#8980.

Backport of #8980.

Backport of apache#8980.

fix iceberg readStream

ce73903

github-actions bot added spark core labels Nov 3, 2023

singhpk234 reviewed Nov 3, 2023

View reviewed changes

nastra reviewed Nov 6, 2023

View reviewed changes

...k/v3.4/spark/src/test/java/org/apache/iceberg/spark/source/TestStructuredStreamingRead3.java Outdated Show resolved Hide resolved

cccs-jc added 7 commits November 7, 2023 13:06

code review fix

f2380b5

code review fix

887fd60

code review fix

3b3d644

code formatting

c0cb0b0

3.5 impl

1aaaecc

linting fix

7b25b40

linting fix

fa86775

cccs-jc requested review from nastra and singhpk234 November 14, 2023 19:10

singhpk234 reviewed Nov 14, 2023

View reviewed changes

code review fix

9294bf8

cccs-jc requested a review from singhpk234 November 22, 2023 20:59

cccs-jc added 2 commits December 7, 2023 16:21

reverted tests for 3.4

e57c92b

reverted tests for 3.4

656982b

cccs-jc added 3 commits December 27, 2023 21:23

code review fix

d5661ce

code review fix

a3553d7

code review

97f0e2b

Fokko added this to the Iceberg 1.5.0 milestone Jan 4, 2024

Fokko approved these changes Jan 4, 2024

View reviewed changes

cccs-jc added 5 commits January 26, 2024 17:55

merged upstream

e73a49a

fix

bc0c04e

fix

17637e4

fix

4cfe75f

fix

0b1a879

Fokko merged commit 6852278 into apache:main Jan 26, 2024

adnanhemani pushed a commit to adnanhemani/iceberg that referenced this pull request Jan 30, 2024

Core: streaming-skip-overwrite-snapshots only skips (apache#8980)

c597060

singhpk234 mentioned this pull request Apr 10, 2024

Spark: readStream from Iceberg doesn't progress anymore after running Maintenance (rewrite_data_files and rewrite_manifests) #10117

Closed

devangjhabakh pushed a commit to cdouglas/iceberg that referenced this pull request Apr 22, 2024

Core: streaming-skip-overwrite-snapshots only skips (apache#8980)

e9b0cab

wypoon added a commit to wypoon/iceberg that referenced this pull request May 28, 2025

Spark 3.4: streaming-skip-overwrite-snapshots fix

2eb0802

Backport of apache#8980.

wypoon mentioned this pull request May 28, 2025

Spark 3.4: streaming-skip-overwrite-snapshots fix #13168

Merged

huaxingao pushed a commit that referenced this pull request May 28, 2025

Spark 3.4: streaming-skip-overwrite-snapshots fix (#13168)

e2de07b

Backport of #8980.

devendra-nr pushed a commit to devendra-nr/iceberg that referenced this pull request Dec 8, 2025

Spark 3.4: streaming-skip-overwrite-snapshots fix (apache#13168)

7529d64

Backport of apache#8980.

Core: Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger #8980

Core: Iceberg streaming streaming-skip-overwrite-snapshots SparkMicroBatchStream only skips over one file per trigger #8980

Uh oh!

Conversation

cccs-jc commented Nov 3, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

singhpk234 Nov 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

singhpk234 left a comment

Choose a reason for hiding this comment

Uh oh!

singhpk234 Nov 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

cccs-jc commented Nov 22, 2023

Uh oh!

cccs-jc commented Nov 23, 2023

Uh oh!

cccs-jc commented Nov 28, 2023

Uh oh!

singhpk234 commented Dec 5, 2023

Uh oh!

cccs-jc commented Dec 7, 2023

Uh oh!

singhpk234 commented Dec 11, 2023

Uh oh!

cccs-jc commented Dec 13, 2023

Uh oh!

Fokko left a comment

Choose a reason for hiding this comment

Uh oh!

singhpk234 commented Jan 8, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

singhpk234 Nov 14, 2023 •

edited

Loading

singhpk234 Nov 14, 2023 •

edited

Loading