Skip to content

KAFKA-13098: Fix NoSuchFileException during snapshot recovery#11071

Merged
cmccabe merged 1 commit intoapache:trunkfrom
jsancio:kafka-13098-error-in-snapshot-recovery
Jul 16, 2021
Merged

KAFKA-13098: Fix NoSuchFileException during snapshot recovery#11071
cmccabe merged 1 commit intoapache:trunkfrom
jsancio:kafka-13098-error-in-snapshot-recovery

Conversation

@jsancio
Copy link
Copy Markdown
Member

@jsancio jsancio commented Jul 16, 2021

Java's FileTreeIterator throws an NoSuchFileException when visting
@metadata-0/partition.metadata.tmp. This is most like do to the
fact that the Log type asynchronously creates and delete that file.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Java's FileTreeIterator throws an NoSuchFileException when visting
@metadata-0/partition.metadata.tmp. This is most like do to the
fact that the Log type asynchronously creates and delete that file.
@jsancio
Copy link
Copy Markdown
Member Author

jsancio commented Jul 16, 2021

This commit needs to get cherry picked to 2.8 and 3.0.

Copy link
Copy Markdown
Contributor

@cmccabe cmccabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@cmccabe cmccabe merged commit 8e48212 into apache:trunk Jul 16, 2021
@cmccabe
Copy link
Copy Markdown
Contributor

cmccabe commented Jul 16, 2021

I cherry-picked this to 3.0, but the commit does not apply to 2.8. Since we never created snapshot files in 2.8, are you sure it needs to be present there?

cmccabe pushed a commit that referenced this pull request Jul 16, 2021
Fix a bug where if a snapshot file is deleted while we're running snapshot recovery,
a NoSuchFileException will be thrown and snapshot recovery will fail.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
@cmccabe
Copy link
Copy Markdown
Contributor

cmccabe commented Jul 16, 2021

Possibly related: #11056

@jsancio jsancio deleted the kafka-13098-error-in-snapshot-recovery branch July 16, 2021 20:10
@jsancio
Copy link
Copy Markdown
Member Author

jsancio commented Jul 16, 2021

I cherry-picked this to 3.0, but the commit does not apply to 2.8. Since we never created snapshot files in 2.8, are you sure it needs to be present there?

@cmccabe My understanding is that @jolshan and @junrao are planning to cherry pick #11056 to 2.8 because it is a performance regression in 2.8.

Even though there are no snapshots in the metadata directory in 2.8, this code exists and gets executed in 2.8:
https://github.com/apache/kafka/blob/2.8.0/core/src/main/scala/kafka/raft/KafkaMetadataLog.scala#L336-L360

We can either proactively cherry pick this change now or wait until #11056 get cherry picked to 2.8.

@jolshan
Copy link
Copy Markdown
Member

jolshan commented Jul 16, 2021

Since I've confirmed that we don't use partition.metadata files for the metadata topic in 2.8, I've removed this version from the ticket. Thanks @jsancio for this patch.

xdgrulez pushed a commit to xdgrulez/kafka that referenced this pull request Dec 22, 2021
…#11071)

Fix a bug where if a snapshot file is deleted while we're running snapshot recovery,
a NoSuchFileException will be thrown and snapshot recovery will fail.

Reviewers: Colin P. McCabe <cmccabe@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants