Skip to content

KAFKA-14238; KRaft metadata log should not delete segment past the latest snapshot#12655

Merged
hachikuji merged 2 commits intoapache:trunkfrom
jsancio:kafka-14238-kraft-segment-delete
Sep 17, 2022
Merged

KAFKA-14238; KRaft metadata log should not delete segment past the latest snapshot#12655
hachikuji merged 2 commits intoapache:trunkfrom
jsancio:kafka-14238-kraft-segment-delete

Conversation

@jsancio
Copy link
Copy Markdown
Member

@jsancio jsancio commented Sep 16, 2022

Disable segment deletion based on size and time by setting the KRaft metadata log's RetentionMsProp and RetentionBytesProp to -1. This will cause UnifiedLog.deleteRetentionMsBreachedSegments and UnifiedLog.deleteRetentionSizeBreachedSegments to short circuit instead of deleting segments.

Without this changes the included test would fail. This happens because deleteRetentionMsBreachedSegments is able to delete past the logStartOffset. Deleting past the logStartOffset would violate the invariant that if the logStartOffset is greater than 0 then there is a snapshot with an end offset greater than or equal to the log start offset.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@jsancio jsancio added core Kafka Broker 3.3 labels Sep 16, 2022
@jsancio jsancio requested a review from hachikuji September 16, 2022 22:27
@jsancio jsancio force-pushed the kafka-14238-kraft-segment-delete branch from e7694f9 to 298436b Compare September 16, 2022 22:33
@jsancio jsancio changed the title Kafka 14238 kraft segment delete KAFKA-4238; KRaft metadata log should not delete segment past the latest snapshot Sep 16, 2022
@jsancio jsancio changed the title KAFKA-4238; KRaft metadata log should not delete segment past the latest snapshot KAFKA-14238; KRaft metadata log should not delete segment past the latest snapshot Sep 16, 2022
Comment on lines +565 to +566
props.setProperty(LogConfig.RetentionMsProp, "-1")
props.setProperty(LogConfig.RetentionBytesProp, "-1")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The long term solution is to also implement this feature documented in KIP-630: https://issues.apache.org/jira/browse/KAFKA-14241

Comment on lines +916 to +920
val latestSnapshotOffset = log.latestSnapshotId().get.offset
assertTrue(
latestSnapshotOffset >= log.startOffset,
s"latest snapshot offset ($latestSnapshotOffset) must be >= log start offset (${log.startOffset})"
)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without this change this check fails with:

> Task :core:test FAILED
kafka.raft.KafkaMetadataLogTest.testSegmentLessThanLatestSnapshot() failed, log available in /home/jsancio/work/kafka/core/build/reports/testOutput/kafka.raft.KafkaMetadataLogTest.testSegmentLessThanLatestSnapshot().test.stdoutKafkaMetadataLogTest > testSegmentNotDeleteWithoutSnapshot() FAILED
    org.opentest4j.AssertionFailedError: latest snapshot offset (1440) must be >= log start offset (20010) ==> expected: <true> but was: <false>
        at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
        at org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
        at org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:210)
        at kafka.raft.KafkaMetadataLogTest.testSegmentLessThanLatestSnapshot(KafkaMetadataLogTest.scala:921)

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM.

throw new InvalidConfigurationException(
s"Cannot set $MetadataLogSegmentBytesProp below ${config.logSegmentMinBytes}: ${config.logSegmentBytes}"
)
} else if (defaultLogConfig.retentionMs >= 0) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really else if? Or should it be just if?

Copy link
Copy Markdown
Member

@ijuma ijuma Sep 17, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the behavior is the same since we always throw an exception in each block, but it seems a bit odd to use else if for unrelated conditions. Feel free to ignore though. :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same. I learned to do this because it lowers the cyclomatic complexity. It looks like the static analyzer understand else if but not throw when computing that value.

@jsancio
Copy link
Copy Markdown
Member Author

jsancio commented Sep 17, 2022

Rerunning build. One of the configuration didn't finish.

Copy link
Copy Markdown
Member

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice workaround! Thanks for the fix!

@showuon
Copy link
Copy Markdown
Member

showuon commented Sep 17, 2022

Rerunning build. One of the configuration didn't finish.

I believe this PR will fix the issue: #12639 . I think we should merge it soon to fix our PR gating and trunk/3.3 build status!

@hachikuji hachikuji merged commit b166ac4 into apache:trunk Sep 17, 2022
hachikuji pushed a commit that referenced this pull request Sep 17, 2022
…test snapshot (#12655)

Disable segment deletion based on size and time by setting the KRaft metadata log's `RetentionMsProp` and `RetentionBytesProp` to `-1`. This will cause `UnifiedLog.deleteRetentionMsBreachedSegments` and `UnifiedLog.deleteRetentionSizeBreachedSegments` to short circuit instead of deleting segments.

Without this changes the included test would fail. This happens because `deleteRetentionMsBreachedSegments` is able to delete past the `logStartOffset`. Deleting past the `logStartOffset` would violate the invariant that if the `logStartOffset` is greater than 0 then there is a snapshot with an end offset greater than or equal to the log start offset.

Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
@jsancio jsancio deleted the kafka-14238-kraft-segment-delete branch September 19, 2022 16:00
ijuma added a commit to confluentinc/kafka that referenced this pull request Sep 21, 2022
…eptember 2022)

`Jenkinsfile` was the only conflict and we ignore the changes since
they are not relevant to the Confluent build.

* apache-github/3.3: (61 commits)
  KAFKA-14214: Introduce read-write lock to StandardAuthorizer for consistent ACL reads. (apache#12628)
  KAFKA-14243: Temporarily disable unsafe downgrade (apache#12664)
  KAFKA-14240; Validate kraft snapshot state on startup (apache#12653)
  KAFKA-14233: disable testReloadUpdatedFilesWithoutConfigChange first to fix the build (apache#12658)
  KAFKA-14238;  KRaft metadata log should not delete segment past the latest snapshot (apache#12655)
  KAFKA-14156: Built-in partitioner may create suboptimal batches (apache#12570)
  MINOR: Adds KRaft versions of most streams system tests (apache#12458)
  MINOR; Add missing li end tag (apache#12640)
  MINOR: Mention that kraft is production ready in upgrade notes (apache#12635)
  MINOR: Add upgrade note regarding the Strictly Uniform Sticky Partitioner (KIP-794)  (apache#12630)
  KAFKA-14222; KRaft's memory pool should always allocate a buffer (apache#12625)
  KAFKA-14208; Do not raise wakeup in consumer during asynchronous offset commits (apache#12626)
  KAFKA-14196; Do not continue fetching partitions awaiting auto-commit prior to revocation (apache#12603)
  KAFKA-14215; Ensure forwarded requests are applied to broker request quota (apache#12624)
  MINOR; Remove end html tag from upgrade (apache#12605)
  Remove the html end tag from upgrade.html
  KAFKA-14205; Document how to replace the disk for the KRaft Controller (apache#12597)
  KAFKA-14203 Disable snapshot generation on broker after metadata errors (apache#12596)
  KAFKA-14216: Remove ZK reference from org.apache.kafka.server.quota.ClientQuotaCallback javadoc (apache#12617)
  KAFKA-14217: app-reset-tool.html should not show --zookeeper flag that no longer exists (apache#12618)
  ...
guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023
…test snapshot (apache#12655)

Disable segment deletion based on size and time by setting the KRaft metadata log's `RetentionMsProp` and `RetentionBytesProp` to `-1`. This will cause `UnifiedLog.deleteRetentionMsBreachedSegments` and `UnifiedLog.deleteRetentionSizeBreachedSegments` to short circuit instead of deleting segments.

Without this changes the included test would fail. This happens because `deleteRetentionMsBreachedSegments` is able to delete past the `logStartOffset`. Deleting past the `logStartOffset` would violate the invariant that if the `logStartOffset` is greater than 0 then there is a snapshot with an end offset greater than or equal to the log start offset.

Reviewers: Luke Chen <showuon@gmail.com>, Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Kafka Broker

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants