Skip to content

KAFKA-8002: Replica reassignment to new log dir may not complete if f…#6346

Merged
hachikuji merged 3 commits intoapache:trunkfrom
bob-barrett:KAFKA-8002
Mar 2, 2019
Merged

KAFKA-8002: Replica reassignment to new log dir may not complete if f…#6346
hachikuji merged 3 commits intoapache:trunkfrom
bob-barrett:KAFKA-8002

Conversation

@bob-barrett
Copy link
Copy Markdown
Contributor

@bob-barrett bob-barrett commented Feb 28, 2019

…uture and current replicas segment files have different base offsets

This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

…uture and current replicas segment files have different base offsets

This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.
@bob-barrett
Copy link
Copy Markdown
Contributor Author

retest this please

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, looks good overall. Left a minor suggestion.

futureLocalReplica match {
case Some(futureReplica) =>
if (replica.logEndOffset == futureReplica.logEndOffset) {
if (replica.logEndOffset.messageOffset == futureReplica.logEndOffset.messageOffset) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I was discussing with @apovzner is whether we should consider renaming logEndOffset. The bug probably would never have occurred if we had a better name. Maybe we should have two methods:

def logEndOffset: Long
def logEndOffsetMetadata: LogOffsetMetadata

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the suggestion of having logEndOffsetMetadata method. Although, need to be careful not to miss any calls that would need to be renamed to
logEndOffsetMetadata (and get to the same type of bug we found here). I guess one way to make sure this does not happen is to first change the name of logEndOffset to
logEndOffsetMetadata, make sure to compile, and only after that add
logEndOffset that returns Long.

}

@Test
// Verify that replacement works when the replicas have the same log end offset but different base offsets in the
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Move this comment above annotation?

@apovzner
Copy link
Copy Markdown
Contributor

Thanks for the PR! Looks good -- I don't have anything to add to Jason's comments.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the fix!

@hachikuji hachikuji merged commit 57604c2 into apache:trunk Mar 2, 2019
hachikuji pushed a commit that referenced this pull request Mar 2, 2019
…nt segment base offset (#6346)

This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.

Reviewers: Anna Povzner <anna@confluent.io>, Jason Gustafson <jason@confluent.io>
@bob-barrett bob-barrett deleted the KAFKA-8002 branch March 4, 2019 07:46
@bob-barrett bob-barrett restored the KAFKA-8002 branch March 4, 2019 07:47
@bob-barrett bob-barrett deleted the KAFKA-8002 branch March 4, 2019 07:56
@bob-barrett bob-barrett restored the KAFKA-8002 branch March 4, 2019 07:57
@bob-barrett bob-barrett deleted the KAFKA-8002 branch March 4, 2019 07:57
xiowu0 pushed a commit to linkedin/kafka that referenced this pull request Mar 12, 2019
… has different segment base offset (apache#6346)

This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.
backported to 2.0-li
Reviewers: Anna Povzner <anna@confluent.io>, Jason Gustafson <jason@confluent.io>
jarekr pushed a commit to confluentinc/kafka that referenced this pull request Apr 18, 2019
* apache/trunk:
  KAFKA-7880:Naming worker thread by task id (apache#6275)
  improve some logging statements (apache#6078)
  KAFKA-7312: Change broker port used in testMinimumRequestTimeouts and testForceClose
  KAFKA-7997: Use automatic RPC generation in SaslAuthenticate
  KAFKA-8002; Log dir reassignment stalls if future replica has different segment base offset (apache#6346)
  KAFKA-3522: Add TimestampedKeyValueStore builder/runtime classes (apache#6152)
  HOTFIX: add igore import to streams_upgrade_test
  MINOR: ConsumerNetworkClient does not need to send the remaining requests when the node is not ready (apache#6264)
  KAFKA-7922: Return authorized operations in describe consumer group responses (KIP-430 Part-1)
  KAFKA-7918: Inline generic parameters Pt. III: in-memory window store (apache#6328)
  KAFKA-8012; Ensure partitionStates have not been removed before truncating. (apache#6333)
  KAFKA-8011: Fix for race condition causing concurrent modification exception (apache#6338)
  KAFKA-7912: Support concurrent access in InMemoryKeyValueStore (apache#6336)
  MINOR: Skip quota check when replica is in sync (apache#6344)
  HOTFIX: Change header back to http instead of https to path license header test (apache#6347)
  MINOR: fix release.py script (apache#6317)
  MINOR: Remove types from caching stores (apache#6331)
  MINOR: Improve logging for alter log dirs (apache#6302)
  MINOR: state.cleanup.delay.ms default is 600,000 ms (10 minutes). (apache#6345)
  MINOR: disable Streams system test for broker upgrade/downgrade (apache#6341)
pengxiaolong pushed a commit to pengxiaolong/kafka that referenced this pull request Jun 14, 2019
…nt segment base offset (apache#6346)

This patch fixes a bug in log dir reassignment where Partition.maybeReplaceCurrentWithFutureReplica would compare the entire LogEndOffsetMetadata of each replica to determine whether the reassignment has completed. If the active segments of both replicas have different base segments (for example, if the current replica had previously been cleaned and the future replica rolled segments at different points), the reassignment will never complete. The fix is to compare only the LogEndOffsetMetadata.messageOffset for each replica. Tested with a unit test that simulates the compacted current replica case.

Reviewers: Anna Povzner <anna@confluent.io>, Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants