KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode by jolshan · Pull Request #12487 · apache/kafka

jolshan · 2022-08-05T22:15:49Z

As part of KIP-841 we prevent shutting down brokers from being added to ISR (and therefore ineligible to become leader).

We want to do the same in 3.3 for ZK to protect against edge cases and not have to do a version bump in future versions.

See this PR to see the equivalent change for KRaft mode: b6cb295

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

dajac · 2022-08-06T08:15:28Z

Thanks for the PR. The title of the PR is misleading here. There is no « fenced » state in ZK and the patch only prevent offline replicas to not join the ISR, not the shutting down ones. Is it intentional?

I can review it next week.

junrao

@jolshan : Thanks for the PR. Looks good overall. Great test! Just a few minor comments.

junrao · 2022-08-06T16:20:09Z

    metadataCache match {
      // In KRaft mode, only replicas which are not fenced nor in controlled shutdown are
-      // allowed to join the ISR. This does not apply to ZK mode.
+      // allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down.


In ControllerChannelManager.sendUpdateMetadataRequests(), it seems that we include shutting down broker in liveBrokers. So, we won't know whether a remote broker is shutting down.

Yeah -- this was something I was looking into. I'm not sure if there is a way to also exclude shutting down brokers here. Is that also included in the metadata? I can take a look as well.

I don't think it's possible with ZK controller on the leader-side, but having the controller-side check is probably sufficient.

This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.

This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?

Ok makes sense. Should I change the comment to reflect that this will not block shutting down brokers here, but will be blocked controller side?

I think for at least this PR (which we want to get into 3.3) we should hold off on protocol changes.

I think our main defense is on the follower side. We prevent the LeaderAndIsr from starting up the fetcher if ReplicaManager is shutting down. Seems ok if this check on the leader side is imperfect.

nit: can we move the comments down to the respective cases?

That makes sense. We can say that we ensure that the replica is online but that it could be in controller shutdown.

Ah my comment is slightly different in the latest commit. Let me know if I should change it

splett2 · 2022-08-08T06:36:26Z

  }

+  @Test
+  def testShutdownBrokerNotAddedToIsr(): Unit = {


I wonder if we can just test this case in one of the variants in testAlterPartitionErrors

Perhaps? Do you think the current way isn't readable?

jolshan · 2022-08-08T16:02:22Z

@dajac I can remove the reference to fenced. As for shutting down brokers -- they are prevented in the kafka controller code(otherwise I'd use liveOrShuttingDownBrokerIds), but not the metadata cache code as I understand.

I can try to modify the metadata cache if possible.

dajac

@jolshan Thanks for the PR. I left a few comments for consideration.

dajac · 2022-08-09T08:46:03Z

    metadataCache match {
      // In KRaft mode, only replicas which are not fenced nor in controlled shutdown are
-      // allowed to join the ISR. This does not apply to ZK mode.
+      // allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down.


This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.

This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?

dajac · 2022-08-09T17:12:02Z

@jolshan Could you also rebase the PR? There are some conflicts.

junrao

@jolshan : Thanks for the updated PR. LGTM

jolshan · 2022-08-09T22:40:01Z

Failed tests passed locally:

Build / JDK 8 and Scala 2.12 / kafka.admin.DeleteOffsetsConsumerGroupCommandIntegrationTest.testDeleteOffsetsNonExistingGroup()
Build / JDK 8 and Scala 2.12 / kafka.server.KRaftClusterTest.testCreateClusterAndPerformReassignment()
Build / JDK 11 and Scala 2.13 / kafka.log.LogCleanerIntegrationTest.testMarksPartitionsAsOfflineAndPopulatesUncleanableMetrics()

This test also failed, but confirming it fails on trunk as well:
Build / JDK 17 and Scala 2.13 / org.apache.kafka.connect.integration.ConnectorRestartApiIntegrationTest.testMultiWorkerRestartOnlyConnector

dajac

LGTM. Thanks for the PR, @jolshan!

…ot eligible to join ISR in ZK mode (#12487) This patch prevents offline or in-controller-shutdown replicas from being added back to the ISR and therefore to become leaders in ZK mode. This is an extra line of defense to ensure that it never happens. This is a continuation of the work done in KIP-841. Reviewers: David Mao <dmao@confluent.io>, Jason Gustafson <jason@confluent.io>, Jun Rao <jun@confluent.io>, David Jacot <djacot@confluent.io>

dajac · 2022-08-10T08:33:22Z

Merged to trunk and 3.3.

…(10 August 2022) Trivial conflict in gradle/dependencies.gradle due to the newer Netty version in confluentinc/kafka. * apache-github/trunk: MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447)

* apache-github/trunk: (447 commits) KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274) KAFKA-10199: Expose read only task from state updater (apache#12497) KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506) KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498) KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507) Remove duplicate common.message.* from clients:test jar file (apache#12407) KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484) Fix the rate window size calculation for edge cases (apache#12184) MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447) ...

Prevent adding offline broker to isr

b673ddf

splett2 reviewed Aug 6, 2022

View reviewed changes

Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated

splett2 reviewed Aug 6, 2022

View reviewed changes

Comment thread core/src/main/scala/kafka/controller/KafkaController.scala Outdated

junrao reviewed Aug 6, 2022

View reviewed changes

splett2 reviewed Aug 8, 2022

View reviewed changes

jolshan changed the title ~~KAFKA-14140: Ensure a fenced or in-controlled-shutdown replica is not eligible to join ISR in ZK mode~~ KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode Aug 8, 2022

Make code neater and clean up test

fe67940

dajac reviewed Aug 9, 2022

View reviewed changes

Fixes from review

39e512f

jolshan force-pushed the kafka-14140 branch from 06bd5e0 to 39e512f Compare August 9, 2022 17:20

hachikuji reviewed Aug 9, 2022

View reviewed changes

Comment thread core/src/main/scala/kafka/cluster/Partition.scala Outdated

jolshan added 2 commits August 9, 2022 10:55

Merge branch 'trunk' of github.com:apache/kafka into kafka-14140

f9ed1f2

fix

888e659

junrao approved these changes Aug 9, 2022

View reviewed changes

dajac approved these changes Aug 10, 2022

View reviewed changes

dajac merged commit 163d00b into apache:trunk Aug 10, 2022

Conversation

jolshan commented Aug 5, 2022

Committer Checklist (excluded from commit message)

Uh oh!

Uh oh!

Uh oh!

dajac commented Aug 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

junrao left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

splett2 Aug 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hachikuji Aug 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jolshan commented Aug 8, 2022

Uh oh!

dajac left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

dajac commented Aug 9, 2022

Uh oh!

Uh oh!

junrao left a comment

Choose a reason for hiding this comment

Uh oh!

jolshan commented Aug 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dajac left a comment

Choose a reason for hiding this comment

Uh oh!

dajac commented Aug 10, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

dajac commented Aug 6, 2022 •

edited

Loading

splett2 Aug 8, 2022 •

edited

Loading

hachikuji Aug 9, 2022 •

edited

Loading

jolshan commented Aug 9, 2022 •

edited

Loading