KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode#12487
Conversation
|
Thanks for the PR. The title of the PR is misleading here. There is no « fenced » state in ZK and the patch only prevent offline replicas to not join the ISR, not the shutting down ones. Is it intentional? I can review it next week. |
| metadataCache match { | ||
| // In KRaft mode, only replicas which are not fenced nor in controlled shutdown are | ||
| // allowed to join the ISR. This does not apply to ZK mode. | ||
| // allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down. |
There was a problem hiding this comment.
In ControllerChannelManager.sendUpdateMetadataRequests(), it seems that we include shutting down broker in liveBrokers. So, we won't know whether a remote broker is shutting down.
There was a problem hiding this comment.
Yeah -- this was something I was looking into. I'm not sure if there is a way to also exclude shutting down brokers here. Is that also included in the metadata? I can take a look as well.
There was a problem hiding this comment.
I don't think it's possible with ZK controller on the leader-side, but having the controller-side check is probably sufficient.
There was a problem hiding this comment.
This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.
This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?
There was a problem hiding this comment.
Ok makes sense. Should I change the comment to reflect that this will not block shutting down brokers here, but will be blocked controller side?
I think for at least this PR (which we want to get into 3.3) we should hold off on protocol changes.
There was a problem hiding this comment.
I think our main defense is on the follower side. We prevent the LeaderAndIsr from starting up the fetcher if ReplicaManager is shutting down. Seems ok if this check on the leader side is imperfect.
nit: can we move the comments down to the respective cases?
There was a problem hiding this comment.
That makes sense. We can say that we ensure that the replica is online but that it could be in controller shutdown.
There was a problem hiding this comment.
Ah my comment is slightly different in the latest commit. Let me know if I should change it
| } | ||
|
|
||
| @Test | ||
| def testShutdownBrokerNotAddedToIsr(): Unit = { |
There was a problem hiding this comment.
I wonder if we can just test this case in one of the variants in testAlterPartitionErrors
There was a problem hiding this comment.
Perhaps? Do you think the current way isn't readable?
|
@dajac I can remove the reference to fenced. As for shutting down brokers -- they are prevented in the kafka controller code(otherwise I'd use liveOrShuttingDownBrokerIds), but not the metadata cache code as I understand. I can try to modify the metadata cache if possible. |
| metadataCache match { | ||
| // In KRaft mode, only replicas which are not fenced nor in controlled shutdown are | ||
| // allowed to join the ISR. This does not apply to ZK mode. | ||
| // allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down. |
There was a problem hiding this comment.
This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.
This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?
|
@jolshan Could you also rebase the PR? There are some conflicts. |
…ot eligible to join ISR in ZK mode (#12487) This patch prevents offline or in-controller-shutdown replicas from being added back to the ISR and therefore to become leaders in ZK mode. This is an extra line of defense to ensure that it never happens. This is a continuation of the work done in KIP-841. Reviewers: David Mao <dmao@confluent.io>, Jason Gustafson <jason@confluent.io>, Jun Rao <jun@confluent.io>, David Jacot <djacot@confluent.io>
|
Merged to trunk and 3.3. |
…(10 August 2022) Trivial conflict in gradle/dependencies.gradle due to the newer Netty version in confluentinc/kafka. * apache-github/trunk: MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447)
* apache-github/trunk: (447 commits) KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274) KAFKA-10199: Expose read only task from state updater (apache#12497) KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506) KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498) KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507) Remove duplicate common.message.* from clients:test jar file (apache#12407) KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484) Fix the rate window size calculation for edge cases (apache#12184) MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495) KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487) KAFKA-14114: Add Metadata Error Related Metrics MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486) MINOR: Upgrade mockito test dependencies (apache#12460) KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489) KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472) MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477) KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457) MINOR: add :server-common test dependency to :storage (apache#12488) KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440) KAFKA-14124: improve quorum controller fault handling (apache#12447) ...
As part of KIP-841 we prevent shutting down brokers from being added to ISR (and therefore ineligible to become leader).
We want to do the same in 3.3 for ZK to protect against edge cases and not have to do a version bump in future versions.
See this PR to see the equivalent change for KRaft mode: b6cb295
Committer Checklist (excluded from commit message)