Skip to content

KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode#12487

Merged
dajac merged 5 commits intoapache:trunkfrom
jolshan:kafka-14140
Aug 10, 2022
Merged

KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode#12487
dajac merged 5 commits intoapache:trunkfrom
jolshan:kafka-14140

Conversation

@jolshan
Copy link
Copy Markdown
Member

@jolshan jolshan commented Aug 5, 2022

As part of KIP-841 we prevent shutting down brokers from being added to ISR (and therefore ineligible to become leader).

We want to do the same in 3.3 for ZK to protect against edge cases and not have to do a version bump in future versions.

See this PR to see the equivalent change for KRaft mode: b6cb295

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated
Comment thread core/src/main/scala/kafka/controller/KafkaController.scala Outdated
@dajac
Copy link
Copy Markdown
Member

dajac commented Aug 6, 2022

Thanks for the PR. The title of the PR is misleading here. There is no « fenced » state in ZK and the patch only prevent offline replicas to not join the ISR, not the shutting down ones. Is it intentional?

I can review it next week.

Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan : Thanks for the PR. Looks good overall. Great test! Just a few minor comments.

metadataCache match {
// In KRaft mode, only replicas which are not fenced nor in controlled shutdown are
// allowed to join the ISR. This does not apply to ZK mode.
// allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ControllerChannelManager.sendUpdateMetadataRequests(), it seems that we include shutting down broker in liveBrokers. So, we won't know whether a remote broker is shutting down.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah -- this was something I was looking into. I'm not sure if there is a way to also exclude shutting down brokers here. Is that also included in the metadata? I can take a look as well.

Copy link
Copy Markdown
Contributor

@splett2 splett2 Aug 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's possible with ZK controller on the leader-side, but having the controller-side check is probably sufficient.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.

This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok makes sense. Should I change the comment to reflect that this will not block shutting down brokers here, but will be blocked controller side?

I think for at least this PR (which we want to get into 3.3) we should hold off on protocol changes.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Aug 9, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think our main defense is on the follower side. We prevent the LeaderAndIsr from starting up the fetcher if ReplicaManager is shutting down. Seems ok if this check on the leader side is imperfect.

nit: can we move the comments down to the respective cases?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. We can say that we ensure that the replica is online but that it could be in controller shutdown.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah my comment is slightly different in the latest commit. Let me know if I should change it

Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated
}

@Test
def testShutdownBrokerNotAddedToIsr(): Unit = {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can just test this case in one of the variants in testAlterPartitionErrors

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps? Do you think the current way isn't readable?

@jolshan
Copy link
Copy Markdown
Member Author

jolshan commented Aug 8, 2022

@dajac I can remove the reference to fenced. As for shutting down brokers -- they are prevented in the kafka controller code(otherwise I'd use liveOrShuttingDownBrokerIds), but not the metadata cache code as I understand.

I can try to modify the metadata cache if possible.

@jolshan jolshan changed the title KAFKA-14140: Ensure a fenced or in-controlled-shutdown replica is not eligible to join ISR in ZK mode KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode Aug 8, 2022
Copy link
Copy Markdown
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan Thanks for the PR. I left a few comments for consideration.

metadataCache match {
// In KRaft mode, only replicas which are not fenced nor in controlled shutdown are
// allowed to join the ISR. This does not apply to ZK mode.
// allowed to join the ISR. In ZK mode, we just ensure the broker is alive and not shutting down.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This basically means that the leader will retry adding back the shutting-down broker to the ISR until the shutting-down broker is removed from the metadata cache. It is worth noting that, during this time, other replicas cannot be added back to the ISR. The controller rejects any ISR expansion containing at least one ineligible replica. This is why we added that in-controller-shutdown state in KRaft. It allows the leader to filter them out as soon.

This may be acceptable here. Otherwise, we would have to propagate the shutting-down brokers via the UpdateMetadataRequest. What do others think?

Comment thread core/src/main/scala/kafka/cluster/Partition.scala
Comment thread core/src/main/scala/kafka/controller/KafkaController.scala Outdated
Comment thread core/src/test/scala/unit/kafka/cluster/PartitionTest.scala Outdated
Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated
Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated
Comment thread core/src/test/scala/unit/kafka/controller/ControllerIntegrationTest.scala Outdated
@dajac
Copy link
Copy Markdown
Member

dajac commented Aug 9, 2022

@jolshan Could you also rebase the PR? There are some conflicts.

Comment thread core/src/main/scala/kafka/cluster/Partition.scala Outdated
Copy link
Copy Markdown
Contributor

@junrao junrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jolshan : Thanks for the updated PR. LGTM

Copy link
Copy Markdown
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the PR, @jolshan!

@dajac dajac merged commit 163d00b into apache:trunk Aug 10, 2022
dajac pushed a commit that referenced this pull request Aug 10, 2022
…ot eligible to join ISR in ZK mode (#12487)

This patch prevents offline or in-controller-shutdown replicas from being added back to the ISR and therefore to become leaders in ZK mode. This is an extra line of defense to ensure that it never happens. This is a continuation of the work done in KIP-841.

Reviewers: David Mao <dmao@confluent.io>, Jason Gustafson <jason@confluent.io>, Jun Rao <jun@confluent.io>, David Jacot <djacot@confluent.io>
@dajac
Copy link
Copy Markdown
Member

dajac commented Aug 10, 2022

Merged to trunk and 3.3.

ijuma added a commit to confluentinc/kafka that referenced this pull request Aug 10, 2022
…(10 August 2022)

Trivial conflict in gradle/dependencies.gradle due to the newer Netty
version in confluentinc/kafka.

* apache-github/trunk:
MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies
(apache#12495)
KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is
not eligible to join ISR in ZK mode (apache#12487)
  KAFKA-14114: Add Metadata Error Related Metrics
MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size
(apache#12486)
  MINOR: Upgrade mockito test dependencies (apache#12460)
KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing
partition epoch (apache#12489)
KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest
(apache#12472)
  MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477)
KAFKA-14104; Add CRC validation when iterating over Metadata Log
Records (apache#12457)
  MINOR: add :server-common test dependency to :storage (apache#12488)
  KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440)
  KAFKA-14124: improve quorum controller fault handling (apache#12447)
ijuma added a commit to franz1981/kafka that referenced this pull request Aug 12, 2022
* apache-github/trunk: (447 commits)
  KAFKA-13959: Controller should unfence Broker with busy metadata log (apache#12274)
  KAFKA-10199: Expose read only task from state updater (apache#12497)
  KAFKA-14154; Return NOT_CONTROLLER from AlterPartition if leader is ahead of controller (apache#12506)
  KAFKA-13986; Brokers should include node.id in fetches to metadata quorum (apache#12498)
  KAFKA-14163; Retry compilation after zinc compile cache error (apache#12507)
  Remove duplicate common.message.* from clients:test jar file (apache#12407)
  KAFKA-13060: Replace EasyMock and PowerMock with Mockito in WorkerGroupMemberTest.java (apache#12484)
  Fix the rate window size calculation for edge cases (apache#12184)
  MINOR: Upgrade gradle to 7.5.1 and bump other build/test dependencies (apache#12495)
  KAFKA-14140: Ensure an offline or in-controlled-shutdown replica is not eligible to join ISR in ZK mode (apache#12487)
  KAFKA-14114: Add Metadata Error Related Metrics
  MINOR: BrokerMetadataSnapshotter must avoid exceeding batch size (apache#12486)
  MINOR: Upgrade mockito test dependencies (apache#12460)
  KAFKA-14144:; Compare AlterPartition LeaderAndIsr before fencing partition epoch (apache#12489)
  KAFKA-14134: Replace EasyMock with Mockito for WorkerConnectorTest (apache#12472)
  MINOR: Update scala version in bin scripts to 2.13.8 (apache#12477)
  KAFKA-14104; Add CRC validation when iterating over Metadata Log Records (apache#12457)
  MINOR: add :server-common test dependency to :storage (apache#12488)
  KAFKA-14107: Upgrade Jetty version for CVE fixes (apache#12440)
  KAFKA-14124: improve quorum controller fault handling (apache#12447)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants