Skip to content

KAFKA-13888: Addition of Information in DescribeQuorumResponse#12508

Merged
hachikuji merged 17 commits intoapache:trunkfrom
niket-goel:kafka-13888-describe-quorum-fields
Aug 19, 2022
Merged

KAFKA-13888: Addition of Information in DescribeQuorumResponse#12508
hachikuji merged 17 commits intoapache:trunkfrom
niket-goel:kafka-13888-describe-quorum-fields

Conversation

@niket-goel
Copy link
Copy Markdown
Contributor

@niket-goel niket-goel commented Aug 12, 2022

This commit adds in the implementation for two new fields:

  • LastFetchTimestamp
  • LastCaughtUpTimestamp

as they are described in KIP-836

This commit adds in the implementation for two new fields:
* LastFetchTimestamp
* LastCaughtUpTimestamp
as they are described in KIP-836
convertToReplicaStates(leaderState.getVoterEndOffsets()),
convertToReplicaStates(leaderState.getObserverStates(currentTimeMs))
convertToReplicaStates(leaderState.getVoterEndOffsets(),
leaderState.getVoterLastFetchTimes(),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a bit simpler to push creation of the ReplicaState object into LeaderState:

class LeaderState {
  ...

  List<ReplicaState> voterStates(long currentTimeMs);
  List<ReplicaState> observerStates(long currentTimeMs);
}

Then we wouldn't need all the boilerplate methods to create all the different Map variations. Also, then the sanity checks in convertToReplicaStates become unnecessary.

Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
state.updateFetchTimestamp(fetchTimestamp);
// Update the Last CaughtUp Time
if (logOffsetMetadata.offset >= leaderLogEndOffset) {
state.updateLastCaughtUpTimestamp(fetchTimestamp);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do these updates before the call to updateEndOffset, which may raise an exception (same thing for updateFetchTimestamp which is an existing issue). I think it would be better to update state only after we have confirmed the update is valid.

Comment thread core/src/test/scala/integration/kafka/server/KRaftClusterTest.scala Outdated
val quorumInfo = quorumState.quorumInfo.get()

assertEquals(0, quorumInfo.observers.size)
assertEquals(3, quorumInfo.voters.size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more straightforward way to assert the voter/observer sets is in DescribeQuorumRequestTest:

      val voterData = partitionData.currentVoters.asScala
      assertEquals(cluster.controllerIds().asScala, voterData.map(_.replicaId).toSet);

      val observerData = partitionData.observers.asScala
      assertEquals(cluster.brokerIds().asScala, observerData.map(_.replicaId).toSet);

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That check already exists in the DescribeQuorumRequestTest. The count validation here is just a sanity check.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji Aug 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are checking the size here and all of the individual replicaIds below. For example:

assertTrue(-1 < observer.replicaId && 4 > observer.replicaId,
            s"Observer ID ${observer.replicaId} was not within expected range.")

I am just offering a more concise and complete way to do that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have missed my comment above. The nice thing about asserting the set directly is that it ensures that all expected voters/observers are present in the result.

* Some code refactoring
* Fixing error handling when updating LeaderState::ReplicaState
@niket-goel niket-goel marked this pull request as ready for review August 16, 2022 18:12
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
val quorumInfo = quorumState.quorumInfo.get()

assertEquals(0, quorumInfo.observers.size)
assertEquals(3, quorumInfo.voters.size)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may have missed my comment above. The nice thing about asserting the set directly is that it ensures that all expected voters/observers are present in the result.

Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
assertEquals(4, quorumInfo.observers.size)
assertEquals(3, quorumInfo.voters.size)
assertEquals(cluster.controllers.asScala.keySet, quorumInfo.voters.asScala.map(_.replicaId).toSet)
assertTrue(2999 < quorumInfo.leaderId && 3003 > quorumInfo.leaderId,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make this assertion a little more generic by asserting that quorumInfo.leaderId is contained in cluster.controllers.asScala.keySet.

private boolean updateEndOffset(
ReplicaState state,
LogOffsetMetadata endOffsetMetadata,
boolean verifyUpdate
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? Seems like false is the only value we ever pass.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to retain the verification step in the method in case we want to use it elsewhere in the future.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, just a few more small comments.

Comment thread raft/src/test/java/org/apache/kafka/raft/LeaderStateTest.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/LeaderStateTest.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/LeaderStateTest.java Outdated
Comment thread raft/src/test/java/org/apache/kafka/raft/LeaderStateTest.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
Comment thread raft/src/main/java/org/apache/kafka/raft/LeaderState.java Outdated
@niket-goel
Copy link
Copy Markdown
Contributor Author

Thanks for the comments @hachikuji . Have addressed them and pushed another version. :)

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just one more nit to fix.

assertNotEquals(OptionalLong.empty(), observer.lastCaughtUpTimeMs)
}
} catch {
case t: Throwable => throw t
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if we're just re-throwing, we don't need to catch

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM. We'll have to be a bit careful since #12469 is also ready to merge. We'll see whoever gets the clean build first.

@niket-goel niket-goel force-pushed the kafka-13888-describe-quorum-fields branch from 38a0af7 to d1936ab Compare August 19, 2022 18:48
Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, the updates LGTM as well. Hopefully we've nailed the cause of the test flakiness.

@hachikuji hachikuji merged commit c7f0519 into apache:trunk Aug 19, 2022
hachikuji pushed a commit that referenced this pull request Aug 19, 2022
…tamp` for DescribeQuorumResponse [KIP-836] (#12508)

This commit implements the newly added fields `LastFetchTimestamp` and `LastCaughtUpTimestamp` for KIP-836: https://cwiki.apache.org/confluence/display/KAFKA/KIP-836:+Addition+of+Information+in+DescribeQuorumResponse+about+Voter+Lag.

Reviewers: Jason Gustafson <jason@confluent.io>
@niket-goel niket-goel mentioned this pull request Aug 31, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants