Skip to content

KAFKA-15585: Add DescribeTopics API server side support#14612

Merged
mumrah merged 30 commits intoapache:trunkfrom
CalvinLiu7947:ELR-ak-Describe-topics-api
Jan 24, 2024
Merged

KAFKA-15585: Add DescribeTopics API server side support#14612
mumrah merged 30 commits intoapache:trunkfrom
CalvinLiu7947:ELR-ak-Describe-topics-api

Conversation

@CalvinLiu7947
Copy link
Copy Markdown
Contributor

@CalvinLiu7947 CalvinLiu7947 commented Oct 23, 2023

Introduce the DescribeTopics API and the server-side handling code.
https://issues.apache.org/jira/browse/KAFKA-15585

Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/protocol/ApiKeys.java Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/protocol/Errors.java Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/requests/DescribeTopicsRequest.java Outdated
Comment thread clients/src/main/resources/common/message/DescribeTopicsResponse.json Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread clients/src/main/resources/common/message/DescribeTopicsRequest.json Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
@CalvinLiu7947 CalvinLiu7947 force-pushed the ELR-ak-Describe-topics-api branch from 637d45a to 35f9763 Compare November 15, 2023 05:35
@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

CalvinLiu7947 commented Nov 18, 2023

  • Updated the API schema with the Cursor. It is needed both in request and response.
  • Removed the RequestLimitReached error.
  • Use ordered map in the TopicsImage.

Copy link
Copy Markdown
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @CalvinConfluent!

Left some comments inline

Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread clients/src/main/resources/common/message/DescribeTopicPartitionsRequest.json Outdated
Comment thread clients/src/main/resources/common/message/DescribeTopicPartitionsResponse.json Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread core/src/main/scala/kafka/server/metadata/KRaftMetadataCache.scala Outdated
@mumrah
Copy link
Copy Markdown
Member

mumrah commented Nov 20, 2023

@CalvinConfluent btw, have you updated the KIP to reflect the two RPC schemas you've corrected here?

@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

@mumrah Thanks for the review, KIP updated.

Comment thread clients/src/main/resources/common/message/DescribeTopicPartitionsResponse.json Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
val cursor = describeTopicPartitionsRequest.cursor()
val fetchAllTopics = topics.isEmpty
if (fetchAllTopics) {
kRaftMetadataCache.getAllTopics().foreach(topic => topics.append(topic))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we copy and sort all the topic names anyway, do we need to change the underlying data structure to NavigableMap? We could just use this list to traverse topic info and it will be in order.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the fetch all path, no additional sort is required. I did not see a good way to convert Java list to a scala mutable list, so I did the copy.
Use a mutable list for 2 reasons

  1. It is easier to filter out the topics alphabetically ahead of the cursor topic
  2. In the fetch all case, I think we should still include the cursor topic in the response if it does not exist. Mutable list make it easier.

Copy link
Copy Markdown
Contributor Author

@CalvinLiu7947 CalvinLiu7947 Nov 21, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But if you ask whether it is worth the effort to create the full set of underline structures to get an ordered list where we can just sort the topic list, I am not sure.

@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

As discussed offline, we will focus on the pagination behavior.
The performance optimization is tracked in https://issues.apache.org/jira/browse/KAFKA-15873

Copy link
Copy Markdown
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @CalvinConfluent! Looks like there are some conflicts with trunk.

I think we should add an integration "request" test for the new RPC. See ApiVersionsRequestTest for a basic example. We can also do this as a follow-up.

@CalvinLiu7947 CalvinLiu7947 force-pushed the ELR-ak-Describe-topics-api branch from aa1c51b to b2bdf53 Compare November 28, 2023 21:07
@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

@mumrah Thanks for the review. The integration tests will be introduced in the client side change.

val result = new ListBuffer[DescribeTopicPartitionsResponsePartition]()
val endIndex = upperIndex.min(topic.partitions().size())
for (partitionId <- startIndex until endIndex) {
val partition = topic.partitions().get(partitionId)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if partition doesn't exist?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean the partitions in the topic are not consecutive? Just realize it is possible.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it is not possible, the partition index starts with 0 and increments by 1.
Then what is the case if the partition does not exist?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data structure leaves a possibility (due to a bug or a change elsewhere) to have arbitrary numbers. It would be good not to crash if the current assumptions are violated.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated.

if (!partitionResponse.isDefined) {
val error = try {
Topic.validate(topicName)
Errors.UNKNOWN_TOPIC_OR_PARTITION
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the error is kind of unexpected -- if the user didn't specify a topic in the first place, why would it get an error about a topic that doesn't exist?

Copy link
Copy Markdown
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @CalvinConfluent, I like the new iterator approach. I left just one comment on that inline.

I also like that you wrote the new request handler in Java. I think that's a first 😄

.setEligibleLeaderReplicas(Replicas.toList(partition.elr))
.setLastKnownElr(Replicas.toList(partition.lastKnownElr)))
// The partition id may not be consecutive.
val partitions = topic.partitions().keySet().stream().sorted().iterator()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has O(N*logN) runtime complexity and O(N) space complexity. We could do O(N) complexity and not have an extra copy if we just iterate over all partitions and filter the ones that fit into the required range (one of your previous implementations had this).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I get it. The partition IDs can be random like the cases in UT, I don't have an O(n) with no extra space simple solution off the top of my head. Maybe running the quick select can do the trick but it is not generically supported by Java.
Instead, I use a tree set to maintain the top K smallest partitions larger than the start index. This is better than the original sorting.

.setEligibleLeaderReplicas(Replicas.toList(partition.elr))
.setLastKnownElr(Replicas.toList(partition.lastKnownElr)))
}
val partitions = topic.partitions().keySet()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like here we just need to remember the size? Or maybe calculate the nextIndex directly here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

val result = new ListBuffer[DescribeTopicPartitionsResponsePartition]()
val endIndex = upperIndex.min(topic.partitions().size())
for (partitionId <- startIndex until endIndex) {
val partition = topic.partitions().get(partitionId)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The data structure leaves a possibility (due to a bug or a change elsewhere) to have arbitrary numbers. It would be good not to crash if the current assumptions are violated.

val maybeLeader = getAliveEndpoint(image, partition.leader, listenerName)
maybeLeader match {
case None =>
val error = if (!image.cluster().brokers.containsKey(partition.leader)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we need to see what the client does with the error code.

Comment thread core/src/main/scala/kafka/server/metadata/KRaftMetadataCache.scala Outdated
@CalvinLiu7947 CalvinLiu7947 requested a review from mumrah January 18, 2024 22:40
@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

Verified the following tests locally
testDescribeUnderReplicatedPartitionsWhenReassignmentIsInProgress also fails in other PR https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka/detail/trunk/2588/tests/
testDescribeQuorumReplicationSuccessful
This PR mostly new code and uses its code path, so theoretically will not affect other UT. Running the integration again by merging the latest master.

Copy link
Copy Markdown
Member

@mumrah mumrah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for all the work on this @CalvinConfluent. LGTM

Please double check that the failing tests on Jenkins look okay locally.

@CalvinLiu7947
Copy link
Copy Markdown
Contributor Author

@mumrah Thanks! I have verified the tests failing can pass locally.

@mumrah mumrah merged commit 7e5ef9b into apache:trunk Jan 24, 2024
yyu1993 pushed a commit to yyu1993/kafka that referenced this pull request Feb 15, 2024
…14612)

This patch implements the new DescribeTopicPartitions RPC as defined in KIP-966 (ELR). Additionally, this patch adds a broker config "max.request.partition.size.limit" which limits the number of partitions returned by the new RPC.

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>
clolov pushed a commit to clolov/kafka that referenced this pull request Apr 5, 2024
…14612)

This patch implements the new DescribeTopicPartitions RPC as defined in KIP-966 (ELR). Additionally, this patch adds a broker config "max.request.partition.size.limit" which limits the number of partitions returned by the new RPC.

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>
Phuc-Hong-Tran pushed a commit to Phuc-Hong-Tran/kafka that referenced this pull request Jun 6, 2024
…14612)

This patch implements the new DescribeTopicPartitions RPC as defined in KIP-966 (ELR). Additionally, this patch adds a broker config "max.request.partition.size.limit" which limits the number of partitions returned by the new RPC.

Reviewers: Artem Livshits <alivshits@confluent.io>, Jason Gustafson <jason@confluent.io>, David Arthur <mumrah@gmail.com>
Demogorgon314 pushed a commit to Demogorgon314/kop that referenced this pull request Apr 15, 2026
…mnative#942)

Main changes:
- Adapt to the new `AddPartitionsToTxnRequest` from
apache/kafka#13231 (KIP-890)
- Support the new `DescribeTopicPartitions` request from
apache/kafka#14612 (KIP-966), which is required
by some admin APIs

Other changes:
- apache/kafka#13760 will retry when
`deleteRecords` returns a retriable error, change the error code to
`INVALID_REQUEST`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants