Skip to content

KIP-222 - Add Consumer Group operations to Admin API#4454

Closed
jeqo wants to merge 70 commits intoapache:trunkfrom
jeqo:feature/admin-client-describe-consumer-group
Closed

KIP-222 - Add Consumer Group operations to Admin API#4454
jeqo wants to merge 70 commits intoapache:trunkfrom
jeqo:feature/admin-client-describe-consumer-group

Conversation

@jeqo
Copy link
Copy Markdown
Contributor

@jeqo jeqo commented Jan 21, 2018

@hachikuji
Copy link
Copy Markdown
Contributor

I did not take a deep look yet, but there are some problems that will probably require changes to the KIP. The group management protocol handles more than just consumer groups. Connect also uses it for its own group management. Connect groups do not have topic partition assignments, so the current MemberAssignment object does not work generally. I see two options:

  1. Get rid of the generic group APIs and only support consumer groups for now. In other words, if the protocol type is not "consumer," we just raise an exception. This would be a bummer, but it's probably the only way to make your current API work.

  2. Change the API to accommodate different group types. This will mean making the MemberAssignment generic so that it does not depend on the consumer assignment.

Additionally, I think there are a few other points to consider:

  • The describe API has the ability to fetch the assignment currently, but not the consumer's subscription. I think we should consider exposing this.
  • Users typically expect "simple" consumer groups which do not use group management to be included when listing consumer groups. In this case, the protocol type will be empty.
  • No fault of your's, but concurrently, there was a KIP to add support for deleting groups. We should consider adding an AdminClient API to support this (it could come later if needed, but there shouldn't be anything too tricky from an API perspective, so we may as well do it here.

@hachikuji
Copy link
Copy Markdown
Contributor

Just to clarify, if we go with the first option that I suggested above, we probably don't need a revote on the KIP. I doubt anyone will object if you just send an email to the discussion thread with the changes. Specifically, we should do the following:

  1. Drop the describeGroups API.
  2. Change the name of listGroups to listConsumerGroups.
  3. Change listGroupOffsets to listConsumerGroupOffsets.

Otherwise, the logic stays the same. We would just need to filter groups which do not have the "consumer" protocol type. Note that I do think it's important to account for simple consumer groups as well. These can be identified as groups with an empty protocol type.

@jeqo
Copy link
Copy Markdown
Contributor Author

jeqo commented Jan 28, 2018

@hachikuji thanks for your feedback!

Putting Group Management (connect and consumer groups) together as part of the changes to the API was one of the important points discussed in this KIP.

As I understand so far: connect group details are different from consumer group details, so we will need to define different APIs for both instead of trying to make them generic as your second option propose.

Following what your first option proposed we can add:

  1. Drop describeGroups API (and change it to describeConsumerGroups?)
  2. Change listGroups to listConsumerGroups
  3. Change listGroupOffsets to listConsumerGroupOffsets.
  4. add listConnectGroups?
  5. I don't know how Connect Groups use Group Management API in detail but if they have offset management also, we can add listConnectGroupOffsets and describeConnectGroups API

Simple Consumer Groups will be added to listConsumerGroupOffsets filtering protocol type = "consumer" or empty right? I can propose it as part of the KIP discussion to check if it make sense to add them as part of the same response.

And about adding DeleteGroups API I would be happy to, but I think we would need to hold until #4479 is merged right?

@hachikuji
Copy link
Copy Markdown
Contributor

@jeqo I'd probably suggest sticking with consumer groups initially if we want to have any hope of getting this into the release. So if we have 1-3 in your list, we can leave general group APIs for future work. The nice thing about sticking with consumer groups is that it lets us expose the inner details from a consumer perspective. Here a couple concrete suggestions:

  1. Change GroupListing to ConsumerGroupListing. If we do that, then we don't need to expose "protocol type" at all. Maybe we can replace it with a simple flag which indicates whether it is a simple group or not.
  2. Change GroupDescription to ConsumerGroupDescription. We can do the same thing with the "protocol type." We should also expose the group "protocol," which for consumers, just means the assignment strategy.

As far as DeleteGroups, we should probably only consider it if we manage to get 1-3 merged.

Thoughts?

@jeqo
Copy link
Copy Markdown
Contributor Author

jeqo commented Jan 30, 2018

@hachikuji I have applied changes mentioned above. Let me know WDYT, and if I should take this back to the KIP discussion to gather some feedback. I could propose to move Connect Group Management and Delete Consumer Group to a next KIP.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeqo Thanks for the updates. From an API perspective, this is looking good. However, the request logic needs a bit more work. I would suggest having a look at kafka.admin.AdminClient to understand the logic behind these requests.

I don't think it's reasonable to get this patch in the next day, so let's plan on getting this into the next feature release. With a little more time, we can also consider the delete API and non-consumer groups (potentially in a separate KIP).

private final String groupId;
private final boolean isSimpleConsumerGroup;
private final List<MemberDescription> members;
private final String protocol;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can use the consumer-specific term "partitionAssignor"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}
final long now = time.milliseconds();
runnable.call(new Call("describeConsumerGroups", calcDeadlineMs(now, options.timeoutMs()),
new ControllerNodeProvider()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DescribeGroup API has to be sent to the group coordinator, which is potentially a different node for each group. You use the FindCoordinator API in order to lookup the coordinator for a given group. The logic should be something like this:

  1. For each group in the request, send a FindCoordinator request to any node in the cluster.
  2. Group the results by coordinator id.
  3. Send DescribeGroups to each coordinator from 2.

Ideally, we should also handle retries correctly. It could happen that the coordinator moves to another node by the time we send DescribeGroups. In this case, the error code will be NOT_COORDINATOR. We should handle this by looking up the coordinator again.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have pushed draft following this description.

}

@Override
public ListConsumerGroupsResult listConsumerGroups(ListConsumerGroupsOptions options) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the consumer groups are not aggregated in the same way that topic metadata is. To get all the groups in the cluster, you have to send the ListGroups request to all nodes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is the same as DescribeConsumerGroup API. If changes are fine, I will updated this an listConsumerGroupOffsets

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic is different for ListGroups. We have to send a separate request to every broker in the cluster and then aggregate the results.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hachikuji trying to test this I find out that the result from this operation has to change (similar to how the Scala API is doing now) to respond with Map<Node, KafkaFuture<List<ConsumerGroupListing>>> instead, to list all responses from nodes. Does this make sense? I can update the KIP regarding this.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd vote for doing the flattening internally than expose the node information in the results, as this is supposed to be internal implementation details that is better not leaking out. The Scala API returning Map[Node, List[GroupOverview]] was not a well designed one in my hind-sight.

}

@Override
public ListConsumerGroupOffsetsResult listConsumerGroupOffsets(final String groupId, final ListConsumerGroupOffsetsOptions options) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like DescribeGroups, we need to find the coordinator for the group to send the OffsetFetch request to.

@asfgit
Copy link
Copy Markdown

asfgit commented Feb 6, 2018

Refer to this link for build results (access rights to CI server needed):
https://builds.apache.org/job/kafka-pr-test-coverage/349/

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the review delay. I will look in more depth once the 1.1 release has been finalized. In the meantime, I noticed there were no test cases. It would be a good idea to add both unit and integration tests for the new functionality.

@guozhangwang
Copy link
Copy Markdown
Contributor

Hmm. Why does there have to be a futures map for listConsumerGroups in the first place? listConsumerGroups just returns a future which contains the list of all the consumer groups in the cluster, right? I don't see why a map needs to be involved. On the other hand, if you want to learn more about a specific consumer group, you use describeConsumerGroup, which does know the groups it is getting information about ahead of time. This is analogous to listTopics versus describeTopics. listTopics doesn't have a map. describeTopics does.

Just to clarify, we will not expose the map inside the future to user; the map will be maintained internally only during the second round-trip as we need to check if we have received the responses from all found broker nodes, but as I suggested we should flatten it before exposing to the users. So the returned type of listConusmer will be KafkaFuture<Collection<ConsumerGroupListing>>, while the returned type of describeConsumer will be Map<String, KafkaFuture<ConsumerGroupDescription>> (note for the latter, it is because users already specifically give the list of consumer group ids, like the deleteRecords api)

hachikuji and others added 3 commits April 9, 2018 15:39
…uests (apache#4842)

This patch fixes an edge case in producer shutdown which prevents `close()` from completing due to a pending request which will never be sent due to shutdown initiation. I have added a test case which reproduces the scenario.

Reviewers: Apurva Mehta <apurva@confluent.io>, Ismael Juma <ismael@juma.me.uk>
Added consumer only workload to Trogdor. The topics must already be pre-populated. The spec lets the user request topic pattern and range of partitions to assign to [startPartition, endPartition].

Reviewers: Colin P. Mccabe <cmccabe@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
…e#4845)

The toString() for ConfigResource was using { } instead of ( ) which is inconsistent with the existing toStrings in the code, while toString for Resource was using a mix of ( and }.
Copy link
Copy Markdown
Contributor

@guozhangwang guozhangwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jeqo I left some more comments on the return types of other Result classes.

Also the jenkins failures seem consistent, need to fix them before merging as well.

this.futures = futures;
}

public KafkaFuture<Map<String, KafkaFuture<Void>>> values() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can just have one function between values and groups here. I'd suggest we use

public Map<TopicPartition, KafkaFuture<Void>> deletedGroups()

return futures;
}

public KafkaFuture<Collection<String>> names() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this function used?

I'd suggest we only keep one function, i.e.

public Map<TopicPartition, KafkaFuture< ConsumerGroupDescription >> DescribeConsumerGroupsResult#values()

});
}

public KafkaFuture<Collection<KafkaFuture<Void>>> all() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all() function, its returned type should be KafkaFuture<Void>; ditto for other two Results as well.

/**
* Return a future which yields a map of topic partitions to OffsetAndMetadata objects.
*/
public KafkaFuture<Map<TopicPartition, OffsetAndMetadata>> partitionsToOffsetAndMetadata() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest only keep partitionsToOffsetAndMetadata here.

rajinisivaram and others added 8 commits April 10, 2018 20:42
…che#4848)

Move creation of quota callback instance out of KafkaConfig constructor to QuotaFactory.instantiate to avoid creating a callback instance for every KafkaConfig since we create temporary KafkaConfigs during dynamic config updates.

Reviewers: Jun Rao <junrao@gmail.com>
@guozhangwang
Copy link
Copy Markdown
Contributor

retest this please

@guozhangwang
Copy link
Copy Markdown
Contributor

@jeqo could you rebase your PR?

@guozhangwang
Copy link
Copy Markdown
Contributor

@jeqo could you rebase your PR?

NVM, I've done the rebase, and will merge i once the tests passed locally.

@guozhangwang
Copy link
Copy Markdown
Contributor

Unit test passed locally, merging now

@jeqo
Copy link
Copy Markdown
Contributor Author

jeqo commented Apr 11, 2018

@guozhangwang thanks!! :)

koqizhao pushed a commit to koqizhao/kafka that referenced this pull request Apr 12, 2018
KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API

Author: Jorge Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Guozhang Wang <wangguoz@gmail.com>

Closes apache#4454 from jeqo/feature/admin-client-describe-consumer-group
koqizhao pushed a commit to koqizhao/kafka that referenced this pull request Apr 12, 2018
KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API

Author: Jorge Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Guozhang Wang <wangguoz@gmail.com>

Closes apache#4454 from jeqo/feature/admin-client-describe-consumer-group
koqizhao pushed a commit to koqizhao/kafka that referenced this pull request Apr 12, 2018
KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API

Author: Jorge Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Guozhang Wang <wangguoz@gmail.com>

Closes apache#4454 from jeqo/feature/admin-client-describe-consumer-group
ying-zheng pushed a commit to ying-zheng/kafka that referenced this pull request Jul 6, 2018
KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-222+-+Add+Consumer+Group+operations+to+Admin+API

Author: Jorge Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Jorge Esteban Quilcate Otoya <quilcate.jorge@gmail.com>
Author: Guozhang Wang <wangguoz@gmail.com>

Reviewers: Colin Patrick McCabe <colin@cmccabe.xyz>, Guozhang Wang <wangguoz@gmail.com>

Closes apache#4454 from jeqo/feature/admin-client-describe-consumer-group
@jeqo jeqo deleted the feature/admin-client-describe-consumer-group branch August 8, 2020 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.