KAFKA-2490: support new consumer in ConsumerGroupCommand#299
KAFKA-2490: support new consumer in ConsumerGroupCommand#299SinghAsDev wants to merge 6 commits intoapache:trunkfrom
Conversation
|
Hi @SinghAsDev, Thanks for the patch! I thought the plan was to add support for the new consumer with a flag, as we did for other tools and for the new producer. The idea is that the new code is not fully tested yet and we want to give people the option to revert to old behavior if something goes wrong. Does that make sense? |
|
@gwenshap got it. Will make appropriate changes. |
73ad826 to
dc2ba4c
Compare
|
@gwenshap could you take a look now? |
|
Hi, looking at this PR it looks to have only updated the 'describe' functionality of ConsumerGroupCommand. If I'm not mistaken other functionality, such as 'list', will still not work with the new consumers with this PR. Is this intentional? |
|
@datalorax list does not use consumer to get list of consumer groups, it instead gets consumer groups from ZK. |
|
@SinghAsDev since the new consumer does not register consumer groups in ZK, does it mean the feature will no longer work? Do we have a work-around? or should we print a warning? @hachikuji - any suggestions? |
|
@gwenshap @SinghAsDev Probably we should await the outcome of KAFKA-2017 before addressing this for the new consumer. If group information ends up in Zookeeper, then it should be straightforward. However, it seems fairly likely that only high-level information (such as group members) would end up in Zookeeper in that case. We'd probably need the DescribeGroup API to get subscription/assignment information. I'll add a JIRA for DescribeGroup this afternoon. |
|
A general comment: we should probably also remove ZkUtils.ConsumersPath and any of its references unless KAFKA-2017 reuses it. The current plan is that although we will probably also use "/consumers" as root, the inner nested path will be changed. |
|
@hachikuji certainly would be useful to have some method to discover the active set of consumer groups and their details. Could you let me know the id of any jira you create please? Thanks |
|
We will have to revisit this after KAFKA-2017, as suggested by @hachikuji . I will comment on KAFKA-2017. Based on what we decide on KAFKA-2017, we might need the DescribeGroup API. @hachikuji I should be able to help out with that. |
|
@datalorax @SinghAsDev Sorry for the delay on this. I was a little more optimistic before that KAFKA-2017 would be committed prior to 0.9, but it's now looking like it will be pushed. So I created KIP-40, which proposes to modify the group metadata request to support this use case. With that proposal, fetching all the consumer groups in the cluster would involve sending group metadata requests to all brokers. It's a little annoying, but I don't see any other options at the moment. |
|
@hachikuji that should work for now. Is my understanding correct we will have to revisit this after KAFKA-2017 gets in. |
|
@SinghAsDev We shouldn't need to wait for KAFKA-2017, but KAFKA-2687. I'm working on a patch for it now under the assumption that there won't be any issues getting the KIP through. |
|
@SinghAsDev An update on the dependency tickets: we are likely to have both 2017 and 2687 merged soon (maybe tonight or tomorrow) by myself and @hachikuji , and we would love to have this ticket rebased / reviewed and merged as soon as possible right after these two are done since our target release date is next week. If you do not have enough time to rebase it please let me know so that we can help on the rest of the work. Thanks for being so patient :) |
|
@guozhangwang thanks for the heads up. I will be ready :). |
|
@SinghAsDev Seems like the patch for ListGroups/DescribeGroup (#388) is not going to change significantly, so you could probably get started if you want. |
|
Sure, will start on it tonight. On Mon, Nov 2, 2015 at 4:50 PM, hachikuji notifications@github.com wrote:
Regards, |
|
@hachikuji there are quite a few merge issues when I pull in #388 , I will wait for those issues to get resolved. |
|
@SinghAsDev Apologies, should finish rebasing shortly. |
|
#388 has been merged. |
7465f16 to
691d6af
Compare
There was a problem hiding this comment.
nit: "WARNING: Group deletion only works for old ZK-based consumer groups, and one has to use it carefully to only delete groups that are not active"
|
Looks good to me overall. One question: have you tried the group command with the new consumer locally, and if yes what is the average latency for list / describe groups for a group of say 20 members? |
…a few reciew comments.
|
@guozhangwang so for a group of 20 consumers, list takes ~1.131s wall time, and describe takes ~1.5 sec for 1 topic and 1 partition. |
|
@SinghAsDev Is that the full time it takes to run the command or is that actual request latency? |
|
@hachikuji that is for running the complete command. |
|
@SinghAsDev Cool. I would expect the requests themselves to be fast. |
bbd94ef to
3ddcd91
Compare
There was a problem hiding this comment.
Maybe we should use listAllConsumerGroupsFlattened.
|
@SinghAsDev As discussed earlier, it might be a little cleaner to modify describeConsumerGroup something like the following: case class ConsumerSummary(
memberId: String,
clientId: String,
clientHost: String,
assignment: List[TopicPartition])
def describeConsumerGroup(groupId): List[ConsumerSummary]Then you could do any transformation in ConsumerGroupCommand. |
|
LGTM. I'll merge, we can clean up the interface a bit in a follow up |
|
@gwenshap @hachikuji my apologies for getting to this a bit late, I have addressed your concern in #442 . Thanks! |
* refactor(inkless): run append completer directly Instead of running this as a separate thread, run directly on the same thread as it is not an IO task. Reduces thread allocation. * refactor(inkless): use fixed thread pools for upload/cache (apache#299) Cached thread polls are unbounded. As concurrent uploads wait for commits to complete, the amound of uploads can grow without limits, affecting cpu utilization and increasing latencies. To bound the uploads to the expected commit latencies, this PR proposes to use fixed thread pool (8 by default) to be able to upload up to 8 files concurrently, these uploads have a P99 of 400ms on S3. As sequential commits can have a P99 of 50ms on PG, having 8 commits queued would already account for doubling the upload time. This is an initial estimation and the values could be updated and the defaults changed as this setup is tested; but having configurable thread pools should benefit in the long term.
* refactor(inkless): run append completer directly Instead of running this as a separate thread, run directly on the same thread as it is not an IO task. Reduces thread allocation. * refactor(inkless): use fixed thread pools for upload/cache (apache#299) Cached thread polls are unbounded. As concurrent uploads wait for commits to complete, the amound of uploads can grow without limits, affecting cpu utilization and increasing latencies. To bound the uploads to the expected commit latencies, this PR proposes to use fixed thread pool (8 by default) to be able to upload up to 8 files concurrently, these uploads have a P99 of 400ms on S3. As sequential commits can have a P99 of 50ms on PG, having 8 commits queued would already account for doubling the upload time. This is an initial estimation and the values could be updated and the defaults changed as this setup is tested; but having configurable thread pools should benefit in the long term.
No description provided.