KAFKA-7126: Reduce number of rebalance for large consumer group after a topic is created#5408
KAFKA-7126: Reduce number of rebalance for large consumer group after a topic is created#5408jonlee2 wants to merge 1 commit intoapache:trunkfrom jonlee2:KAFKA-7126
Conversation
|
@lindong28 @hachikuji @tedyu Can you take a look? Thanks. |
|
Raised INFRA-16796 for Jenkins |
There was a problem hiding this comment.
Add javadoc for nowMs parameter
lindong28
left a comment
There was a problem hiding this comment.
LGTM overall. Left one minor comment.
There was a problem hiding this comment.
Maybe we can improve the comment a bit more to clarify why we need to request metadata here. The important point is that, when a topic is created, the metadata update in any consumer can trigger the rebalance. In order to the topic creation to trigger one rebalance in the entire consumer group, we need each consumer to refresh metadata before it re-joins the group.
Currently the comment suggests, if we can refresh metadata before consumer re-joins the group, then consumer can avoid rebalance due to metadata refresh later. It seems a bit confusing.
There was a problem hiding this comment.
How about something like this:
For consumer group that uses pattern-based subscription, after a topic is created, any consumer that discovers the topic after metadata refresh can trigger rebalance across the entire consumer group. Multiple rebalance can be triggered after one topic creation if consumers refresh metadata at vastly different times. We can significantly reduce the number of rebalance caused by single topic creation by asking consumer to refresh metadata before re-joining the group as long as the refresh backoff time has passed.
There was a problem hiding this comment.
Updated the comment as suggested.
|
LGTM. @junrao @ijuma @hachikuji Would you have time to review this patch if you have time? |
hachikuji
left a comment
There was a problem hiding this comment.
Thanks for the patch. I think the approach makes sense. Can you write a test case so that we do not regress this behavior in the future?
|
Thanks much for your review @hachikuji. Yeah it is better to have a test. |
…ps after a topic is created
|
@hachikuji @lindong28 Thanks for the review. I added a unit test. Can you take a look? |
lindong28
left a comment
There was a problem hiding this comment.
LGTM. Left one minor comment. I will just make the change and merge to trunk.
| * @return remaining time in ms till updating the cluster info | ||
| */ | ||
| public synchronized long timeToNextUpdate(long nowMs) { | ||
| public synchronized long timeToNextUpdate(long nowMs) { |
There was a problem hiding this comment.
nit: can you remove the extra space here?
… a topic is created This patch forces metadata update for consumers with pattern subscription at the beginning of rebalance (retry.backoff.ms is respected). This is to prevent such consumers from detecting subscription changes (e.g., new topic creation) independently and triggering multiple unnecessary rebalances. KAFKA-7126 contains detailed scenarios and rationale. Author: Jon Lee <jonlee@linkedin.com> Reviewers: Jason Gustafson <jason@confluent.io>, Ted Yu <yuzhihong@gmail.com>, Dong Lin <lindong28@gmail.com> Closes #5408 from jonlee2/KAFKA-7126 (cherry picked from commit a932520) Signed-off-by: Dong Lin <lindong28@gmail.com>
|
The PR has been merged to 2.0 branch as well. |
This patch forces metadata update for consumers with pattern subscription at the beginning of rebalance (retry.backoff.ms is respected). This is to prevent such consumers from detecting subscription changes (e.g., new topic creation) independently and triggering multiple unnecessary rebalances. KAFKA-7126 contains detailed scenarios and rationale.
Committer Checklist (excluded from commit message)