KAFKA-15170: CooperativeStickyAssignor may fail adjust assignment#13965
KAFKA-15170: CooperativeStickyAssignor may fail adjust assignment#13965ableegoldman merged 8 commits intoapache:trunkfrom
Conversation
|
please note another bugs in this implementation. the function assignRackAwareRoundRobin of ConstrainedAssignmentBuilder have 2 logical error:
unfilledMembersWithExactlyMinQuotaPartitions only if currentNumMembersWithOverMinQuotaPartitions < expectedNumMembersWithOverMinQuotaPartitions
assignment.get(consumer).size() + 1 == maxQuota. this error will lead to fail execute verifyUnfilledMembers, throw exception, raise new rebalance and return the same assignment result so group may stuck forever. |
I suggest |
|
@rreddy-22 I see you are familiar with this implementation, plz help me check this, thx! |
guozhangwang
left a comment
There was a problem hiding this comment.
ping @rajinisivaram @ableegoldman to take a look since you have the most context about sticky assignor with rack info.
|
This PR is being marked as stale since it has not had any activity in 90 days. If you would like to keep this PR alive, please ask a committer for review. If the PR has merge conflicts, please update it with the latest from trunk (or appropriate release branch) If this PR is no longer valid or desired, please feel free to close it. If no activity occurs in the next 30 days, it will be automatically closed. |
|
@flashmouse can you rebase this PR please? |
|
updated to latest commit now @ableegoldman |
ableegoldman
left a comment
There was a problem hiding this comment.
LGTM, left a few suggestions to improve readability and hopefully prevent any more bugs like this from slipping in in the future.
Nice find by the way! Thanks for the fix
| if (assignmentCount >= minQuota) { | ||
| unfilledMembersWithUnderMinQuotaPartitions.remove(consumer); | ||
| if (assignmentCount < maxQuota) | ||
| if (assignmentCount < maxQuota && (currentNumMembersWithOverMinQuotaPartitions < expectedNumMembersWithOverMinQuotaPartitions)) { |
There was a problem hiding this comment.
This fix makes sense, good find. But let's add a comment because obviously this is difficult to understand if the bug slipped in.
Something like:
// Only add this consumer if the current num members at maxQuota is less than the expected number
// since a consumer at minQuota can only be considered unfilled if it's possible to add another partition,
// which would bump it to maxQuota and exceed the expectedNumMembersWithOverMinQuotaPartitions
|
|
||
| private final Set<TopicPartition> partitionsWithMultiplePreviousOwners; | ||
| private final Set<TopicPartition> allRevokedPartitions; | ||
| private final Map<TopicPartition, String> mayRevokedPartitions; |
There was a problem hiding this comment.
nit: this name is a bit confusing, how about maybeRevokedPartitions?
There was a problem hiding this comment.
modified, plz check again.
There was a problem hiding this comment.
maybeRevokedPartitions sounds very comprehensive and short. It is indeed a nice choice of words.
More options for who do not prefer 'maybe' may be: possiblyRevokedPartitions, perhapsRevokedPartitions.
There was a problem hiding this comment.
More options for who do not prefer 'maybe' may be: possiblyRevokedPartitions, perhapsRevokedPartitions.
I actually considered both of those as well -- ultimately went with "maybe" since the variables in this class are all way too long already, but I'm happy with any of these choices 🙂
ableegoldman
left a comment
There was a problem hiding this comment.
One more small thing but after that I'm happy to merge it
| // Keep track of the partitions being migrated from one consumer to another during assignment | ||
| // so the cooperative assignor can adjust the assignment | ||
| protected Map<TopicPartition, String> partitionsTransferringOwnership = new HashMap<>(); | ||
| public Map<TopicPartition, String> partitionsTransferringOwnership = new HashMap<>(); |
There was a problem hiding this comment.
Instead of making this public can you actually make it private and just add a getter method that returns it? We don't want any class fields being modifiable from the outside
There was a problem hiding this comment.
that make sense, but set to private would result in many modifications in org.apache.kafka.clients.consumer.internals.AbstractStickyAssignorTest, I think you mean set protected is enough? changed to protected now.
| return partitionMovements.isSticky(); | ||
| } | ||
|
|
||
| public Map<TopicPartition, String> getPartitionsTransferringOwnership() { |
There was a problem hiding this comment.
Forgot to mention, it's not a big deal but typically we don't include the "get" in getter names in the Kafka clients. Just a naming convention. ie this would be just partitionsTransferringOwnership()
There was a problem hiding this comment.
I know, plz check again
|
Test failures are unrelated, merging to trunk |
) This PR fixes two kinds of bugs in the new(ish) rack-aware part of the sticky assignment algorithm: First, when reassigning "owned partitions" to their previous owners, we now have to take the rack placement into account and might not immediately assign a previously-owned partition to its old consumer during this phase. There is a small chance this partition will be assigned to its previous owner during a later stage of the assignment, but if it's not then by definition it has been "revoked" and must be removed from the assignment during the adjustment phase of the CooperativeStickyAssignor according to the cooperative protocol. We need to make sure any partitions removed in this way end up in the "partitionsTransferringOwnership". Second, the sticky algorithm works in part by keeping track of how many consumers are still "unfilled" when they are at the "minQuota", meaning we may need to assign one more partition to get to the expected number of consumers at the "maxQuota". During the rack-aware round-robin assignment phase, we were not properly clearing the set of unfilled & minQuota consumers once we reached the expected number of "maxQuota" consumers (since by definition that means no more minQuota consumers need to or can be given any more partitions since that would bump them up to maxQuota and exceed the expected count). This bug would result in the entire assignment being failed due to a correctness check at the end which verifies that the "unfilled members" set is empty before returning the assignment. An IllegalStateException would be thrown, failing the rebalancing and sending the group into an endless rebalancing loop until/unless it was lucky enough to produce a new assignment that didn't hit this bug Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
|
Merged to trunk and cherrypicked back to 3.7 |
…che#13965) This PR fixes two kinds of bugs in the new(ish) rack-aware part of the sticky assignment algorithm: First, when reassigning "owned partitions" to their previous owners, we now have to take the rack placement into account and might not immediately assign a previously-owned partition to its old consumer during this phase. There is a small chance this partition will be assigned to its previous owner during a later stage of the assignment, but if it's not then by definition it has been "revoked" and must be removed from the assignment during the adjustment phase of the CooperativeStickyAssignor according to the cooperative protocol. We need to make sure any partitions removed in this way end up in the "partitionsTransferringOwnership". Second, the sticky algorithm works in part by keeping track of how many consumers are still "unfilled" when they are at the "minQuota", meaning we may need to assign one more partition to get to the expected number of consumers at the "maxQuota". During the rack-aware round-robin assignment phase, we were not properly clearing the set of unfilled & minQuota consumers once we reached the expected number of "maxQuota" consumers (since by definition that means no more minQuota consumers need to or can be given any more partitions since that would bump them up to maxQuota and exceed the expected count). This bug would result in the entire assignment being failed due to a correctness check at the end which verifies that the "unfilled members" set is empty before returning the assignment. An IllegalStateException would be thrown, failing the rebalancing and sending the group into an endless rebalancing loop until/unless it was lucky enough to produce a new assignment that didn't hit this bug Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
…che#13965) This PR fixes two kinds of bugs in the new(ish) rack-aware part of the sticky assignment algorithm: First, when reassigning "owned partitions" to their previous owners, we now have to take the rack placement into account and might not immediately assign a previously-owned partition to its old consumer during this phase. There is a small chance this partition will be assigned to its previous owner during a later stage of the assignment, but if it's not then by definition it has been "revoked" and must be removed from the assignment during the adjustment phase of the CooperativeStickyAssignor according to the cooperative protocol. We need to make sure any partitions removed in this way end up in the "partitionsTransferringOwnership". Second, the sticky algorithm works in part by keeping track of how many consumers are still "unfilled" when they are at the "minQuota", meaning we may need to assign one more partition to get to the expected number of consumers at the "maxQuota". During the rack-aware round-robin assignment phase, we were not properly clearing the set of unfilled & minQuota consumers once we reached the expected number of "maxQuota" consumers (since by definition that means no more minQuota consumers need to or can be given any more partitions since that would bump them up to maxQuota and exceed the expected count). This bug would result in the entire assignment being failed due to a correctness check at the end which verifies that the "unfilled members" set is empty before returning the assignment. An IllegalStateException would be thrown, failing the rebalancing and sending the group into an endless rebalancing loop until/unless it was lucky enough to produce a new assignment that didn't hit this bug Reviewers: Anna Sophie Blee-Goldman <ableegoldman@apache.org>
This PR fixes two kinds of bugs in the new(ish) rack-aware part of the sticky assignment algorithm:
First, when reassigning "owned partitions" to their previous owners, we now have to take the rack placement into account and might not immediately assign a previously-owned partition to its old consumer during this phase. There is a small chance this partition will be assigned to its previous owner during a later stage of the assignment, but if it's not then by definition it has been "revoked" and must be removed from the assignment during the adjustment phase of the CooperativeStickyAssignor according to the cooperative protocol. We need to make sure any partitions removed in this way end up in the "partitionsTransferringOwnership".
Second, the sticky algorithm works in part by keeping track of how many consumers are still "unfilled" when they are at the "minQuota", meaning we may need to assign one more partition to get to the expected number of consumers at the "maxQuota". During the rack-aware round-robin assignment phase, we were not properly clearing the set of unfilled & minQuota consumers once we reached the expected number of "maxQuota" consumers (since by definition that means no more minQuota consumers need to or can be given any more partitions since that would bump them up to maxQuota and exceed the expected count). This bug would result in the entire assignment being failed due to a correctness check at the end which verifies that the "unfilled members" set is empty before returning the assignment. An IllegalStateException would be thrown, failing the rebalancing and sending the group into an endless rebalancing loop until/unless it was lucky enough to produce a new assignment that didn't hit this bug