KAFKA-7194; Fix buffer underflow if onJoinComplete is retried after failure#5417
KAFKA-7194; Fix buffer underflow if onJoinComplete is retried after failure#5417rajinisivaram merged 2 commits intoapache:trunkfrom
Conversation
1b85e29 to
fb028f1
Compare
kkonstantine
left a comment
There was a problem hiding this comment.
Thanks @hachikuji ! Makes perfect sense.
LGTM!
| // if there are any metadata changes affecting any of the consumed partitions (whether or not this | ||
| // instance is subscribed to the topics). | ||
| this.metadata.setTopics(subscriptions.groupSubscription()); | ||
| if (!client.ensureFreshMetadata(Long.MAX_VALUE)) throw new TimeoutException(); |
There was a problem hiding this comment.
@hachikuji any brief comment about why this is not needed any more?
There was a problem hiding this comment.
There were a couple reasons. Primarily it seemed like an unnecessary optimization which added some complicated failures to think about. We have already updated the internal assignment state at this point, but if this call raises an exception, then the user may see this updated state prior to having their onPartitionsAssigned callback invoked. I was also happy to get rid of one of the final cases of indefinite blocking in the consumer.
rajinisivaram
left a comment
There was a problem hiding this comment.
@hachikuji Thanks for the PR, LGTM. Merging to trunk and 2.0.
…ailure (#5417) An untimely wakeup can cause ConsumerCoordinator.onJoinComplete to throw a WakeupException before completion. On the next poll(), it will be retried, but this leads to an underflow error because the buffer containing the assignment data will already have been advanced. The solution is to duplicate the buffer passed to onJoinComplete. Reviewers: Konstantine Karantasis <konstantine@confluent.io>, Rajini Sivaram <rajinisivaram@googlemail.com>
An untimely wakeup can cause
ConsumerCoordinator.onJoinCompleteto throw aWakeupExceptionbefore completion. On the nextpoll(), it will be retried, but this leads to an underflow error because the buffer containing the assignment data will already have been advanced. The solution is to duplicate the buffer passed toonJoinComplete.Committer Checklist (excluded from commit message)