KAFKA-13064: Make ListConsumerGroupOffsetsHandler unmap for COORDINATOR_NOT_AVAILABLE error#11026
KAFKA-13064: Make ListConsumerGroupOffsetsHandler unmap for COORDINATOR_NOT_AVAILABLE error#11026dajac merged 6 commits intoapache:trunkfrom
Conversation
|
@dajac , please take a look. Thanks. |
dajac
left a comment
There was a problem hiding this comment.
Left a few comments. Thanks for the PR.
|
|
||
| if (groupsToUnmap.isEmpty() && groupsToRetry.isEmpty()) { | ||
| return new ApiResult<>( | ||
| completed, |
There was a problem hiding this comment.
We could get rid of completed and use Collections.singletonMap(groupId, groupOffsetsListing), no?
There was a problem hiding this comment.
No, we can't do that because the completed here could be empty map. If we put Collections.singletonMap(groupId, groupOffsetsListing), it'll always not empty. Thanks.
There was a problem hiding this comment.
@showuon I think that there is a case that we don't handle correctly.
Imagine that GROUP_AUTHORIZATION_FAILED is returned as a partition error. In this case, we ignore it in handlePartitionError and therefore don't add the failed group to failed. I think that we should also handle all the group level errors in handlePartitionError.
The second thing is that if there is a group failure, we should not add the group to completed at L131. Otherwise, this will complete the group future with an empty list.
Could you check this out and add a test for it?
There was a problem hiding this comment.
Good suggestion! Will do it tomorrow (my time). Thanks.
| } | ||
|
|
||
| } | ||
| } No newline at end of file |
|
|
||
| if (groupsToUnmap.isEmpty() && groupsToRetry.isEmpty()) { | ||
| return new ApiResult<>( | ||
| completed, |
There was a problem hiding this comment.
@showuon I think that there is a case that we don't handle correctly.
Imagine that GROUP_AUTHORIZATION_FAILED is returned as a partition error. In this case, we ignore it in handlePartitionError and therefore don't add the failed group to failed. I think that we should also handle all the group level errors in handlePartitionError.
The second thing is that if there is a group failure, we should not add the group to completed at L131. Otherwise, this will complete the group future with an empty list.
Could you check this out and add a test for it?
| switch (error) { | ||
| case COORDINATOR_LOAD_IN_PROGRESS: | ||
| // If the coordinator is in the middle of loading, then we just need to retry | ||
| log.debug("`{}` request for group {} failed because the coordinator " + |
There was a problem hiding this comment.
Could we also update the log messages here and below to follow what you did in handleGroupError?
There was a problem hiding this comment.
Oh, sorry, I forgot the partitionError section. Will do.
| final String unexpectedErrorMsg = | ||
| String.format("`OffsetFetch` request for group id %s failed due to error %s", groupId.idValue, error); | ||
| log.error(unexpectedErrorMsg); | ||
| failed.put(groupId, error.exception(unexpectedErrorMsg)); |
There was a problem hiding this comment.
Could we also remove providing the error message here like we did for the others?
…ll group level errors
|
Failed tests are unrelated, thanks. |
|
Failures are not related: |
…OR_NOT_AVAILABLE error (#11026) This patch improve the error handling in `ListConsumerGroupOffsetsHandler` and ensures that `COORDINATOR_NOT_AVAILABLE` is unmapped in order to look up the coordinator again. Reviewers: David Jacot <djacot@confluent.io>
|
Merged to trunk and 3.0. |
|
@showuon Thanks for the patches. Could you update the description of this PR and the others to ensure that the description reflects the changes? |
|
@dajac , all checked and updated. Thank you very much for your patiently review all these PRs! After these update, we are more confident in these new handlers. :) |
…OR_NOT_AVAILABLE error (apache#11026) This patch improve the error handling in `ListConsumerGroupOffsetsHandler` and ensures that `COORDINATOR_NOT_AVAILABLE` is unmapped in order to look up the coordinator again. Reviewers: David Jacot <djacot@confluent.io>
Make ListConsumerGroupOffsetsHandler unmap for COORDINATOR_NOT_AVAILABLE error
This is the old handle response logic. FYR:
Committer Checklist (excluded from commit message)