KAFKA-13062: Make DeleteConsumerGroupsHandler unmap for COORDINATOR_NOT_AVAILABLE error#11021
KAFKA-13062: Make DeleteConsumerGroupsHandler unmap for COORDINATOR_NOT_AVAILABLE error#11021dajac merged 4 commits intoapache:trunkfrom
Conversation
|
@dajac , please take a look. Thanks. |
| } | ||
| return new ApiResult<>(completed, failed, unmapped); | ||
|
|
||
| if (groupsToUnmap.isEmpty() && groupsToRetry.isEmpty()) { |
There was a problem hiding this comment.
It seems incorrect to do this here. We were able to do so in the other because they were expecting only one group at the time. This one is different. The driver will retry if the group is not completed nor failed. It seems to me that we could keep the existing code, no?
| case INVALID_GROUP_ID: | ||
| case NON_EMPTY_GROUP: | ||
| case GROUP_ID_NOT_FOUND: | ||
| log.error("Received non retriable failure for group {} in `{}` response", groupId, |
There was a problem hiding this comment.
I would also try to uniformize the logs and would use debug all the time except for the unexpected errors.
| } | ||
|
|
||
| } | ||
| } No newline at end of file |
| case INVALID_GROUP_ID: | ||
| case NON_EMPTY_GROUP: | ||
| case GROUP_ID_NOT_FOUND: | ||
| log.debug("`DeleteConsumerGroups` request for group id {} failed due to error {}", groupId, error); |
There was a problem hiding this comment.
nit: We should use groupId.idValue here and in the others.
| case COORDINATOR_LOAD_IN_PROGRESS: | ||
| case COORDINATOR_NOT_AVAILABLE: | ||
| // If the coordinator is in the middle of loading, then we just need to retry | ||
| log.debug("`DeleteConsumerGroups` request for group {} failed because the coordinator " + |
| unmapped.add(groupId); | ||
| // If the coordinator is unavailable or there was a coordinator change, then we unmap | ||
| // the key so that we retry the `FindCoordinator` request | ||
| log.debug("`DeleteConsumerGroups` request for group {} returned error {}. " + |
| final DeletableGroupResultCollection errorResponse1 = new DeletableGroupResultCollection(); | ||
| errorResponse1.add(new DeletableGroupResult() | ||
| .setGroupId("groupId") | ||
| .setErrorCode(Errors.COORDINATOR_NOT_AVAILABLE.code()) | ||
| ); | ||
| env.kafkaClient().prepareResponse(new DeleteGroupsResponse( | ||
| new DeleteGroupsResponseData() | ||
| .setResults(errorResponse1))); |
There was a problem hiding this comment.
This section is testing "retriable" errors should be retried. Before the change, COORDINATOR_NOT_AVAILABLE is considered as retriable error. But after this PR, it'll considered as unmapped error, so it is moved to later, to test when receiving the error, we should re-find coordinator, and then re-send request.
|
Failures are not related: |
…OT_AVAILABLE error (#11021) This patch improve the error handling in `DeleteConsumerGroupsHandler` and ensure that `COORDINATOR_NOT_AVAILABLE` is unmapped in order to look up the coordinator again. Reviewers: David Jacot <djacot@confluent.io>
|
Merged to trunk and to 3.0. cc @kkonstantine |
…OT_AVAILABLE error (apache#11021) This patch improve the error handling in `DeleteConsumerGroupsHandler` and ensure that `COORDINATOR_NOT_AVAILABLE` is unmapped in order to look up the coordinator again. Reviewers: David Jacot <djacot@confluent.io>
Make DeleteConsumerGroupsHandler unmap for COORDINATOR_NOT_AVAILABLE error
old handlResponse logic:
Committer Checklist (excluded from commit message)