Skip to content

KAFKA-14690: Add topic IDs to the OffsetCommit API version 9#13240

Closed
Hangleton wants to merge 76 commits intoapache:trunkfrom
Hangleton:offset-commit-api-version-9
Closed

KAFKA-14690: Add topic IDs to the OffsetCommit API version 9#13240
Hangleton wants to merge 76 commits intoapache:trunkfrom
Hangleton:offset-commit-api-version-9

Conversation

@Hangleton
Copy link
Copy Markdown

@Hangleton Hangleton commented Feb 13, 2023

KAFKA-14690: OffsetCommit API Version 9

This change introduces topic ids in OffsetCommit requests and responses. The approach chosen in the PR is to support references to topics by either their ID or name (but not both at the same time).

Topic IDs are not surfaced to the Kafka consumer APIs.

On the consumer-side, topic IDs and names are resolved using the client-side cached metadata. On brokers, the metadata cache is used. If a topic ID cannot be resolved on the broker, the corresponding topic will be assigned an UNKNOWN_TOPIC_ID in the OffsetCommit response.

This PR is written as part of KIP-848: The Next Generation of the Consumer Rebalance Protocol.

Out of scope

  • The admin client continues to use OffsetCommit version 8, without topic ids.
  • Support of topic ids for offset commits in the new component CommitRequestManager will be added in KAFKA-14777.

Request path on the consumer

KAFKA-14690 drawio (1)

Request path on the broker

Requests version >= 9 use topic ids exclusively. Brokers resolve the topic name in the request handler based on cached metadata. The request is then propagated to the group coordinator which constructs the response with topic names as of time of this PR. The resolution of topic name back to topic ID is performed when the response from the group coordinator is merged with the partial response for topic with error (if any).

KAFKA-14690 drawio

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitResponse.java Outdated
Comment thread clients/src/main/java/org/apache/kafka/common/requests/OffsetCommitRequest.java Outdated
@Hangleton Hangleton force-pushed the offset-commit-api-version-9 branch 2 times, most recently from d09095b to 568cbe5 Compare February 13, 2023 17:00
@Hangleton
Copy link
Copy Markdown
Author

Hello David (@dajac), still working on this but opening a draft if you wish to start reviewing at your convenience.

@dajac dajac self-requested a review February 13, 2023 17:50
@dajac dajac added the KIP-848 The Next Generation of the Consumer Rebalance Protocol label Feb 13, 2023
@dajac
Copy link
Copy Markdown
Member

dajac commented Feb 13, 2023

@Hangleton Thanks. I will take a look later this week.

@Hangleton Hangleton force-pushed the offset-commit-api-version-9 branch 6 times, most recently from 75b997e to 3b634e9 Compare February 14, 2023 13:35
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
Comment thread core/src/main/scala/kafka/server/KafkaApis.scala Outdated
@Hangleton
Copy link
Copy Markdown
Author

Thanks for the review, David. I am working on adding unit tests for OffsetCommitResponse and the server-side handling of the request/response, and fix the bugs you have identified.

@Hangleton Hangleton force-pushed the offset-commit-api-version-9 branch 13 times, most recently from 00b0e10 to f4f7dcd Compare February 20, 2023 15:00
@Hangleton Hangleton force-pushed the offset-commit-api-version-9 branch from d2a813a to 892dd25 Compare May 24, 2023 09:46
@Hangleton Hangleton force-pushed the offset-commit-api-version-9 branch from 1496b9e to 0b0e649 Compare May 24, 2023 13:35
@Hangleton Hangleton requested a review from dajac May 25, 2023 07:58
@Hangleton
Copy link
Copy Markdown
Author

Hi David (@dajac), thanks for the review and apologies for the delayed reply. Thanks to Christo's help, I believe most of your comments have been addressed. I have one question regarding the behaviour of the offset commit consumer API that you identified here. Thanks!

@Hangleton
Copy link
Copy Markdown
Author

Hello David (@dajac), I was discussing this with Christo today as part of his work on the OffsetFetch API. Would you like this PR on OffsetCommit to be split to make the review easier and reduce risks?

Comment on lines +1381 to +1386
if (topicName == null) {
// Could only happen if the broker replied with an ID which was not in the request and
// unknown by this client. This would be a bug.
log.warn("Ignoring invalid topic ID found in OffsetCommit response: " + topic.topicId());
continue;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a second thought, I wonder if we should complete the future with an exception here. Being defensive would help us to catch bugs early one. What do you think?

Copy link
Copy Markdown
Member

@dajac dajac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Hangleton I am back one this one. I looked at the files in clients package and left a few minor comments.

* - {@link Errors#INVALID_COMMIT_OFFSET_SIZE}
* - {@link Errors#TOPIC_AUTHORIZATION_FAILED}
* - {@link Errors#GROUP_AUTHORIZATION_FAILED}
* - {@link Errors#STALE_MEMBER_EPOCH}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove this one for now as it is not implemented yet?

Comment on lines +32 to +33
// Version 9 adds TopicId field and can return STALE_MEMBER_EPOCH, UNKNOWN_MEMBER_ID
// and UNKNOWN_TOPIC_ID errors (KIP-848).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove STALE_MEMBER_EPOCH and UNKNOWN_MEMBER_ID for now?

assertFalse(coordinator.commitOffsetsSync(offsets, time.timer(timeoutMs)));
} else {
AtomicBoolean callbackInvoked = new AtomicBoolean();
coordinator.commitOffsetsAsync(offsets, (inputOffsets, exception) -> {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we don't retry when commitOffsetsAsync is used. Is it correct? If it is, it may be better to split the test in two. It is really misleading otherwise.

}
}

private Map<TopicPartition, OffsetAndMetadata> testRetryCommitWithUnknownTopicIdSetup() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It may be better to name this one prepare.....

}
}

private Map<TopicPartition, OffsetAndMetadata> testRetryCommitWithUnknownTopicIdSetup() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: It may be better to name this one prepare.....

Comment on lines +2840 to +2843
Map<TopicIdPartition, Long> byTopicNameOffsets = new HashMap<>();
byTopicNameOffsets.put(ti1p, 100L);
byTopicNameOffsets.put(ti2p, 200L);
byTopicNameOffsets.put(unknownTopicIdPartition, 300L);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Comment on lines +2945 to +2952
// The following offset commit response defines a topic incorrectly. The coordinator ignores the topic,
// and the group authorization failure is therefore not propagated.
client.prepareResponse(offsetCommitResponse(topicName, Uuid.ZERO_UUID, Errors.GROUP_AUTHORIZATION_FAILED));
assertTrue(coordinator.commitOffsetsSync(offsets, time.timer(Long.MAX_VALUE)));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that the method should return false if any mismatched topic id. If I commit foo-topic-id and bar-topic-id, the method should not succeed if we don't get a response for any of them, right?

Comment on lines +3007 to +3011
// The following offset commit responses defines a topic incorrectly. The coordinator ignores the topic,
// and the group authorization failure is therefore not propagated.
asserter.accept(
offsetCommitResponse(topicName, Uuid.ZERO_UUID, Errors.GROUP_AUTHORIZATION_FAILED),
null);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This case is not correct as well in my opinion. The caller should get an exception in this case.

return topic;
}

public static final class NameAndId {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: TopicNameAndId?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit weird to have this class defined here but I cannot think of a better place for now. Thoughts?

).build(version);
)))
),
true
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this seems to be misaligned.

@dajac dajac added the stale Stale PRs label Sep 5, 2023
@dajac
Copy link
Copy Markdown
Member

dajac commented Sep 29, 2023

Closing this PR for now as the topic id work will be done later. We can re-open it when we resume the work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

KIP-848 The Next Generation of the Consumer Rebalance Protocol stale Stale PRs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants