Skip to content

KAFKA-9630; Replace OffsetsForLeaderEpoch request/response with automated protocol#9547

Merged
dajac merged 9 commits intoapache:trunkfrom
dajac:KAFKA-9630
Nov 19, 2020
Merged

KAFKA-9630; Replace OffsetsForLeaderEpoch request/response with automated protocol#9547
dajac merged 9 commits intoapache:trunkfrom
dajac:KAFKA-9630

Conversation

@dajac
Copy link
Copy Markdown
Member

@dajac dajac commented Nov 3, 2020

This PR migrates the OffsetsForLeaderEpoch request/response to the automated protocol. It also refactors the OffsetsForLeaderEpochClient to use directly the internal structs generated by the automated protocol. It relies on the existing tests.

It seems that we could also refactor the broker (api layer, fetcher) to directly use the internal structs of the generated protocol but, as it is more involved, I propose to address it in a separate PR.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

Comment thread clients/src/main/resources/common/message/OffsetForLeaderEpochRequest.json Outdated
@dajac dajac changed the title KAFKA-9630; Migrate OffsetsForLeaderEpochRequest/Response to the auto-generated protocol KAFKA-9630; Replace OffsetsForLeaderEpoch request/response with automated protocol Nov 9, 2020
@dajac dajac marked this pull request as ready for review November 9, 2020 19:53
Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just a couple comments.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if it might be simpler to initialize partitionsToRetry from the request key set.

Copy link
Copy Markdown
Member Author

@dajac dajac Nov 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside of relying on partitionsToRetry is that we can't differentiate the partitions to be retried due to an error from the missing partitions in the response. It may not be that important but we can't preserve the above log otherwise.

logger().warn("Missing partition {} from response, retrying.", topicPartition);

I suppose that this log entry is there for a reason so I went with preserving it. What do you think?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion. It's an unlikely case anyway, so I'm not sure it calls for special treatment.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're right. Let me remove it.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have similar logic in OffsetsForLeaderEpochClient.prepareRequest. Wonder if we should push it to the Builder?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good question. Actually, I was hoping to push this one out of the builder while removing PartitionData in profit of using the internal data structure of the auto-generated protocol. I'd like to do this in a follow-up PR.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fine. Do you have a jira?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

@hachikuji hachikuji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Copy Markdown
Member

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this great patch. +1

some trivial comments are left. please take a look :)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why this kind of method still exist in each request. It is not used anymore.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. As we have kept them in other requests/responses, I will keep it for now. We should consider removing all of them.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these are still used in RequestResponseTest. There's probably a lot of cruft like this that we can start cleaning up though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth using data rather than responses() to avoid extra conversion?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hope we can get rid of those conversion in the future :)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will address this in a follow-up PR.

gardnervickers added a commit to gardnervickers/kafka-1 that referenced this pull request Nov 18, 2020
OffsetForLeaderEpoch and Produce are not yet generated RPCs, but will be once apache#9401 and apache#9547 are merged.
@dajac
Copy link
Copy Markdown
Member Author

dajac commented Nov 18, 2020

@chia7712 @hachikuji Thanks for your comments. I have addressed them. Could you have another quick look?

@hachikuji
Copy link
Copy Markdown
Contributor

@dajac The updates LGTM.

@chia7712
Copy link
Copy Markdown
Member

@dajac +1 to nice updating. sorry that my commit causes conflicting files to you.

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Nov 19, 2020

Failed test seems unrelated to this PR. Merging to trunk.

@dajac dajac merged commit 51c833e into apache:trunk Nov 19, 2020
@dajac dajac deleted the KAFKA-9630 branch November 19, 2020 10:41
dajac added a commit that referenced this pull request Dec 3, 2020
This patch follows up #9547. It refactors KafkaApis, ReplicaManager and Partition to use `OffsetForLeaderEpochResponseData.EpochEndOffset` instead of `EpochEndOffset`. In the mean time, it removes `OffsetsForLeaderEpochRequest#epochsByTopicPartition` and `OffsetsForLeaderEpochResponse#responses` and replaces their usages to use the automated protocol directly. Finally, it removes old constructors in `OffsetsForLeaderEpochResponse`. The patch relies on existing tests.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
dajac added a commit that referenced this pull request Dec 17, 2020
…automated protocol (#9689)

This patch follows up #9547. It refactors AbstractFetcherThread and its descendants to use `OffsetForLeaderEpochRequestData.OffsetForLeaderPartition` instead of `OffsetsForLeaderEpochRequest.PartitionData`. The patch relies on existing tests.

Reviewers: Chia-Ping Tsai <chia7712@gmail.com>, Jason Gustafson <jason@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants