Skip to content

KAFKA-14367; Add TxnOffsetCommit to the new GroupCoordinator interface#12901

Merged
dajac merged 6 commits intoapache:trunkfrom
dajac:KAFKA-14367-txn-commit-offset
Jan 13, 2023
Merged

KAFKA-14367; Add TxnOffsetCommit to the new GroupCoordinator interface#12901
dajac merged 6 commits intoapache:trunkfrom
dajac:KAFKA-14367-txn-commit-offset

Conversation

@dajac
Copy link
Copy Markdown
Member

@dajac dajac commented Nov 23, 2022

This patch adds TxnOffsetCommit to the new GroupCoordinator interface and updates KafkaApis to use it.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@dajac dajac added the KIP-848 The Next Generation of the Consumer Rebalance Protocol label Nov 23, 2022
@jolshan
Copy link
Copy Markdown
Member

jolshan commented Dec 21, 2022

reminder to rebase here :)

@jolshan
Copy link
Copy Markdown
Member

jolshan commented Dec 27, 2022

also minor nit in the title TnxOffsetCommit -> TxnOffsetCommit

@dajac dajac changed the title KAFKA-14367; Add TnxOffsetCommit to the new GroupCoordinator interface KAFKA-14367; Add TxnOffsetCommit to the new GroupCoordinator interface Jan 9, 2023
@dajac dajac force-pushed the KAFKA-14367-txn-commit-offset branch from 10503c9 to a8d5423 Compare January 9, 2023 16:07
@dajac dajac requested a review from jolshan January 9, 2023 16:08
@dajac
Copy link
Copy Markdown
Member Author

dajac commented Jan 9, 2023

@jolshan I have updated the PR.

sendResponse(responseBuilder.build())
CompletableFuture.completedFuture[Unit](())
} else {
val txnOffsetCommitRequestData = new TxnOffsetCommitRequestData()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we only replacing the topics here? There's a lot of code to move the majority of one request data into a new one.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. It is basically a copy of the original request but with the authorized/existing topics.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, it doesn't seem ideal. what's the reason for not reusing the original request data and applying setTopics()?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot mutate the original request. In case of an error resulting in calling txnOffsetCommitRequest.getErrorResponse(exception), the error response would only include the authorised topics and not all of the original one.

I agree that it is not ideal. I considered a few alternatives:

  1. We could use .duplicate().setTopics(...) but duplicate also copy the original topics and that it is not necessary;
  2. We could change the GroupCoordinator interface to take individual arguments but I thought that keeping the interface consistent is better.

In the end, I went with the manual copy.

partitions += tp -> new OffsetAndMetadata(
offset = partition.committedOffset,
leaderEpoch = partition.committedLeaderEpoch match {
case RecordBatch.NO_PARTITION_LEADER_EPOCH => Optional.empty[Integer]
Copy link
Copy Markdown
Member

@jolshan jolshan Jan 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we previously not handle -1 correctly here? In KafkaApis 2544 we just keep -1.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was handled here in the previous code.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh didn't see this. Thanks for sharing 😅 I like that both are handled in the same place now.

def testJoinGroup(version: Short): Unit = {
val groupCoordinator = mock(classOf[GroupCoordinator])
val adapter = new GroupCoordinatorAdapter(groupCoordinator)
val adapter = new GroupCoordinatorAdapter(groupCoordinator, Time.SYSTEM)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if we wanted to mock time here. Doesn't seem to be a big difference one way or the for most of these since they don't use a time component anyway, but could be a good practice to have a MockTime. (Looks like we do use it for the test added 😄 )

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I usually use MockTime only when I need it. Otherwise, I default to Time.SYSTEM.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm just recalling some recent tests where I used system time because it was already there and it led to flakiness.
I suppose here though it is easier to switch so maybe not a problem.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. In this case, it does not matter because the test does not depend on time at all.

requestLocal.bufferSupplier
).handle[Unit] { (response, exception) =>
if (exception != null) {
sendResponse(txnOffsetCommitRequest.getErrorResponse(exception))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to still send the existing responsebuilder response at all or mark the whole request as failed?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- I misunderstood this error. This is how we handle errors that would have previously been thrown back to KafkaApis.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my confusion is when we see this line vs. in the handleError above.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this idea is to catch unexpected errors here. Otherwise, the future would return a response. Note that handleError is here to handle errors thrown in the handle. It is basically a safety net.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah -- so we typically don't expect handleError to see anything since this handle should get most of the errors.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right. handleError is more for unexpected errors.

}

@Test
def testHandleTxnOffsetCommitRequestTopicsAndPartitionsValidation(): Unit = {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is super thorough! Thanks for adding

Copy link
Copy Markdown
Member

@jolshan jolshan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a few comments. Looks pretty good so far.

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Jan 12, 2023

@jolshan @jeffkbkim Updated the PR.

topic = new TxnOffsetCommitResponseData.TxnOffsetCommitResponseTopic().setName(tp.topic)
byTopics.put(tp.topic, topic)
response.topics.add(topic)
val topic = byTopics.get(tp.topic) match {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it could use the getOrElseUpdate method, but I'm not sure if we typically have side effects like adding to the response. Not a huge deal either way.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I didn’t use it because of the side effect.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The downside is double-lookup (one for get and another for insert).

Copy link
Copy Markdown
Member

@jolshan jolshan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks David. Failures look unrelated.

Copy link
Copy Markdown
Contributor

@jeffkbkim jeffkbkim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, left a small comment.

sendResponse(responseBuilder.build())
CompletableFuture.completedFuture[Unit](())
} else {
val txnOffsetCommitRequestData = new TxnOffsetCommitRequestData()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, it doesn't seem ideal. what's the reason for not reusing the original request data and applying setTopics()?

@dajac
Copy link
Copy Markdown
Member Author

dajac commented Jan 13, 2023

Failed tests are not related.

@dajac dajac merged commit a2926ed into apache:trunk Jan 13, 2023
@dajac dajac deleted the KAFKA-14367-txn-commit-offset branch January 13, 2023 08:55
ijuma added a commit to confluentinc/kafka that referenced this pull request Jan 17, 2023
…master

* apache-github/trunk: (23 commits)
  MINOR: Include the inner exception stack trace when re-throwing an exception (apache#12229)
  MINOR: Fix docs to state that sendfile implemented in `TransferableRecords` instead of `MessageSet` (apache#13109)
  Update ProducerConfig.java (apache#13115)
  KAFKA-14618; Fix off by one error in snapshot id (apache#13108)
  KAFKA-13709 (follow-up): Avoid mention of 'exactly-once delivery' or 'delivery guarantees' in Connect (apache#13106)
  KAFKA-14367; Add `TxnOffsetCommit` to the new `GroupCoordinator` interface (apache#12901)
  KAFKA-14568: Move FetchDataInfo and related to storage module (apache#13085)
  KAFKA-14612: Make sure to write a new topics ConfigRecords to metadata log iff the topic is created (apache#13104)
  KAFKA-14601: Improve exception handling in KafkaEventQueue apache#13089
  KAFKA-14367; Add `OffsetCommit` to the new `GroupCoordinator` interface (apache#12886)
  KAFKA-14530: Check state updater more often (apache#13017)
  KAFKA-14304 Use boolean for ZK migrating brokers in RPC/record (apache#13103)
  KAFKA-14003 Kafka Streams JUnit4 to JUnit5 migration part 2 (apache#12301)
  KAFKA-14607: Move Scheduler/KafkaScheduler to server-common (apache#13092)
  KAFKA-14367; Add `OffsetFetch` to the new `GroupCoordinator` interface (apache#12870)
  KAFKA-14557; Lock metadata log dir (apache#13058)
  MINOR: Implement toString method for TopicAssignment and PartitionAssignment (apache#13101)
  KAFKA-12558: Do not prematurely mutate internal partition state in Mirror Maker 2 (apache#11818)
  KAFKA-14540: Fix DataOutputStreamWritable#writeByteBuffer (apache#13032)
  KAFKA-14600: Reduce flakiness in ProducerIdExpirationTest (apache#13087)
  ...
guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023
…rface (apache#12901)

This patch adds `TxnOffsetCommit` to the new `GroupCoordinator` interface and updates `KafkaApis` to use it.

Reviewers: Jeff Kim <jeff.kim@confluent.io>, Justine Olshan <jolshan@confluent.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

KIP-848 The Next Generation of the Consumer Rebalance Protocol

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants