KAFKA-7737; Use single path in producer for initializing the producerId by hachikuji · Pull Request #7920 · apache/kafka

hachikuji · 2020-01-09T17:31:35Z

Previously the idempotent producer and transactional producer use separate logic when initializing the producerId. This patch consolidates the two paths. We also do some cleanup in TransactionManagerTest to eliminate brittle expectations on Sender.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

hachikuji · 2020-01-15T01:23:00Z

retest this please

viktorsomogyi

It looks good as far as I can tell. Saw that you had 2 test issues but most likely those were flakies, I reran the client and core tests for this patch on my laptop and they didn't fail, although let's rerun them in the hope of a green build.

viktorsomogyi · 2020-01-15T12:21:22Z

Just sanity checking: do I understand correctly that this became the part of resetProducerIdIfNeeded so at the first time this code path runs the producer id gets placed in the transactional request queue? Is this also the reason for removing 'maybeWaitForProducerId`?

Yes, that's right.

viktorsomogyi · 2020-01-15T16:26:29Z

Would you consider using a long timeout instead and move this to TestUtils? Perhaps we need this somewhere else too.

It's a bit difficult to restructure these test cases with a timeout. Most of the logic is basically "run the sender until X happens, then verify Y." With a timeout, we cannot control the stopping condition.

viktorsomogyi · 2020-01-15T16:30:43Z

Ah you reran them in the meanwhile :)
Now only one failed:
kafka.api.ConsumerBounceTest.testRollingBrokerRestartsWithSmallerMaxGroupSizeConfigDisruptsBigGroup for which we have a submitted issue (KAFKA-7965)

guozhangwang

Just had one qq about the PR, will review the remaining of the PR once that's cleared -- maybe I missed something big here so need to clarify.

guozhangwang · 2020-01-15T21:16:09Z

The only other caller of InitProducerIdRequest.Builder is in TransactionManager#initializeTransactions which is only called in producer#iniTxns. For idempotent producer how that initPid request would be sent? Did I miss anything?

We also build an InitProducerIdRequest inside TransactionManager.resetProducerIdIfNeeded in this patch.

Ack, thanks!

guozhangwang · 2020-01-15T22:00:53Z

Ack, thanks!

guozhangwang · 2020-01-15T22:17:24Z

qq: is the second condition necessary given the first one? If we are not in transactional, then right now the only place to transit to INITIALIZING is line 496 below. I'm actually fine to leave it here to be less bug-vulnerable, but just to clarify my understanding.

I added it to ensure that we don't enqueue multiple InitProducerId requests.

guozhangwang · 2020-01-15T22:22:43Z

qq: could you add a comment explaining why we need to call client.poll before exiting under this condition?

I don't have a great explanation for it to be honest. I guess we are trying to ensure that any inflight requests return before we shutdown. Perhaps we should just crash the thread and let the client get cleaned up? How about I open a separate issue for this?

guozhangwang · 2020-01-15T22:25:58Z

We can let this to return a boolean on whether a request is enqueued, and then in Sender we could:

if (transactionManager.resetProducerIdIfNeeded() && maybeSendAndPollTransactionalRequest()) ..

also nit: Add a comment on top indicating that we will send out a new initPid request if producer is not transactional.

I think we want to call maybeSendAndPollTransactionalRequest in this loop regardless whether resetProducerIdNeeded returns true. Ack on the comment.

guozhangwang · 2020-01-15T22:29:34Z

qq: is that a piggy-backed fix, or is it necessary for the refactoring?

More of an optimization than a fix I guess, but it simplified one of the tests in TransactionManagerTest.

guozhangwang · 2020-01-15T22:30:59Z

How about: letting the state to transit to UNINITIALIZED inside resetProducerId (since it will only execute successfully if not transactional). By doing this we still have a single transition path to INITIALIZING?

Yeah, it's a good thought. I don't recall if I previously considered this, but let me take a look and see if it works.

guozhangwang · 2020-01-15T22:59:49Z

qq: Why is this necessary?

Note that we blackout the node for 10ms above. Previously we were relying on the backoff logic in maybeWaitForProducerId for the node to be ready again. Now the test needs time to be advanced externally since we cannot rely on client.poll advancing it.

hachikuji · 2020-01-16T02:04:49Z

Reducing visibility here to make sure that test cases are forced through the proper state transitions.

hachikuji · 2020-01-18T02:26:28Z

retest this please

hachikuji · 2020-01-18T18:44:53Z

retest this please

guozhangwang

LGTM!

bob-barrett

LGTM

bob-barrett · 2020-01-22T10:11:32Z

Just FYI, for KIP-360 I'm doing this check for both idempotent and transactional, since it triggers an epoch bump instead of a producerId reset. I'll just pull this call out to a shared code path, the rest of this method shouldn't need to change.

Sounds good. We can rename the method as well.

hachikuji · 2020-01-23T05:12:32Z

retest this please

Conflicts and/or compiler errors due to the fact that we temporarily reverted the commit that removes Scala 2.11 support: * Exit.scala: replace SAMs with anonymous inner classes. * MiniKdc.scala: take upstream changes. # By A. Sophie Blee-Goldman (1) and others # Via Jason Gustafson * apache-github/trunk: KAFKA-9254; Overridden topic configs are reset after dynamic default change (apache#7870) MINOR: MiniKdc JVM shutdown hook fix (apache#7946) KAFKA-9152; Improve Sensor Retrieval (apache#7928) Correct exception message in DistributedHerder (apache#7995) KAFKA-7317: Use collections subscription for main consumer to reduce metadata (apache#7969) KAFKA-9181; Maintain clean separation between local and group subscriptions in consumer's SubscriptionState (apache#7941) KAFKA-7737; Use single path in producer for initializing the producerId (apache#7920) # Conflicts: # core/src/test/scala/kafka/security/minikdc/MiniKdc.scala

hachikuji force-pushed the KAFKA-7737 branch from 3715f77 to 618769c Compare January 9, 2020 21:16

viktorsomogyi approved these changes Jan 15, 2020

View reviewed changes

guozhangwang reviewed Jan 15, 2020

View reviewed changes

hachikuji force-pushed the KAFKA-7737 branch from 618769c to 7b7cb39 Compare January 16, 2020 02:03

hachikuji commented Jan 16, 2020

View reviewed changes

guozhangwang approved these changes Jan 19, 2020

View reviewed changes

bob-barrett approved these changes Jan 22, 2020

View reviewed changes

hachikuji added 3 commits January 22, 2020 16:14

KAFKA-7737; Use single path in producer for initializing the producerId

7241171

A second call to initTransactions should fail

94df312

ProducerId reset transitions to UNINITIALIZED

3e45a5a

hachikuji force-pushed the KAFKA-7737 branch from 7b7cb39 to 3e45a5a Compare January 23, 2020 00:20

hachikuji merged commit df13fc9 into apache:trunk Jan 23, 2020

Conversation

hachikuji commented Jan 9, 2020

Committer Checklist (excluded from commit message)

Uh oh!

hachikuji commented Jan 15, 2020

Uh oh!

viktorsomogyi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

viktorsomogyi commented Jan 15, 2020

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hachikuji Jan 15, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hachikuji commented Jan 18, 2020

Uh oh!

hachikuji commented Jan 18, 2020

Uh oh!

guozhangwang left a comment

Choose a reason for hiding this comment

Uh oh!

bob-barrett left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hachikuji commented Jan 23, 2020

Uh oh!

Reviewers

Assignees

hachikuji Jan 15, 2020 •

edited

Loading