KAFKA-7737; Use single path in producer for initializing the producerId#7920
KAFKA-7737; Use single path in producer for initializing the producerId#7920hachikuji merged 3 commits intoapache:trunkfrom
Conversation
|
retest this please |
viktorsomogyi
left a comment
There was a problem hiding this comment.
It looks good as far as I can tell. Saw that you had 2 test issues but most likely those were flakies, I reran the client and core tests for this patch on my laptop and they didn't fail, although let's rerun them in the hope of a green build.
There was a problem hiding this comment.
Just sanity checking: do I understand correctly that this became the part of resetProducerIdIfNeeded so at the first time this code path runs the producer id gets placed in the transactional request queue? Is this also the reason for removing 'maybeWaitForProducerId`?
There was a problem hiding this comment.
Yes, that's right.
There was a problem hiding this comment.
Would you consider using a long timeout instead and move this to TestUtils? Perhaps we need this somewhere else too.
There was a problem hiding this comment.
It's a bit difficult to restructure these test cases with a timeout. Most of the logic is basically "run the sender until X happens, then verify Y." With a timeout, we cannot control the stopping condition.
|
Ah you reran them in the meanwhile :) |
guozhangwang
left a comment
There was a problem hiding this comment.
Just had one qq about the PR, will review the remaining of the PR once that's cleared -- maybe I missed something big here so need to clarify.
There was a problem hiding this comment.
The only other caller of InitProducerIdRequest.Builder is in TransactionManager#initializeTransactions which is only called in producer#iniTxns. For idempotent producer how that initPid request would be sent? Did I miss anything?
There was a problem hiding this comment.
We also build an InitProducerIdRequest inside TransactionManager.resetProducerIdIfNeeded in this patch.
There was a problem hiding this comment.
qq: is the second condition necessary given the first one? If we are not in transactional, then right now the only place to transit to INITIALIZING is line 496 below. I'm actually fine to leave it here to be less bug-vulnerable, but just to clarify my understanding.
There was a problem hiding this comment.
I added it to ensure that we don't enqueue multiple InitProducerId requests.
There was a problem hiding this comment.
qq: could you add a comment explaining why we need to call client.poll before exiting under this condition?
There was a problem hiding this comment.
I don't have a great explanation for it to be honest. I guess we are trying to ensure that any inflight requests return before we shutdown. Perhaps we should just crash the thread and let the client get cleaned up? How about I open a separate issue for this?
There was a problem hiding this comment.
We can let this to return a boolean on whether a request is enqueued, and then in Sender we could:
if (transactionManager.resetProducerIdIfNeeded() && maybeSendAndPollTransactionalRequest()) ..
also nit: Add a comment on top indicating that we will send out a new initPid request if producer is not transactional.
There was a problem hiding this comment.
I think we want to call maybeSendAndPollTransactionalRequest in this loop regardless whether resetProducerIdNeeded returns true. Ack on the comment.
There was a problem hiding this comment.
qq: is that a piggy-backed fix, or is it necessary for the refactoring?
There was a problem hiding this comment.
More of an optimization than a fix I guess, but it simplified one of the tests in TransactionManagerTest.
There was a problem hiding this comment.
How about: letting the state to transit to UNINITIALIZED inside resetProducerId (since it will only execute successfully if not transactional). By doing this we still have a single transition path to INITIALIZING?
There was a problem hiding this comment.
Yeah, it's a good thought. I don't recall if I previously considered this, but let me take a look and see if it works.
There was a problem hiding this comment.
qq: Why is this necessary?
There was a problem hiding this comment.
Note that we blackout the node for 10ms above. Previously we were relying on the backoff logic in maybeWaitForProducerId for the node to be ready again. Now the test needs time to be advanced externally since we cannot rely on client.poll advancing it.
There was a problem hiding this comment.
Reducing visibility here to make sure that test cases are forced through the proper state transitions.
|
retest this please |
1 similar comment
|
retest this please |
There was a problem hiding this comment.
Just FYI, for KIP-360 I'm doing this check for both idempotent and transactional, since it triggers an epoch bump instead of a producerId reset. I'll just pull this call out to a shared code path, the rest of this method shouldn't need to change.
There was a problem hiding this comment.
Sounds good. We can rename the method as well.
|
retest this please |
Conflicts and/or compiler errors due to the fact that we temporarily reverted the commit that removes Scala 2.11 support: * Exit.scala: replace SAMs with anonymous inner classes. * MiniKdc.scala: take upstream changes. # By A. Sophie Blee-Goldman (1) and others # Via Jason Gustafson * apache-github/trunk: KAFKA-9254; Overridden topic configs are reset after dynamic default change (apache#7870) MINOR: MiniKdc JVM shutdown hook fix (apache#7946) KAFKA-9152; Improve Sensor Retrieval (apache#7928) Correct exception message in DistributedHerder (apache#7995) KAFKA-7317: Use collections subscription for main consumer to reduce metadata (apache#7969) KAFKA-9181; Maintain clean separation between local and group subscriptions in consumer's SubscriptionState (apache#7941) KAFKA-7737; Use single path in producer for initializing the producerId (apache#7920) # Conflicts: # core/src/test/scala/kafka/security/minikdc/MiniKdc.scala
Previously the idempotent producer and transactional producer use separate logic when initializing the producerId. This patch consolidates the two paths. We also do some cleanup in
TransactionManagerTestto eliminate brittle expectations onSender.Committer Checklist (excluded from commit message)