KAFKA-9832: fix attempt to commit non-running tasks#8580
KAFKA-9832: fix attempt to commit non-running tasks#8580vvcephei merged 2 commits intoapache:trunkfrom vvcephei:kafka-9832-fix-corrupted-commit
Conversation
|
Hey @mjsax , do you have time for a quick review? This bug seems to have been introduced by https://github.com/apache/kafka/pull/8440/files#r407722022 , which attempts to commit all non-corrupted tasks. Some of these tasks may not be running. The Task implementations will throw an exception if we attempt to We could make the task more permissive, so that it would ignore the commit to a task that is not in a committable state. I opted instead to filter out only the tasks in committable states, though. I was concerned that if we make prepareCommit more permissive, we might just complicate the rest of the commit lifecycle, because then the rest of it would also have to be permissive, etc. Thanks for the very nice test in your prior PR; it was easy to extend it to cover this case and also to add the regression test. WDYT? |
mjsax
left a comment
There was a problem hiding this comment.
LGTM. I agree that filtering non-running task is the better approach.
|
The failure was the known-problematic GlobalKTableEOSIntegrationTest |
|
Thanks for the review @mjsax ! |
…/master` * apache-github/trunk: (45 commits) MINOR: Fix broken JMX link in docs by adding missing starting double quote (apache#8587) KAFKA-9652: Fix throttle metric in RequestChannel and request log due to KIP-219 (apache#8567) KAFKA-9922: Update demo instructions in examples README (apache#8559) KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses (apache#8442) KAFKA-9875: Make integration tests more resilient (apache#8578) KAFKA-9932: Don't load configs from ZK when the log has already been loaded (apache#8582) KAFKA-9925: decorate pseudo-topics with app id (apache#8574) KAFKA-9832: fix attempt to commit non-running tasks (apache#8580) KAFKA-9127: don't create StreamThreads for global-only topology (apache#8540) MINOR: add support for kafka 2.4 and 2.5 to downgrade test KAFKA-9176: Retry on getting local stores from KafkaStreams (apache#8568) KAFKA-9823: Follow-up, check state for handling commit error response (apache#8548) KAFKA-6145: KIP-441: Add TaskAssignor class config (apache#8541) MINOR: Fix partition numbering from 0 to P-1 instead of P in docs (apache#8572) KAFKA-9921: disable caching on stores configured to retain duplicates (apache#8564) Minor: remove redundant check in auto preferred leader election (apache#8566) MINOR: Update the link to the Raft paper in docs (apache#8560) MINOR: Fix typos in config properties in MM2 test (apache#8561) MINOR: Improve producer test BufferPoolTest#testCloseNotifyWaiters. (apache#7982) MINOR: document how to escape json parameters to ducktape tests (apache#8546) ...
There was a minor conflict in gradle.properties because the default Scala version changed upstream to Scala 2.13. I kept the upstream change. Related to this, I have updated Jenkinsfile to compile and validate with Scala 2.12 in a separate stage so that we ensure we maintain compatibility. Unlike Apache Kafka, we only run the tests with the default Scala version, which is now 2.13. * apache-github/trunk: (45 commits) MINOR: Fix broken JMX link in docs by adding missing starting double quote (apache#8587) KAFKA-9652: Fix throttle metric in RequestChannel and request log due to KIP-219 (apache#8567) KAFKA-9922: Update demo instructions in examples README (apache#8559) KAFKA-9830: Implement AutoCloseable in ErrorReporter and subclasses (apache#8442) KAFKA-9875: Make integration tests more resilient (apache#8578) KAFKA-9932: Don't load configs from ZK when the log has already been loaded (apache#8582) KAFKA-9925: decorate pseudo-topics with app id (apache#8574) KAFKA-9832: fix attempt to commit non-running tasks (apache#8580) KAFKA-9127: don't create StreamThreads for global-only topology (apache#8540) MINOR: add support for kafka 2.4 and 2.5 to downgrade test KAFKA-9176: Retry on getting local stores from KafkaStreams (apache#8568) KAFKA-9823: Follow-up, check state for handling commit error response (apache#8548) KAFKA-6145: KIP-441: Add TaskAssignor class config (apache#8541) MINOR: Fix partition numbering from 0 to P-1 instead of P in docs (apache#8572) KAFKA-9921: disable caching on stores configured to retain duplicates (apache#8564) Minor: remove redundant check in auto preferred leader election (apache#8566) MINOR: Update the link to the Raft paper in docs (apache#8560) MINOR: Fix typos in config properties in MM2 test (apache#8561) MINOR: Improve producer test BufferPoolTest#testCloseNotifyWaiters. (apache#7982) MINOR: document how to escape json parameters to ducktape tests (apache#8546) ...
Fixes an attempt to commit potentially non-running tasks while recovering from task corruption.
Committer Checklist (excluded from commit message)