KAFKA-17507; WriteTxnMarkers API must not return until markers are written and materialized in group coordinator's cache by dajac · Pull Request #18168 · apache/kafka

dajac · 2024-12-13T13:25:55Z

We have observed the below errors in some cluster:

Uncaught exception in scheduled task 'handleTxnCompletion-902667' exception.message:Trying to complete a transactional offset commit for producerId *** and groupId *** even though the offset commit record itself hasn't been appended to the log.

When a transaction is completed, the transaction coordinator sends a WriteTxnMarkers request to all the partitions involved in the transaction to write the markers to them. When the broker receives it, it writes the markers and if markers are written to the __consumer_offsets partitions, it informs the group coordinator that it can materialize the pending transactional offsets in its main cache. The group coordinator does this asynchronously since Apache Kafka 2.0, see this patch.

The above error appends when the asynchronous operation is executed by the scheduler and the operation finds that there are pending transactional offsets that were not written yet. How come?

There is actually an issue is the steps described above. The group coordinator does not wait until the asynchronous operation completes to return to the api layer. Hence the WriteTxnMarkers response may be send back to the transaction coordinator before the async operation is actually completed. Hence it is possible that the next transactional produce to be started also before the operation is completed too. This could explain why the group coordinator has pending transactional offsets that are not written yet.

There is a similar issue when the transaction is aborted. However on this path, we don't have any checks to verify whether all the pending transactional offsets have been written or not so we don't see any errors in our logs. Due to the same race condition, it is possible to actually remove the wrong pending transactional offsets.

PS: The new group coordinator is not impacted by this bug.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

…itten and materialized in group coordinator's cache

mumrah · 2024-12-13T14:18:49Z

@dajac #17881 adds a "triage" label to PRs from non-committers. Turns out this also affect committers if their membership visibility in the ASF GitHub org is not public. I added instructions for setting your membership visibility to public https://github.com/apache/kafka/blob/trunk/.github/workflows/README.md#pr-triage

jolshan

Thanks for the fix!

…itten and materialized in group coordinator's cache (apache#18168) We have observed the below errors in some cluster: Uncaught exception in scheduled task 'handleTxnCompletion-902667' exception.message:Trying to complete a transactional offset commit for producerId *** and groupId *** even though the offset commit record itself hasn't been appended to the log. When a transaction is completed, the transaction coordinator sends a WriteTxnMarkers request to all the partitions involved in the transaction to write the markers to them. When the broker receives it, it writes the markers and if markers are written to the __consumer_offsets partitions, it informs the group coordinator that it can materialize the pending transactional offsets in its main cache. The group coordinator does this asynchronously since Apache Kafka 2.0, see this patch. The above error appends when the asynchronous operation is executed by the scheduler and the operation finds that there are pending transactional offsets that were not written yet. How come? There is actually an issue is the steps described above. The group coordinator does not wait until the asynchronous operation completes to return to the api layer. Hence the WriteTxnMarkers response may be send back to the transaction coordinator before the async operation is actually completed. Hence it is possible that the next transactional produce to be started also before the operation is completed too. This could explain why the group coordinator has pending transactional offsets that are not written yet. There is a similar issue when the transaction is aborted. However on this path, we don't have any checks to verify whether all the pending transactional offsets have been written or not so we don't see any errors in our logs. Due to the same race condition, it is possible to actually remove the wrong pending transactional offsets. PS: The new group coordinator is not impacted by this bug. Reviewers: Justine Olshan <jolshan@confluent.io>

…itten and materialized in group coordinator's cache (#18168) (#18206) We have observed the below errors in some cluster: Uncaught exception in scheduled task 'handleTxnCompletion-902667' exception.message:Trying to complete a transactional offset commit for producerId *** and groupId *** even though the offset commit record itself hasn't been appended to the log. When a transaction is completed, the transaction coordinator sends a WriteTxnMarkers request to all the partitions involved in the transaction to write the markers to them. When the broker receives it, it writes the markers and if markers are written to the __consumer_offsets partitions, it informs the group coordinator that it can materialize the pending transactional offsets in its main cache. The group coordinator does this asynchronously since Apache Kafka 2.0, see this patch. The above error appends when the asynchronous operation is executed by the scheduler and the operation finds that there are pending transactional offsets that were not written yet. How come? There is actually an issue is the steps described above. The group coordinator does not wait until the asynchronous operation completes to return to the api layer. Hence the WriteTxnMarkers response may be send back to the transaction coordinator before the async operation is actually completed. Hence it is possible that the next transactional produce to be started also before the operation is completed too. This could explain why the group coordinator has pending transactional offsets that are not written yet. There is a similar issue when the transaction is aborted. However on this path, we don't have any checks to verify whether all the pending transactional offsets have been written or not so we don't see any errors in our logs. Due to the same race condition, it is possible to actually remove the wrong pending transactional offsets. PS: The new group coordinator is not impacted by this bug. Reviewers: Justine Olshan <jolshan@confluent.io>

…itten and materialized in group coordinator's cache (apache#18168) We have observed the below errors in some cluster: Uncaught exception in scheduled task 'handleTxnCompletion-902667' exception.message:Trying to complete a transactional offset commit for producerId *** and groupId *** even though the offset commit record itself hasn't been appended to the log. When a transaction is completed, the transaction coordinator sends a WriteTxnMarkers request to all the partitions involved in the transaction to write the markers to them. When the broker receives it, it writes the markers and if markers are written to the __consumer_offsets partitions, it informs the group coordinator that it can materialize the pending transactional offsets in its main cache. The group coordinator does this asynchronously since Apache Kafka 2.0, see this patch. The above error appends when the asynchronous operation is executed by the scheduler and the operation finds that there are pending transactional offsets that were not written yet. How come? There is actually an issue is the steps described above. The group coordinator does not wait until the asynchronous operation completes to return to the api layer. Hence the WriteTxnMarkers response may be send back to the transaction coordinator before the async operation is actually completed. Hence it is possible that the next transactional produce to be started also before the operation is completed too. This could explain why the group coordinator has pending transactional offsets that are not written yet. There is a similar issue when the transaction is aborted. However on this path, we don't have any checks to verify whether all the pending transactional offsets have been written or not so we don't see any errors in our logs. Due to the same race condition, it is possible to actually remove the wrong pending transactional offsets. PS: The new group coordinator is not impacted by this bug. Reviewers: Justine Olshan <jolshan@confluent.io>

KAFKA-17507; WriteTxnMarkers API must not return until markers are wr…

6bab7bb

…itten and materialized in group coordinator's cache

dajac requested a review from jolshan December 13, 2024 13:25

github-actions bot added triage PRs from the community core Kafka Broker labels Dec 13, 2024

dajac removed the triage PRs from the community label Dec 13, 2024

jolshan approved these changes Dec 13, 2024

View reviewed changes

jolshan merged commit 450c10d into apache:trunk Dec 13, 2024

dajac deleted the KAFKA-17507 branch December 16, 2024 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-17507; WriteTxnMarkers API must not return until markers are written and materialized in group coordinator's cache#18168

KAFKA-17507; WriteTxnMarkers API must not return until markers are written and materialized in group coordinator's cache#18168
jolshan merged 1 commit intoapache:trunkfrom
dajac:KAFKA-17507

dajac commented Dec 13, 2024

Uh oh!

mumrah commented Dec 13, 2024

Uh oh!

jolshan left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

dajac commented Dec 13, 2024

Committer Checklist (excluded from commit message)

Uh oh!

mumrah commented Dec 13, 2024

Uh oh!

jolshan left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants