KAFKA-19999 Transaction coordinator livelock caused by invalid producer epoch#21176
KAFKA-19999 Transaction coordinator livelock caused by invalid producer epoch#21176chia7712 merged 10 commits intoapache:trunkfrom
Conversation
|
Test the test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.3.2.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=4.1.1.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.4.1.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.5.2.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.6.2.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.7.2.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.9.1.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=3.8.1.metadata_quorum=ISOLATED_KRAFT.group_protocol=None test_id: kafkatest.tests.core.transactions_upgrade_test.TransactionsUpgradeTest.test_transactions_upgrade.from_kafka_version=4.0.1.metadata_quorum=ISOLATED_KRAFT.group_protocol=None |
| Errors.NOT_LEADER_OR_FOLLOWER | ||
| case error => | ||
| error | ||
| if (IdempotentTransactionMarkerException.isInstanceOf(exception)) |
|
Thanks for @chia7712 review, I have addressed all comments |
| log.debug("Received duplicate transaction marker for producer {} with epoch {} " + | ||
| "but transaction is no longer ongoing, treating as idempotent success", | ||
| producerId, producerEpoch); | ||
| throw new IdempotentTransactionMarkerException(); |
There was a problem hiding this comment.
Hmm -- I'm not sure how I feel about throwing this error here. Why did we choose to throw an error?
There was a problem hiding this comment.
Ideally we would try to avoid an error thrown and control flow checking errors in a case where there is no error.
There was a problem hiding this comment.
I agree that using exception for control is an anti-pattern, especially for success scenarios.
However, the write path for the offset topic appears to lack built-in idempotency logic, so avoiding the exception approach might require more refactoring. Regardless, I agree it is worth the effort to implement this without relying on exceptions
There was a problem hiding this comment.
If we simply return early from ProducerAppendInfo#checkProducerEpoch, we still need to prevent duplicate data from being written to the log. The validation happens before the actual append operation, so just returning allows the code to continue and write the duplicate marker anyway.
The general produce path can handle duplicate writes more easily through conditional logic after validation. However, the write path for the offset topic appears to lack built-in idempotency logic. Therefore, throwing an exception provides a clean way to immediately break the append flow once we detect an idempotent retry. The exception propagates up through the call stack, preventing the duplicate write regardless of which code path is handling the marker. The caller can then catch this specific exception type and treat it as a successful operation.
While using exceptions for control flow is not ideal, it's a pragmatic solution given the current code structure.
There was a problem hiding this comment.
To avoid using exceptions for control flow, an alternative approach is to simply suppress the InvalidProducerEpochException when the "safe" conditions are met (i.e., the transaction is completed, the protocol is v2, and the producer epochs are identical)
While this results in appending a duplicate marker to the log, it should be acceptable as this scenario is rare and the record size is negligible. Furthermore, both the broker and clients can handle such duplicate markers gracefully
There was a problem hiding this comment.
I think it is ok to write a duplicate marker as long as we have the log lock and can't write anything else (ie no transaction will start)
There was a problem hiding this comment.
While this results in appending a duplicate marker to the log, it should be acceptable as this scenario is rare and the record size is negligible. Furthermore, both the broker and clients can handle such duplicate markers gracefully
Agreed. Suppressing the exception and allowing the duplicate marker is simpler and more pragmatic. The duplicate marker is harmless and this scenario is rare enough that the tradeoff is worth it for cleaner code.
There was a problem hiding this comment.
Awesome, the code is also looking a lot cleaner. 👍
| this(error, baseOffset, logAppendTime, logStartOffset, recordErrors, errorMessage, new ProduceResponseData.LeaderIdAndEpoch()); | ||
| } | ||
|
|
||
| public PartitionResponse(Throwable exception, long baseOffset, long logAppendTime, long logStartOffset, List<RecordError> recordErrors, String errorMessage) { |
There was a problem hiding this comment.
What was the rationale for the changes in this file?
There was a problem hiding this comment.
If it is for passing up the error back to the KafkaApis, it is also not ideal as the handling for this exception starts leaking into the client code.
|
Thanks for @chia7712 and @jolshan review, I test |
| assertEquals(OptionalLong.of(99L), stateManager.firstUndecidedOffset()); | ||
|
|
||
| short markerEpoch = (short) (epoch + 1); | ||
| appendEndTxnMarker(stateManager, producerId, markerEpoch, ControlRecordType.COMMIT, 100, transactionVersion); |
There was a problem hiding this comment.
small question -- why do we do this append in two different ways in the test? It seems like the behavior is mostly the same in this helper vs the code at 1106. It's just the code at 1106 doesn't do three of the steps at the end:
private void appendEndTxnMarker(ProducerStateManager stateManager,
long producerId,
short producerEpoch,
ControlRecordType controlType,
long offset,
int coordinatorEpoch,
long timestamp,
short transactionVersion) {
ProducerAppendInfo producerAppendInfo = stateManager.prepareUpdate(producerId, AppendOrigin.COORDINATOR, time.milliseconds());
EndTransactionMarker endTxnMarker = new EndTransactionMarker(controlType, coordinatorEpoch);
Optional<CompletedTxn> completedTxn = producerAppendInfo.appendEndTxnMarker(endTxnMarker, producerEpoch, offset, timestamp, transactionVersion);
stateManager.update(producerAppendInfo);
completedTxn.ifPresent(stateManager::completeTxn);
stateManager.updateMapEndOffset(offset + 1);
}
There was a problem hiding this comment.
Thanks for pointing out! Updated to use the helper method consistently for both appends. This ensures the full cleanup steps are executed and makes the test clearer.
…er epoch (#21176) In Transaction Version 2, strict epoch validation (`markerEpoch > currentEpoch`) causes hanging transactions in two scenarios: 1. **Coordinator recovery**: When reloading PREPARE_COMMIT/ABORT from transaction log, retried markers are rejected with `InvalidProducerEpochException` because they use the same epoch 2. **Network retry**: When marker write succeeds but response is lost, coordinator retries are rejected for the same reason Both cases leave transactions permanently hanging in PREPARE state, causing clients to fail with `CONCURRENT_TRANSACTIONS`. Detect idempotent marker retries in `ProducerStateManager.checkProducerEpoch()` by checking three conditions: 1. Transaction Version ≥ 2 2. markerEpoch == currentEpoch (same epoch) 3. currentTxnFirstOffset is empty (transaction already completed) When all conditions are met, treat the marker as a successful idempotent retry instead of throwing an error. Reviewers: Justine Olshan <jolshan@confluent.io>, TaiJuWu <tjwu1217@gmail.com>, Chia-Ping Tsai <chia7712@gmail.com>
In Transaction Version 2, strict epoch validation (
markerEpoch > currentEpoch) causes hanging transactions in two scenarios:transaction log, retried markers are rejected with
InvalidProducerEpochExceptionbecause they use the same epochcoordinator retries are rejected for the same reason
Both cases leave transactions permanently hanging in PREPARE state,
causing clients to fail with
CONCURRENT_TRANSACTIONS.Detect idempotent marker retries in
ProducerStateManager.checkProducerEpoch()by checking threeconditions:
When all conditions are met, treat the marker as a successful idempotent
retry instead of throwing an error.
Reviewers: Justine Olshan jolshan@confluent.io, TaiJuWu
tjwu1217@gmail.com, Chia-Ping Tsai chia7712@gmail.com