Skip to content

[Bug] [txn] TC recover fail in recoverTracker.handleCommittingAndAbortingTransaction() #18923

@TakaHiR07

Description

@TakaHiR07

Search before asking

  • I searched in the issues and found nothing similar.

Version

pulsar 2.9.4 or master

Minimal reproduce step

  1. txn produce
  2. restart broker, it would do transactionLog.replayAsync() in tc recover
  3. TC recover finish, see too many ERROR log as follow
12:22:06.137 [transaction_coordinator_TransactionCoordinatorID(id=13)thread_factory-89-1] ERROR org.apache.pulsar.broker.TransactionMetadataStoreService - End t
ransaction fail! TxnId : (13,2310145), TxnAction : 0
org.apache.pulsar.transaction.coordinator.exceptions.CoordinatorException$CoordinatorNotFoundException: Transaction coordinator with id 13 not found!
        at org.apache.pulsar.broker.TransactionMetadataStoreService.getTxnMeta(TransactionMetadataStoreService.java:322) ~[org.apache.pulsar-pulsar-broker-2.9.4.jar:2.9.4]
        at org.apache.pulsar.broker.TransactionMetadataStoreService.endTransaction(TransactionMetadataStoreService.java:370) ~[org.apache.pulsar-pulsar-broker-2.9.4.jar:2.9.4]
        at org.apache.pulsar.broker.TransactionMetadataStoreService.endTransaction(TransactionMetadataStoreService.java:349) ~[org.apache.pulsar-pulsar-broker-2.9.4.jar:2.9.4]
        at org.apache.pulsar.broker.transaction.recover.TransactionRecoverTrackerImpl.lambda$handleCommittingAndAbortingTransaction$0(TransactionRecoverTrackerImpl.java:126) ~[org.apache.pulsar-pulsar-broker-2.9.4.jar:2.9.4]
        at java.lang.Iterable.forEach(Iterable.java:75) ~[?:?]
        at org.apache.pulsar.broker.transaction.recover.TransactionRecoverTrackerImpl.handleCommittingAndAbortingTransaction(TransactionRecoverTrackerImpl.java:125) ~[org.apache.pulsar-pulsar-broker-2.9.4.jar:2.9.4]
        at org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore$1.replayComplete(MLTransactionMetadataStore.java:127) ~[org.apache.pulsar-pulsar-transaction-coordinator-2.9.4.jar:2.9.4]
        at org.apache.pulsar.transaction.coordinator.impl.MLTransactionLogImpl$TransactionLogReplayer.start(MLTransactionLogImpl.java:243) ~[org.apache.pulsar-pulsar-transaction-coordinator-2.9.4.jar:2.9.4]
        at org.apache.pulsar.transaction.coordinator.impl.MLTransactionLogImpl.replayAsync(MLTransactionLogImpl.java:119) ~[org.apache.pulsar-pulsar-transaction-coordinator-2.9.4.jar:2.9.4]
        at org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore.lambda$init$0(MLTransactionMetadataStore.java:113) ~[org.apache.pulsar-pulsar-transaction-coordinator-2.9.4.jar:2.9.4]

What did you expect to see?

The ERROR log is introduced by #13957. In recoverTracker.handleCommittingAndAbortingTransaction(), need to get TC store for endTransaction(). But the store is not put in the stores map until future complete. So handleCommittingAndAbortingTransaction() would throw exception, and since it is not RetryableException, handleCommittingAndAbortingTransaction() actually not finish.

What did you see instead?

do not print too many ERROR log when TC recover

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    type/bugThe PR fixed a bug or issue reported a bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions