Skip to content

Geo-replication failed when enabling partitioned topic auto-creation #6840

@codelipenghui

Description

@codelipenghui

Describe the bug
Geo-replication failed when enabling partitioned topic auto-creation

To Reproduce
Steps to reproduce the behavior:

  1. Start a geo Pulsar cluster
  2. Enable the partitioned topic auto-creation(allowAutoTopicCreationType=partitioned)
  3. Create a namespace with 2 clusters
  4. Start a producer to produce messages to a topic(don't create it by pulsar-admin)
  5. The error occurs at a local cluster
16:19:02.142 [pulsar-io-22-24] ERROR org.apache.pulsar.client.impl.ProducerImpl - [persistent://geo/default/pub-test-partition-2] [pulsar.repl.pulsar-cluster-a] Failed to create producer: persistent://geo/default/pub-test-partition-2 Failed to start replicator pulsar-cluster-a
16:19:02.142 [pulsar-io-22-24] WARN  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://geo/default/pub-test-partition-2] [pulsar.repl.pulsar-cluster-a] Could not get connection to broker: persistent://geo/default/pub-test-partition-2 Failed to start replicator pulsar-cluster-a -- Will try again in 3.13 s
16:19:02.355 [pulsar-timer-42-1] INFO  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://geo/default/pub-test-partition-1] [pulsar.repl.pulsar-cluster-a] Reconnecting after connection was closed
16:19:02.357 [pulsar-io-22-8] INFO  org.apache.pulsar.client.impl.ProducerImpl - [persistent://geo/default/pub-test-partition-1] [pulsar.repl.pulsar-cluster-a] Creating producer on cnx [id: 0xbed80cce, L:/192.168.1.102:54848 - R:/192.168.1.102:6660]
16:19:02.360 [pulsar-io-22-18] WARN  org.apache.pulsar.client.impl.ClientCnx - [id: 0xbed80cce, L:/192.168.1.102:54848 - R:/192.168.1.102:6660] Received error from server: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a
16:19:02.360 [pulsar-io-22-18] ERROR org.apache.pulsar.client.impl.ProducerImpl - [persistent://geo/default/pub-test-partition-1] [pulsar.repl.pulsar-cluster-a] Failed to create producer: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a
16:19:02.360 [pulsar-io-22-18] WARN  org.apache.pulsar.client.impl.ConnectionHandler - [persistent://geo/default/pub-test-partition-1] [pulsar.repl.pulsar-cluster-a] Could not get connection to broker: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a -- Will try again in 2.919 s
16:18:59.839 [pulsar-ordered-OrderedExecutor-1-0] WARN  org.apache.pulsar.broker.service.BrokerService - Replication or dedup check failed. Removing topic from topics list persistent://geo/default/pub-test-partition-0, java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-0 Failed to start replicator pulsar-cluster-a
16:18:59.839 [pulsar-ordered-OrderedExecutor-1-0] ERROR org.apache.pulsar.broker.service.ServerCnx - [/192.168.1.102:54867] Failed to create topic persistent://geo/default/pub-test-partition-0
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-0 Failed to start replicator pulsar-cluster-a
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1298) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1321) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.allOf(CompletableFuture.java:2238) ~[?:1.8.0_241]
	at org.apache.pulsar.common.util.FutureUtil.waitForAll(FutureUtil.java:37) ~[org.apache.pulsar-pulsar-common-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.checkReplication(PersistentTopic.java:1103) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.BrokerService$2.openLedgerComplete(BrokerService.java:892) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$asyncOpen$7(ManagedLedgerFactoryImpl.java:341) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:683) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:2010) ~[?:1.8.0_241]
	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:340) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.BrokerService.lambda$createPersistentTopic$25(BrokerService.java:885) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_241]
	at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1036) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_241]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_241]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_241]
Caused by: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-0 Failed to start replicator pulsar-cluster-a
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.startReplicator(PersistentTopic.java:1180) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.checkReplication(PersistentTopic.java:1086) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	... 18 more
16:19:00.081 [pulsar-io-22-3] INFO  org.apache.pulsar.broker.service.ServerCnx - [/192.168.1.102:54857][persistent://geo/default/pub-test-partition-1] Creating producer. producerId=3
16:19:00.081 [pulsar-ordered-OrderedExecutor-2-0] INFO  org.apache.pulsar.broker.service.AbstractTopic - Disabling publish throttling for persistent://geo/default/pub-test-partition-1
16:19:00.082 [pulsar-ordered-OrderedExecutor-2-0] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://geo/default/pub-test-partition-1] There are no replicated subscriptions on the topic
16:19:00.082 [pulsar-ordered-OrderedExecutor-2-0] INFO  org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://geo/default/pub-test-partition-1] Starting replicator to remote: pulsar-cluster-a
16:19:00.082 [pulsar-ordered-OrderedExecutor-2-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentTopic - [persistent://geo/default/pub-test-partition-1] Replicator startup failed due to partitioned-topic pulsar-cluster-a
16:19:00.082 [pulsar-ordered-OrderedExecutor-2-0] WARN  org.apache.pulsar.broker.service.BrokerService - Replication or dedup check failed. Removing topic from topics list persistent://geo/default/pub-test-partition-1, java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a
16:19:00.082 [pulsar-ordered-OrderedExecutor-2-0] ERROR org.apache.pulsar.broker.service.ServerCnx - [/192.168.1.102:54857] Failed to create topic persistent://geo/default/pub-test-partition-1
java.util.concurrent.CompletionException: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a
	at java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.biRelay(CompletableFuture.java:1298) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.andTree(CompletableFuture.java:1321) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.allOf(CompletableFuture.java:2238) ~[?:1.8.0_241]
	at org.apache.pulsar.common.util.FutureUtil.waitForAll(FutureUtil.java:37) ~[org.apache.pulsar-pulsar-common-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.checkReplication(PersistentTopic.java:1103) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.BrokerService$2.openLedgerComplete(BrokerService.java:892) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.lambda$asyncOpen$7(ManagedLedgerFactoryImpl.java:341) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:683) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:2010) ~[?:1.8.0_241]
	at org.apache.bookkeeper.mledger.impl.ManagedLedgerFactoryImpl.asyncOpen(ManagedLedgerFactoryImpl.java:340) ~[org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.BrokerService.lambda$createPersistentTopic$25(BrokerService.java:885) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:646) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:488) ~[?:1.8.0_241]
	at java.util.concurrent.CompletableFuture.complete(CompletableFuture.java:1975) ~[?:1.8.0_241]
	at org.apache.pulsar.broker.service.BrokerService.lambda$getManagedLedgerConfig$34(BrokerService.java:1036) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.mledger.util.SafeRun$2.safeRun(SafeRun.java:49) [org.apache.pulsar-managed-ledger-2.5.1.jar:2.5.1]
	at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [org.apache.bookkeeper-bookkeeper-common-4.10.0.jar:4.10.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_241]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_241]
	at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.45.Final.jar:4.1.45.Final]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_241]
Caused by: org.apache.pulsar.broker.service.BrokerServiceException$NamingException: persistent://geo/default/pub-test-partition-1 Failed to start replicator pulsar-cluster-a
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.startReplicator(PersistentTopic.java:1180) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	at org.apache.pulsar.broker.service.persistent.PersistentTopic.checkReplication(PersistentTopic.java:1086) ~[org.apache.pulsar-pulsar-broker-2.5.1.jar:2.5.1]
	... 18 more

Expected behavior
Geo-replication should work well

Additional context
Version 2.5.1

Metadata

Metadata

Assignees

Labels

release/2.5.2type/bugThe PR fixed a bug or issue reported a bug

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions