Skip to content

Conversation

@Technoboy-
Copy link
Contributor

Motivation

When broker do monitorBacklogQuota, there may throw NPE.

2022-06-24T18:14:38,360-0400 [pulsar-backlog-quota-checker-29-1] ERROR org.apache.pulsar.broker.service.BacklogQuotaManager - [persistent://enmv/epnmccs-yang/ECLIPSE-epnmccs-yang-ms-east-partition-1] Error resetting cursor for slowest consumer [ENMV_CCS_SUBSCRIPTION]
java.lang.NullPointerException: null
 at org.apache.pulsar.broker.service.BacklogQuotaManager.dropBacklogForTimeLimit(BacklogQuotaManager.java:247) 
 at org.apache.pulsar.broker.service.BacklogQuotaManager.handleExceededBacklogQuota(BacklogQuotaManager.java:127) 
 at org.apache.pulsar.broker.service.BrokerService.lambda$monitorBacklogQuota$78(BrokerService.java:1721) 
 at java.util.Optional.ifPresent(Optional.java:183) ~[?:?]
 at org.apache.pulsar.broker.service.BrokerService.lambda$forEachTopic$77(BrokerService.java:1705) 
 at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap$Section.forEach(ConcurrentOpenHashMap.java:544)
 at org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap.forEach(ConcurrentOpenHashMap.java:272) 
 at org.apache.pulsar.broker.service.BrokerService.forEachTopic(BrokerService.java:1703) 
 at org.apache.pulsar.broker.service.BrokerService.monitorBacklogQuota(BrokerService.java:1714) 
 

Below is the related code :

try {
for (;;) {
ManagedCursor slowestConsumer = mLedger.getSlowestConsumer();
Position oldestPosition = slowestConsumer.getMarkDeletedPosition();
ManagedLedgerInfo.LedgerInfo ledgerInfo = mLedger.getLedgerInfo(oldestPosition.getLedgerId()).get();
// Timestamp only > 0 if ledger has been closed
if (ledgerInfo.getTimestamp() > 0
&& currentMillis - ledgerInfo.getTimestamp() > quota.getLimitTime()) {
// skip whole ledger for the slowest cursor
PositionImpl nextPosition = mLedger.getNextValidPosition(
PositionImpl.get(ledgerInfo.getLedgerId(), ledgerInfo.getEntries() - 1));
if (!nextPosition.equals(oldestPosition)) {
slowestConsumer.resetCursor(nextPosition);
continue;
}
}
break;

The root cause is that the slowest consumer ledgerId may be trimmed, so we need to check ledgerInfo.

Documentation

  • doc-not-needed
    (Please explain why)

@Technoboy- Technoboy- self-assigned this Jun 27, 2022
@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jun 27, 2022
@Technoboy- Technoboy- marked this pull request as ready for review June 27, 2022 06:34
Copy link
Contributor

@codelipenghui codelipenghui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it able to add a test?

Maybe try to mock the managed ledger and let the mLedger.getLedgerInfo method return null.

@codelipenghui codelipenghui added this to the 2.11.0 milestone Jun 27, 2022
@Technoboy- Technoboy- changed the title Fix NPE when drop backlog for time limit. [fix][broker] Fix NPE when drop backlog for time limit. Jun 27, 2022
@Technoboy-
Copy link
Contributor Author

Is it able to add a test?

Maybe try to mock the managed ledger and let the mLedger.getLedgerInfo method return null.

Really. it's hard to add a test. because there existed a race condition with PersistentTopic#slowestReaderTimeBasedBacklogQuotaCheck, see #15663.
When broker do monitorBacklogQuota , it will first execute PersistentTopic#slowestReaderTimeBasedBacklogQuotaCheck, it will check the mLedger.getLedgerInfo. then to BacklogQuotaManager#dropBacklogForTimeLimit. we need to mock after PersistentTopic#slowestReaderTimeBasedBacklogQuotaCheck.

It's the same issue with #15663, so using the same resolution for this issue.

@codelipenghui codelipenghui merged commit d24d827 into apache:master Jun 28, 2022
codelipenghui pushed a commit that referenced this pull request Jun 28, 2022
mattisonchao pushed a commit that referenced this pull request Jul 2, 2022
@mattisonchao mattisonchao added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label Jul 2, 2022
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jul 4, 2022
(cherry picked from commit d24d827)
(cherry picked from commit a51196e)
BewareMyPower pushed a commit that referenced this pull request Jul 29, 2022
@BewareMyPower BewareMyPower added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Jul 29, 2022
@Technoboy- Technoboy- deleted the fix-backlog-quota-manager-npe branch August 10, 2022 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/broker cherry-picked/branch-2.8 Archived: 2.8 is end of life cherry-picked/branch-2.9 Archived: 2.9 is end of life cherry-picked/branch-2.10 doc-not-needed Your PR changes do not impact docs release/2.8.4 release/2.9.4 release/2.10.2 type/bug The PR fixed a bug or issue reported a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants