-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Fix][broker] Fix NPE when ledger id not found in OpReadEntry
#15837
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@mattisonchao:Thanks for your contribution. For this PR, do we need to update docs? |
|
/pulsarbot run-failure-checks |
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Show resolved
Hide resolved
|
After discussing with @Technoboy-, we may find why the Precondition
Cases
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java Lines 1618 to 1659 in 5455f4d
After line:1643, we may have empty ledgers.
|
|
And because the |
gaoran10
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Maybe we could add a unit test to reproduce this problem, for example reading a position greater than max ledger id exists in |
|
@gaoran10 Got it, I will do that. |
Done, PTAL. |
(cherry picked from commit 7a3ad61)
(cherry picked from commit 7a3ad61)
(cherry picked from commit 7a3ad61)
|
move |
Motivation
In current implementation, We have more than one potential NPE relate
ManagedLedgerImpl#startReadOperationOnLedgermethod.1. Unbox NPE
pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java
Lines 2230 to 2244 in b13d15c
According to this code, we can know that when the ledger id is null, we will fail
OpReadEntry. However, there is no direct return here after failure. NPE will be thrown on line 2238. Because we want to unboxnull.2. null
OpReadEntrycontextAccording to the code above, we can see when we fail
OpReadEntry, we pass a null value as context(line 2235). there is an NPE when the dispatcher gets the callback.pulsar/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentDispatcherSingleActiveConsumer.java
Lines 478 to 484 in 0975cdc
3.
OpReadEntry#createpulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Lines 48 to 63 in 9376128
When we create
OpReadEntryand then callManagedCursor#startReadOperationOnLedger, assuming the ledger id is equal to null. At this point, we will fail thisOpReadEntry. But at the current time, other parameters are not initialized. When we callOpReadEntry#readEntriesFailed. The cursor will be null and the NPE will be thrown.pulsar/managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/OpReadEntry.java
Lines 90 to 94 in 9376128
Modifications
OpReadEntrylogic inManagedCursor#startReadOperationOnLedger, when ledger id equals null, we can return the original value. theManagedLedgerwill validate if this operation is legal.From the perspective of the overall design, the
OpReadEntryis just a middle state object, that may have illegal value, we have to check thisOpReadEntryis valid inManagedLedger#asyncReadEntries(Current we already do that).Verifying this change
Documentation
doc-not-needed(Please explain why)