Skip to content

Conversation

@casuallc
Copy link
Contributor

@casuallc casuallc commented Aug 27, 2021

Fixes #11796

[ISSUE 11796] throw NPE when readEntry

No need doc.

OpReadEntry op = OpReadEntry.create(this, readPosition, numberOfEntriesToRead, callback,
ctx, maxPosition);
if (op.readPosition == null) {
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need to call the method callback.readEntriesFailed to make sure the caller gets a result.

Copy link
Contributor Author

@casuallc casuallc Sep 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do not need call the method here, if OpReadEntry init failed, must ensure OpReadEntry->readEntriesFailed will be called, callback.readEntriesFailed will be called then.

PENDING_READ_OPS_UPDATER.incrementAndGet(this);
OpReadEntry op = OpReadEntry.create(this, readPosition, numOfEntriesToRead, callback, ctx, maxPosition);
ledger.asyncReadEntries(op);
if (op.readPosition != null) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op is not recycle if op.readPosition == null

By the way, could you add some unit test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

op recycled when the method OpReadEntry->readEntriesFailed called.

At present, op init failed only if op.readPosition = cursor.ledger.startReadOperationOnLedger(readPositionRef, op); called.

I do not know how to test this case ...

@lhotari
Copy link
Member

lhotari commented Sep 8, 2021

good work @casuallc . The changes in this PR seem to be necessary for handling the failure case that is highlighted.

btw. There seems to be a similar change #11292 in progress, but that is taking a different approach for solving a similar issue. I'd prefer the approach used in this PR over the changes in 11292 since the solution in this PR is very simple.

@lhotari lhotari requested a review from merlimat September 8, 2021 07:05
@lhotari lhotari requested a review from zymap September 9, 2021 06:30
@Jason918
Copy link
Contributor

good work @casuallc . The changes in this PR seem to be necessary for handling the failure case that is highlighted.

btw. There seems to be a similar change #11292 in progress, but that is taking a different approach for solving a similar issue. I'd prefer the approach used in this PR over the changes in 11292 since the solution in this PR is very simple.

@lhotari
IMHO, there is race condition in this solution.
If op.readPosition = cursor.ledger.startReadOperationOnLedger(readPositionRef, op); is failed in OpReadEntry.create, this OpReadEntry is recycled in readEntriesFailed.
So if we check op.readPosition after OpReadEntry.create, this OpReadEntry object could be reused in other threads, and maybe the readPosition is properly initialized.

That's why I try to init readPosition after each OpReadEntry.create method in #11292 , causing the complexity.

@casuallc
Copy link
Contributor Author

good work @casuallc . The changes in this PR seem to be necessary for handling the failure case that is highlighted.
btw. There seems to be a similar change #11292 in progress, but that is taking a different approach for solving a similar issue. I'd prefer the approach used in this PR over the changes in 11292 since the solution in this PR is very simple.

@lhotari
IMHO, there is race condition in this solution.
If op.readPosition = cursor.ledger.startReadOperationOnLedger(readPositionRef, op); is failed in OpReadEntry.create, this OpReadEntry is recycled in readEntriesFailed.
So if we check op.readPosition after OpReadEntry.create, this OpReadEntry object could be reused in other threads, and maybe the readPosition is properly initialized.

That's why I try to init readPosition after each OpReadEntry.create method in #11292 , causing the complexity.

@Jason918 Yes, you are right. I should check op but not op.readPosition.
I changed the code, PTAL.

@hezhangjian
Copy link
Member

@lhotari code managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerImpl.java the callback paas the ctx as null.
But in class PersistentDispatcherSingleActiveConsumer readEntryFailed method acquire the ctx(Consumer) not be null.
So the code should be

opReadEntry.readEntriesFailed(new ManagedLedgerException.NoMoreEntriesToReadException("The ceilingKey(K key) method is used to return the " + "least key greater than or equal to the given key, or null if there is no such key"), ctx);

Am i right ?

@lhotari
Copy link
Member

lhotari commented Oct 15, 2021

Closing and re-opening this PR to trigger a fresh build.

@lhotari lhotari closed this Oct 15, 2021
@lhotari
Copy link
Member

lhotari commented Oct 15, 2021

@casuallc It looks like your branch for this PR is gone and this PR cannot be re-opened. I closed it with the intention to trigger a rebuild. Rebuilding on CI was disabled and now I can see that the reason was that the original PR branch has been deleted.

@casuallc Would you mind creating a new PR with the same changes?

@lhotari
Copy link
Member

lhotari commented Oct 15, 2021

@nicoloboschi @eolivelli The changes in this PR would be useful for fixing a common NPE issue #11796.

@casuallc
Copy link
Contributor Author

@casuallc It looks like your branch for this PR is gone and this PR cannot be re-opened. I closed it with the intention to trigger a rebuild. Rebuilding on CI was disabled and now I can see that the reason was that the original PR branch has been deleted.

@casuallc Would you mind creating a new PR with the same changes?

#12396

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

throw NPE when readEntry

6 participants