[fix][broker] Fix markDeletedPosition race condition in ManagedLedgerImpl.maybeUpdateCursorBeforeTrimmingConsumedLedger() method#25110
Conversation
|
@oneby-wang Please add the following content to your PR description and select a checkbox: |
f0619ec to
3de1d97
Compare
bbc81d2 to
3569bde
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #25110 +/- ##
=============================================
+ Coverage 38.39% 74.53% +36.13%
- Complexity 13035 34073 +21038
=============================================
Files 1842 1899 +57
Lines 145553 149737 +4184
Branches 16919 17410 +491
=============================================
+ Hits 55890 111599 +55709
+ Misses 81980 29252 -52728
- Partials 7683 8886 +1203
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
/pulsarbot rerun-failure-checks |
|
Doesn't solve Seems we must ensure some happens-before order between |
daa66dc to
0f89953
Compare
…rsorBeforeTrimmingConsumedLedger
…d testFlushCursorAfterIndividualDeleteInactivity test
…ursorLedgerFailed test
…ursorLedgerFailed test
…erties in ManagedLedgerTest.testTrimmerRaceCondition
…Impl.internalAsyncMarkDelete
afdba8a to
bd62af6
Compare
…eleteEntryToLatest, add test
…onWithThrottleMarkDeleteInDurableCursor
|
Fix Add Ready for review. |
…Impl.maybeUpdateCursorBeforeTrimmingConsumedLedger() method (apache#25110) (cherry picked from commit 1617bb2) (cherry picked from commit 252050c)
…Impl.maybeUpdateCursorBeforeTrimmingConsumedLedger() method (apache#25110) (cherry picked from commit 1617bb2) (cherry picked from commit 252050c)
Motivation
We do mark deleting operation in
ManagedLedgerImpl.maybeUpdateCursorBeforeTrimmingConsumedLedger()after #25087.Mark deleting an already mark-deleted position will throw an Exception.
Please see comment for race condition case: #25101 (comment).
See also:
Modifications
Modify
ManagedLedgerImpl.maybeUpdateCursorBeforeTrimmingConsumedLedger()method:a. snapshot positions into a local variable to avoid race condition.
b. use
markDeletePositioninstead ofpersistentMarkDeletedPositionto setlastAckedPosition, becausepersistentMarkDeletedPositionis updated aftermarkDeletePosition.c. add if check logic to avoid mark deleting an already mark-deleted position.
d. wait
maybeUpdateCursorBeforeTrimmingConsumedLedger()completed when opening ledger.e. move
maybeUpdateCursorBeforeTrimmingConsumedLedger()into synchronized block when creating a new ledger. I think new ledger's initialization work should be completed before the first entry is added to this ledger.Modify
ManagedCursorImpl.getNumberOfEntries()method: snapshot positions into a local variable to avoid race condition, error logs like this:Ignore changes in
ManagedCursorImpl.asyncMarkDelete()method, I just snapshot positions into a local variable to avoid race condition. It should be fixed by: [fix][broker] Move newPosition to nextValidLedger:-1 to avoid cursor position and ledger inconsistency in ManagedCursorImpl #25117.Add
testNeverThrowsMarkDeletingMarkedPositionInMaybeUpdateCursorBeforeTrimmingConsumedLedgertest inManagedLedgerImplTestto verify the code change.Fix tests due to this PR's code change.
Fix some flaky tests introduced by PR [fix][broker] Fix cursor position persistence in ledger trimming #25087.
Verifying this change
Does this pull request potentially affect one of the following parts:
If the box was checked, please highlight the changes
Documentation
docdoc-requireddoc-not-neededdoc-completeMatching PR in forked repository
PR in forked repository: oneby-wang#18