Skip to content

Conversation

@merlimat
Copy link
Contributor

Motivation

The ReadHandle instances used in ManagedLedger are not properly closed on either cache invalidation or when the managed ledger instance is closed.

While this does not cause any issue with regular BK LedgerHandle instances (since there are no resources to dispose of and the objects just get garbage collected), this is a problem with some of the offloader implementations which might keep file descriptions, sockets or other resources open.

@merlimat merlimat added type/bug The PR fixed a bug or issue reported a bug release/2.7.3 labels May 20, 2021
@merlimat merlimat added this to the 2.8.0 milestone May 20, 2021
@merlimat merlimat self-assigned this May 20, 2021
@merlimat merlimat merged commit 00677bf into apache:master May 20, 2021
@merlimat merlimat deleted the fix-read-handle-leak branch May 20, 2021 21:44
@merlimat merlimat added the cherry-picked/branch-2.7 Archived: 2.7 is end of life label May 20, 2021
@addisonj
Copy link
Contributor

👏 this looks great

eolivelli pushed a commit to datastax/pulsar that referenced this pull request May 21, 2021
eolivelli added a commit to datastax/pulsar that referenced this pull request May 22, 2021
@eolivelli
Copy link
Contributor

FYI This commit on branch-2.7 make test ManagedCursorTest hang.
Probably there is some missing dependency.

@merlimat
Copy link
Contributor Author

@eolivelli Good point. The test is failing because of a thread deadlock in the test:

"test-OrderedScheduler-0-0"  prio=5 tid=16 in Object.wait()
java.lang.Thread.State: WAITING (on object monitor)
        at java.base@13.0.1/jdk.internal.misc.Unsafe.park(Native Method)
        at java.base@13.0.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
        at java.base@13.0.1/java.util.concurrent.locks.StampedLock.acquireWrite(StampedLock.java:1329)
        at java.base@13.0.1/java.util.concurrent.locks.StampedLock.writeLock(StampedLock.java:475)
        at app//org.apache.pulsar.common.util.collections.ConcurrentLongHashMap$Section.remove(ConcurrentLongHashMap.java:340)
        at app//org.apache.pulsar.common.util.collections.ConcurrentLongHashMap$Section.access$300(ConcurrentLongHashMap.java:194)
        at app//org.apache.pulsar.common.util.collections.ConcurrentLongHashMap.remove(ConcurrentLongHashMap.java:141)
        at app//org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.invalidateReadHandle(ManagedLedgerImpl.java:1619)
        at app//org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.lambda$asyncClose$9(ManagedLedgerImpl.java:1239)
        at app//org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl$$Lambda$250/0x0000000800d7c840.accept(Unknown Source)
        at app//org.apache.pulsar.common.util.collections.ConcurrentLongHashMap$Section.forEach(ConcurrentLongHashMap.java:431)
        at app//org.apache.pulsar.common.util.collections.ConcurrentLongHashMap.forEach(ConcurrentLongHashMap.java:164)
        at app//org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl.lambda$asyncClose$10(ManagedLedgerImpl.java:1238)
        at app//org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl$$Lambda$249/0x0000000800d7c040.closeComplete(Unknown Source)
        at app//org.apache.bookkeeper.client.PulsarMockLedgerHandle.lambda$asyncClose$1(PulsarMockLedgerHandle.java:96)
        at app//org.apache.bookkeeper.client.PulsarMockLedgerHandle$$Lambda$229/0x0000000800d71040.accept(Unknown Source)

This is because the fix to ConcurrentLongHashMap (#9787) never got back ported to 2.7. Cherry-picked and the branch is working now.

@eolivelli
Copy link
Contributor

@merlimat
Thanks.
I wasn't able to find the missing commit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cherry-picked/branch-2.7 Archived: 2.7 is end of life release/2.7.3 type/bug The PR fixed a bug or issue reported a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants