Skip to content

Conversation

@AnonHxy
Copy link
Contributor

@AnonHxy AnonHxy commented Jan 20, 2026

Descriptions of the changes in this PR:

Fix #4705

Motivation

Fix #4705

Changes

1.Setting the compacting flag of entryLocationIndex as true when shutdown LedgerStorage to stopping the subsequent compact.
2. When LedgerStorage shutdown we will wait unit the compact end.

@AnonHxy AnonHxy force-pushed the fix_rocksdb_compact_coredump branch from 560ace4 to bc91be9 Compare January 20, 2026 16:09
Comment on lines 350 to 352
while (!entryLocationIndex.compareAndSetCompacting(false, true)) {
Thread.sleep(100);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't it be done in the close method in the EntryLocationIndex?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sense @zymap

@AnonHxy AnonHxy force-pushed the fix_rocksdb_compact_coredump branch from ecaea19 to 7018ea1 Compare January 22, 2026 01:18
@AnonHxy AnonHxy force-pushed the fix_rocksdb_compact_coredump branch from 1556a35 to abee9aa Compare January 22, 2026 05:01
@Override
public void close() throws IOException {
log.info("Closing EntryLocationIndex");
while (!compacting.compareAndSet(false, true)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This waiting could result in the system being stuck here indefinitely, or it could take an exceptionally long time to get stuck at this step.

Should we add a maximum waiting time, or do something else, such as modifying RocksDB operations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dont think so. Closing a DB which in compaction status may triger core dumps or other unexpected error. If Closing the DB cost toot long time, I think we'd better find out why or kill -9 if we need.

The handling approach here is similar to the procedure below:

while (!compacting.compareAndSet(false, true)) {
// Wait till the thread stops compacting
Thread.sleep(100);

public void compact() throws IOException {
try {
isCompacting = true;
if (!compacting.compareAndSet(false, true)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question:

Is the isCompacting syntax the root cause of this problem?
Or is it simply a matter of rewriting the method more efficiently, thus ruling out issues with the isCompacting setting?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. It's the root cause. We should cancel the compacting if we have closed the DB, or we should delay closing the DB if we have already been in compaction status. So here We need an atomic variable to serve as a flag for the DB status. @StevenLuMT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] Core dumps are triggered by rocksdb compacting when shutdown the bookkeeper

3 participants