Skip to content

Conversation

@gaozhangmin
Copy link
Contributor

@gaozhangmin gaozhangmin commented Sep 23, 2021

Motivation
Now ReplicationStats numUnderReplicatedLedger registers when publishSuspectedLedgersAsync, but its value doesn't decrease as with the ledger replicated successfully, We cannot know the progress of replication from the stat.

Changes
registers a notifyUnderReplicationLedgerChanged when auditor starts. numUnderReplicatedLedger value will decrease when the ledger path under replicate deleted.

@gaozhangmin
Copy link
Contributor Author

@eolivelli PTAL

Copy link
Contributor

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, minor comments here

@gaozhangmin
Copy link
Contributor Author

@nicoloboschi PTAL

Copy link
Contributor

@nicoloboschi nicoloboschi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor

@dlg99 dlg99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not convinced that it solves the problem as you outlined it (see my comment on the review) + the PR needs a unit test.

@gaozhangmin gaozhangmin force-pushed the replication-stats-num-under-replicated-ledgers branch from 431f26f to 7d985c8 Compare September 28, 2021 12:00
@gaozhangmin gaozhangmin requested a review from dlg99 September 28, 2021 12:01
@dlg99
Copy link
Contributor

dlg99 commented Sep 28, 2021

@gaozhangmin

[INFO] There is 1 error reported by Checkstyle 6.19 with buildtools/src/main/resources/bookkeeper/checkstyle.xml ruleset.
Error:  src/main/java/org/apache/bookkeeper/replication/Auditor.java:[744] (sizes) LineLength: Line is longer than 120 characters (found 126).

fyi: https://bookkeeper.apache.org/community/coding_guide/ and https://bookkeeper.apache.org/community/contributing/#optional-ide-setup for checkstyle configuration

@gaozhangmin gaozhangmin force-pushed the replication-stats-num-under-replicated-ledgers branch 2 times, most recently from fc639a2 to 11d6119 Compare October 8, 2021 13:00
@gaozhangmin gaozhangmin force-pushed the replication-stats-num-under-replicated-ledgers branch from 6f4ee21 to 7f0b333 Compare January 19, 2022 12:32
@gaozhangmin
Copy link
Contributor Author

@eolivelli PTAL again, It has been for a lone time

@gaozhangmin
Copy link
Contributor Author

@zymap PTAL

@gaozhangmin
Copy link
Contributor Author

rerun failure checks

@dlg99 dlg99 added this to the 4.15.0 milestone Feb 14, 2022
@dlg99 dlg99 merged commit 9fd35f9 into apache:master Feb 14, 2022
StevenLuMT pushed a commit to StevenLuMT/bookkeeper that referenced this pull request Feb 16, 2022
…cess of replication

Motivation
Now ReplicationStats numUnderReplicatedLedger registers when `publishSuspectedLedgersAsync`, but its value doesn't decrease as with the ledger replicated successfully, We cannot know the progress of replication from the stat.

Changes
registers a notifyUnderReplicationLedgerChanged when auditor starts. numUnderReplicatedLedger value will decrease when the ledger path under replicate deleted.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>

This closes apache#2805 from gaozhangmin/replication-stats-num-under-replicated-ledgers
Copy link
Contributor

@hangc0276 hangc0276 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-1 for this change

}
};
try {
zkc.addWatch(urLedgerPath, w, AddWatchMode.PERSISTENT_RECURSIVE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will register a lot of watcher on Zookeeper

public void operationComplete(int rc, Void result) {
Iterator<UnderreplicatedLedger> underreplicatedLedgersInfo = ledgerUnderreplicationManager
.listLedgersToRereplicate(null);
underReplicatedLedgersGuageValue.set(Iterators.size(underreplicatedLedgersInfo));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Iterators.size will iterate the whole Znodes in an under-replicated path which brings heavy pressure on Zookeeper and lead to dead lock.

@horizonzy
Copy link
Member

-1 for this change

Agree, I think we can revert it.

@hangc0276
Copy link
Contributor

This PR introduced a deadlock described: apache/pulsar#21010

Ghatage pushed a commit to sijie/bookkeeper that referenced this pull request Jul 12, 2024
…cess of replication

Motivation
Now ReplicationStats numUnderReplicatedLedger registers when `publishSuspectedLedgersAsync`, but its value doesn't decrease as with the ledger replicated successfully, We cannot know the progress of replication from the stat.

Changes
registers a notifyUnderReplicationLedgerChanged when auditor starts. numUnderReplicatedLedger value will decrease when the ledger path under replicate deleted.

Reviewers: Nicolò Boschi <boschi1997@gmail.com>, Enrico Olivelli <eolivelli@gmail.com>, Andrey Yegorov <None>

This closes apache#2805 from gaozhangmin/replication-stats-num-under-replicated-ledgers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants