-
Notifications
You must be signed in to change notification settings - Fork 963
Add error handling to readLedgerMetadata in over-replicated ledger GC #2844
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
zymap
merged 1 commit into
apache:master
from
shustsud:add_error_handling_over-replicated_ledger_GC
Oct 25, 2021
Merged
Add error handling to readLedgerMetadata in over-replicated ledger GC #2844
zymap
merged 1 commit into
apache:master
from
shustsud:add_error_handling_over-replicated_ledger_GC
Oct 25, 2021
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
dlg99
approved these changes
Oct 23, 2021
zymap
approved these changes
Oct 25, 2021
massakam
reviewed
Oct 25, 2021
| } | ||
| } catch (Throwable t) { | ||
| latch.countDown(); | ||
| continue; |
Contributor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exceptions other than BKNoSuchLedgerExistsException may be thrown, so it would be better to output the WARN log. There is also an option not to output the log only when the cause is BKNoSuchLedgerExistsException:
} catch (Throwable t) {
if (!(t.getCause() instanceof BKNoSuchLedgerExistsException)) {
LOG.warn("...");
}
latch.countDown();
continue;
}
zymap
pushed a commit
that referenced
this pull request
Oct 26, 2021
…#2844) ### Motivation For each ledger whose metadata is not in ZK, following stack trace will be output: ``` 15:30:17.925 [GarbageCollectorThread-11-1] ERROR o.a.b.b.ScanAndCompareGarbageCollector - Exception when iterating through the ledgers to check for over-replication java.util.concurrent.ExecutionException: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at org.apache.bookkeeper.bookie.ScanAndCompareGarbageCollector.removeOverReplicatedledgers(ScanAndCompareGarbageCollector.java:199) at org.apache.bookkeeper.bookie.ScanAndCompareGarbageCollector.gc(ScanAndCompareGarbageCollector.java:120) at org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcLedgers(GarbageCollectorThread.java:372) at org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(GarbageCollectorThread.java:323) at org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun(GarbageCollectorThread.java:301) at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists at org.apache.bookkeeper.meta.AbstractZkLedgerManager$3.processResult(AbstractZkLedgerManager.java:397) at org.apache.bookkeeper.zookeeper.ZooKeeperClient$19$1.processResult(ZooKeeperClient.java:994) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:575) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) ``` It is noisy, makes the size of log files large and finally causes OOM during log rotation. So we should suppress the stacktrace. (This problem is due to [#2813](#2813).) ### Changes Add error handling to readLedgerMetadata in over-replicated ledger GC in order to suppress the stacktrace. (cherry picked from commit bd5c50b)
hsaputra
pushed a commit
that referenced
this pull request
Oct 29, 2021
### Motivation
In order to complete migration to Gradle we must build all the subprojects.
### Changes
- Enabled `sh` integration tests with gradle, located in `tests/scripts/src/test/bash/gradle`
- Added these modules to the build
- `bookkeeper-http:servlet-http-server`
- `metadata-drivers:etcd`
- `tests:backward-compat:*`
- `tests:shaded:*`
- `stream:bk-grpc-name-resolver`
- DL shading process is now performed (before it didn't build any jar)
- Groovy tests (`tests:backward-compat:*`) now are triggered by the build/tests itself; with Maven, there is a "runner" project (`tests/integration-tests-base-groovy`); in Gradle is useless so it is skipped
### Test
- Both `bin/bookkeper standalone` and `bin/bookkeper_gradle standalone` work locally
- Tests are passing locally
Master Issue: #2849
Reviewers: Henry Saputra <hsaputra@apache.org>, Prashant Kumar <None>
This closes #2850 from nicoloboschi/fix/2849/gradle and squashes the following commits:
00b49f4 [Nicolò Boschi] Fix common_gradle.sh regex
bd739fd [Nicolò Boschi] fix sh tests
43230ba [Nicolò Boschi] revert sh files. Avoid to modify maven files, create gradle versions to faciltate migration
d1f95e4 [Nicolò Boschi] fix shaded deps
bcab40d [Nicolò Boschi] fix build
5fd0341 [Nicolò Boschi] fix build
0082e0e [Nicolò Boschi] fix build
2c32ac1 [Nicolò Boschi] fixes
3bc0b26 [Nicolò Boschi] bookkeeper-server-shaded-tests
ba89132 [Nicolò Boschi] shaded tests
6d39e33 [Nicolò Boschi] sh tests
e0032bc [Nicolò Boschi] actually run arquillian groovy tests
08dcc39 [Nicolò Boschi] backwards
2361f79 [Nicolò Boschi] hierarchical-ledger-manager
8388e11 [Nicolò Boschi] current-server-old-clients
6a24344 [Nicolò Boschi] bc-non-fips
2faca01 [Nicolò Boschi] bk-grpc-name-resolver
991bc11 [Nicolò Boschi] servlet-http-server
675ef7b [Nicolò Boschi] etcd
b1d5e14 [ZhangJian He] A empty implement in EtcdLedgerManagerFactory to let the project can compile (#2845)
bd5c50b [shustsud] Add error handling to readLedgerMetadata in over-replicated ledger GC (#2844)
746f9f6 [Matteo Merli] Remove direct ZK access for Auditor (#2842)
4117200 [ZhangJian He] the compare should be >= instead of > (#2782)
14ef56f [Prashant Kumar] BookieId can not be cast to BookieSocketAddress (#2843)
e10f3fe [ZhangJian He] Forget to close preAllocator log on shutdown (#2819)
53954ca [shustsud] Add ensemble check to over-replicated ledger GC (#2813)
919fdd3 [Prashant Kumar] Issue:2840 Create bookie shellscript for gradle (#2841)
031d168 [gaozhangmin] fix-npe-when-pulsar-ZkBookieRackAffinityMapping-getBookieAddressResolver (#2788)
3dd671c [Prashant Kumar] Migrate bookkeepr-server:test to gradle run unit tests excepts org.apache.bookkeeper.bookie. org.apache.bookkeeper.client org.apache.bookkeeper.replication org.apache.bookkeeper.tls. (#2812)
f6903b8 [Jack Vanlightly] BP-44 USE metrics
a4afaa4 [Matteo Merli] Eliminate direct ZK access in ScanAndCompareGarbageCollector (#2833)
a9b576d [Yunze Xu] Release semaphore when addEntry accepts the same entries (#2832)
148bf22 [Yun Tang] Ensure to release cache during KeyValueStorageRocksDB#closec (#2821)
4dc4260 [gaozhangmin] Heap memory leak problem when ledger replication failed (#2794)
a522fa3 [Raúl Gracia] Issue 2815: Upgrade to log4j2 to get rid of CVE-2019-17571 (#2816)
0465052 [Nicolò Boschi] Upgrade httpclient from 4.5.5 to 4.5.13 (#2793)
594a056 [Raúl Gracia] Issue 2795: Bookkeeper upgrade using Bookie ID may fail due to cookie mismatch (#2796)
354cf37 [Raúl Gracia] Upgraded dependencies with CVEs (#2792)
e413c70 [Raúl Gracia] Issue 2728: Entry Log GC may get blocked when using entryLogPerLedgerEnabled option (#2779)
883231e [pradeepbn] Building bookkeeper with gradle on java11
dlg99
pushed a commit
that referenced
this pull request
Feb 14, 2022
…sOnMetadataServerException occurs in over-replicated ledger GC ### Motivation - Even if an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs of readLedgerMetadata in over-replicated ledger GC, nothing will be output to the log. (#2844 (comment)) ### Changes - If an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs in readLedgerMetadata, output information to the log. Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com> This closes #2873 from shustsud/improved_error_handling
StevenLuMT
pushed a commit
to StevenLuMT/bookkeeper
that referenced
this pull request
Feb 16, 2022
…sOnMetadataServerException occurs in over-replicated ledger GC ### Motivation - Even if an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs of readLedgerMetadata in over-replicated ledger GC, nothing will be output to the log. (apache#2844 (comment)) ### Changes - If an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs in readLedgerMetadata, output information to the log. Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com> This closes apache#2873 from shustsud/improved_error_handling
Ghatage
pushed a commit
to sijie/bookkeeper
that referenced
this pull request
Jul 12, 2024
…apache#2844) ### Motivation For each ledger whose metadata is not in ZK, following stack trace will be output: ``` 15:30:17.925 [GarbageCollectorThread-11-1] ERROR o.a.b.b.ScanAndCompareGarbageCollector - Exception when iterating through the ledgers to check for over-replication java.util.concurrent.ExecutionException: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists at java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:357) at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:1908) at org.apache.bookkeeper.bookie.ScanAndCompareGarbageCollector.removeOverReplicatedledgers(ScanAndCompareGarbageCollector.java:199) at org.apache.bookkeeper.bookie.ScanAndCompareGarbageCollector.gc(ScanAndCompareGarbageCollector.java:120) at org.apache.bookkeeper.bookie.GarbageCollectorThread.doGcLedgers(GarbageCollectorThread.java:372) at org.apache.bookkeeper.bookie.GarbageCollectorThread.runWithFlags(GarbageCollectorThread.java:323) at org.apache.bookkeeper.bookie.GarbageCollectorThread.safeRun(GarbageCollectorThread.java:301) at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.bookkeeper.client.BKException$BKNoSuchLedgerExistsException: No such ledger exists at org.apache.bookkeeper.meta.AbstractZkLedgerManager$3.processResult(AbstractZkLedgerManager.java:397) at org.apache.bookkeeper.zookeeper.ZooKeeperClient$19$1.processResult(ZooKeeperClient.java:994) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:575) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508) ``` It is noisy, makes the size of log files large and finally causes OOM during log rotation. So we should suppress the stacktrace. (This problem is due to [apache#2813](apache#2813).) ### Changes Add error handling to readLedgerMetadata in over-replicated ledger GC in order to suppress the stacktrace.
Ghatage
pushed a commit
to sijie/bookkeeper
that referenced
this pull request
Jul 12, 2024
### Motivation
In order to complete migration to Gradle we must build all the subprojects.
### Changes
- Enabled `sh` integration tests with gradle, located in `tests/scripts/src/test/bash/gradle`
- Added these modules to the build
- `bookkeeper-http:servlet-http-server`
- `metadata-drivers:etcd`
- `tests:backward-compat:*`
- `tests:shaded:*`
- `stream:bk-grpc-name-resolver`
- DL shading process is now performed (before it didn't build any jar)
- Groovy tests (`tests:backward-compat:*`) now are triggered by the build/tests itself; with Maven, there is a "runner" project (`tests/integration-tests-base-groovy`); in Gradle is useless so it is skipped
### Test
- Both `bin/bookkeper standalone` and `bin/bookkeper_gradle standalone` work locally
- Tests are passing locally
Master Issue: apache#2849
Reviewers: Henry Saputra <hsaputra@apache.org>, Prashant Kumar <None>
This closes apache#2850 from nicoloboschi/fix/2849/gradle and squashes the following commits:
00b49f4 [Nicolò Boschi] Fix common_gradle.sh regex
bd739fd [Nicolò Boschi] fix sh tests
43230ba [Nicolò Boschi] revert sh files. Avoid to modify maven files, create gradle versions to faciltate migration
d1f95e4 [Nicolò Boschi] fix shaded deps
bcab40d [Nicolò Boschi] fix build
5fd0341 [Nicolò Boschi] fix build
0082e0e [Nicolò Boschi] fix build
2c32ac1 [Nicolò Boschi] fixes
3bc0b26 [Nicolò Boschi] bookkeeper-server-shaded-tests
ba89132 [Nicolò Boschi] shaded tests
6d39e33 [Nicolò Boschi] sh tests
e0032bc [Nicolò Boschi] actually run arquillian groovy tests
08dcc39 [Nicolò Boschi] backwards
2361f79 [Nicolò Boschi] hierarchical-ledger-manager
8388e11 [Nicolò Boschi] current-server-old-clients
6a24344 [Nicolò Boschi] bc-non-fips
2faca01 [Nicolò Boschi] bk-grpc-name-resolver
991bc11 [Nicolò Boschi] servlet-http-server
675ef7b [Nicolò Boschi] etcd
b1d5e14 [ZhangJian He] A empty implement in EtcdLedgerManagerFactory to let the project can compile (apache#2845)
bd5c50b [shustsud] Add error handling to readLedgerMetadata in over-replicated ledger GC (apache#2844)
746f9f6 [Matteo Merli] Remove direct ZK access for Auditor (apache#2842)
4117200 [ZhangJian He] the compare should be >= instead of > (apache#2782)
14ef56f [Prashant Kumar] BookieId can not be cast to BookieSocketAddress (apache#2843)
e10f3fe [ZhangJian He] Forget to close preAllocator log on shutdown (apache#2819)
53954ca [shustsud] Add ensemble check to over-replicated ledger GC (apache#2813)
919fdd3 [Prashant Kumar] Issue:2840 Create bookie shellscript for gradle (apache#2841)
031d168 [gaozhangmin] fix-npe-when-pulsar-ZkBookieRackAffinityMapping-getBookieAddressResolver (apache#2788)
3dd671c [Prashant Kumar] Migrate bookkeepr-server:test to gradle run unit tests excepts org.apache.bookkeeper.bookie. org.apache.bookkeeper.client org.apache.bookkeeper.replication org.apache.bookkeeper.tls. (apache#2812)
f6903b8 [Jack Vanlightly] BP-44 USE metrics
a4afaa4 [Matteo Merli] Eliminate direct ZK access in ScanAndCompareGarbageCollector (apache#2833)
a9b576d [Yunze Xu] Release semaphore when addEntry accepts the same entries (apache#2832)
148bf22 [Yun Tang] Ensure to release cache during KeyValueStorageRocksDB#closec (apache#2821)
4dc4260 [gaozhangmin] Heap memory leak problem when ledger replication failed (apache#2794)
a522fa3 [Raúl Gracia] Issue 2815: Upgrade to log4j2 to get rid of CVE-2019-17571 (apache#2816)
0465052 [Nicolò Boschi] Upgrade httpclient from 4.5.5 to 4.5.13 (apache#2793)
594a056 [Raúl Gracia] Issue 2795: Bookkeeper upgrade using Bookie ID may fail due to cookie mismatch (apache#2796)
354cf37 [Raúl Gracia] Upgraded dependencies with CVEs (apache#2792)
e413c70 [Raúl Gracia] Issue 2728: Entry Log GC may get blocked when using entryLogPerLedgerEnabled option (apache#2779)
883231e [pradeepbn] Building bookkeeper with gradle on java11
Ghatage
pushed a commit
to sijie/bookkeeper
that referenced
this pull request
Jul 12, 2024
…sOnMetadataServerException occurs in over-replicated ledger GC ### Motivation - Even if an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs of readLedgerMetadata in over-replicated ledger GC, nothing will be output to the log. (apache#2844 (comment)) ### Changes - If an exception other than BKNoSuchLedgerExistsOnMetadataServerException occurs in readLedgerMetadata, output information to the log. Reviewers: Andrey Yegorov <None>, Nicolò Boschi <boschi1997@gmail.com> This closes apache#2873 from shustsud/improved_error_handling
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
For each ledger whose metadata is not in ZK, following stack trace will be output:
It is noisy, makes the size of log files large and finally causes OOM during log rotation.
So we should suppress the stacktrace.
(This problem is due to #2813.)
Changes
Add error handling to readLedgerMetadata in over-replicated ledger GC in order to suppress the stacktrace.