-
Notifications
You must be signed in to change notification settings - Fork 963
Make maxConcurrentZkRequests for gc configurable #2797
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make maxConcurrentZkRequests for gc configurable #2797
Conversation
nicoloboschi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job, I think we should rename the property avoiding to mention Zookkeeper
| protected static final String GC_WAIT_TIME = "gcWaitTime"; | ||
| protected static final String IS_FORCE_GC_ALLOW_WHEN_NO_SPACE = "isForceGCAllowWhenNoSpace"; | ||
| protected static final String GC_OVERREPLICATED_LEDGER_WAIT_TIME = "gcOverreplicatedLedgerWaitTime"; | ||
| protected static final String GC_OVERREPLICATED_LEDGER_MAX_CONCURRENT_ZK_REQUESTS = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this property is not tightly related to ZK but it's more a logical option, IMHO we should not mention ZK in the name of the property
GC_OVERREPLICATED_LEDGER_MAX_CONCURRENT_REQUESTS ?
|
@nicoloboschi |
nicoloboschi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
I left a comment about a small typo
| this.zkLedgersRootPath = ZKMetadataDriverBase.resolveZkLedgersRootPath(conf); | ||
| LOG.info("Over Replicated Ledger Deletion : enabled=" + enableGcOverReplicatedLedger + ", interval=" | ||
| + gcOverReplicatedLedgerIntervalMillis); | ||
| LOG.info("Over Replicated Ledger Deletion : enabled={}, interval={}, maxConcurrentRequest={}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: maxConcurrentRequest > maxConcurrentRequests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eolivelli
can you please review again and merge this PR?
|
rerun failure checks |
1 similar comment
|
rerun failure checks |
### Motivation - Issue is as described in [PR#2797](#2797). > In one day, zookeepers became high cpu usage and disk full. > The cause of this is bookie's gc of overreplicated ledgers. > Gc created/deleted zk nodes under /ledgers/underreplication/locks very frequently and some bookies ran gc at same time. > As a result, zookeepers created a lot of snapshots and became disk full. - I want to reduce the number of lock node creations and deletions in ZK. ### Changes - Add an ensemble check before creating the lock node. This is to reduce the number of lock node creations and deletions in ZK. - ~~If [PR#2797](#2797) was merged, this PR needs to be fixed.~~
### Motivation - Issue is as described in [PR#2797](#2797). > In one day, zookeepers became high cpu usage and disk full. > The cause of this is bookie's gc of overreplicated ledgers. > Gc created/deleted zk nodes under /ledgers/underreplication/locks very frequently and some bookies ran gc at same time. > As a result, zookeepers created a lot of snapshots and became disk full. - I want to reduce the number of lock node creations and deletions in ZK. ### Changes - Add an ensemble check before creating the lock node. This is to reduce the number of lock node creations and deletions in ZK. - ~~If [PR#2797](#2797) was merged, this PR needs to be fixed.~~ (cherry picked from commit 53954ca)
6fd01be to
24c1dd1
Compare
24c1dd1 to
0118a27
Compare
|
PTAL Rebased to newer version of master and resolved conflicts. |
|
rerun failure checks |
eolivelli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm
### Motivation - Issue is as described in [PR#2797](apache#2797). > In one day, zookeepers became high cpu usage and disk full. > The cause of this is bookie's gc of overreplicated ledgers. > Gc created/deleted zk nodes under /ledgers/underreplication/locks very frequently and some bookies ran gc at same time. > As a result, zookeepers created a lot of snapshots and became disk full. - I want to reduce the number of lock node creations and deletions in ZK. ### Changes - Add an ensemble check before creating the lock node. This is to reduce the number of lock node creations and deletions in ZK. - ~~If [PR#2797](apache#2797) was merged, this PR needs to be fixed.~~
Motivation
In one day, zookeepers became high cpu usage and disk full.
The cause of this is bookie's gc of overreplicated ledgers.
Gc created/deleted zk nodes under
/ledgers/underreplication/locksvery frequently and some bookies ran gc at same time.As a result, zookeepers created a lot of snapshots and became disk full.
I want to configure max zk concurrent requests lower than 1000(default) to avoid heavy traffic at a specific time.
Changes