-
Notifications
You must be signed in to change notification settings - Fork 3.7k
[Issue 12812] Introduce ZKBatchedMetadataStore. #12975
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
/pulsarbot run-failure-checks |
908fa55 to
d6b7667
Compare
d6b7667 to
86e8697
Compare
|
/pulsarbot run-failure-checks |
|
/pulsarbot run-failure-checks |
merlimat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is implementing the batching for getData/getChildren calls but not for the create and setData which will be very interesting, even though we need to solve the problem of creating the parent nodes.
I'll submit a different PR with the logic for the write/create
| import org.apache.zookeeper.ZooDefs; | ||
|
|
||
| @Slf4j | ||
| public class ZKBatchedMetadataStore extends ZKMetadataStore implements AsyncCallback.MultiCallback { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of extending the ZKMetadataStore, I believe we should have a common abstract class that implements the batching logic, which could be shared across multiple implementations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote a AbstractBatchedMetadataStore at first, but got some issue.
- We have to use two different batching queues for Zookeeper Multi operations, one for read and one for write. But I believe that we don't need to do that in other metadata store implementations, e.g. Memory, RocksDB or maybe etcd version of implementation.
- There are not many things left in this AbstractBatchedMetadataStore, mostly is in wrapped in
BatchWorker, maybe onlyfallbackToSingleOps.
| protected final boolean metadataStoreBatchingEnable = true; | ||
| protected final int metadataStoreBatchingMaxDelayMillis = 5; | ||
| protected final int metadataStoreBatchingMaxOperations = 100; | ||
| protected final int metadataStoreBatchingMaxSizeKb = 128; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These parameters should be left configurable, in particular the enable/disable, as we might want to disable in case we see bugs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will fix this.
@codelipenghui @hangc0276 FYI.
This is WIP. Adding create and setData is pretty easy based on this PR. I plan to get a comparison result of containing batch read operations first, as we can see in previous zk perf result #12812, batch reading have the most performance improvements. |
Fixes #12812
Motivation
See #12812
Modifications
Add class
ZKBatchedMetadataStoreextendsZKMetadataStoreand overridesstoreGetandgetChildrenFromStore.Put these two kinds of read operations into
org.apache.pulsar.metadata.impl.batch.BatchWorker#opQueue, which contains a background thread that will send batch ops once it reaches target op number (set as 100) or waited long enough (5ms).Verifying this change
This change is already covered by existing tests, all test cases in pulsar-metadata.
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
If
yeswas chosen, please highlight the changesDocumentation
Check the box below and label this PR (if you have committer privilege).
Need to update docs?
no-need-docPref optimization.