The implementation of a concurrent theta sketch is based on two main design choices
- updating threads write into a local buffer which is propagated in the background to a shared sketch
- query threads read from a snapshot created by the shared sketch, namely they always see a consistent state of the shared sketch.
More details can be found here https://datasketches.github.io/docs/Theta/ConcurrentThetaSketch.html
This issue discusses 3 design alternatives for implementing a concurrent union operation.
Design option I: This design works very similar to the design of concurrent theta sketch. It allows updating only the local buffers, and querying only the shared union object.
ConcurrentSharedUnionImpl extends Union overrides all update methods with UnsupportedException
ConcurrentHeapUnionBuffer extends Union overrides getResult and getByteArray methods with UnsupportedException
- Add to
SetOperationBuilder 2 methods:
-buildShared gets ConcurrentSharedThetaSketch returns ConcurrentSharedUnionImpl
-buildLocal gets ConcurrentSharedUnionImpl returns ConcurrentHeapUnionBuffer
Design option II: This design has some pros over the first design alternative mainly simplicity, however it means the “backend” is a shared sketch, and querying the union object is done through one of the local buffers, which is only available to the updating threads. Question: is this a reasonable setting?
ConcurrentHeapUnionBuffer extends Union supports all methods both updates and queries
- Add to
SetOperationBuilder 1 method:
-buildLocal gets ConcurrentSharedThetaSketch returns ConcurrentHeapUnionBuffer
Design option III: This design is very similar to the second option however users cannot query the local buffer. This design assumes that queries can be delegated to the ConcurrentSharedThetaSketch “backend” which is available somewhere in the system. Question: is this a reasonable setting?
ConcurrentHeapUnionBuffer extends Union overrides getResult and getByteArray methods with UnsupportedException
- Add to
SetOperationBuilder 1 method:
-buildLocal gets ConcurrentSharedThetaSketch returns ConcurrentHeapUnionBuffer
The implementation of a concurrent theta sketch is based on two main design choices
More details can be found here https://datasketches.github.io/docs/Theta/ConcurrentThetaSketch.html
This issue discusses 3 design alternatives for implementing a concurrent union operation.
Design option I: This design works very similar to the design of concurrent theta sketch. It allows updating only the local buffers, and querying only the shared union object.
ConcurrentSharedUnionImplextendsUnionoverrides all update methods withUnsupportedExceptionConcurrentHeapUnionBufferextendsUnionoverridesgetResultandgetByteArraymethods withUnsupportedExceptionSetOperationBuilder2 methods:-
buildSharedgetsConcurrentSharedThetaSketchreturnsConcurrentSharedUnionImpl-
buildLocalgetsConcurrentSharedUnionImplreturnsConcurrentHeapUnionBufferDesign option II: This design has some pros over the first design alternative mainly simplicity, however it means the “backend” is a shared sketch, and querying the union object is done through one of the local buffers, which is only available to the updating threads. Question: is this a reasonable setting?
ConcurrentHeapUnionBufferextendsUnionsupports all methods both updates and queriesSetOperationBuilder1 method:-
buildLocalgetsConcurrentSharedThetaSketchreturnsConcurrentHeapUnionBufferDesign option III: This design is very similar to the second option however users cannot query the local buffer. This design assumes that queries can be delegated to the ConcurrentSharedThetaSketch “backend” which is available somewhere in the system. Question: is this a reasonable setting?
ConcurrentHeapUnionBufferextendsUnionoverridesgetResultandgetByteArraymethods with UnsupportedExceptionSetOperationBuilder1 method:-
buildLocalgetsConcurrentSharedThetaSketchreturnsConcurrentHeapUnionBuffer