Skip to content

Theta sketch - Concurrent union implementation #263

@Eshcar

Description

@Eshcar

The implementation of a concurrent theta sketch is based on two main design choices

  1. updating threads write into a local buffer which is propagated in the background to a shared sketch
  2. query threads read from a snapshot created by the shared sketch, namely they always see a consistent state of the shared sketch.
    More details can be found here https://datasketches.github.io/docs/Theta/ConcurrentThetaSketch.html

This issue discusses 3 design alternatives for implementing a concurrent union operation.

Design option I: This design works very similar to the design of concurrent theta sketch. It allows updating only the local buffers, and querying only the shared union object.

  1. ConcurrentSharedUnionImpl extends Union overrides all update methods with UnsupportedException
  2. ConcurrentHeapUnionBuffer extends Union overrides getResult and getByteArray methods with UnsupportedException
  3. Add to SetOperationBuilder 2 methods:
    -buildShared gets ConcurrentSharedThetaSketch returns ConcurrentSharedUnionImpl
    -buildLocal gets ConcurrentSharedUnionImpl returns ConcurrentHeapUnionBuffer

Design option II: This design has some pros over the first design alternative mainly simplicity, however it means the “backend” is a shared sketch, and querying the union object is done through one of the local buffers, which is only available to the updating threads. Question: is this a reasonable setting?

  1. ConcurrentHeapUnionBuffer extends Union supports all methods both updates and queries
  2. Add to SetOperationBuilder 1 method:
    -buildLocal gets ConcurrentSharedThetaSketch returns ConcurrentHeapUnionBuffer

Design option III: This design is very similar to the second option however users cannot query the local buffer. This design assumes that queries can be delegated to the ConcurrentSharedThetaSketch “backend” which is available somewhere in the system. Question: is this a reasonable setting?

  1. ConcurrentHeapUnionBuffer extends Union overrides getResult and getByteArray methods with UnsupportedException
  2. Add to SetOperationBuilder 1 method:
    -buildLocal gets ConcurrentSharedThetaSketch returns ConcurrentHeapUnionBuffer

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions