Skip to content

[Proposal] Resizable Buffer in BufferAggregator #2963

@himanshug

Description

@himanshug

Currenty AggregatorFactory has a method
public abstract int getMaxIntermediateSize()
which returns maximum byte size required to store intermediate size of the "sketch" object used by the BufferAggregator . This max size is used to reserve space on the processing ByteBuffer while processing various queries.

In case of complex aggregators like thetaSketch, more than 80% of the time sketches don't grow to full capacity but query processing still reserves the full max size. The idea is to support use of a resizable aggregator by BufferAggregator so that maximum space is not reserved but BufferAggregator should be able to re-allocate the buffer on demand.

A high level implementation is described below.

  1. We would add a new method public int getInitSize() to AggregatorFactory whose default implementation will be to return getMaxIntermediateSize()
  2. BufferAggregator would use ResizableBuffer instead of ByteBuffer. ResizableBuffer internally can manage one or more ByteBuffers and supports "allocate(int capacity)" method.
  3. Existing BufferAggregator implementations can continue to work as is by extending FixedSizeBufferAggregator instead of implementing BufferAggregator directly. For example, see LongSumBufferAggregator
  4. ThetaSketch BuffereAggregator can implement BufferAggregator to provide a resizable buffer based implementation. For example, see SketchResizableBufferAggregator
  5. Finally, GroupBy and TopN would need to be updated to take advantage of getInitSize() method.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions