Skip to content

[Proposal] BufferAggregator support for growable sketches. #8126

@himanshug

Description

@himanshug

Motivation

  1. From IncrementalIndex generally overestimates theta sketch size #6743
    "Theta sketches have a very large max size by default, relative to typical row sizes (about 250KB with "size" set to the default of 16384). The ingestion-time row size estimator (getMaxBytesPerRowForAggregators in OnheapIncrementalIndex) uses this figure to estimate row sizes when theta sketches are used at ingestion time, leading to way more spills than is reasonable. It would be better to use an estimate based more on actual current size. I'm not sure how to get this, though."

  2. From [Proposal] Resizable Buffer in BufferAggregator #2963
    "In case of complex aggregators like thetaSketch, more than 80% of the time sketches don't grow to full capacity but query processing still reserves the full max size. The idea is to support use of a resizable aggregator by BufferAggregator so that maximum space is not reserved but BufferAggregator should be able to re-allocate the buffer on demand."

  3. Some sketches e.g. doublesSketch do not have an upper bound on size and can't provide a correct number for AggregatorFactory.getMaxIntermediateSize() . Current workaround, they use, is to use some number and then fall back to on-heap objects if sketch grows bigger than that.

Proposed changes

Add following methods to AggregatorFactory

  /**
   * Does BufferAggregator support handling of varying ByteBuffer sizes by overriding
   * {@link BufferAggregator#aggregate(ByteBuffer, int, int)}
   * @return
   */
  public boolean isDynamicallyResizable()
  {
    return getMinIntermediateSize() < getMaxIntermediateSize();
  }
  
  /**
   * Start size of ByteBuffer to be used with BufferAggregator.
   * @return
   */
  public int getMinIntermediateSize()
  {
    return getMaxIntermediateSize();
  }

Add following methods to BufferAggregator

  /**
   * Returns false if aggregation requires a bigger buffer than capacity arg or true.
   * Return status must be used exclusively to signal "low memory in buffer" condition and
   * nothing else.
   */
  default boolean aggregate(ByteBuffer buff, int position, int capacity)
  {
    aggregate(buff, position);
    return true;
  }

  // Following methods are equivalent of old methods with same name except they provide access to capacity
  // of ByteBuffer which is assumed to be AggregatorFactory.getMaxIntermediateSize() by older methods.

  default void init(ByteBuffer buff, int position, int capacity)
  {
    init(buff, position);
  }

  default void relocate(ByteBuffer oldBuff, int oldPosition, int oldCapacity, ByteBuffer newwBuff, int newwPosition, int newwCapacity)
  {
    relocate(oldPosition, newwPosition, oldBuff, newwBuff);
  }

  default Object get(ByteBuffer buff, int position, int capacity)
  {
    return get(buff, position);
  }

  default float getFloat(ByteBuffer buff, int position, int capacity)
  {
    return getFloat(buff, position);
  }

  default double getDouble(ByteBuffer buff, int position, int capacity)
  {
    return getDouble(buff, position);
  }

  default long getLong(ByteBuffer buff, int position, int capacity)
  {
    return getLong(buff, position);
  }

  default boolean isNull(ByteBuffer buff, int position, int capacity)
  {
    return isNull(buff, position);
  }

Update all of Druid core code to remove usage of Aggregator and use BufferAggregator in all places using the newly introduced methods. For example, see the changes made to OnheapIncrementalIndex in #8127 . Once that is done in all places, Aggregator interface and AggregatorFactory.factorize(ColumnSelectorFactory) and AggregatorFactory.getMaxIntermediateSize() can be removed.

At least BufferAggregator that work on top of sketches of variable sizes should be updated to implement newly introduced methods. In those extensions, AggregatorFactory.getMinIntermediateSize() should return a value that can be overridden by user per query/indexing to allow fine tuning.

Following interfaces and classes are introduced to aid usage of new methods in aggregator extension.

public interface MemoryAllocator
{
  BufferHolder allocate(int capacity);
  void free(BufferHolder bh);
}

public interface BufferHolder
{
  int position();
  int capacity();
  ByteBuffer get();
}

public class SimpleOnHeapMemoryAllocator implements MemoryAllocator
{
  @Override
  public BufferHolder allocate(int capacity)
  {
    return new SimplerBufferHolder(ByteBuffer.allocate(capacity));
  }

  @Override
  public void free(BufferHolder ignored)
  {

  }
  
  // This would be used in OnHeapIncrementalIndex
  private static class SimplerBufferHolder implements BufferHolder
  {
    private final ByteBuffer bb;

    public SimplerBufferHolder(ByteBuffer bb)
    {
      this.bb = bb;
    }

    @Override
    public int position()
    {
      return 0;
    }

    @Override
    public int capacity()
    {
      return bb.capacity();
    }

    @Override
    public ByteBuffer get()
    {
      return bb;
    }
  }
}

Rationale

#2963

Operational impact

None

Test plan (optional)

Unit Tests should cover the changes.

Future Work

I think MemoryAllocator interface might change a bit when we write code to use it in off-heap use cases in querying.
With these changes we can possibly use off-heap memory for aggregators in IncrementalIndex too.
This would also enable removal of v1 groupBy implementation that is kept around due to its usage of Aggregator that allowed onheap growable sketches.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions