Skip to content

Index merging without garbage #4622

@leventov

Description

@leventov

Current state

Currently the data in several partial (or just one - for transformations) indexes is transformed during merged in the following way:

  1. Iterator < TimeAndDims + Object[] metrics (entry in IncrementalIndex) >
    --> sorting dimension value indexed, aka unsortedToSorted
  2. Iterator < Rowboat (Object[] dims, Object[] metrics) >
    --> optionally, reordering dims
  3. Iterator < Rowboat (Object[] dims, Object[] metrics) >
    // here array elements are the same objects as at the previous step, but Object[] arrays are new, if reordering or dims and/or metrics is actually required

    --> another one reindexing, based on merged dictionary
  4. Iterator < Rowboat (Object[] dims, Object[] metrics) >
    --> final merge.

Here, Object[] elements are either int[] (DimensionSelector), Long, Double or Float (numeric ColumnValueSelectors, correspondingly).

So in the process of merge, each entry generates 2-3 extra Rowboat objects, 4-7 new Object[] arrays, and N (the number of string dimensions) * 2 new int[] arrays, and new boxed primitive objects, if merging is done with QueryableIndex as a source.

Garbage-free approach

Rowboat contains an array of ColumnValueSelector objects, representing the stream of dimensions, and another array of ColumnValueSelector objects, representing a stream of metrics, both "under cursor". When QueryableIndexis used as source for merging, the existing Cursor and ColumnValueSelectorFactory infrastructure is reused with minimal modifications.

0->1 and 2-3 conversions, as described above, implemented as ColumnValueSelector transformations, without creating new arrays, boxed primitives, etc. 1->2 transformation is essentially a no-op: create a Rowboat object with array of ColumnValueSelectors, ordered differently.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions