Current state
Currently the data in several partial (or just one - for transformations) indexes is transformed during merged in the following way:
- Iterator < TimeAndDims + Object[] metrics (entry in
IncrementalIndex) >
--> sorting dimension value indexed, aka unsortedToSorted
- Iterator < Rowboat (Object[] dims, Object[] metrics) >
--> optionally, reordering dims
- Iterator < Rowboat (Object[] dims, Object[] metrics) >
// here array elements are the same objects as at the previous step, but Object[] arrays are new, if reordering or dims and/or metrics is actually required
--> another one reindexing, based on merged dictionary
- Iterator < Rowboat (Object[] dims, Object[] metrics) >
--> final merge.
Here, Object[] elements are either int[] (DimensionSelector), Long, Double or Float (numeric ColumnValueSelectors, correspondingly).
So in the process of merge, each entry generates 2-3 extra Rowboat objects, 4-7 new Object[] arrays, and N (the number of string dimensions) * 2 new int[] arrays, and new boxed primitive objects, if merging is done with QueryableIndex as a source.
Garbage-free approach
Rowboat contains an array of ColumnValueSelector objects, representing the stream of dimensions, and another array of ColumnValueSelector objects, representing a stream of metrics, both "under cursor". When QueryableIndexis used as source for merging, the existing Cursor and ColumnValueSelectorFactory infrastructure is reused with minimal modifications.
0->1 and 2-3 conversions, as described above, implemented as ColumnValueSelector transformations, without creating new arrays, boxed primitives, etc. 1->2 transformation is essentially a no-op: create a Rowboat object with array of ColumnValueSelectors, ordered differently.
Current state
Currently the data in several partial (or just one - for transformations) indexes is transformed during merged in the following way:
IncrementalIndex) >--> sorting dimension value indexed, aka unsortedToSorted
--> optionally, reordering dims
// here array elements are the same objects as at the previous step, but
Object[]arrays are new, if reordering or dims and/or metrics is actually required--> another one reindexing, based on merged dictionary
--> final merge.
Here,
Object[]elements are eitherint[](DimensionSelector),Long,DoubleorFloat(numeric ColumnValueSelectors, correspondingly).So in the process of merge, each entry generates 2-3 extra
Rowboatobjects, 4-7 newObject[]arrays, and N (the number of string dimensions) * 2 newint[]arrays, and new boxed primitive objects, if merging is done withQueryableIndexas a source.Garbage-free approach
Rowboatcontains an array of ColumnValueSelector objects, representing the stream of dimensions, and another array of ColumnValueSelector objects, representing a stream of metrics, both "under cursor". WhenQueryableIndexis used as source for merging, the existingCursorandColumnValueSelectorFactoryinfrastructure is reused with minimal modifications.0->1 and 2-3 conversions, as described above, implemented as ColumnValueSelector transformations, without creating new arrays, boxed primitives, etc. 1->2 transformation is essentially a no-op: create a Rowboat object with array of ColumnValueSelectors, ordered differently.