Replace ByteBuffer with int[] for group-by key by navis · Pull Request #3111 · apache/druid

navis · 2016-06-08T14:22:40Z

RowUpdater in GroupByQueryEngine uses ByteBuffer for group by key. replacing it with int[] seemed faster and uses smaller footprint.

ByteBuffer

Benchmark                                     (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
GroupByBenchmark.queryMultiQueryableIndex                 4            100000           basic.A  avgt   25  548195.687 ± 23942.989  us/op
GroupByBenchmark.querySingleIncrementalIndex              4            100000           basic.A  avgt   25  322880.320 ± 14818.112  us/op
GroupByBenchmark.querySingleQueryableIndex                4            100000           basic.A  avgt   25  244786.136 ± 13154.709  us/op

int array

Benchmark                                     (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
GroupByBenchmark.queryMultiQueryableIndex                 4            100000           basic.A  avgt   25  509435.673 ± 22015.450  us/op
GroupByBenchmark.querySingleIncrementalIndex              4            100000           basic.A  avgt   25  293420.200 ±  7727.087  us/op
GroupByBenchmark.querySingleQueryableIndex                4            100000           basic.A  avgt   25  199744.396 ±  9634.891  us/op

fjy · 2016-06-08T15:06:03Z

@navis we are rewriting groupBy right now: #2998

drcrallen · 2016-06-08T15:31:05Z

-            ByteBuffer newKey = key.duplicate();
-            newKey.putInt(dimValue);
-            unaggregatedBuffers = updateValues(newKey, dims.subList(1, dims.size()));
+            int[] newKey = Arrays.copyOf(key, key.length);


This changes behavior. In the /master impl only one underlying copy of the data exists. here, you are expanding the quantity of heap objects.

We are already having heap problems on our highly utilized nodes. If it is possible to NOT create more heap objects during this workflow that would be nice.

I think in the most of cases, key.duplicate() (1 ref, 5 int, 1 long, 1 boolean) uses more memory than simple int[].

You'd have to have 7 (I think) dimensions before heap use of int[] exceeds ByteBuffer. Fewer than that and the int[] would actually be cheaper. Hard to say which is better without data, but my guess most real world groupBy queries are going to be roughly at or below 7 dimensions.

btw, in v2, groupBy is reworked to avoid this recursive structure, which reduces allocations and stack depth.

navis · 2016-06-08T23:36:27Z

@fjy The patch is using ByteBuffer for key either. If this simple replacement has some effect on performance, it will be like that also in #2998

navis · 2016-07-29T08:09:57Z

I've did some similar test with HyperLogLogCollector, which uses ByteBuffer.

Benchmark                                              (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt        Score       Error  Units
TimeseriesBenchmark.queryFilteredSingleQueryableIndex              1            750000           basic.A  avgt   25    25100.436 ±  1620.870  us/op
TimeseriesBenchmark.queryMultiQueryableIndex                       1            750000           basic.A  avgt   25   189828.354 ± 12740.761  us/op
TimeseriesBenchmark.querySingleIncrementalIndex                    1            750000           basic.A  avgt   25  1726906.360 ± 91623.333  us/op
TimeseriesBenchmark.querySingleQueryableIndex                      1            750000           basic.A  avgt   25   204556.793 ± 14508.094  us/op

By replacing ByteBuffer to five columns + byte[],

Benchmark                                              (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt        Score        Error  Units
TimeseriesBenchmark.queryFilteredSingleQueryableIndex              1            750000           basic.A  avgt   25    23485.673 ±   2061.591  us/op
TimeseriesBenchmark.queryMultiQueryableIndex                       1            750000           basic.A  avgt   25   179789.402 ±  12704.263  us/op
TimeseriesBenchmark.querySingleIncrementalIndex                    1            750000           basic.A  avgt   25  1272280.840 ± 100677.017  us/op
TimeseriesBenchmark.querySingleQueryableIndex                      1            750000           basic.A  avgt   25   173549.084 ±  12147.194  us/op

Can see some changes in number, especially for IncrementalIndex.

For the reference, this is after removing HLL aggregator,

Benchmark                                              (numSegments)  (rowsPerSegment)  (schemaAndQuery)  Mode  Cnt       Score       Error  Units
TimeseriesBenchmark.queryFilteredSingleQueryableIndex              1            750000           basic.A  avgt   25   23912.216 ±  1884.648  us/op
TimeseriesBenchmark.queryMultiQueryableIndex                       1            750000           basic.A  avgt   25   81740.830 ±  7043.293  us/op
TimeseriesBenchmark.querySingleIncrementalIndex                    1            750000           basic.A  avgt   25  484488.713 ± 20284.775  us/op
TimeseriesBenchmark.querySingleQueryableIndex                      1            750000           basic.A  avgt   25   79206.648 ±  5434.338  us/op

gianm · 2016-07-29T22:47:33Z

@navis The HLL effect looks pretty big, that's cool! What change did you make exactly?

gianm · 2016-07-29T22:52:51Z

This change looks good to me assuming we can get consensus on the comment chain in #3111 (comment).

navis · 2016-07-30T09:19:47Z

hll-array.txt

@gianm attached the patch I've used. it's just done for fun not for PR.

navis · 2016-07-30T09:29:04Z

I think ByteBuffer is preferred way of using memory in druid. If it could be used with off-heap memory by some configuration, it's reasonable to keep that way. but I saw some cases it's never be used with off-heap and got curious if it affects performance replacing that with simpler format (int[] of byte[]). I don't have hard intention this to be merged into master but wanted to show that using int[] or byte[] is reasonable in some cases.

jon-wei · 2016-10-19T00:05:40Z

I think this approach would complicate numeric dimensions, since the key would no longer only contain ints

navis · 2016-10-24T00:44:08Z

@jon-wei fair enough. I'm closing.

Replace ByteBuffer with int[] for group-by key

7cd8290

drcrallen reviewed Jun 8, 2016
View reviewed changes

gianm mentioned this pull request Aug 3, 2016

HLL: Avoid some allocations when possible. #3314

Merged

drcrallen added the Performance label Aug 3, 2016

navis closed this Oct 24, 2016

seoeun25 pushed a commit to seoeun25/incubator-druid that referenced this pull request Jan 10, 2020

Replace ByteBuffer with int[] for group-by key (apache#3111)

4bd6edd

Conversation

navis commented Jun 8, 2016

Uh oh!

fjy commented Jun 8, 2016

Uh oh!

drcrallen Jun 8, 2016

Choose a reason for hiding this comment

Uh oh!

navis Jun 8, 2016

Choose a reason for hiding this comment

Uh oh!

gianm Jul 29, 2016

Choose a reason for hiding this comment

Uh oh!

gianm Jul 29, 2016

Choose a reason for hiding this comment

Uh oh!

navis commented Jun 8, 2016

Uh oh!

navis commented Jul 29, 2016

Uh oh!

gianm commented Jul 29, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gianm commented Jul 29, 2016

Uh oh!

navis commented Jul 30, 2016

Uh oh!

navis commented Jul 30, 2016

Uh oh!

jon-wei commented Oct 19, 2016

Uh oh!

navis commented Oct 24, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

gianm commented Jul 29, 2016 •

edited

Loading