optimize index merge by binlijin · Pull Request #1938 · apache/druid

binlijin · 2015-11-09T03:12:55Z

optimize index merge performance.

binlijin · 2015-11-09T03:16:11Z

I use druid-0.8.1 to test the performance:

2015-11-06T06:18:04,977 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed index.drd in 39 millis.
2015-11-06T06:20:12,525 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed dim conversions in 127,547 millis.
2015-11-06T06:20:30,711 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed walk through of 5,150,450 rows in 18,185 millis.
2015-11-06T06:20:30,711 INFO [main] io.druid.segment.IndexMerger - Starting dimension[call_info] with cardinality[22]
2015-11-06T06:20:33,367 INFO [main] io.druid.segment.IndexMerger - Completed dimension[call_info] in 2,656 millis.
2015-11-06T06:20:33,368 INFO [main] io.druid.segment.IndexMerger - Starting dimension[id] with cardinality[5,131,904]
2015-11-06T07:10:51,904 INFO [main] io.druid.segment.IndexMerger - Completed dimension[id] in 3,018,536 millis.
2015-11-06T07:10:51,904 INFO [main] io.druid.segment.IndexMerger - Starting dimension[msg] with cardinality[120]
2015-11-06T07:10:54,540 INFO [main] io.druid.segment.IndexMerger - Completed dimension[msg] in 2,635 millis.
2015-11-06T07:10:54,540 INFO [main] io.druid.segment.IndexMerger - Starting dimension[priority] with cardinality[1]
2015-11-06T07:10:56,598 INFO [main] io.druid.segment.IndexMerger - Completed dimension[priority] in 2,058 millis.
2015-11-06T07:10:56,598 INFO [main] io.druid.segment.IndexMerger - Starting dimension[source] with cardinality[2]
2015-11-06T07:10:58,753 INFO [main] io.druid.segment.IndexMerger - Completed dimension[source] in 2,155 millis.
2015-11-06T07:10:58,753 INFO [main] io.druid.segment.IndexMerger - Starting dimension[type] with cardinality[2]
2015-11-06T07:11:00,881 INFO [main] io.druid.segment.IndexMerger - Completed dimension[type] in 2,128 millis.
2015-11-06T07:11:00,881 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed inverted.drd in 3,030,170 millis.

binlijin · 2015-11-09T03:17:10Z

With the patch binlijin@35bb704 , the result is :

2015-11-06T07:22:20,902 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed index.drd in 40 millis.
2015-11-06T07:22:52,368 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed dim conversions in 31,466 millis.
2015-11-06T07:23:10,691 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed walk through of 5,150,450 rows in 18,322 millis.
2015-11-06T07:23:10,691 INFO [main] io.druid.segment.IndexMerger - Starting dimension[call_info] with cardinality[22]
2015-11-06T07:23:13,648 INFO [main] io.druid.segment.IndexMerger - Completed dimension[call_info] in 2,957 millis.
2015-11-06T07:23:13,648 INFO [main] io.druid.segment.IndexMerger - Starting dimension[id] with cardinality[5,131,904]
2015-11-06T08:13:57,667 INFO [main] io.druid.segment.IndexMerger - Completed dimension[id] in 3,044,019 millis.
2015-11-06T08:13:57,668 INFO [main] io.druid.segment.IndexMerger - Starting dimension[msg] with cardinality[120]
2015-11-06T08:14:00,156 INFO [main] io.druid.segment.IndexMerger - Completed dimension[msg] in 2,488 millis.
2015-11-06T08:14:00,157 INFO [main] io.druid.segment.IndexMerger - Starting dimension[priority] with cardinality[1]
2015-11-06T08:14:02,191 INFO [main] io.druid.segment.IndexMerger - Completed dimension[priority] in 2,033 millis.
2015-11-06T08:14:02,191 INFO [main] io.druid.segment.IndexMerger - Starting dimension[source] with cardinality[2]
2015-11-06T08:14:04,216 INFO [main] io.druid.segment.IndexMerger - Completed dimension[source] in 2,025 millis.
2015-11-06T08:14:04,217 INFO [main] io.druid.segment.IndexMerger - Starting dimension[type] with cardinality[2]
2015-11-06T08:14:06,222 INFO [main] io.druid.segment.IndexMerger - Completed dimension[type] in 2,006 millis.
2015-11-06T08:14:06,223 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed inverted.drd in 3,055,532 millis.

binlijin · 2015-11-09T03:38:36Z

With the patch binlijin@f2b736c , the result is :
2015-11-06T08:28:15,064 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed index.drd in 30 millis.
2015-11-06T08:28:46,611 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed dim conversions in 31,547 millis.
2015-11-06T08:29:05,494 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed walk through of 5,150,450 rows in 18,883 millis.
2015-11-06T08:29:05,495 INFO [main] io.druid.segment.IndexMerger - Starting dimension[call_info] with cardinality[22]
2015-11-06T08:29:08,151 INFO [main] io.druid.segment.IndexMerger - Completed dimension[call_info] in 2,656 millis.
2015-11-06T08:29:08,152 INFO [main] io.druid.segment.IndexMerger - Starting dimension[id] with cardinality[5,131,904]
2015-11-06T08:59:21,848 INFO [main] io.druid.segment.IndexMerger - Completed dimension[id] in 1,813,697 millis.
2015-11-06T08:59:21,849 INFO [main] io.druid.segment.IndexMerger - Starting dimension[msg] with cardinality[120]
2015-11-06T08:59:24,022 INFO [main] io.druid.segment.IndexMerger - Completed dimension[msg] in 2,174 millis.
2015-11-06T08:59:24,022 INFO [main] io.druid.segment.IndexMerger - Starting dimension[priority] with cardinality[1]
2015-11-06T08:59:25,975 INFO [main] io.druid.segment.IndexMerger - Completed dimension[priority] in 1,953 millis.
2015-11-06T08:59:25,976 INFO [main] io.druid.segment.IndexMerger - Starting dimension[source] with cardinality[2]
2015-11-06T08:59:28,001 INFO [main] io.druid.segment.IndexMerger - Completed dimension[source] in 2,026 millis.
2015-11-06T08:59:28,002 INFO [main] io.druid.segment.IndexMerger - Starting dimension[type] with cardinality[2]
2015-11-06T08:59:29,962 INFO [main] io.druid.segment.IndexMerger - Completed dimension[type] in 1,960 millis.
2015-11-06T08:59:29,962 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed inverted.drd in 1,824,467 millis.

binlijin · 2015-11-09T04:16:59Z

With the patch binlijin@2e31b45 , the result is :
2015-11-06T09:08:20,056 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed index.drd in 29 millis.
2015-11-06T09:08:50,576 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed dim conversions in 30,519 millis.
2015-11-06T09:09:09,008 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed walk through of 5,150,450 rows in 18,432 millis.
2015-11-06T09:09:09,009 INFO [main] io.druid.segment.IndexMerger - Starting dimension[call_info] with cardinality[22]
2015-11-06T09:09:11,571 INFO [main] io.druid.segment.IndexMerger - Completed dimension[call_info] in 2,562 millis.
2015-11-06T09:09:11,572 INFO [main] io.druid.segment.IndexMerger - Starting dimension[id] with cardinality[5,131,904]
2015-11-06T09:10:36,240 INFO [main] io.druid.segment.IndexMerger - Completed dimension[id] in 84,668 millis.
2015-11-06T09:10:36,241 INFO [main] io.druid.segment.IndexMerger - Starting dimension[msg] with cardinality[120]
2015-11-06T09:10:38,410 INFO [main] io.druid.segment.IndexMerger - Completed dimension[msg] in 2,169 millis.
2015-11-06T09:10:38,410 INFO [main] io.druid.segment.IndexMerger - Starting dimension[priority] with cardinality[1]
2015-11-06T09:10:40,379 INFO [main] io.druid.segment.IndexMerger - Completed dimension[priority] in 1,969 millis.
2015-11-06T09:10:40,379 INFO [main] io.druid.segment.IndexMerger - Starting dimension[source] with cardinality[2]
2015-11-06T09:10:42,383 INFO [main] io.druid.segment.IndexMerger - Completed dimension[source] in 2,004 millis.
2015-11-06T09:10:42,384 INFO [main] io.druid.segment.IndexMerger - Starting dimension[type] with cardinality[2]
2015-11-06T09:10:44,381 INFO [main] io.druid.segment.IndexMerger - Completed dimension[type] in 1,997 millis.
2015-11-06T09:10:44,381 INFO [main] io.druid.segment.IndexMerger - outDir[/home/tianzhao.blj/log/merge/v8-tmp] completed inverted.drd in 95,372 millis.

drcrallen · 2015-11-09T15:06:07Z

Is this really faster than Lists.transform?

It's not about the speed of this loop, it's about the speed of access of items in that list. Lists.transform only creates a "view" of the original list, meaning the function gets applied every time you access an element.

@drcrallen, just like @xvrl said.

Yes, during the merge, there is a iteration like
for (dimValue : allDimValues) {
for (int j = 0; j < indexes.size(); ++j) {
}
}
The function will get applied "cardinality" times, and there is some disk IO in the constructor of QueryableIndexIndexableAdapter.

drcrallen · 2015-11-09T17:18:12Z

HI @binlijin thanks for the awesome PR. A few points and questions:

It looks like there are two points of caching explicitly called out here. https://github.com/druid-io/druid/pull/1938/files#diff-af29f4b50bb9f1944744fa9a1b3f4df2R1024 and using STRING_STRATEGY at https://github.com/druid-io/druid/pull/1938/files#diff-4cb77a6cbc8f0534054954700125f6afR367 . If my overall reading is correct the main thing that's going on is that the looking for values is now done in such a way that explicitly uses the cached STRING_STRATEGY. Is that the right way to read this? A summarization for why you changed the aspects would be awesome.
There need to be some unit tests for the added functionality.

Cheers and thanks again for making Druid better!

xvrl · 2015-11-09T17:47:52Z

you could avoid the for loop by simply doing a Lists.newArrayList(Iterables.transform(..))

It would also be good to add a comment here that we are materializing the list for performance reasons.

KurtYoung · 2015-11-10T02:39:02Z

Hi @drcrallen @himanshug , I want to explain what's purpose of the BitmapIndexSeeker which the main performance boost comes from.
In the previous version of index merge, during the creation of invert index, we do something like merge sort of multiply bitmap indexes. It simply iterator all the previous sorted values, and "binary find" the id in each bitmap indexes, which involves disk IO and is really slow. Suppose we have N (which is 100 in our test) small segments, each have M (which is 50000 in our case) rows. In high cardinality scenario, we will almost have N * M uniq values. So the complexity will be O(N * M * M * LOG(M)).

There are 2 properties we did not use during the merging:

We always travel the dimension values sequentially
One single dimension value is valid only in one index when cardinality is high enough
So we introduced the BitmapIndexSeeker, which can only seek value sequentially and can never seek back. By using this and the help of "getDimValueLookup", we only need to translate all dimension value to its ID once, and the translation is done by self-increase of the integer. We only need to change the CACHED value once after previous value is hit, renew the value and increase the ID. The complexity now is O(N * M * LOG(M)).

There is still a potential optimization though, since we use the Indexed returned by "getDimValueLookup" to get the value of the id which may causes some binary search or disk io later. But these costs are small right now. When these costs get higher, we can move the "seeker" into the bitmap index, and travel the value sequentially to get the value and id tuple in O(1). By then the complexity will be O(N * M).

cheddar · 2015-11-11T17:53:01Z

This is pretty cool and should be a great addition. Totally makes sense that not doing a binary search for every value to get the id would speed things up, didn't realize how much though, thanks for the contribution!

👍

@drcrallen You added 0.9.0, is that just an indication that we want it in the next release after 0.8.2 and that'll be 0.9.0? Or are you intending that this isn't in 0.8.3 if we choose to do 0.8.3? This is a fully backwards compatible change (the interface that was adjusted is a purely internal interface) so it should be able to go in whatever release happens after 0.8.2.

xvrl · 2015-11-11T17:58:35Z

I'm ok with having this make into 0.8.3

drcrallen · 2015-11-11T18:00:16Z

@cheddar I meant next release, which at the time was going to be 0.9.0

drcrallen · 2015-11-11T18:05:41Z

Overall this is looking very good. I have to admit that following the change is a bit mind-bending.
Already have a CLA so we're good on that end.

Would you mind squashing your commits?

drcrallen · 2015-11-11T18:09:45Z

Wait, I'm crazy, we have a CLA for @KurtYoung but not for @binlijin ,

@binlijin would you mind filling out a CLA? http://druid.io/community/cla.html

xvrl · 2015-11-11T18:27:21Z

technically this is only slow for concise bitmaps, but is fast with roaring bitmaps, so we should update this comment.

Also, it looks like this whole anonymous class is copy-paste from IncrementaIndexAdapter. Is it possible to create a simple class to wrap a bitmapIndex and turn it into an IndexInts and use it both places?

@drcrallen, done, i do fill out a CLA.
@xvrl, i do refactor the code and reuse it.

xvrl · 2015-11-11T18:33:44Z

@binlijin Had a few minor suggestions about code-reuse and adding comments, I'm 👍 otherwise once commits are squasheed

binlijin · 2015-11-12T03:18:47Z

I merge all commit into one and do not know how to reopen with this pull request , so open with the new PR #1960

drcrallen · 2015-11-12T15:52:22Z

@binlijin thanks, GitHub is weird like that and won't let you re-open a PR that was force pushed.

gianm · 2015-11-12T17:43:16Z

@binlijin the trick is to reopen first, then force push.

binlijin · 2015-11-13T00:53:36Z

@drcrallen @gianm , Thank your very much.

binlijin changed the title ~~optimize DimValueConverter by reuse currValue~~ optimize index merge Nov 9, 2015

drcrallen reviewed Nov 9, 2015
View reviewed changes

xvrl reviewed Nov 9, 2015
View reviewed changes

drcrallen added this to the 0.9.0 milestone Nov 10, 2015

drcrallen added the Performance label Nov 10, 2015

drcrallen modified the milestones: 0.8.3, 0.9.0 Nov 11, 2015

xvrl reviewed Nov 11, 2015
View reviewed changes

binlijin closed this Nov 12, 2015

binlijin mentioned this pull request Nov 12, 2015

optimize index merge #1960

Merged

Conversation

binlijin commented Nov 9, 2015

Uh oh!

binlijin commented Nov 9, 2015

Uh oh!

binlijin commented Nov 9, 2015

Uh oh!

binlijin commented Nov 9, 2015

Uh oh!

binlijin commented Nov 9, 2015

Uh oh!

drcrallen Nov 9, 2015

Choose a reason for hiding this comment

Uh oh!

xvrl Nov 9, 2015

Choose a reason for hiding this comment

Uh oh!

binlijin Nov 10, 2015

Choose a reason for hiding this comment

Uh oh!

KurtYoung Nov 10, 2015

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Nov 9, 2015

Uh oh!

xvrl Nov 9, 2015

Choose a reason for hiding this comment

Uh oh!

KurtYoung commented Nov 10, 2015

Uh oh!

cheddar commented Nov 11, 2015

Uh oh!

xvrl commented Nov 11, 2015

Uh oh!

drcrallen commented Nov 11, 2015

Uh oh!

drcrallen commented Nov 11, 2015

Uh oh!

drcrallen commented Nov 11, 2015

Uh oh!

xvrl Nov 11, 2015

Choose a reason for hiding this comment

Uh oh!

binlijin Nov 12, 2015

Choose a reason for hiding this comment

Uh oh!

xvrl commented Nov 11, 2015

Uh oh!

binlijin commented Nov 12, 2015

Uh oh!

drcrallen commented Nov 12, 2015

Uh oh!

gianm commented Nov 12, 2015

Uh oh!

binlijin commented Nov 13, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants