Do persist IncrementalIndex in another thread in IndexGeneratorReducer by binlijin · Pull Request #2149 · apache/druid

binlijin · 2015-12-23T05:50:45Z

Current in IndexGeneratorReducer, the reduce build a small incremental index then persist it until there is no more row, finally merge the persisted indexs.
This patch is to new background threads to do small incremental index persist. Using this feature causes a notable increase in memory pressure and cpu usage, but will make the job finish more quickly.

drcrallen · 2015-12-23T16:35:13Z

Most places in code would use Futures, and then call Futures.allAsList(futures).get(1, TimeUnit.HOURS) or similar.

The catch with that approach is you need to be able to have the incremental index garbage collected, so you have to eliminate hard references to the incremental index in the future.

drcrallen · 2015-12-23T16:41:19Z

This PR has significant memory pressure changes in the reducer and changes the default behavior.

With this PR the JVM now holds onto 2 incremental index objects and the persist objects at the same time (instead of 1 incremental index object and the persist objects). This is a notable increase in memory pressure and should not be enabled by default.

To get around such constraints, the executor service can default to the sameThreadExecutorService and use a blocking service with a set backpressure size as an option.

Such an option could be represented by "io.druid.index.persist.background.count" or similar, which defaults to 0. In the case of 0 the sameThreadExecutorService can be used, in the case of > 0 the executor service with a blocking queue could have its capacity set to the config value. In the case of < 0 is an error.

There are use cases where this can be very handy, but this PR needs some major JVM heap pressure benchmarks before such behavior can be turned on by default.

binlijin · 2015-12-24T05:56:43Z

Yes, the memory will increase so may be we can decrease the rowFlushBoundary.

gianm · 2015-12-29T03:30:34Z

Agree with @drcrallen, would be good to have this as an option that is off by default for reasons of increased memory pressure and increased cpu usage (2 threads instead of 1).

binlijin · 2015-12-29T06:07:31Z

@drcrallen @gianm
Yes, with the latest patch this feature is an option and turn off by default .
And i do not know why travis fail...

fjy · 2015-12-29T21:47:51Z

@binlijin can you pull from master and merge into this PR? it'll help with the failing travis-ci checks

binlijin · 2015-12-30T02:57:01Z

@fjy Yes, it is ok now.

drcrallen · 2016-01-04T18:50:12Z

Thread.currentThread.interrupt() to reset interrupted flag status?

himanshug · 2016-01-05T17:22:05Z

can u use com.google.common.base.Preconditions ?

binlijin · 2016-01-11T08:39:04Z

@drcrallen @gianm @himanshug what about now?

rasahner · 2016-01-12T16:16:42Z

The number of new background threads to use for incremental persists. Using this feature causes a notable increase in memory pressure and cpu usage, but will make the job finish more quickly. If changing from the default of 0 (use current thread for persists), we recommend setting it to 1.

himanshug · 2016-01-12T16:17:35Z

👍 after https://github.com/druid-io/druid/pull/2149/files#r49475041 and https://github.com/druid-io/druid/pull/2149/files#r49477224 are resolved.

himanshug · 2016-01-12T16:32:11Z

can you use Execs.newBlockingSingleThreaded(..) instead ?

Execs.newBlockingSingleThreaded(..) only have one background thread to persist incremental Index, so i have not use it.

binlijin · 2016-01-19T11:46:20Z

What else can i do for the merge？we use this feature in our hadoop build job.

xvrl · 2016-01-20T01:20:18Z

@binlijin can we put a description in the PR that explains what problem this is solving?

binlijin · 2016-01-20T01:21:41Z

rebase

fjy · 2016-01-20T01:26:05Z

@binlijin Can you update the description of the problem being solved with this PR?

binlijin · 2016-01-20T01:34:57Z

@xvrl @fjy update the description

nishantmonu51 · 2016-01-20T12:46:51Z

👍, this looks good to me,
@drcrallen : I think this can be merged, if you dont have any outstanding comments ?

Do persist IncrementalIndex in another thread in IndexGeneratorReducer

drcrallen · 2016-01-21T01:06:45Z

👍

xvrl · 2016-01-21T07:13:09Z

numBackgroundPersistThreads would be more consistent with our other properties, such as druid.processing.numThreads. We should try to keep property naming consistent.

@xvrl, this has been merged, file a new PR to rename the properties?

drcrallen reviewed Dec 23, 2015
View reviewed changes

binlijin closed this Dec 29, 2015

binlijin reopened this Dec 29, 2015

binlijin closed this Dec 30, 2015

binlijin reopened this Dec 30, 2015

drcrallen reviewed Jan 4, 2016
View reviewed changes

drcrallen mentioned this pull request Jan 4, 2016

Make OnHeapIncrementalIndex clean maps on close() #2197

Merged

himanshug reviewed Jan 5, 2016
View reviewed changes

binlijin closed this Jan 6, 2016

binlijin reopened this Jan 6, 2016

binlijin closed this Jan 6, 2016

binlijin reopened this Jan 6, 2016

rasahner reviewed Jan 12, 2016
View reviewed changes

himanshug reviewed Jan 12, 2016
View reviewed changes

binlijin closed this Jan 14, 2016

binlijin reopened this Jan 14, 2016

binlijin closed this Jan 19, 2016

binlijin reopened this Jan 19, 2016

binlijin closed this Jan 20, 2016

Do persist IncrementalIndex in another thread in IndexGeneratorReducer

8e43e2c

binlijin reopened this Jan 20, 2016

fjy added this to the 0.9.0 milestone Jan 20, 2016

drcrallen added a commit that referenced this pull request Jan 21, 2016

Merge pull request #2149 from binlijin/master

2a69a58

Do persist IncrementalIndex in another thread in IndexGeneratorReducer

drcrallen merged commit 2a69a58 into apache:master Jan 21, 2016

xvrl reviewed Jan 21, 2016
View reviewed changes

binlijin mentioned this pull request Jan 22, 2016

rename persistBackgroundCount to numBackgroundPersistThreads #2318

Merged

Conversation

binlijin commented Dec 23, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

drcrallen commented Dec 23, 2015

Uh oh!

binlijin commented Dec 24, 2015

Uh oh!

gianm commented Dec 29, 2015

Uh oh!

binlijin commented Dec 29, 2015

Uh oh!

fjy commented Dec 29, 2015

Uh oh!

binlijin commented Dec 30, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binlijin commented Jan 11, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

himanshug commented Jan 12, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

binlijin commented Jan 19, 2016

Uh oh!

xvrl commented Jan 20, 2016

Uh oh!

binlijin commented Jan 20, 2016

Uh oh!

fjy commented Jan 20, 2016

Uh oh!

binlijin commented Jan 20, 2016

Uh oh!

nishantmonu51 commented Jan 20, 2016

Uh oh!

drcrallen commented Jan 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants