explicitly unmap hydrant files when abandonSegment to recycle mmap memory by kaijianding · Pull Request #4341 · apache/druid

kaijianding · 2017-05-30T01:50:20Z

I noticed the RSS memory grown to extremely large number in realtime node in one of my environments. The memory usage grows after index merge and never goes down after handoff like it in other environments.
Usually the FileUtils.deleteDirectory(target); can recycle mmap memory, but it doesn't work in some environment.
We should explicitly unmap hydrant files when abandonSegment to recycle mmap memory.

Also a test fix in TestKafkaExtractionCluster to use random port like in KafkaSupervisorTest. This test failed if the kafka was already started in environment(my environment is not as clean as in Travis environment)

leventov · 2017-05-30T02:59:44Z

    serverProperties.put("zookeeper.connect", zkTestServer.getConnectString() + zkKafkaPath);
    serverProperties.put("zookeeper.session.timeout.ms", "10000");
    serverProperties.put("zookeeper.sync.time.ms", "200");
+    serverProperties.put("port", String.valueOf(new Random().nextInt(9999) + 10000));


Please use ThreadLocalRandom.current()

leventov · 2017-05-30T03:01:16Z

                  mergedFile,
                  sink.getSegment().withDimensions(Lists.newArrayList(index.getAvailableDimensions()))
              );
+              index.close();


It doesn't seem that creating index is even needed here, IndexMerger.getMergedDimensions(indexes) could be used

leventov · 2017-05-30T03:11:36Z

        );
        for (FireHydrant hydrant : sink) {
          cache.close(SinkQuerySegmentWalker.makeHydrantCacheIdentifier(hydrant));
+          hydrant.getSegment().close();


Despite it's documentation, abandonSegment() is called not only from mergeExecutor, so races between persistAndMerge() and abandonSegment() are possible. It should be resolved before merging this change, because it may lead to JVM crashes.

@kaijianding could you address this?

could you explain more on the race condition? @leventov

abandonSegment() is called in FlushingPlumber not from mergeExecutor. Well, probably it doesn't lead to race, because in the context of FlushingPlumber mergeExecutor is not used at all, but it's a dangerous situation. Could you please refactor RealtimePlumber/FlushingPlumber by extracting logic and fields which are used by both into a superclass, and make RealtimePlumber and FlushingPlumber both subclasses of that class. So that FlushingPlumber doesn't have unused Executor fields.

I got your point. You'd like to move the merge and handoff part of code out of the parent class and make them only in RealtimePlumber who really needs to care about merge and handoff.
I tried to refactor as you described, I found there are huge diff to make and I think it need be carefully tested. I think it's big risk to do it in this PR.
As FlushingPlumber doesn't really start the mergeExecuter, thus there is no race condition here. Should we do the refactor in another PR? @leventov

IMO, another PR is ok. Or maybe even skipping it altogether, and instead, migrating users of Plumbers to Appenderator (which is meant to be an improved replacement).

The flushing plumber isn't really meant to be used in production anyway (I think it's not even documented). It was meant to be a way to set up some realtime demos that just throw away data after a period of time.

gianm

@kaijianding, could you make a similar change in AppenderatorImpl as well, in the "abandonSegment" method?

I think there's no need to refactor flushing/realtime plumber, since:

Plumbers could be rewritten in the future in terms of Appenderators, or alternatively replaced with Appenderators, which are more flexible (see original description of #2220)
The flushing plumber is not expected to be used in prod anyway.

gianm · 2017-06-01T18:20:24Z

+    // We are materializing the list for performance reasons. Lists.transform
+    // only creates a "view" of the original list, meaning the function gets
+    // applied every time you access an element.
+    return Lists.transform(


The behavior is different here. The list used to be materialized (Lists.newArrayList) but now it's a view (Lists.transform). The comment says it should be materialized, so I think the old code was right.

Also it's better to use Stream API for new code, instead of Guava

gianm

LGTM

gianm · 2017-06-01T21:27:48Z

Got an error from the TeamCity build:

[00:03:17]Failed to load profile from '/mnt/agent/work/68c78b7e1c77925/.idea/inspectionProfiles/Druid.xml'
[00:03:19]Process exited with code 1
[00:03:19]Inspection output
[00:03:19][Inspection output] Starting up IntelliJ IDEA 2017.1.2 (build IU-171.4249.39) ...done.
[00:03:19][Inspection output] Opening project...done.
[00:03:19][Inspection output] Failed to load profile from '/mnt/agent/work/68c78b7e1c77925/.idea/inspectionProfiles/Druid.xml'
[00:03:19][Inspection output] Initializing project...

I clicked "Run" in its UI to see if it will work when run again.

leventov · 2017-06-01T21:28:18Z

@gianm this failure could be ignored until #4348 is merged

gianm · 2017-06-01T21:28:39Z

Ah, got it.

kaijianding added 2 commits May 23, 2017 11:22

fix TestKafkaExtractionCluster fail due to port already used

1d1e31f

explicitly unmap hydrant files when abandonSegment to recyle mmap memory

6077867

leventov requested changes May 30, 2017

View reviewed changes

address the comments

f5ea529

gianm reviewed Jun 1, 2017

View reviewed changes

gianm added the Improvement label Jun 1, 2017

apply to AppenderatorImpl

e909687

gianm approved these changes Jun 1, 2017

View reviewed changes

leventov approved these changes Jun 1, 2017

View reviewed changes

leventov closed this Jun 1, 2017

leventov reopened this Jun 1, 2017

leventov merged commit 0efd182 into apache:master Jun 1, 2017

kaijianding deleted the kafka branch June 2, 2017 02:03

gianm added this to the 0.10.1 milestone Jun 6, 2017

Conversation

kaijianding commented May 30, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gianm left a comment

Choose a reason for hiding this comment

Uh oh!

gianm commented Jun 1, 2017

Uh oh!

leventov commented Jun 1, 2017

Uh oh!

gianm commented Jun 1, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaijianding commented May 30, 2017 •

edited

Loading