Skip to content

KAFKA-14432: RocksDBStore relies on finalizers to not leak memory#12935

Merged
ableegoldman merged 5 commits intoapache:trunkfrom
lucasbru:leak_fix
Dec 8, 2022
Merged

KAFKA-14432: RocksDBStore relies on finalizers to not leak memory#12935
ableegoldman merged 5 commits intoapache:trunkfrom
lucasbru:leak_fix

Conversation

@lucasbru
Copy link
Copy Markdown
Member

@lucasbru lucasbru commented Dec 1, 2022

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).

The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

RocksDBStore relied on finalizers to not leak memory (and leaked memory
after the upgrade to RocksDB 7).

The problem was that every call to `options.statistics` creates a new
wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics
from `ValueProvider` to `RocksDBStore`.
@lucasbru lucasbru marked this pull request as ready for review December 2, 2022 13:47
@lucasbru
Copy link
Copy Markdown
Member Author

lucasbru commented Dec 2, 2022

@cadonna Could you have a look? I ran this for a few hours in the soak and the native memory is stable. So I think this was the main leak

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucasbru Thanks for the PR!

Nice finding!

Here my feedback!

Comment on lines +233 to +242
statistics = userSpecifiedOptions.statistics();
if (statistics == null) {
if (RecordingLevel.forName((String) configs.get(METRICS_RECORDING_LEVEL_CONFIG)) == RecordingLevel.DEBUG) {
statistics = new Statistics();
dbOptions.setStatistics(statistics);
}
userSpecifiedStatistics = false;
} else {
userSpecifiedStatistics = true;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you extract this to a method? IMO, openDB() is already quite long.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me personally, spreading linear code across methods with only one call-site does not help readability. Anyway, going with the PMC member on this one, done ;)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I get the joke, it is not a matter of PMC member or not. If you have good arguments, I am happy to be convinced.

When I read code, I do not want to be distracted by every detail right away. For the first overview when reading a method -- for me -- it is enough to know that the statistics are set up. So reading setupStatistics() would be enough for me. Then, if I need to understand how the statistics are setup I would look into setupStatistics().

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was just joking. Normally, I'd be happy to convince you with arguments, but I think in this case it's really a matter of personal preference or convention among the engineers in the project (and you know those better, honestly!).

In this particular case, I think a comment and newlines help to split the function into correlated "blocks".

// setup statistics
...
...

// next block

Arguably, this is not as clearly scoped as a separate method, as you say. But it does give me extra information when reading: I do not at all have to worry about this code if I am trying to understand any of the other methods, because the code cannot be called from other methods. Also, I don't have to use my IDE to jump around to understand the flow of execution.

Everyone will make these kinds of trade-offs slightly differently, I am happy to err on the side of project convention here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a big fan of comments. They eventually start to lie because they are not maintained as well as executable parts of the code. I prefer to extract a code block into a method with a meaningful name instead of using a inline comment. One could argue, that the name of a method might also start to lie, but IMO that is less likely compared to an inline comment.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah watch out @lucasbru, don't mention code comments around @cadonna or he'll come after you 😜

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comment!

Comment on lines +524 to 527
if (statistics != null) {
statistics.close();
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a user-specified statistics object is closed by user code?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user still needs to close the statistics he provided. This will just close the statistics handle.

I added a test for this. It doesn't check everything, but that is hard to do.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user still needs to close the statistics he provided

We should add this to the docs then -- users forget to close things all the time, especially if we don't explicitly call out common objects that will need this. Can you file a followup ticket for this? Specifically we should definitely mention it in the rocksdb section of the Memory Management docs, but we also give an example RocksDBConfigSetter in the configuration docs where we might want to include this

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw -- why/when would a user want to specify the statistics themselves vs just letting Streams handle it?

And on a related note, is there any chance failing to close this (either by the user or by Streams) might be causing some of the recent reports of getting no results and just seeing 0.0 for a bunch of the rocksdb metrics? I just feel like I've started seeing these pop up again and can't tell if there's a bug in user code or somehow in ours

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you documented it already in the javadoc for close #6697

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant the "real" docs on the actual kafka site, not the javadocs. Seems like it would be useful to show how/why to use your own Statistics like this. Anyways I just filed a ticket for this so someone else can pick it up https://issues.apache.org/jira/browse/KAFKA-14459

Comment on lines -349 to -361
@Test
public void shouldCloseStatisticsWhenValueProvidersAreRemoved() {
recorder.addValueProviders(SEGMENT_STORE_NAME_1, dbToAdd1, cacheToAdd1, statisticsToAdd1);
recorder.removeValueProviders(SEGMENT_STORE_NAME_1);
verify(statisticsToAdd1).close();
}

@Test
public void shouldNotCloseStatisticsWhenValueProvidersWithoutStatisticsAreRemoved() {
recorder.addValueProviders(SEGMENT_STORE_NAME_1, dbToAdd1, cacheToAdd1, null);
recorder.removeValueProviders(SEGMENT_STORE_NAME_1);
verify(statisticsToAdd1, never()).close();
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write some tests for RocksDBStore to ensure the statistics object is closed and the same instance is passed to the metrics recorder?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, done

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Dec 5, 2022

@lucasbru Actually, I didn't want to approve. Could you please answer my questions before I merge?

@lucasbru
Copy link
Copy Markdown
Member Author

lucasbru commented Dec 6, 2022

Failing tests are unrelated

Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lucasbru Thanks for the updates!

Here my comments!

private Statistics statistics;

// visible for testing
protected Statistics statistics;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to avoid this with mocking the Statistics constructor with no luck.
However, I managed with reflection.

    @Test
    public void shouldCloseStatisticsWhenUserProvidesNoStatistics() throws Exception {
        rocksDBStore = getRocksDBStoreWithRocksDBMetricsRecorder();

        context = getProcessorContext(RecordingLevel.DEBUG);

        rocksDBStore.openDB(context.appConfigs(), context.stateDir());

        Statistics statistics = getStatistics(rocksDBStore);
        rocksDBStore.close();
        assertFalse(statistics.isOwningHandle());
        statistics = getStatistics(rocksDBStore);
        assertNull(statistics);
    }

    private Statistics getStatistics(final RocksDBStore rocksDBStore) throws Exception {
        final Field statisticsField = rocksDBStore.getClass().getDeclaredField("statistics");
        statisticsField.setAccessible(true);
        final Statistics statistics = (Statistics) statisticsField.get(rocksDBStore);
        statisticsField.setAccessible(false);
        return statistics;
    }

Normally, I am not a big fan of using reflection, but in this case were we need to verify off-heap memory leaks I think it is fine.

In such a way, we can keep statistics as a private field and still verify that there is not memory leak.

WDYT?

Copy link
Copy Markdown
Member Author

@lucasbru lucasbru Dec 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea, thanks! Done

lucasbru and others added 2 commits December 7, 2022 13:07
Copy link
Copy Markdown
Member

@cadonna cadonna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Once the buiilds are acceptable, I will merge this PR!

@ableegoldman
Copy link
Copy Markdown
Member

I did look over this and it seems reasonable but I'll refer to @cadonna's evaluation here as the expert on RocksDB metrics/statistics. Since he approved and the build passed (modulo a few unrelated/flaky test failures) I'm going to merge this and cherrypick back to 3.4

If there are any other/older branches you think this should go to, I'll leave it to @cadonna to port the fix as I suspect there could be nontrivial merge conflicts

@ableegoldman ableegoldman merged commit 36a2f7b into apache:trunk Dec 8, 2022
ableegoldman pushed a commit that referenced this pull request Dec 8, 2022
…2935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
@ableegoldman
Copy link
Copy Markdown
Member

Merged to trunk and cherrypicked to 3.4

@lucasbru
Copy link
Copy Markdown
Member Author

lucasbru commented Dec 8, 2022

If there are any other/older branches you think this should go to, I'll leave it to @cadonna to port the fix as I suspect there could be nontrivial merge conflicts

Yes, hmm we probably have to port this back to all branches that got the RocksDB upgrade, right? @cadonna

@cadonna
Copy link
Copy Markdown
Member

cadonna commented Dec 8, 2022

I agree! I will backport this commit.

cadonna pushed a commit that referenced this pull request Dec 8, 2022
…2935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
cadonna pushed a commit that referenced this pull request Dec 8, 2022
…2935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
cadonna pushed a commit that referenced this pull request Dec 8, 2022
…2935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
cadonna pushed a commit that referenced this pull request Dec 8, 2022
…2935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
@cadonna
Copy link
Copy Markdown
Member

cadonna commented Dec 8, 2022

Backported to 3.3, 3.2, 3.1, and 3.0.

guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023
…ache#12935)

RocksDBStore relied on finalizers to not leak memory (and leaked memory after the upgrade to RocksDB 7).
The problem was that every call to options.statistics creates a new wrapper object that needs to be finalized.

I simplified the logic a bit and moved the ownership of the statistics from ValueProvider to RocksDBStore.

Reviewers: Bruno Cadonna <cadonna@apache.org>, Anna Sophie Blee-Goldman <ableegoldman@apache.org>, Christo Lolov <lolovc@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants