CURATOR-487 Make GzipCompressionProvider to recycle Deflaters and Inflaters in pools by leventov · Pull Request #282 · apache/curator

leventov · 2018-11-30T17:56:47Z

This PR addresses https://issues.apache.org/jira/browse/CURATOR-487 by recycling Deflaters and Inflaters in static concurrent pools. Since Deflaters and Inflaters are acquired and returned to the pools in try-finally blocks that are free of blocking calls themselves, it's not expected that the number of objects in the pools could exceed the number of hardware threads on the machine much. Therefore it's accepted to have simple pools of strongly-referenced objects.

Just an interesting cross project reference, similar task in Jetty: jetty/jetty.project#300

cammckenzie · 2018-12-10T02:58:00Z

Thanks for the PR @alexbrasetvik I will merge this shortly.

Randgalt · 2018-12-10T13:41:21Z

I came to this issue late. Are we certain merging this was the right thing to do? Is there any additional referencing documentation of other projects doing something similar? I'm concerned about replacing a JDK library method. If the JDK authors had a better implementation surely they would've updated the JDK no?

leventov · 2018-12-10T14:05:40Z

JDK authors did the right thing... in OpenJDK 12 (see JDK-8212129). When the minimum Curator requirement is bumped to at least JDK 12 (realistically: JDK 17, the next LTS version), this specialization code could be removed.

leventov · 2018-12-10T14:12:00Z

However, the specialized code will still be more efficient:

    // Even when Curator's minimum supported Java version becomes
    // no less than Java 12, where finalize() methods are removed
    // in Deflater and Inflater classes and instead they are phantom-referenced
    // via Cleaner, it still makes sense to avoid GZIPInputStream
    // and GZIPOutputStream because phantom references are also not
    // entirely free for GC algorithms, and also to allocate less garbage
    // and make less unnecessary data copies.

Randgalt · 2018-12-10T14:57:21Z

// it still makes sense to avoid GZIPInputStream
// and GZIPOutputStream

I don't see how it makes sense to avoid JDK library code. If what you say is true, why wouldn't they update the JDK?

leventov · 2018-12-10T15:03:53Z

Because GZIPInputStream and GZIPOutputStream are more generic mechanisms, they are able to compress / decompress arbitrary streams of data, of unknown length, available only byte-by-byte, etc. Also, GZIPInputStream is capable of decompressing series of GZipped sequences, following one after another. In GzipCompressionProvider, the problem statement is much more narrow: compress/decompress byte[]. When decompressing, the array is expected to contain exactly one GZIP sequence.

Randgalt · 2018-12-10T15:05:32Z

Can you point to other libraries that have taken the approach of re-writing these APIs? I see you opened https://issues.apache.org/jira/browse/COMPRESS-473. Are they taking this change as well?

leventov · 2018-12-10T15:23:48Z

There is no peer evidence here, because we are on the optimization forefront. See apache/druid#6677 (comment) and https://lists.apache.org/thread.html/1aff123193cec5c385821b2d745a4e846a8a5786146c047acbdf8ea3@%3Cdev.druid.apache.org%3E.

I've seen a Druid heap with more than 10k finalizable Deflater objects, about 8k of which were already dead, awaiting in the finalization queue. They come from GzipCompressionProvider.

Historically Druid uses Zookeeper somewhat wrong (not for what Zookeeper was designed): it announces data segment placement using Zookeeper, that leads to creation of a lot of new nodes in Zookeeper every second. It means that by accident, Druid is a good stress test for Zookeeper (and consequently for Curator), and we run probably the largest Druid cluster.

Randgalt · 2018-12-10T16:36:13Z

OK - interesting. It might make sense to develop a general purpose project just for this. Larger projects like Curator could pull this new lib in via shading to avoid the dependency.

leventov added 2 commits November 30, 2018 18:55

Make GzipCompressionProvider to recycle Deflaters and Inflaters in pools

6457f26

Add tests

7467e59

asfgit merged commit 7467e59 into apache:master Dec 10, 2018

leventov deleted the GzipCompressionProvider-references branch December 10, 2018 14:05

munendrasn mentioned this pull request May 25, 2020

Probable Native memory leak with Java11 pippo-java/pippo#541

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CURATOR-487 Make GzipCompressionProvider to recycle Deflaters and Inflaters in pools#282

CURATOR-487 Make GzipCompressionProvider to recycle Deflaters and Inflaters in pools#282
asfgit merged 2 commits intoapache:masterfrom
leventov:GzipCompressionProvider-references

leventov commented Nov 30, 2018

Uh oh!

cammckenzie commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018 •

edited

Loading

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

leventov commented Nov 30, 2018

Uh oh!

cammckenzie commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

leventov commented Dec 10, 2018

Uh oh!

Randgalt commented Dec 10, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

leventov commented Dec 10, 2018 •

edited

Loading