FileUtils: Sync directory entry too on writeAtomically. by gianm · Pull Request #6677 · apache/druid

gianm · 2018-11-28T09:33:47Z

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.

See the fsync(2) man page for why this is important: https://linux.die.net/man/2/fsync This also plumbs CompressionUtils's "zip" function through writeAtomically, so the code for handling atomic local filesystem writes is all done in the same place.

leventov · 2018-11-28T13:42:01Z

+    if (fsync) {
+      return FileUtils.writeAtomically(outputZipFile, out -> zip(directory, out));
+    } else {
+      try (final FileOutputStream out = new FileOutputStream(outputZipFile)) {


Please don't use FileOutputStream and FileInputStream in any new code (and properly they should be eliminated from old code too), because they are finalizable. Please use a static factory method in Files

Oh, interesting, I didn't know that. I'll change it.

Do you have a sense of roughly how expensive a finalizer is? Like is it bad enough that we should care a lot about adjusting old code, or not so bad, and we should only care for adjusting old code that creates a lot of objects (maybe code that creates lots of short lived tmp files).

From looking around online it seems that the JDK will probably remove finalization from these classes eventually (https://bugs.openjdk.java.net/browse/JDK-8187325).

Well, now that I look, it seems like there aren't that many anyway, so maybe not a big deal to adjust them all.

Personally I've only heard that all types of references from java.lang.ref are somehow bad for GC and could lead to effective memory leaks, both especially true for finalizable objects. But I cannot prove that.

Anecdote:
We are constantly suffering GC and apparent memory leak problems with pretty much all Druid node types, that only gets worse with Druid updates, but we cannot understand the reasons. We only try to make incremental improvements on this front, such as #6370 and other PRs.

On the suffering nodes, we cannot made full heap dumps to anylize them because heaps are too big, but we can see heap stats (a-la jmap -histo) and they show that there are, for example, on some Historical node: more than 10k of finalizable objects on the heap (determined by the number of java.lang.ref.Finalizer objects on the heap), tens of thousands of java.lang.ref.PhantomReferences, and several thousands of both WeakReferences and SoftReferences. I don't know where the hell all this comes from. (Except for PhantomReferences that are apparently connected to direct ByteBuffers, #4773 could help here.)

finalize() in FileInputStream and FileOutputStream are removed in OpenJDK 12: https://bugs.openjdk.java.net/browse/JDK-8192939 but Druid won't probably enjoy that until OpenJDK 17, because versions 12 through 16 are those "unsupported" ones that nobody would probably risk to use ever: https://blog.joda.org/2018/10/adopt-java-12-or-stick-on-11.html

Another minor reason to prefer Files static factory methods is that they provide explicit control over how do you open the file with OpenOptions. FileOutputStream just always creates a file if it doesn't exist, or truncates an existing file, and this could not be configured.

We are constantly suffering GC and apparent memory leak problems with pretty much all Druid node types, that only gets worse with Druid updates, but we cannot understand the reasons. We only try to make incremental improvements on this front, such as #6370 and other PRs.

Hm interesting. I did a jmap -histo on a historical of a fairly active Druid cluster (~150 queries/sec, ~5000 segments/server) and I see about 900 WeakReferences, 200 SoftReferences, no PhantomReferences, and 1100 Finalizers. It is typically doing 0.5-2sec of GC time per minute (based on the jvm/gc/time metric). It has been running for a month without being restarted, and avg heap occupancy / gc time hasn't changed much in that time; so in that sense, I don't see signs of memory leaks. GC and heap options are -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -Xms24g -Xmx24g. Is your experience different from this?

I don't understand how you don't see any PhantomReferences. Aren't there any segments loaded? Each loaded segment means one (or several) direct ByteBuffers (memory mapping). Each direct ByteBuffer does have a Cleaner, that means a PhantomReference.

I see 5183 io.druid.segment.QueryableIndexSegment objects (loaded segments), 2.6 million java.nio.DirectByteBufferR, 140k java.nio.DirectByteBuffer, 7422 sun.misc.Cleaner, but no PhantomReferences. This is on OpenJDK 1.8.0_181 and Druid 0.12.3-iap8 (like 0.12.3, but with some extra patches, see https://docs.imply.io/2.7.9/on-prem/misc/release).

In my JDK, Cleaner extends PhantomReference, so maybe that's why there are no explicit PhantomReference objects. The Cleaners are PhantomReferences. It doesn't explain why you do see them though.

Right, sorry for confusion. I have Cleaners too, just associated them with PhantomReferences.

Got it. So, sure, I do see a few thousand of those, on a box that has 5000 segments loaded. & that box does about 0.5s-2s of GC activity per minute.

leventov · 2018-11-28T13:43:50Z

    try {
+      final T retVal;
+
      try (final FileOutputStream out = new FileOutputStream(tmpFile)) {


Please use FileChannel.open(...) and then Channels.newOutputStream()

leventov · 2018-11-28T13:44:23Z

    finally {
-      tmpFile.delete();
+      if (tmpFile.exists()) {
+        if (!tmpFile.delete()) {


Files.delete() will allow to see the reason of the failure (IO error or file not found, or permissions, etc.)

gianm · 2018-11-28T20:14:43Z

Pushed an update.

leventov · 2018-11-28T20:21:13Z

-    try {
-      try (final FileOutputStream out = new FileOutputStream(tmpFile)) {
+    //noinspection unused
+    try (final Closeable deleter = () -> java.nio.file.Files.deleteIfExists(tmpFile.toPath())) {


Now IOExceptions that could be thrown from deleteIfExists() are propagated

I think that's fine. It would be a shock to have it fail, and maybe it's best to propagate in that case.

leventov · 2018-11-28T20:21:53Z

+              StandardOpenOption.WRITE,
+              StandardOpenOption.CREATE_NEW
+          );
+          final OutputStream out = Channels.newOutputStream(fileChannel)


Maybe a bit clearer to factor this as a nested try-with-resource block and omit out.flush()

The fileChannel.force (fsync) has to be done while the stream is still open, so we can't close the OutputStream before that happens. So that's why the flush was first. I just pushed a change with some more comments describing what's going on.

But, I also removed the out.flush completely in this commit -- Channels.newOutputStream is specced as unbuffered, so a flush isn't necessary.

Right, thanks for researching that.

* FileUtils: Sync directory entry too on writeAtomically. See the fsync(2) man page for why this is important: https://linux.die.net/man/2/fsync This also plumbs CompressionUtils's "zip" function through writeAtomically, so the code for handling atomic local filesystem writes is all done in the same place. * Remove unused import. * Avoid FileOutputStream. * Allow non-atomic writes to overwrite. * Add some comments. And no need to flush an unbuffered stream.

gianm mentioned this pull request Nov 28, 2018

Race between file reader and writer in FileSessionCredentialsProvider #6665

Open

Remove unused import.

6cecb58

leventov reviewed Nov 28, 2018

View reviewed changes

Avoid FileOutputStream.

fb8e1ac

leventov reviewed Nov 28, 2018

View reviewed changes

fjy added this to the 0.13.1 milestone Dec 3, 2018

gianm added 2 commits December 4, 2018 10:32

Allow non-atomic writes to overwrite.

3cabd98

Add some comments. And no need to flush an unbuffered stream.

1a8447d

leventov merged commit b7709e1 into apache:master Dec 8, 2018

leventov mentioned this pull request Dec 10, 2018

CURATOR-487 Make GzipCompressionProvider to recycle Deflaters and Inflaters in pools apache/curator#282

Merged

himanshug mentioned this pull request Jun 30, 2019

CompressionUtils performance improvement ask #7932

Closed

leventov mentioned this pull request Aug 19, 2019

Fix resource leak #8337

Merged

8 tasks

gianm deleted the fix-atomic-file-write branch September 23, 2022 19:22

Conversation

gianm commented Nov 28, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leventov Nov 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gianm commented Nov 28, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

leventov Nov 28, 2018 •

edited

Loading