Skip to content

FileUtils: Sync directory entry too on writeAtomically.#6677

Merged
leventov merged 5 commits intoapache:masterfrom
gianm:fix-atomic-file-write
Dec 8, 2018
Merged

FileUtils: Sync directory entry too on writeAtomically.#6677
leventov merged 5 commits intoapache:masterfrom
gianm:fix-atomic-file-write

Conversation

@gianm
Copy link
Copy Markdown
Contributor

@gianm gianm commented Nov 28, 2018

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.
if (fsync) {
return FileUtils.writeAtomically(outputZipFile, out -> zip(directory, out));
} else {
try (final FileOutputStream out = new FileOutputStream(outputZipFile)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use FileOutputStream and FileInputStream in any new code (and properly they should be eliminated from old code too), because they are finalizable. Please use a static factory method in Files

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting, I didn't know that. I'll change it.

Do you have a sense of roughly how expensive a finalizer is? Like is it bad enough that we should care a lot about adjusting old code, or not so bad, and we should only care for adjusting old code that creates a lot of objects (maybe code that creates lots of short lived tmp files).

From looking around online it seems that the JDK will probably remove finalization from these classes eventually (https://bugs.openjdk.java.net/browse/JDK-8187325).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, now that I look, it seems like there aren't that many anyway, so maybe not a big deal to adjust them all.

Copy link
Copy Markdown
Member

@leventov leventov Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I've only heard that all types of references from java.lang.ref are somehow bad for GC and could lead to effective memory leaks, both especially true for finalizable objects. But I cannot prove that.

Anecdote:
We are constantly suffering GC and apparent memory leak problems with pretty much all Druid node types, that only gets worse with Druid updates, but we cannot understand the reasons. We only try to make incremental improvements on this front, such as #6370 and other PRs.

On the suffering nodes, we cannot made full heap dumps to anylize them because heaps are too big, but we can see heap stats (a-la jmap -histo) and they show that there are, for example, on some Historical node: more than 10k of finalizable objects on the heap (determined by the number of java.lang.ref.Finalizer objects on the heap), tens of thousands of java.lang.ref.PhantomReferences, and several thousands of both WeakReferences and SoftReferences. I don't know where the hell all this comes from. (Except for PhantomReferences that are apparently connected to direct ByteBuffers, #4773 could help here.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

finalize() in FileInputStream and FileOutputStream are removed in OpenJDK 12: https://bugs.openjdk.java.net/browse/JDK-8192939 but Druid won't probably enjoy that until OpenJDK 17, because versions 12 through 16 are those "unsupported" ones that nobody would probably risk to use ever: https://blog.joda.org/2018/10/adopt-java-12-or-stick-on-11.html

Another minor reason to prefer Files static factory methods is that they provide explicit control over how do you open the file with OpenOptions. FileOutputStream just always creates a file if it doesn't exist, or truncates an existing file, and this could not be configured.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are constantly suffering GC and apparent memory leak problems with pretty much all Druid node types, that only gets worse with Druid updates, but we cannot understand the reasons. We only try to make incremental improvements on this front, such as #6370 and other PRs.

Hm interesting. I did a jmap -histo on a historical of a fairly active Druid cluster (~150 queries/sec, ~5000 segments/server) and I see about 900 WeakReferences, 200 SoftReferences, no PhantomReferences, and 1100 Finalizers. It is typically doing 0.5-2sec of GC time per minute (based on the jvm/gc/time metric). It has been running for a month without being restarted, and avg heap occupancy / gc time hasn't changed much in that time; so in that sense, I don't see signs of memory leaks. GC and heap options are -XX:+UseG1GC -XX:MaxGCPauseMillis=500 -Xms24g -Xmx24g. Is your experience different from this?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how you don't see any PhantomReferences. Aren't there any segments loaded? Each loaded segment means one (or several) direct ByteBuffers (memory mapping). Each direct ByteBuffer does have a Cleaner, that means a PhantomReference.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see 5183 io.druid.segment.QueryableIndexSegment objects (loaded segments), 2.6 million java.nio.DirectByteBufferR, 140k java.nio.DirectByteBuffer, 7422 sun.misc.Cleaner, but no PhantomReferences. This is on OpenJDK 1.8.0_181 and Druid 0.12.3-iap8 (like 0.12.3, but with some extra patches, see https://docs.imply.io/2.7.9/on-prem/misc/release).

In my JDK, Cleaner extends PhantomReference, so maybe that's why there are no explicit PhantomReference objects. The Cleaners are PhantomReferences. It doesn't explain why you do see them though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, sorry for confusion. I have Cleaners too, just associated them with PhantomReferences.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. So, sure, I do see a few thousand of those, on a box that has 5000 segments loaded. & that box does about 0.5s-2s of GC activity per minute.

try {
final T retVal;

try (final FileOutputStream out = new FileOutputStream(tmpFile)) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use FileChannel.open(...) and then Channels.newOutputStream()

finally {
tmpFile.delete();
if (tmpFile.exists()) {
if (!tmpFile.delete()) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Files.delete() will allow to see the reason of the failure (IO error or file not found, or permissions, etc.)

@gianm
Copy link
Copy Markdown
Contributor Author

gianm commented Nov 28, 2018

Pushed an update.

try {
try (final FileOutputStream out = new FileOutputStream(tmpFile)) {
//noinspection unused
try (final Closeable deleter = () -> java.nio.file.Files.deleteIfExists(tmpFile.toPath())) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now IOExceptions that could be thrown from deleteIfExists() are propagated

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fine. It would be a shock to have it fail, and maybe it's best to propagate in that case.

StandardOpenOption.WRITE,
StandardOpenOption.CREATE_NEW
);
final OutputStream out = Channels.newOutputStream(fileChannel)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a bit clearer to factor this as a nested try-with-resource block and omit out.flush()

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fileChannel.force (fsync) has to be done while the stream is still open, so we can't close the OutputStream before that happens. So that's why the flush was first. I just pushed a change with some more comments describing what's going on.

But, I also removed the out.flush completely in this commit -- Channels.newOutputStream is specced as unbuffered, so a flush isn't necessary.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, thanks for researching that.

@fjy fjy added this to the 0.13.1 milestone Dec 3, 2018
@leventov leventov merged commit b7709e1 into apache:master Dec 8, 2018
gianm added a commit to implydata/druid-public that referenced this pull request Dec 18, 2018
* FileUtils: Sync directory entry too on writeAtomically.

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.

* Remove unused import.

* Avoid FileOutputStream.

* Allow non-atomic writes to overwrite.

* Add some comments. And no need to flush an unbuffered stream.
clintropolis pushed a commit to implydata/druid-public that referenced this pull request Feb 5, 2019
* FileUtils: Sync directory entry too on writeAtomically.

See the fsync(2) man page for why this is important:
https://linux.die.net/man/2/fsync

This also plumbs CompressionUtils's "zip" function through
writeAtomically, so the code for handling atomic local filesystem
writes is all done in the same place.

* Remove unused import.

* Avoid FileOutputStream.

* Allow non-atomic writes to overwrite.

* Add some comments. And no need to flush an unbuffered stream.
@leventov leventov mentioned this pull request Aug 19, 2019
8 tasks
@gianm gianm deleted the fix-atomic-file-write branch September 23, 2022 19:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants