TiffSaver: Only setLength of output file once by dgault · Pull Request #3239 · ome/bioformats

dgault · 2018-10-02T11:50:24Z

This is in relation to trying to address performance issues, see https://trello.com/c/OimiHAQY/39-ome-common-profile-tiff-writing and ome/ome-common-java#15

As shown in ome/ome-common-java#15 the problem lies in the repeated calling of setLength on RandomAccessFile.

The issue originally thought to be a Windows issue looks like it may be related to Java versions with RandomAccessFile.setLength being much slower on Java 10 (see https://stackoverflow.com/questions/50450317/randomaccessfile-setlength-much-slower-on-java-10-centos for some info)

In Java 8 setLength does:

[pid 49027] ftruncate(23, 53248)        = 0
[pid 49027] lseek(23, 0, SEEK_SET)      = 0
[pid 49027] lseek(23, 0, SEEK_CUR)      = 0

While in Java 10 it introduces the much slower fallocate (to fix a bug):

[pid   444] fstat(8, {st_mode=S_IFREG|0664, st_size=126976, ...}) = 0
[pid   444] fallocate(8, 0, 0, 131072)  = 0
[pid   444] lseek(8, 0, SEEK_SET)       = 0
[pid   444] lseek(8, 0, SEEK_CUR)       = 0

This is particularly problematic for us as the profiling shows we make a large number of calls to setLength. Essentially every time we write a strip we are calling it to increase the length of the output file.

Buffering in NIOFileHandle will help improve this, but as TiffSaver knows the length of data it is writing it might make more sense to make a single call to allocate the length upfront. In this case it is being forced by seeking to length which in turn calls setLength.

dgault · 2018-10-02T20:17:46Z

+    // sets the output file size without having to allocate for each strip iteration
+    out.seek(stripStartPos + totalStripSize);
+    // return to original position
+    out.seek(stripStartPos);


This seek is unnecessary and can be removed

dgault · 2018-10-08T16:18:13Z

Attached show the before and after of the call stacks:

After:

Using bfconvert with tubhiswt-4D dataset:

Windows - Java 8
Without PR: 71.031s elapsed (4.6918607+74.8314ms per plane, 2125ms overhead)
With PR: 26.018s elapsed (4.816279+23.474419ms per plane, 1016ms overhead)

Windows - Java 10 (actually not impacted as much as expected)
Without PR: 78.485s
With PR: 23.783s

Tiling performance is not improved so that needs to be corrected.

dgault · 2018-10-09T16:04:37Z

As a follow up, i've been profiling and debugging the tiled scenario further. In the case of tiling there are some differences but the short of it is we are still in a scenario of having to setLength for each tile so performance will remain similar as to before. In saying that the setLength doesnt impact the tiled case as much in the first place, the reading of tiles in ImageConverter is actually adding overhead as opposed to the writing.

sbesson

Change make sense and this has certainly been tested over the last weeks for various conversions without any adverse effect being reported. All unit tests are passing. Merging for the next milestone of Bio-Formats 6.

TiffSaver: Only setLength of output file once

b52a70d

dgault commented Oct 2, 2018

View reviewed changes

sbesson self-requested a review December 17, 2018 12:36

sbesson approved these changes Dec 17, 2018

View reviewed changes

sbesson merged commit 67cef27 into ome:develop Dec 17, 2018

dgault mentioned this pull request Jun 15, 2023

bfconvert very slow when converting to ome.tiff to a disk where writing is "slow" #3983

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TiffSaver: Only setLength of output file once#3239

TiffSaver: Only setLength of output file once#3239
sbesson merged 1 commit intoome:developfrom
dgault:TiffSaver-avoid-setLength

dgault commented Oct 2, 2018

Uh oh!

dgault Oct 2, 2018

Uh oh!

dgault commented Oct 8, 2018

Uh oh!

dgault commented Oct 9, 2018

Uh oh!

sbesson left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dgault commented Oct 2, 2018

Uh oh!

dgault Oct 2, 2018

Choose a reason for hiding this comment

Uh oh!

dgault commented Oct 8, 2018

Uh oh!

dgault commented Oct 9, 2018

Uh oh!

sbesson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants