feat(local): add fsync to LocalFileSystem for durability by Barre · Pull Request #643 · apache/arrow-rs-object-store

Barre · 2026-02-15T17:11:19Z

Call sync_all() on written files and fsync parent directories at all write-path boundaries (put, copy, rename, multipart complete) so that a successful return guarantees data is durable on disk, matching the implicit contract of cloud object stores.

Rationale for this change

When LocalFileSystem::put (or copy/rename/multipart complete) returns Ok, callers reasonably expect the data to be durable on disk as this is the implicit contract of every cloud object store like S3 or GCS.

However, LocalFileSystem never called fsync/sync_all, meaning the OS was free to keep the data in its page cache indefinitely. A crash or power loss after a successful put could result in data loss or zero-length files.

This change adds sync_all() calls on written files and fsync on parent directories at every write-path boundary (put_opts, copy_opts, rename_opts, multipart complete), ensuring that when an operation returns success, both the file contents and the directory entry pointing to them are durable on stable storage.

Are there any user-facing changes?

No.

Call sync_all() on written files and fsync parent directories at all write-path boundaries (put, copy, rename, multipart complete) so that a successful return guarantees data is durable on disk, matching the implicit contract of cloud object stores.

crepererum · 2026-02-16T13:58:40Z

I think this is technically what we should do. I wonder however if we want to offer some kind of opt-out mechanism (I think in vein of "safety+sanity by default", it shouldn't be opt-in). WDYT @tustvold?

tustvold · 2026-02-16T14:27:44Z

I think calling fsync on the files makes a lot of sense, and ensures that we maintain atomicity (or at least try do our best to).

Calling fsync on directories also makes sense, but I think this PR doesn't go far enough, it needs to ensure that any recursively created directories are also fsynced...

I think it makes sense for this to be enabled by default (in a future breaking release), but it should be possible to turn this behaviour off, perhaps with a separate option that just fsync's files, ensuring atomicity.

There is also the question to me of what happens if a process is writing lots of files to the same directory, fsyncing the directory on every new file is rather wasteful, you really want some mechanism for the caller to fsync as part of some higher-level transaction. I don't know how we would expose this though...

Perhaps it doesn't matter - LocalFileSystem inherently trades performance for being able to interoperate with a filesystem, if someone wants the optimal disk performance they're probably better off using io_uring and something custom anyway.

alamb · 2026-03-19T15:59:15Z

I don't think we should automatically call fsync on filesystem writes. Users who want that behavior can already call fsync when the write returns. An option (defaulting to false) would make sense for convenience

There is also the question to me of what happens if a process is writing lots of files to the same directory, fsyncing the directory on every new file is rather wasteful, you really want some mechanism for the caller to fsync as part of some higher-level transaction.

I think it be best put in a layer higher up in the call stack (whatever is calling / orchestrating this higher level transaction using the object_store crate)

Barre · 2026-03-20T03:56:00Z

Defaulting to false seems a bit reckless to me, users are most certainly expecting the local "object store" to be durable, as are all other object store implementations in this crate (aside from memory...).

It was definitely a great surprise to me to learn that it was not.

adamreeve · 2026-03-20T07:13:33Z

Users who want that behavior can already call fsync when the write returns. An option (defaulting to false) would make sense for convenience

Is this true? You need a File to call fsync on, and you can't call fsync after the file is closed. The ObjectStore trait doesn't expose a way to get the file after a write as far as I can tell.

I think it be best put in a layer higher up in the call stack (whatever is calling / orchestrating this higher level transaction using the object_store crate)

For the same reason as above, I don't think this is really feasible.

crepererum · 2026-03-20T10:16:51Z

I don't think we should automatically call fsync on filesystem writes. Users who want that behavior can already call fsync when the write returns.

They can, but it's a major hassle to do. As written above, just calling fsync on the file isn't enough, so to implement this properly, you also have to walk the directories and figure out which ones you need to fsync (technically only the ones where new files or sub-directories have been created). That to me seems unpractical. Hence I think object-store should implement this, at least as an option.

To avoid confusion: calling fsync on a directory does NOT recursively fsync the whole subtree. It really just fsync "what's in this directory at this specific level" + "what's the state of the directory, e.g. permissions and attributes".

alamb · 2026-03-20T15:26:26Z

Defaulting to false seems a bit reckless to me, users are most certainly expecting the local "object store" to be durable, as are all other object store implementations in this crate (aside from memory...).

It was definitely a great surprise to me to learn that it was not.

I agree that if we were writing this library from scratch, having writes to files automatically call fsync would be a very reasonable behavior.

Likewise, I am not debating that adding some way to call fsync is valuable, I am simply voting against changing the default behavior to avoid causing hard to track down downstream implications (though it enough people vote the other way I will be willing to defer).

I recommend we do this in two PRs:

Add a new flag, default to false that adds fsync on write (there seems to be broad agreement that this is a useful feature)
Propose a separate PR to change the default of the new flag to true where we can debate the implications of that change.

Is this true? You need a File to call fsync on, and you can't call fsync after the file is closed. The ObjectStore trait doesn't expose a way to get the file after a write as far as I can tell.

This is a good point @adamreeve -- I stand corrected.

crepererum mentioned this pull request Mar 19, 2026

feat(local): reliable file writes for LocalFileSystem #662

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(local): add fsync to LocalFileSystem for durability#643

feat(local): add fsync to LocalFileSystem for durability#643
Barre wants to merge 1 commit intoapache:mainfrom
Barre:add-fsync-local-filesystem

Barre commented Feb 15, 2026

Uh oh!

crepererum commented Feb 16, 2026

Uh oh!

tustvold commented Feb 16, 2026 •

edited

Loading

Uh oh!

alamb commented Mar 19, 2026

Uh oh!

Barre commented Mar 20, 2026 •

edited

Loading

Uh oh!

adamreeve commented Mar 20, 2026

Uh oh!

crepererum commented Mar 20, 2026 •

edited

Loading

Uh oh!

alamb commented Mar 20, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

Barre commented Feb 15, 2026

Rationale for this change

Are there any user-facing changes?

Uh oh!

crepererum commented Feb 16, 2026

Uh oh!

tustvold commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Mar 19, 2026

Uh oh!

Barre commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

adamreeve commented Mar 20, 2026

Uh oh!

crepererum commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alamb commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tustvold commented Feb 16, 2026 •

edited

Loading

Barre commented Mar 20, 2026 •

edited

Loading

crepererum commented Mar 20, 2026 •

edited

Loading

alamb commented Mar 20, 2026 •

edited

Loading