Skip to content

Conversation

@razzeee
Copy link

@razzeee razzeee commented Dec 13, 2025

…porary file to prevent OOM for large files.

Hey,

my C is very bad, so this was found and done with AI. After looking at the reports an flatpak/flatpak#6255

I would expect this, to be slightly less performant, but work more reliably on systems with less RAM. Most of the things we do here, also seem to have been done in other code files here and there, so it doesn't seem to be unprecedented to go this route.

I haven't been able to runtime test this and I'm unsure how - especially validating, that it still works in the wild (e.g. doesn't break deltas)

@openshift-ci
Copy link

openshift-ci bot commented Dec 13, 2025

Hi @razzeee. Thanks for your PR.

I'm waiting for a github.com member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a great approach to solving the OOM issue with large deltas. Using a memory-mapped temporary file is a solid strategy to reduce memory pressure. The implementation of reading the patched data in chunks is also well done.

However, I've found a critical issue with the handling of zero-sized output files. The new implementation using mmap will fail in this case, which was handled correctly by the previous g_malloc0 implementation. This would cause a regression for any delta that results in a zero-byte file.

I've added a few detailed comments with suggestions on how to fix this by special-casing when content_size is zero. Please take a look. With these changes, this should be a very robust improvement!

Copy link

@simonmicro simonmicro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be interested to learn more about your reasoning behind changing the code around _ostree_repo_bare_content_write.

guint64 bytes_remaining = state->content_size;
while (bytes_remaining > 0)
{
guchar chunk_buf[8192];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered bigger block sizes like experimented with in src/libostree/ostree-repo-commit.c:1232?

Comment on lines +485 to +487
do
bytes_read = read (tmpf.fd, chunk_buf, chunk_size);
while (G_UNLIKELY (bytes_read == -1 && errno == EINTR));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While reading, this chunk should not be "long enough" for a badly placed interrupt, you are going to retry it. Can you consider also patching src/libostree/ostree-repo-commit.c:1240 for that?

if (!_ostree_repo_bare_content_write (repo, &state->content_out, buf, state->content_size,
cancellable, error))
return FALSE;
/* Now read from tmpfile and write to repository (if content_size > 0) */

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this done? Previously, the (potentially) large output buffer was used to be written into the repository via _ostree_repo_bare_content_write. Now, this code loads the patched content chunk-based into memory and then writes it to disk. But, as the source was changed to be potentially memory-backed, this is not needed - I would even say it introduced an unnecessary copy?

Furthermore, the glnx_loop_write part of _ostree_repo_bare_content_write now may performs additional calls to write, as the chuck sizes may not align with its retry-logic as from https://github.com/GNOME/libglnx/blob/de29c5f7d9df8d57b4f5caa9920f5d4caa7a8cfc/glnx-fdio.c#L752

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants