Skip to content

init: Add --transient flag for ephemeral builds#2889

Merged
jlebon merged 4 commits intocoreos:mainfrom
cgwalters:transient-init
Jun 2, 2022
Merged

init: Add --transient flag for ephemeral builds#2889
jlebon merged 4 commits intocoreos:mainfrom
cgwalters:transient-init

Conversation

@cgwalters
Copy link
Copy Markdown
Member

@cgwalters cgwalters commented Jun 1, 2022

init: Add --transient flag for ephemeral builds

We go to a lot of effort to create various caches of builds, and this
is very useful for local incremental development to avoid re-downloading
RPMs and rewriting data.

Conversely, it is useless for CI builds that discard everything when they're done, and actually slows things down.

Add a cosa init --transient flag that for now just disables fsync()
on tmp/repo.

There's more we can do here in the future, for example we need to
propagate this too into cache/repo-build and cache/pkgcache-repo,
and actually in general we don't need the pkgcache repo at all here.


init: Add a success message

To make it clearer that things worked.


cgwalters added a commit to cgwalters/rpm-ostree that referenced this pull request Jun 1, 2022
This pairs with coreos/coreos-assembler#2889

If fsync is disabled on our target repository, take that as a hint
we should not care about durability for our cache repositories either.
Comment thread src/cmd-init
Comment thread src/cmd-init
@dustymabe
Copy link
Copy Markdown
Member

We go to a lot of effort to create various caches of builds, and this is very useful for local incremental development to avoid re-downloading RPMs and rewriting data.

I kind of feel like disabling fsync() should be the default
everywhere. If we lose power somewhere we're probably going
to start a build from scratch, right? Even if our local cache
got messed up we could easily just blow it away.

@cgwalters
Copy link
Copy Markdown
Member Author

Even if our local cache got messed up we could easily just blow it away.

Yes, in theory. But...the hairy thing is correctly detecting that. A perfect example here is today, if one does yum -y install ccache, it starts caching gcc compilations automatically in ~/.ccache. Which works great until you force reboot or the kernel panics etc.

In that case, you may end up with zero-sized or truncated or otherwise corrupted cached objects in ~/.ccache. And it's really confusing because you'll hit an obscure linker error, and do e.g. a git clean -dfx and rebuild and the problem still happens and you're thinking WTF... I think that's happened to me at least twice, and I eventually learned to rm -rf ~/.ccache when I see obscure linker errors.

ostree does this thing with using the boot-id but it's still not perfect.

I mean, I agree with you obviously, it's just one of those things where the last 10% corner cases is 90% of the work.

So for now...I think we need to just make this opt-in for CI.

@dustymabe
Copy link
Copy Markdown
Member

Good point. We could "big hammer" this and just force clean caches if uptime is newer than last time the cache was touched. Or warn the user.

jlebon
jlebon previously approved these changes Jun 2, 2022
Copy link
Copy Markdown
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me!

@jlebon jlebon enabled auto-merge (rebase) June 2, 2022 13:18
@miabbott
Copy link
Copy Markdown
Member

miabbott commented Jun 2, 2022

gangplank tests are being flaky (same thing observed on #2892); restarted test

@cgwalters
Copy link
Copy Markdown
Member Author

OK I pushed more changes here, most notably a bit that avoids the need for coreos/rpm-ostree#3719 in the virt case (most important one here).

jlebon
jlebon previously approved these changes Jun 2, 2022
Copy link
Copy Markdown
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll want to make sure to update coreos-ci-lib and the pipelines to make use of this too.

Comment thread src/cmdlib.sh Outdated
Comment thread src/cmdlib.sh
cgwalters added 4 commits June 2, 2022 15:39
We go to a lot of effort to create various caches of builds, and this
is very useful for local incremental development to avoid re-downloading
RPMs and rewriting data.

It is useless for CI builds that discard everything when they're done.
Add a `cosa init --transient` flag that for now just disables `fsync()`
on `tmp/repo`.

There's more we can do here in the future, for example we need to
propagate this too into `cache/repo-build` and `cache/pkgcache-repo`,
and actually in general we don't need the pkgcache repo at all here.
To make it clearer that things worked.
Followup to the addition of `--transient` for CI flows.  If
we had coreos/rpm-ostree#3719
this would mostly be unnecessary, but today in CI and prod builds,
ostree is invoking `fsync()` inside the supermin VM, and then
qemu in the container is going to `fsync()` all changes down to
the host system (overlay)fs.

For transient flows, switch to `cache=unsafe` so we stop doing
that which should greatly help speed.
So we test it, and so we also gain the speed benefits.
@jlebon jlebon merged commit 57ce5c9 into coreos:main Jun 2, 2022
cgwalters added a commit to cgwalters/os that referenced this pull request Jun 3, 2022
Opt into the faster infrastructure from
coreos/coreos-assembler#2889
for CI builds where we do not maintain any state across builds.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants