Avoid killing runc init early#2855
Merged
Merged
Conversation
Contributor
Author
|
Some repo (as configured by github) appears to be broken:
Restarted; got a different but similar error:
|
a59f526 to
6c408c9
Compare
Contributor
Author
|
(validate CI is failing here -- fixed in #2860) @AkihiroSuda @cyphar @lifubang PTAL |
Contributor
Author
|
@AkihiroSuda @cyphar @lifubang PTAL (please ignore failed CI -- it is fixed in #2860) |
Member
|
Could you rebase, LGTM then |
Add some minimal validation for cgroups. The following checks are implemented: - cgroup name and/or prefix (or path) is set; - for cgroup v1, unified resources are not set; - for cgroup v2, if memorySwap is set, memory is also set, and memorySwap > memory. This makes some invalid configurations fail earlier (before runc init is started), which is better. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
The stars can be aligned in a way that results in runc to leave a stale bind mount in container's state directory, which manifests itself later, while trying to remove the container, in an error like this: > remove /run/runc/test2: unlinkat /run/runc/test2/runc.W24K2t: device or resource busy The stale mount happens because runc start/run/exec kills runc init while it is inside ensure_cloned_binary(). One such scenario is when a unified cgroup resource is specified for cgroup v1, a cgroup manager's Apply returns an error (as of commit b006f4a), and when (*initProcess).start() kills runc init just after it was started. One solution is NOT to kill runc init too early. To achieve that, amend the libcontainer/nsenter code to send a \0 byte to signal that it is past the initial setup, and make start() (for both run/start and exec) wait for this byte before proceeding with kill on an error path. While at it, improve some error messages. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
6c408c9 to
4ecff8d
Compare
Contributor
Author
Rebased to include #2860 |
AkihiroSuda
approved these changes
Apr 1, 2021
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two commits, each one solving a problem of a stale bind mount
left by runc init after unsuccessful container start.
See #2843 for more details
and the initial investigation.
Closes: #2843
start: don't kill runc init too early
The stars can be aligned in a way that results in runc to leave a stale
bind mount in container's state directory, which manifests itself later,
while trying to remove the container, in an error like this:
The stale mount happens because runc start/run/exec kills runc init
while it is inside ensure_cloned_binary(). One such scenario is when
a unified cgroup resource is specified for cgroup v1, a cgroup manager's
Apply returns an error (as of commit b006f4a), and when
(*initProcess).start() kills runc init just after it was started.
One solution is NOT to kill runc init too early. To achieve that,
amend the libcontainer/nsenter code to send a \0 byte to signal
that it is past the initial setup, and make start() (for both
run/start and exec) wait for this byte before proceeding with
kill on an error path.
While at it, improve some error messages.
libct/configs/validator: add some cgroup support
Add some minimal validation for cgroups. The following checks
are implemented:
and memorySwap > memory.
This makes some invalid configurations fail earlier (before runc init
is started), which is better, as this should prevent killing runc init
in the middle of ensure_cloned_binary().