libct: fix a race with systemd removal#3812
Conversation
|
@lifubang @thaJeztah PTAL |
libcontainer/container_linux.go
Outdated
| // for systemd cgroup, the unit's cgroup path will be auto removed if container's all processes exited | ||
| if status == Stopped && !c.cgroupManager.Exists() { | ||
| pids, err := c.cgroupManager.GetAllPids() | ||
| if c.ignoreCgroupError(err) == nil { |
There was a problem hiding this comment.
Maybe there is a more readable implement like this:
- change the return type of
ignoreCgroupErrortobool; - if err == nil || c.ignoreCgroupError(err) {
Otherwise the reader needs to see the function body ofignoreCgroupErrorto find out what happened here.
There was a problem hiding this comment.
Perhaps reverse the logic, which seems more natural;
if err := c.ignoreCgroupError(err); err != nil {
return nil, fmt.Errorf("unable to get all container pids: %w", err)
}
return pids, nilThere was a problem hiding this comment.
Oh well, my initial implementation was called isIgnorableCgroupError and returned bool. Then I took the approach used by func ignoreTerminateErrors.
Either way is fine with me.
|
Yes, I think there is a race condition here, because |
libcontainer/container_linux.go
Outdated
| // for systemd cgroup, the unit's cgroup path will be auto removed if container's all processes exited | ||
| if status == Stopped && !c.cgroupManager.Exists() { | ||
| pids, err := c.cgroupManager.GetAllPids() | ||
| if c.ignoreCgroupError(err) == nil { |
There was a problem hiding this comment.
Perhaps reverse the logic, which seems more natural;
if err := c.ignoreCgroupError(err); err != nil {
return nil, fmt.Errorf("unable to get all container pids: %w", err)
}
return pids, nilf4e94f3 to
463675a
Compare
|
Now, here'a an interesting question: why |
I think this happened when randomly hitting this issue. |
Let me rephrase my question. Currently, on a stopped container, [root@kir-rhat runc-tst]# ./runc list
ID PID STATUS BUNDLE CREATED OWNER
123 0 stopped /home/kir/go/src/github.com/opencontainers/runc-tst 2023-04-06T21:51:20.521334579Z root
[root@kir-rhat runc-tst]# ./runc kill 123; echo $?
ERRO[0000] container not running
1
[root@kir-rhat runc-tst]# ./runc kill -a 123; echo $?
0My question was -- is this (the fact that adding |
As described in For |
Good question. So, it looks that There are a few ways to look at how
|
|
@giuseppe WDYT? (see previous comment) |
|
I've no strong opinion, I've followed the same runc behavior in crun. I am afraid to change the current behavior if anyone depends on it (although the entire mechanism seems like a big race condition).
|
|
OK let's agree to merely document the existing behavior of |
Are you planning to open a PR for that, or do we need a tracking ticket? (looks like this PR may need a rebase, as it's marked "outdated") |
For a previous attempt to fix that (and added test cases), see commit 9087f2e. Alas, it's not always working because of cgroup directory TOCTOU. To solve this and avoid the race, add an error _after_ the operation. Implement it as a method that ignores the error that should be ignored. Instead of currentStatus(), use faster runType(), since we are not interested in Paused status here. For Processes(), remove the pre-op check, and only use it after getting an error, making the non-error path more straightforward. For Signal(), add a second check after getting an error. The first check is left as is because signalAllProcesses might print a warning if the cgroup does not exist, and we'd like to avoid that. This should fix an occasional failure like this one: not ok 84 kill detached busybox # (in test file tests/integration/kill.bats, line 27) # `[ "$status" -eq 0 ]' failed .... # runc kill test_busybox KILL (status=0): # runc kill -a test_busybox 0 (status=1): # time="2023-04-04T18:24:27Z" level=error msg="lstat /sys/fs/cgroup/devices/system.slice/runc-test_busybox.scope: no such file or directory" Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
be1682d to
fe278b9
Compare
Opened #3834
Rebased |
|
1.1 backport: #3877 |
For the previous attempt to fix that (and added test cases), see commit 9087f2e (PR #2338).
Alas, it's not always working because of cgroup directory TOCTOU.
To solve this and avoid the race, add an error after the operation. Implement it as a method that ignores the error that should be ignored. Instead of currentStatus(), use faster runType(), since we are not interested in Paused status here.
For Processes(), remove the pre-op check, and only use it after getting an error, making the non-error path more straightforward.
For Signal(), add a second check after getting an error. The first check is left as is because signalAllProcesses might print a warning if the cgroup does not exist, and we'd like to avoid that.
This should fix an occasional failure like this one:
Fixes: #3372
Fixes: #3744