[1.1] Fix setting RLIMIT_NOFILE#4239
Conversation
8e2cc9a to
26c2019
Compare
|
CI failures in cross-i386 and cirrus are expected and are being fixed in #4231. |
This comment was marked as outdated.
This comment was marked as outdated.
26c2019 to
c08c687
Compare
Signed-off-by: lfbzhm <lifubang@acmcoder.com> (cherry picked from commit a596a05) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since Go 1.21 (https://go-review.googlesource.com/c/go/+/476097), Go runtime saves the original value of RLIMIT_NOFILE upon startup and uses the saved value in StartProcess, unless RLIMIT_NOFILE is not explicitly changed by calling syscall.Setrlimit. Now, runc uses unix.Prlimit (rather than syscall.Setrlimit) to set RLIMIT_NOFILE, thus Golang runtime is not aware that it is changed, result in occasional reset of RLIMIT_NOFILE, reported in opencontainers#4195. Bumping x/sys/unix to v0.7.0 fixes this (via https://go-review.googlesource.com/c/sys/+/476695). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
c08c687 to
ec70039
Compare
Cherry-picked those CI commits from there to have green CI here. |
|
@jrivera-px @ls-ggg PTAL |
|
I can't reproduce this on my Fedora 39 machine. Can repro on Ubuntu 20.04 (40 times out of 1 million attempts). If you want to try binary runc bult from this PR, use a zip file from https://github.com/opencontainers/runc/actions/runs/8528034513/artifacts/1378752088 |
...which btw took almost 4 hours. Now retrying with runc built from this PR. |
|
OK I tested this in the same env I used to repro #4195. Not able to reproduce in 100000 execs. |
|
I'd love to add a test case but it takes a few minutes to repro so it doesn't make sense. Here's how the test would look like: [kir@kir-tp1 runc]$ cat tests/integration/ulimit.bats
#!/usr/bin/env bats
load helpers
function setup() {
setup_busybox
}
function teardown() {
teardown_bundle
}
@test "runc exec with RLIMIT_NOFILE" {
update_config ' .process.rlimits = [{"type": "RLIMIT_NOFILE", "hard": 10000, "soft": 10000}]'
runc run -d --console-socket "$CONSOLE_SOCKET" nofile
[ "$status" -eq 0 ]
for ((i = 0; i < 1000; i++)); do
runc exec nofile sh -c 'ulimit -n'
echo "[$i] $output"
[[ "${output}" == "10000" ]]
done
} |
|
Alas i can still repro this with the supposedly fixed version. Using the same setup as in #4195 except the container is run with i=0; while [ $i -lt 100000 ]; do
LIM=$(sudo /home/kir/git/runc/runc exec bionic sh -c "ulimit -n");
if [ $LIM -ne 65536 ]; then
echo "WHOOPSIE (iter $i) got numfile $LIM";
break;
fi;
((i%100==0)) && printf "[%s] %10d\n" "$(date)" "$i";
let i++;
doneIt takes 5-10 minutes to reproduce, here are a few examples: |
|
This is not working :-\ |
and our main branch is buggy, too |
Since Go 1.21 (https://go-review.googlesource.com/c/go/+/476097), Go runtime saves the original value of RLIMIT_NOFILE upon startup and uses the saved value in StartProcess, unless RLIMIT_NOFILE is not explicitly changed by calling syscall.Setrlimit.
Now, runc uses unix.Prlimit (rather than syscall.Setrlimit) to set RLIMIT_NOFILE, thus Golang runtime is not aware that it is changed, result in occasional reset of RLIMIT_NOFILE, reported in #4195.
Bumping x/sys/unix to v0.7.0 fixes this (via
https://go-review.googlesource.com/c/sys/+/476695).
Fixes: #4195.
PS this is similar to #3856 in main branch, except now we know we're fixing a real bug. Thus, it is only needed in release-1.1 branch.