Skip to content

cgroupv2: don't enable threaded mode by default#2390

Merged
kolyshkin merged 1 commit into
opencontainers:masterfrom
lifubang:threadedordomain
May 13, 2020
Merged

cgroupv2: don't enable threaded mode by default#2390
kolyshkin merged 1 commit into
opencontainers:masterfrom
lifubang:threadedordomain

Conversation

@lifubang
Copy link
Copy Markdown
Member

@lifubang lifubang commented May 8, 2020

After this commit: 60c647e
Runc enable threaded mode in cgroup v2 by default.
If the cgroupPath is set to a absolute path like /docker/********, the memory subsystem can't be used by this mode.
So, I think we should use domain mode by default.

Signed-off-by: lifubang lifubang@acmcoder.com

@lifubang lifubang changed the title cgroupv2: only enable threaded mode when rootless cgroupv2: enable threaded mode only in rootless May 8, 2020
@lifubang
Copy link
Copy Markdown
Member Author

lifubang commented May 8, 2020

@AkihiroSuda PTAL

@cyphar
Copy link
Copy Markdown
Member

cyphar commented May 8, 2020

I'm not sure I understand why we're putting cgroupv2 into threaded mode -- @AkihiroSuda is this to work around some odd permission issue (and why does it fix an EPERM)? The description in 60c647e and #2340 is pretty sparse.

Comment thread libcontainer/cgroups/fs2/create.go Outdated
// Otherwise ENOTSUP may happen.
cgType := filepath.Join(current, "cgroup.type")
_ = ioutil.WriteFile(cgType, []byte("threaded"), 0644)
if rootless {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cgType := filepath.Join(current, "cgroup.type")
cgTypeB, _ := ioutil.ReadFile(cgType)
if strings.TrimSpace(string(cgTypeB)) == "domain invalid" {
  _ = ioutil.WriteFile(cgType, []byte("threaded"), 0644)
}

@AkihiroSuda
Copy link
Copy Markdown
Member

@cyphar
Writing threaded to cgroup.type is required when cgroup.type becomes domain invalid, mostly in nested environment

$ sudo podman run -it --rm --privileged --runtime=crun alpine
/ # cd /sys/fs/cgroup/
/sys/fs/cgroup # cat cgroup.controllers 
cpuset cpu io memory pids
/sys/fs/cgroup # cat cgroup.subtree_control 
/sys/fs/cgroup # echo +cpu > cgroup.subtree_control
/sys/fs/cgroup # mkdir foo
/sys/fs/cgroup # cat foo/cgroup.type
domain invalid

@cyphar
Copy link
Copy Markdown
Member

cyphar commented May 9, 2020

Okay, but "domain invalid" means that the cgroup is in an invalid state (meaning that one of the cgroup rules has been violated -- most likely the no-internal-processes rule). Putting a cgroup into threaded mode doesn't fix that -- it switches it into an alternative mode which only allows controllers which are thread-aware to be enabled (such as cpu). This is why we can't enable the memory controller -- it isn't thread-aware.

IMHO, a more complete solution would be to figure out how to deal with the parent cgroup having child processes (which is a bit dodgy if we're going to move other programs on the system between cgroups) or to simply give an error if we hit domain invalid and explain what the issue is.

@lifubang
Copy link
Copy Markdown
Member Author

lifubang commented May 9, 2020

or to simply give an error if we hit domain invalid and explain what the issue is.

Personally, I prefer this one. @AkihiroSuda WDYT? Because in man7 cgroups:

2. "Internal" processes are not permitted.  With the exception of the
          root cgroup, processes may reside only in leaf nodes (cgroups that
          do not themselves contain child cgroups).  The details are
          somewhat more subtle than this, and are described below.
Cgroups v2 "no internal processes" rule
       Cgroups v2 enforces a so-called "no internal processes" rule.
       Roughly speaking, this rule means that, with the exception of the
       root cgroup, processes may reside only in leaf nodes (cgroups that do
       not themselves contain child cgroups).  This avoids the need to
       decide how to partition resources between processes which are members
       of cgroup A and processes in child cgroups of A.

       For instance, if cgroup /cg1/cg2 exists, then a process may reside in
       /cg1/cg2, but not in /cg1.  This is to avoid an ambiguity in cgroups
       v1 with respect to the delegation of resources between processes in
       /cg1 and its child cgroups.  The recommended approach in cgroups v2
       is to create a subdirectory called leaf for any nonleaf cgroup which
       should contain processes, but no child cgroups.  Thus, processes
       which previously would have gone into /cg1 would now go into
       /cg1/leaf.  This has the advantage of making explicit the relation‐
       ship between processes in /cg1/leaf and /cg1's other children.

       The "no internal processes" rule is in fact more subtle than stated
       above.  More precisely, the rule is that a (nonroot) cgroup can't
       both (1) have member processes, and (2) distribute resources into
       child cgroups—that is, have a nonempty cgroup.subtree_control file.
       Thus, it is possible for a cgroup to have both member processes and
       child cgroups, but before controllers can be enabled for that cgroup,
       the member processes must be moved out of the cgroup (e.g., perhaps
       into the child cgroups).

       With the Linux 4.14 addition of "thread mode" (described below), the
       "no internal processes" rule has been relaxed in some cases.

@AkihiroSuda
Copy link
Copy Markdown
Member

SGTM, but when no domain controller is enabled, we can write "threaded" without retuning error

@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from fd7d8a4 to eea6058 Compare May 10, 2020 03:29
@lifubang lifubang changed the title cgroupv2: enable threaded mode only in rootless cgroupv2: don't enable threaded mode by default May 10, 2020
@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from 106fdd8 to 7ca4d5a Compare May 10, 2020 05:32
Comment thread libcontainer/cgroups/fs2/create.go Outdated
if strings.TrimSpace(string(cgType)) == "domain invalid" {
cgTypeParentFile := filepath.Join(current, "../cgroup.type")
cgTypeParent, _ := ioutil.ReadFile(cgTypeParentFile)
if bytes.HasPrefix(cgTypeParent, []byte("domain")) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC we don't need to check parent, we just need to check whether the current config contains domain controller

@lifubang lifubang force-pushed the threadedordomain branch 2 times, most recently from 1e97975 to 757bb65 Compare May 10, 2020 15:52
@lifubang
Copy link
Copy Markdown
Member Author

lifubang commented May 10, 2020

echo threaded > "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"

I think rootless doesn't need threaded mode in default?
I don't know whether my opinion is right or not. @AkihiroSuda

@AkihiroSuda
Copy link
Copy Markdown
Member

Let's check invalid cgroup.type and set threaded conditionally

@lifubang lifubang force-pushed the threadedordomain branch from 757bb65 to e0c9737 Compare May 10, 2020 23:14
Comment thread tests/rootless.sh Outdated
if grep -qw invalid "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"; then
echo threaded > "$CGROUP_MOUNT/$CGROUP_PATH/cgroup.type"
fi
# Make sure cgroup.type doesn't contain "invalid". Otherwise write ops will fail with ENOTSUP.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can drop L106-L110 now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lifubang ^^^

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been removed before LGTM.
I think we should keep these comments to let other people know why we need to write threaded to cgroups.type.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the echo threaded > still needed? Seems a bit odd to run the entire test suite under threaded.

@lifubang lifubang force-pushed the threadedordomain branch from e0c9737 to 2aefa92 Compare May 10, 2020 23:31
@AkihiroSuda
Copy link
Copy Markdown
Member

AkihiroSuda commented May 10, 2020

LGTM

Approved with PullApprove

Comment thread libcontainer/cgroups/fs2/create.go Outdated
Comment thread libcontainer/cgroups/fs2/create.go Outdated
Copy link
Copy Markdown
Contributor

@kolyshkin kolyshkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor fixes, otherwise good

@lifubang lifubang force-pushed the threadedordomain branch from 2aefa92 to d6a76d9 Compare May 11, 2020 04:30
@AkihiroSuda
Copy link
Copy Markdown
Member

AkihiroSuda commented May 11, 2020

LGTM

Approved with PullApprove

@AkihiroSuda
Copy link
Copy Markdown
Member

@kolyshkin LGTY?

Comment thread libcontainer/cgroups/fs2/create.go Outdated
Comment thread libcontainer/cgroups/fs2/create.go Outdated
Comment thread tests/rootless.sh Outdated
Because in threaded mode, we can't enable the memory controller -- it isn't thread-aware.

Signed-off-by: lifubang <lifubang@acmcoder.com>
@lifubang lifubang force-pushed the threadedordomain branch from d6a76d9 to fe0669b Compare May 13, 2020 08:27
@cyphar
Copy link
Copy Markdown
Member

cyphar commented May 13, 2020

LGTM.

Approved with PullApprove

@cyphar cyphar dismissed kolyshkin’s stale review May 13, 2020 09:09

issues fixed

@giuseppe
Copy link
Copy Markdown
Member

@AkihiroSuda @kolyshkin PTAL

@kolyshkin
Copy link
Copy Markdown
Contributor

kolyshkin commented May 13, 2020

LGTM

Approved with PullApprove

@kolyshkin kolyshkin merged commit 85d4264 into opencontainers:master May 13, 2020
@lifubang lifubang deleted the threadedordomain branch May 28, 2020 01:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants