Remove global device mknod cgroup rules by dasizeman · Pull Request #1623 · opencontainers/runc

dasizeman · 2017-10-28T00:23:32Z

These global rules seem like they aren't needed and they cause conflicts with parent cgroup policies.

cyphar · 2017-10-28T03:58:36Z

You need to add a Signed-off-by: line to your commit(s) which indicates that you attest the Developer Certificate of Origin a statement about your contributions that you must read before signing (don't worry, it's quite short and easy-to-read). You can add it to your commits with git commit --amend -s, and then doing a git push --force.

Signed-off-by: Dave Sizer <dave@sizetron.net>

dasizeman · 2017-10-30T20:23:55Z

I have signed off on the commit, thanks.

cyphar · 2017-10-31T10:06:28Z

/cc @crosbymichael -- Do you remember why this was around originally?

crosbymichael · 2017-10-31T13:29:19Z

One of the reasons is that many things mknod devices and install fails. Things like java mknod fuse, etc. Mknod is safe across the board as long as rw is blocked via cgroups.

Also mknod is save because there is nothing stopping an image from bringing in device nodes so we have a wildcard to allow mknod and block access

dasizeman · 2017-10-31T20:46:57Z

I see, so it seems like really the issue I was originally having in #1618 is because SLURM shouldn't be blocking mknod on the devices in the parent policy.

@crosbymichael Do you have any more details/references on how you are sure that global mknod is safe, and maybe some examples of things that fail without it? I would like to understand this better and also I am going to need a compelling argument if I want a change in SLURM to be accepted.

crosbymichael · 2017-10-31T20:49:59Z

@dasizeman

I think this will give you all the history and information that you need. Everything from kernel devs to users involved ;)

moby/moby#2979

rhatdan · 2017-10-31T21:01:11Z

Personally I would prefer that this was off. There are hardly any containers that would ever need mknod, and I believe in relying on as many security mechanisms as possible to protect against container break out. Perhaps having separate capabilities for build and run would be helpful. But when far less then 1 % of all containers needs mknod, we should not enable it by default. But that is water under the bridge at this point for docker/moby I am afraid.

cyphar · 2017-11-01T03:29:07Z

@crosbymichael I agree with you from the Docker side of things, but it feels as though these default should be added to the specs that Docker generates rather than having them in runc -- especially since if you wanted to make different mknod policies you'd need to mess around with devices.deny a fair bit in order to make runc act nicely. Not to mention that tools (like umoci) that generate root filesystems for runtime bundles might do filtering of possible inode major+minor numbers.

And while a process inside a container cannot open(...) the device inode, I have found some quite worrying kernel bugs (which I've reported and am discussing upstream) that would make it theoretically possible to escalate the ability to access an inode into being able to trick another (more privileged) process into opening it.

dasizeman · 2017-11-01T23:12:55Z

@crosbymichael Thanks for pointing me toward that PR, it did a great job of catching me up on the history.

I actually tend to agree with @cyphar here. Based on the original need for these permissions in Docker, it seems like they only need to be present at image build time. The correct course of action in my opinion would be to remove them in runc, and have Docker only implicitly enable them during image builds, not when running containers (if they were needed at runtime, they could be set like any other special permission).

If everyone agrees, I would be happy to help with getting the needed changes merged on the moby side of things.

crosbymichael · 2017-11-02T14:27:59Z

I'm fine with it as long as we update projects that rely on runc so that no regressions are introduced. This change is not about security, its about where the responsibility lies. Also the moby change should not apply to just build time, it should apply to all, I'm not comfortable saying that people are not ever going to drop into a bash session and run an apt-get install and it breaks after this change.

dasizeman · 2017-11-02T18:20:26Z

Good point about apt-get installs from the shell, I forgot about that. I think you're right that it's a responsibility issue, and I think that the responsibility of allowing mknod to enable certain package installs should lie with moby, not runc. Perhaps the best move forward here is to get this merged in runc and update any downstream projects.

Unfortunately, on the moby side of things, this is a more complicated issue. These mknod permissions are required for certain use cases, but at the same time they are really an inconsistency as far as container security goes: they allow certain package installs, while simultaneously breaking --cgroup-parent.

I will open an issue in moby to start a discussion on how to integrate this change if it were merged.

@crosbymichael What other projects are there downstream? I believe moby actually uses runc indirectly through containerd correct? Also, what would the process be for releasing a change like this and making sure we make the needed changes in downstream projects? If we can't figure out a good way to "solve" the inconsistency on the moby side anyway, then I would imagine it may not be worth going through the process of integrating this right now, since there would be no motivating use case (my original use case with Docker would still be broken if no change was made on that side).

cyphar · 2017-11-03T03:32:38Z

@dasizeman

Unfortunately, on the moby side of things, this is a more complicated issue. These mknod permissions are required for certain use cases, but at the same time they are really an inconsistency as far as container security goes: they allow certain package installs, while simultaneously breaking --cgroup-parent.

[Insert common complaint that --cgroup-parent is usually misused.]

This is actually situation that can be detected somewhat trivially by whoever is making decisions on what devices should be allowed. devices.list gives you the effective set of devices that are allowed -- and userspace (whether that be runc or Docker) should be able to figure out whether a *:* m is going to be permitted in a sub-cgroup (of course you'd need to replicate the kernel parsing code and other set-related logic for devices). We could add some functions in libcontainer than Docker could then use, since we already have a lot of code in runc that does these sorts of things.

There are some slight complications for blacklist devices cgroups -- but in that situation there's nothing we can do in userspace other than just try the bad configuration and then adjust afterwards. And ultimately, if you're using a blacklist you're already doing things wrong.

What other projects are there downstream? I believe moby actually uses runc indirectly through containerd correct?

That is correct, but there are other downstreams. cri-o uses runc. Cloud Foundry has also switched to using runc. There's also tools like orca-build which use runc (and there are certainly many others).

cyphar · 2018-11-14T11:48:11Z

I'm removing this from 1.0 since we don't have time to land other external-to-the-project commits.

kolyshkin · 2025-12-15T21:19:16Z

I'm inclined to close this one, as every time we remove some permissions we end up realizing some upper-level tools depend on them and/or have no way to re-add those when needed, thus bringing in the unnecessary turmoil. Last time it happened then we tried to remove tun/tap device from the default set (see #3468, #4555).

Feel free to reopen or file a new PR/issue if you disagree.

Remove global device mknod cgroup rules

f8b2d32

Signed-off-by: Dave Sizer <dave@sizetron.net>

dasizeman force-pushed the global_cgroup_mknod_fix branch from eb6aff5 to f8b2d32 Compare October 30, 2017 20:22

mrunalp added this to the 1.0.0 milestone Nov 1, 2017

cyphar removed this from the 1.0.0 milestone Nov 14, 2018

kolyshkin closed this Dec 15, 2025

kolyshkin mentioned this pull request Dec 17, 2025

mknod permission always granted on block and char devs no matter what the allow string is in specs #5073

Closed

cyphar mentioned this pull request Dec 18, 2025

Default linux cgroup devices.allow rules too general #1618

Closed

Conversation

dasizeman commented Oct 28, 2017

Uh oh!

cyphar commented Oct 28, 2017

Uh oh!

dasizeman commented Oct 30, 2017

Uh oh!

cyphar commented Oct 31, 2017

Uh oh!

crosbymichael commented Oct 31, 2017

Uh oh!

dasizeman commented Oct 31, 2017

Uh oh!

crosbymichael commented Oct 31, 2017

Uh oh!

rhatdan commented Oct 31, 2017

Uh oh!

cyphar commented Nov 1, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dasizeman commented Nov 1, 2017

Uh oh!

crosbymichael commented Nov 2, 2017

Uh oh!

dasizeman commented Nov 2, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cyphar commented Nov 3, 2017

Uh oh!

cyphar commented Nov 14, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kolyshkin commented Dec 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cyphar commented Nov 1, 2017 •

edited

Loading

dasizeman commented Nov 2, 2017 •

edited

Loading

cyphar commented Nov 14, 2018 •

edited

Loading