Additional user namespace checks#1649
Conversation
Signed-off-by: Lorenzo Fontana <lo@linux.com>
8545cf8 to
c05cc44
Compare
| // Verify that the current process is not inside a chroot, | ||
| // the kernel does not allow creating a new user namespace in a chroot. | ||
| // for information, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/user_namespace.c | ||
| if root.Sys().(*syscall.Stat_t).Ino != 2 { |
There was a problem hiding this comment.
From memory, this check will break if you're trying to nest containers or have any other form of weird rootfs mount structure set up.
Also, I just checked and on my system and / appears to have an inode number of 256 with btrfs:
% stat /
File: /
Size: 218 Blocks: 0 IO Block: 4096 directory
Device: 2dh/45d Inode: 256 Links: 1
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2017-11-12 07:05:28.393935000 +1100
Modify: 2017-04-27 22:57:13.841679200 +1000
Change: 2017-04-27 22:57:13.841679200 +1000
Birth: -
And I believe that XFS has a root inode of 128.
There was a problem hiding this comment.
Oh, also this trick won't be able to detect if you have mounted a filesystem onto /jail and then done chroot /jail -- because then the inode number will still be the same.
There are some ways to detect chroot "sort of", using a combination of /proc/$pid/root and /proc/1/root, or /proc/1/mountinfo. But unfortunately they're all very iffy and don't handle all of the corner cases. It would be ideal if the kernel exposed current_chrooted to userspace but I imagine that's not likely to happen. Another possibility is to propose that we should be able to introspect the root of /proc/$pid/ns/mnt -- something that Eric is more likely to consider (but passing paths around is generally not a good idea especially since the kernel likes to changes paths to / if it feels that something fishy is going on).
There was a problem hiding this comment.
@cyphar Tried the pivot_root approach however doesn't seem to be very reliable. Thanks for taking the time to look at this
| // the kernel is compiled without user namespaces by default. | ||
| // See: kernel.unprivileged_userns_clone | ||
| unc, err := ioutil.ReadFile("/proc/sys/kernel/unprivileged_userns_clone") | ||
| if err == nil && unc[0] == '0' && os.Getuid() != 0 { |
There was a problem hiding this comment.
I'm not a huge fan of distro-specific code (especially since Red Hat have a similar patch but it has different semantics and they also have some other patches as well). However, this is also not correct if the system has /usr/bin/newuidmap and /usr/bin/newgidmap set up -- see #1529.
|
@cyphar @crosbymichael there's any new review update regarding this PR? |
|
There are a lot of edge cases with this, can we not do validation and just let the kernel fail and return an error for us? |
|
Is this closable now? |
|
Yes let's close this, no need for that check. Thanks everyone for your time. |
The kernel does not allow creating user namespaces inside a chroot
see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/user_namespace.c#n95
There's a debian-specific flag to enable unprivileged user
namespaces. In case of a rootless container warn the debian user about this.
Signed-off-by: Lorenzo Fontana lo@linux.com