mount: add enhanced mount functionality to support run container in userns with host network#3613
mount: add enhanced mount functionality to support run container in userns with host network#3613shidao1 wants to merge 5 commits intoopencontainers:mainfrom
Conversation
Extend bootstrap message to pass mount fds for open_tree()/move_mount(). Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Enhance mountToRootfs() to support MoveMount(), so it could be used to support cross user namespace mounting. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Introduce struct namespace_info_t to split join_namespaces() in stages, so it could be reused later. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
Prepare source mount fds for move_mount() to support cross user namespace mounting. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com>
When a user namespace is enabled for a pod/container, it may fail to mount /proc, /sys and /dev/mqueue under certain conditions. This may be solved by enabling cross user namespace mounting. Signed-off-by: Jiang Liu <gerry@linux.alibaba.com> Signed-off-by: shidao.ytt <shidao.ytt.kernel@linux.alibaba.com>
| return err | ||
| } | ||
| // Selinux kernels do not support labeling of /proc or /sys | ||
| if m.IsMove() && *c.fd >= 0 { |
There was a problem hiding this comment.
Nit: unlike in C, you don't have to dereference a pointer to a struct when accessing its members.
IOW s/*c.fd/c.fd/
| @@ -400,12 +400,25 @@ func mountToRootfs(m *configs.Mount, c *mountConfig) error { | |||
| return err | |||
| } | |||
| // Selinux kernels do not support labeling of /proc or /sys | |||
There was a problem hiding this comment.
This commit makes the comment above ^^^ misplaced. It used to explain why label.SetFileLabel is not called here.
| if _, exist := nsList[configs.NEWPID]; exist { | ||
| } |
There was a problem hiding this comment.
Is some code missing here?
|
We need test cases for this. |
|
Also, I'm afraid you'll have to redo this once #3599 is merged, which refactors some C code in nsenter. |
|
I think this PR is not active for long time, may I take handle the rest work for making this PR ready to merge? cc @AkihiroSuda |
|
@Zheaoli You'll need to base it on top of #3985, which reworks all of the mountfd logic. I'm not sure how easy it'll be to use the new Go-based setup to implement this though. I suspect you can do it by creating a locked goroutine that joins the container's non-userns namespaces, but the slight issue is that we cannot create a procfs mount that uses the containers pidns because procfs uses the active pidns, not the Also, you don't want to use |
in the public cloud service product, serverless container running environment has some specials.
the purpose is the container running in a new userns on host network mode.
the main process is using syscall open_tree to get fd for mount point sys, proc, mqueue beforce runc switch to new user ns
and using move_mount to mount sys, proc, mqueue after runc switch to new user ns