Linux 4.14 added "Namespaced File Capabilities"
The patch is described here:
torvalds/linux@8db6c34
http://man7.org/linux/man-pages/man7/capabilities.7.html
It seems to me that this feature could be used to support rootless containers over NFS and GPFS or similar file systems, as the file is still stored as the user's UID, but the namespaced UID is stored in an extended attribute.
If a task writes a v3 security.capability, then it can provide a uid for
the xattr so long as the uid is valid in its own user namespace, and it
is privileged with CAP_SETFCAP over its namespace. The kernel will
translate that rootid to an absolute uid, and write that to disk. After
this, a task in the writer's namespace will not be able to use those
capabilities (unless rootid was 0), but a task in a namespace where the
given uid is root will.
So, during a write, the uid for the xattr is the user namespace PID - the one GPFS/NFS doesn't like, and the translated one is that of the the user executing the rootless container.
It would appear that this could resolve the more general case from containers/podman#4551, although I note that specific issue may be resolved soon thanks to #529.
Linux 4.14 added "Namespaced File Capabilities"
The patch is described here:
torvalds/linux@8db6c34
http://man7.org/linux/man-pages/man7/capabilities.7.html
It seems to me that this feature could be used to support rootless containers over NFS and GPFS or similar file systems, as the file is still stored as the user's UID, but the namespaced UID is stored in an extended attribute.
So, during a write, the uid for the xattr is the user namespace PID - the one GPFS/NFS doesn't like, and the translated one is that of the the user executing the rootless container.
It would appear that this could resolve the more general case from containers/podman#4551, although I note that specific issue may be resolved soon thanks to #529.