Skip to content

crio: set minimum_mappable_uid/minimum_mappable_gid#2862

Closed
nalind wants to merge 1 commit intoopenshift:masterfrom
nalind:minimum-mappable-id
Closed

crio: set minimum_mappable_uid/minimum_mappable_gid#2862
nalind wants to merge 1 commit intoopenshift:masterfrom
nalind:minimum-mappable-id

Conversation

@nalind
Copy link
Copy Markdown
Member

@nalind nalind commented Dec 6, 2021

- What I did

Set minimum_mappable_uid/minimum_mappable_gid to 1000000000, the start of the ranges we assign to namespaces in the default configuration, for use with unprivileged builds. Follow-on to #2805, part of https://issues.redhat.com/browse/OCPNODE-683.

- How to verify it

Builds should be able to specify "io.kubernetes.cri-o.userns-mode" annotations in their build pods, using "private:..." values that specify mapping host IDs at or above 1000000000 for the build pod's containers.

- Description for the changelog
crio: set minimum_mappable_uid and minimum_mappable_gid to 1000000000

Set minimum_mappable_uid/minimum_mappable_gid to 1000000000, the start
of the ranges we assign to namespaces in the default configuration, for
use with unprivileged builds.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@openshift-ci openshift-ci Bot requested review from mtrmac and umohnani8 December 6, 2021 20:43
@frasertweedale
Copy link
Copy Markdown
Contributor

@nalind do I understand correctly that these settings apply when specifying io.kubernetes.cri-o.userns-mode annotation with mode private: - which requires the mappings to be explicitly specified?

What I really need is support for auto: mode - which requires sub-id ranges to be defined for user container. I do not think it is acceptable to require the workload creator to explicitly choose the host UIDs in the mapping - because there is a risk that different users could specify overlapping host ID ranges, reducing the isolation. The auto mode solves this nicely. Can we also use this PR to add the sub-id mappings?

@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 7, 2021

The current aim in openshift/openshift-controller-manager#173 is to construct those mappings using the ranges that have been allocated for the OpenShift namespace in which the build is being run. Are there cases outside of builds where io.kubernetes.cri-o.userns-mode=auto would be used?
I have no problem with adding changes to /etc/subuid and /etc/subgid to this PR. Which range should the containers entries be given, then?

@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 7, 2021

Hmm, the /etc/subuid and /etc/subgid on my test node includes an entry for core that appears to have been added by running useradd at some point, so depending on the order in which things happen, adding them here could overwrite that, but I haven't found where that content (or the base of what becomes that content) lives.

@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 7, 2021

/test e2e-agnostic-upgrade

@frasertweedale
Copy link
Copy Markdown
Contributor

@nalind

Are there cases outside of builds where io.kubernetes.cri-o.userns-mode=auto would be used?

Yes. "Legacy" workloads that require to run as uid 0 (in container's user namespace). Telco has a huge demand for this. My team is working on operationalising IDM (FreeIPA) for OpenShift, and we want to use a systemd-based workload running in user namespace to avoid (a) big up-front engineering effort to re-architect FreeIPA as "cloud native" app and (b) avoid years of increased maintenance and support costs as we maintain the same application under two separate architectures (because the existing architecture will be what we deliver in RHEL, for the forseeable future at least). And the eventual goal is to offer IDM as a managed service - so user namespace isolation of workloads is essential to support secure multi-tenancy and avoid workloads running as privilege users in the host user namespace.

The current aim in openshift/openshift-controller-manager#173 is to construct those mappings using the ranges that have been allocated for the OpenShift namespace in which the build is being run.

Thanks for clarifying. That's an interesting approach. It is feasible for operators or other agents that create the workload to observe the host uid assignment of the project namespace and configure the users-ns mode to map into that range. But it is a burden that could (should?) be avoided. Furthermore the default size of the slice is 10000, less than the 65536 required for "full linux system in a container" workloads (which may require user 65534 - "nobody" - to be mapped). You can "chop up" the UID allocation to create non-contiguous mapping ranges to cover the UIDs the workload actually uses (actually using > 10000 UIDs is very unlikely), but this creates even more complexity for the controller.

In any case, I tested this PR using a pod configuration that explicitly mentions the project namespace UIDs in both the user-ns annotation and explicit RunAsUser in the security context. It does work, and avoids the need for anyuid SCC, which is a huge benefit of this approach. There are rough edges, but it is probably better to discuss in an issue or PR against CRI-O rather than in this PR.

Hmm, the /etc/subuid and /etc/subgid on my test node includes an entry for core that appears to have been added by running useradd at some point, so depending on the order in which things happen, adding them here could overwrite that, but I haven't found where that content (or the base of what becomes that content) lives.

I haven't found it either. As you say, it is likely a side-effect of core user creation. But based on my testing and discussion above, I think it is good to accept this PR as-is, and alter /etc/subuid and /etc/subgid later, if required.

@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 8, 2021

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 8, 2021

@nalind: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-disruptive 32ec532 link false /test e2e-aws-disruptive
ci/prow/e2e-aws-upgrade-single-node 32ec532 link false /test e2e-aws-upgrade-single-node

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mrunalp
Copy link
Copy Markdown
Member

mrunalp commented Dec 10, 2021

/lgtm
/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Dec 10, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 10, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mrunalp, nalind
To complete the pull request process, please assign yuqi-zhang after the PR has been reviewed.
You can assign the PR to them by writing /assign @yuqi-zhang in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@frasertweedale
Copy link
Copy Markdown
Contributor

Further to the discussion in cri-o/cri-o#5493, I'm not sure userns-mode private, which this PR enables, is the best approach. The fact that it allows mapping UIDs from the uid-range of a different project namespace seems problematic.

@nalind
Copy link
Copy Markdown
Member Author

nalind commented Dec 13, 2021

Agreed. Closing this one to ensure it doesn't get merged in the meantime.

@nalind nalind closed this Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants