Skip to content

Update MetalLB how to with instructions for k8s 1.25+#1291

Merged
openshift-merge-robot merged 2 commits intoopenshift:mainfrom
kevchu3:metallb-docs-update
Jan 26, 2023
Merged

Update MetalLB how to with instructions for k8s 1.25+#1291
openshift-merge-robot merged 2 commits intoopenshift:mainfrom
kevchu3:metallb-docs-update

Conversation

@kevchu3
Copy link
Copy Markdown
Member

@kevchu3 kevchu3 commented Jan 24, 2023

Which issue(s) this PR addresses: #1290

Closes #1290

@openshift-ci openshift-ci Bot requested review from dhellmann and mangelajo January 24, 2023 18:52
@ggiguash
Copy link
Copy Markdown
Contributor

ggiguash commented Jan 25, 2023

Thank you, @kevchu3.
If we are fixing this document, can we also look further at the nginx installation and test? It should have the same problem, shouldn't it?

@ggiguash
Copy link
Copy Markdown
Contributor

/assign @ggiguash

@kevchu3
Copy link
Copy Markdown
Member Author

kevchu3 commented Jan 25, 2023

Hi @ggiguash,

Here are my thoughts on applying this to the nginx example application. As a best practices, end user OpenShift (and subsequently MicroShift) applications should not be running as privileged pods (i.e. root). The recommendations for the metallb-system load balancers are different as they are part of the infrastructure and may require root. I looked into the docker.io/library/nginx:1.18 image that was being used, and it is indeed intended to be run as root, which may be fine with other k8s distros but is not best practices for MicroShift. In fact, a quick traversal of the image showed mostly all of the filesystem including configurations needed by nginx are owned by root user and group. That being said, the proper approach would likely be to select another base image that adheres to OpenShift standards of running non-privileged containers. However, I couldn't immediately find one that was consumable for nginx which didn't require us to build from source.

The other item to call out for our nginx application is that there is no deprecated/removed PodSecurityPolicy being applied in the manifests. Thus, the issues affecting this nginx application when deploying in OpenShift are entirely different, and have persisted well before k8s 1.25+ removed PodSecurityPolicy.

Anyways, I'll walk through a hack workaround to resolve the issue for purposes of testing. Here are my results:

First, I ran the original instructions:

NAMESPACE=nginx-lb-test
oc create ns $NAMESPACE
oc apply -n $NAMESPACE -f https://raw.githubusercontent.com/openshift/microshift/main/docs/config/nginx-IP-header.yaml

Which subsequently runs into the issue you've observed where the pod can't modify files on the filesystem. Note that whoami confirms the pod runs as the high UID in MicroShift.

$ oc get pods
NAME                     READY   STATUS             RESTARTS      AGE
nginx-65fd5cf854-cnvhw   0/1     CrashLoopBackOff   3 (52s ago)   90s
nginx-65fd5cf854-gnjz2   0/1     CrashLoopBackOff   3 (46s ago)   90s
nginx-65fd5cf854-trpt2   0/1     CrashLoopBackOff   3 (50s ago)   90s

$ oc logs nginx-65fd5cf854-cnvhw
/docker-entrypoint.sh: /docker-entrypoint.d/ is not empty, will attempt to perform configuration
/docker-entrypoint.sh: Looking for shell scripts in /docker-entrypoint.d/
/docker-entrypoint.sh: Launching /docker-entrypoint.d/10-listen-on-ipv6-by-default.sh
10-listen-on-ipv6-by-default.sh: error: can not modify /etc/nginx/conf.d/default.conf (read-only file system?)
/docker-entrypoint.sh: Launching /docker-entrypoint.d/20-envsubst-on-templates.sh
/docker-entrypoint.sh: Configuration complete; ready for start up
2023/01/25 17:37:50 [warn] 1#1: the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
nginx: [warn] the "user" directive makes sense only if the master process runs with super-user privileges, ignored in /etc/nginx/nginx.conf:2
2023/01/25 17:37:50 [emerg] 1#1: mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
nginx: [emerg] mkdir() "/var/cache/nginx/client_temp" failed (13: Permission denied)
[root@microshift-starter ~]# oc debug pod/nginx-65fd5cf854-cnvhw
Starting pod/nginx-65fd5cf854-cnvhw-debug ...
Pod IP: 10.42.0.33
If you don't see a command prompt, try pressing enter.

$ cd /etc/nginx/conf.d	
$ ls -l
total 8
-rw-r--r--. 1 root root       1093 Oct 29  2020 default.conf
-rw-r--r--. 1 root 1000160000   44 Jan 25 17:38 headers.conf
$ touch newfile
touch: cannot touch 'newfile': Permission denied
$ whoami
1000160000

I then apply the privileged annotation to the namespace. Note that pod-security.kubernetes.io/enforce: privileged is now set.

$ oc adm policy add-scc-to-user privileged -z default -n $NAMESPACE
$ oc get ns $NAMESPACE -o yaml
apiVersion: v1
kind: Namespace
metadata:
  annotations:
    openshift.io/sa.scc.mcs: s0:c13,c2
    openshift.io/sa.scc.supplemental-groups: 1000160000/10000
    openshift.io/sa.scc.uid-range: 1000160000/10000
  creationTimestamp: "2023-01-25T17:37:45Z"
  labels:
    kubernetes.io/metadata.name: nginx-lb-test
    pod-security.kubernetes.io/enforce: privileged
    pod-security.kubernetes.io/enforce-version: v1.24
  name: nginx-lb-test
  resourceVersion: "109157"
  uid: a11ba70c-6ae8-4913-8b88-9e2dd784f82f
spec:
  finalizers:
  - kubernetes
status:

However, this isn't sufficient because the pod still expects to run as the high UID (in my case 1000160000). What we need to do is set spec.securityContext.runAsUser: 0 in the definition of the container inside the Deployment to explicitly tell the pod to run as root. My disclaimer still applies as this is NOT a security best practice. Here is the complete YAML manifest and should probably replace the existing nginx-IP-header.yaml example:

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx
data:
  headers.conf: |
    add_header X-Server-IP $server_addr always;
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      securityContext:
        runAsUser: 0
      containers:
      - image: nginx:1.18
        imagePullPolicy: Always
        name: nginx
        ports:
        - containerPort: 80
        volumeMounts:
        - name: nginx-configs
          subPath: headers.conf
          mountPath: /etc/nginx/conf.d/headers.conf
      volumes:
        - name: nginx-configs
          configMap:
            name: nginx
            items:
              - key: headers.conf
                path: headers.conf

I'd like to hear your thoughts on whether we should move forward with this example for purposes of testing, before I go ahead and update the PR to include these changes.

@ggiguash
Copy link
Copy Markdown
Contributor

@kevchu3, I think your suggestion makes a lot of sense. The purpose of this example is to demonstrate MetalLB load balancing, so having a simple-enough test is preferrable.

Please, update the PR with your proposal. Is only running as root user sufficient, or we also need to apply scc?
If scc change is necessary, let's do it right after creating the namespace.

@kevchu3
Copy link
Copy Markdown
Member Author

kevchu3 commented Jan 25, 2023

Looks like the SCC is required. Here's what happens when I create the namespace and workload but don't apply the SCC. I'll create a PR with your latest recommendations.

$ oc get events
LAST SEEN   TYPE      REASON              OBJECT                       MESSAGE
5s          Warning   FailedCreate        replicaset/nginx-9d96b8987   Error creating: pods "nginx-9d96b8987-" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, spec.containers[0].securityContext.runAsUser: Invalid value: 0: must be in the ranges: [1000130000, 1000139999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, provider "nonroot": Forbidden: not usable by user or serviceaccount, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "topolvm-node": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]

@ggiguash
Copy link
Copy Markdown
Contributor

/label tide/merge-method-squash

@openshift-ci openshift-ci Bot added the tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges. label Jan 26, 2023
@ggiguash
Copy link
Copy Markdown
Contributor

@kevchu3 , thank you for your help.
/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jan 26, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ggiguash, kevchu3

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 26, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 26, 2023

@kevchu3: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 245661e into openshift:main Jan 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] MetalLB instructions fail on k8s 1.25+

3 participants