MCO-831: added feature gate to mco for on cluster builds#4060
Conversation
|
/retest |
|
/retest |
cdoern
left a comment
There was a problem hiding this comment.
Looks good, this should work since the operator is already wired up to wait for featuregates.
|
@dkhater-redhat here is some info on the failures: it looks like its possible the FGs changed since we last vendored in the API, that is why unit is failing. Make sure you check the cluster config operator and/or API to see if anyone removed any features for the default FG. if they did, you'll need to edit Other than that, verify is failing bc you bumped k8s to a version that changed what errorf and a few other functions allow as formatted args you'll need to go into the functions and change Going to take a look at e2e failures now, I have a feeling its all FG related. |
|
actually. It seems the failure is on build: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/4060/pull-ci-openshift-machine-config-operator-master-e2e-gcp-op/1732782030187401216/build-log.txt at the bottom either
I'll go look at c/common to make sure they didn't delete anything |
yuqi-zhang
left a comment
There was a problem hiding this comment.
Could you please squash the commits? Thanks!
f3efccc to
b62162e
Compare
|
/test ci/prow/unit |
|
@dkhater-redhat: The specified target(s) for
The following commands are available to trigger optional jobs:
Use
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/test unit |
99c8cf4 to
06c6bb6
Compare
|
Do you have unit tests that depend on the build controller working? If so, those aren't going to work since the FG isn't enabled. Also, if they are in unit tests, and you try to check if a FG exists you will get errors. I can't tell if that is the issue or if the pods can't build. Try make binaries locally maybe and see if there are any failures? |
|
there are also a "does not support" errors for formatting in here: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/4060/pull-ci-openshift-machine-config-operator-master-unit/1732880126053453824/build-log.txt |
|
/test unit |
9e1a1ac to
5b26e22
Compare
|
/retest-required |
|
I know the SCOS test isn't required, but the operator pod on SCOS is panic-ing because the SCOS featuregate list doesn't have the new featuregate in it: And it's right, it doesn't: But it does have |
|
/hold |
|
/retest-required |
|
by default featureGate: OnClusterBuild is disabled. $ oc get featuregate/cluster -o yaml | yq -y '.status.featureGates[].disabled' | grep OnClusterBuild
- name: OnClusterBuildtry to turn on OCB when this featureGate is disabled # create custom mcp
$ cat infra.mcp.yaml
apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfigPool
metadata:
name: infra
spec:
machineConfigSelector:
matchExpressions:
- {key: machineconfiguration.openshift.io/role, operator: In, values: [worker,infra]}
nodeSelector:
matchLabels:
node-role.kubernetes.io/infra: ""
$ oc apply -f infra.mcp.yaml
machineconfigpool.machineconfiguration.openshift.io/infra created
$ oc label node/ip-10-0-118-209.us-west-1.compute.internal node-role.kubernetes.io/infra=
node/ip-10-0-118-209.us-west-1.compute.internal labeled
$ mcp infra
NAME CONFIG UPDATED UPDATING DEGRADED MACHINECOUNT READYMACHINECOUNT UPDATEDMACHINECOUNT DEGRADEDMACHINECOUNT AGE
infra rendered-infra-ee03b72baa24ec9b3184492526b023b3 True False False 1 1 1 0 11m
# create configmap, pullsecret etc.
$ cat cm-on-cluster-build-config.yaml
apiVersion: v1
data:
baseImagePullSecretName: mco-global-pull-secret
finalImagePushSecretName: mco-test-pull-secret
finalImagePullspec: "quay.io/mcoqe/layering"
kind: ConfigMap
metadata:
name: on-cluster-build-config
namespace: openshift-machine-config-operator
$ oc apply -f cm-on-cluster-build-config.yaml
configmap/on-cluster-build-config created
$ oc apply -f base-image-pull-secret.yaml
secret/mco-global-pull-secret created
$ oc apply -f final-image-pull-secret.yaml
secret/mco-test-pull-secret created
$ oc get cm/on-cluster-build-config -n openshift-machine-config-operator
NAME DATA AGE
on-cluster-build-config 3 60s
$ oc get secret -n openshift-machine-config-operator | grep pull-secret
mco-global-pull-secret kubernetes.io/dockerconfigjson 1 81s
mco-test-pull-secret kubernetes.io/dockerconfigjson 1 65s
# label mcp/infra
$ oc label mcp/infra machineconfiguration.openshift.io/layering-enabled=
machineconfigpool.machineconfiguration.openshift.io/infra labeled
$ oc get mcp/infra -o yaml | yq -y '.metadata.labels'
machineconfiguration.openshift.io/layering-enabled: ''
# check deployment
$ oc get deployment -n openshift-machine-config-operator
NAME READY UP-TO-DATE AVAILABLE AGE
machine-config-controller 1/1 1 1 63m
machine-config-operator 1/1 1 1 67mso when the featureGate is disabled, even required resources are created, deployment will not be created. this feature cannot be enabled. patch featuregate/cluster to enable featureGate: OnClusterBuild $ cat ../mc/featuregate_techpreview.yaml
apiVersion: config.openshift.io/v1
kind: FeatureGate
metadata:
name: cluster
spec:
featureSet: TechPreviewNoUpgrade
$ oc apply -f ../mc/featuregate_techpreview.yaml
Warning: resource featuregates/cluster is missing the kubectl.kubernetes.io/last-applied-configuration annotation which is required by oc apply. oc apply should only be used on resources created declaratively by either oc create --save-config or oc apply. The missing annotation will be patched automatically.
featuregate.config.openshift.io/cluster configured
$ oc get featuregate/cluster -o yaml | yq -y '.status.featureGates[].enabled' | grep OnClusterBuild
- name: OnClusterBuildcheck whether deployment and pod are ready and running $ oc get deployment/machine-os-builder -n openshift-machine-config-operator
NAME READY UP-TO-DATE AVAILABLE AGE
machine-os-builder 1/1 1 1 20m
$ oc get pod -n openshift-machine-config-operator -l k8s-app=machine-os-builder
NAME READY STATUS RESTARTS AGE
machine-os-builder-bdc4f7d8c-cm6c2 1/1 Running 0 20mbtw, featureGate: $ oc get featuregate/cluster -o yaml | yq -y '.status.featureGates[].disabled'
- name: ClusterAPIInstall
- name: DisableKubeletCloudCredentialProviders
- name: EventedPLEG
- name: MachineAPIOperatorDisableMachineHealthCheckController |
|
/retitle MCO-831 added feature gate to mco for on cluster builds |
|
CI is just rotten today |
|
/hold Revision 251523f was retested 3 times: holding |
|
There were issues with build02 earlier, hopefully better now /hold cancel |
|
/test e2e-gcp-op |
|
/override ci/prow/e2e-gcp-op-single-node |
|
@jkyros: Overrode contexts on behalf of jkyros: ci/prow/e2e-gcp-op-single-node DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
hahaha I don't think build02 is fixed: |
|
/test e2e-gcp-op |
|
It's not telling me here what the conflict is but locally it looks like we just got beat to a dependency bump, probably #4119 ? |
|
That looks like it worked -- the tests passed, we just failed in teardown. I'd pull out that "merge" commit 09ec41d (I assume that was just collateral damage from the rebase or somesuch) and then we can try again to get this in? 😄 |
12a01d0 to
1f41fcb
Compare
1f41fcb to
df610a6
Compare
|
/test e2e-hypershift |
|
/lgtm |
|
/override ci/prow/e2e-gcp-op-single-node |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cdoern, dkhater-redhat, jkyros The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@jkyros: Overrode contexts on behalf of jkyros: ci/prow/e2e-gcp-op-single-node DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@dkhater-redhat: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
- What I did
- How to verify it
- Description for the changelog