From 377a78bb9fd2253b40035ac8c26f8bb217a8dc49 Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Mon, 16 Dec 2024 17:38:15 -0800 Subject: [PATCH] pkg/operator/status: Drop PoolUpdating as an Upgradeable=False condition 956e7874dc (Implement Upgrade-Monitor, FeatureGate, and MachineConfigNode types, 2023-11-28, #4012) had added the "this should no longer trigger when adding a node to a pool" comment, but unfortunately, it's still triggering. For example, in [1]: $ curl -s https://storage.googleapis.com/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/build-log.txt | grep 'PoolUpdating' | sort | uniq time="2024-12-16T01:43:52Z" level=info msg="operator status: processing event" event="Dec 16 00:55:35.662 W clusteroperator/machine-config condition/Upgradeable reason/PoolUpdating status/False One or more machine config pools are updating, please see `oc get mcp` for further details" operator=machine-config Checking PromeCIeus, the Upgradeable=False window seems to have been 00:56 through 00:59, which correlates with the scale-up/scale-down of the serial suite: $ curl -s https://storage.googleapis.com/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/build-log.txt | grep 'Managed cluster should grow and decrease when scaling different machineSets simultaneously' started: 0/20/74 "[sig-cluster-lifecycle][Feature:Machines][Serial] Managed cluster should grow and decrease when scaling different machineSets simultaneously [Timeout:30m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial]" passed: (5m42s) 2024-12-16T00:57:49 "[sig-cluster-lifecycle][Feature:Machines][Serial] Managed cluster should grow and decrease when scaling different machineSets simultaneously [Timeout:30m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial]" confirmed via MCC logs: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/artifacts/e2e-gcp-ovn-serial-crun/gather-extra/artifacts/pods/openshift-machine-config-operator_machine-config-controller-6f4f46457c-v8b2l_machine-config-controller.log | grep rendered- I1216 00:55:35.430231 1 node_controller.go:584] Pool worker[zone=us-central1-f]: node ci-op-k8c03v6z-9149a-r27w7-worker-f-t7rmb: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:35.430252 1 node_controller.go:584] Pool worker[zone=us-central1-f]: node ci-op-k8c03v6z-9149a-r27w7-worker-f-t7rmb: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:36.174629 1 node_controller.go:584] Pool worker[zone=us-central1-a]: node ci-op-k8c03v6z-9149a-r27w7-worker-a-f7hkj: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:36.174738 1 node_controller.go:584] Pool worker[zone=us-central1-a]: node ci-op-k8c03v6z-9149a-r27w7-worker-a-f7hkj: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:41.296273 1 node_controller.go:584] Pool worker[zone=us-central1-b]: node ci-op-k8c03v6z-9149a-r27w7-worker-b-554bt: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:41.296306 1 node_controller.go:584] Pool worker[zone=us-central1-b]: node ci-op-k8c03v6z-9149a-r27w7-worker-b-554bt: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:47.106173 1 node_controller.go:584] Pool worker[zone=us-central1-c]: node ci-op-k8c03v6z-9149a-r27w7-worker-c-hshj2: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 I1216 00:55:47.106201 1 node_controller.go:584] Pool worker[zone=us-central1-c]: node ci-op-k8c03v6z-9149a-r27w7-worker-c-hshj2: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2 In this commit, I'm dropping the code that had been moving the ClusterOperator to Upgradeable=False on PoolUpdating entirely, instead of hoping that it doesn't trip. I haven't dug into why the code had still been tripping. But we want to stay Upgradeable=True while new nodes scale in, because clusters where nodes are joining should still be able to update to 4.(y+1). There are node-vs.-control-plane skew issues that should block updates to 4.(y+1), but they're enforced by the Kube API server operator [2], and don't need the MCO chipping in. [1]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712 [2]: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1199/commits/9ce4f7477570945dabf414e6282296055cf60438 --- pkg/operator/status.go | 7 ------- 1 file changed, 7 deletions(-) diff --git a/pkg/operator/status.go b/pkg/operator/status.go index 63699ec5ca..eb015e0057 100644 --- a/pkg/operator/status.go +++ b/pkg/operator/status.go @@ -285,13 +285,6 @@ func (optr *Operator) syncUpgradeableStatus(co *configv1.ClusterOperator) error break } } - // this should no longer trigger when adding a node to a pool. It should only trigger if the node actually has to go through an upgrade - // updating and degraded can occur together, in that case defer to the degraded Reason that is already set above - if updating && !degraded && !interrupted { - coStatusCondition.Status = configv1.ConditionFalse - coStatusCondition.Reason = "PoolUpdating" - coStatusCondition.Message = "One or more machine config pools are updating, please see `oc get mcp` for further details" - } // don't overwrite status if updating or degraded if !updating && !degraded && !interrupted {