From 377a78bb9fd2253b40035ac8c26f8bb217a8dc49 Mon Sep 17 00:00:00 2001
From: "W. Trevor King" <wking@tremily.us>
Date: Mon, 16 Dec 2024 17:38:15 -0800
Subject: [PATCH] pkg/operator/status: Drop PoolUpdating as an
 Upgradeable=False condition

956e7874dc (Implement Upgrade-Monitor, FeatureGate, and
MachineConfigNode types, 2023-11-28, #4012) had added the "this should
no longer trigger when adding a node to a pool" comment, but
unfortunately, it's still triggering.  For example, in [1]:

  $ curl -s https://storage.googleapis.com/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/build-log.txt | grep 'PoolUpdating' | sort | uniq
  time="2024-12-16T01:43:52Z" level=info msg="operator status: processing event" event="Dec 16 00:55:35.662 W clusteroperator/machine-config condition/Upgradeable reason/PoolUpdating status/False One or more machine config pools are updating, please see `oc get mcp` for further details" operator=machine-config

Checking PromeCIeus, the Upgradeable=False window seems to have been
00:56 through 00:59, which correlates with the scale-up/scale-down of
the serial suite:

  $ curl -s https://storage.googleapis.com/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/build-log.txt | grep 'Managed cluster should grow and decrease when scaling different machineSets simultaneously'
  started: 0/20/74 "[sig-cluster-lifecycle][Feature:Machines][Serial] Managed cluster should grow and decrease when scaling different machineSets simultaneously [Timeout:30m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial]"
  passed: (5m42s) 2024-12-16T00:57:49 "[sig-cluster-lifecycle][Feature:Machines][Serial] Managed cluster should grow and decrease when scaling different machineSets simultaneously [Timeout:30m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial]"

confirmed via MCC logs:

  $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712/artifacts/e2e-gcp-ovn-serial-crun/gather-extra/artifacts/pods/openshift-machine-config-operator_machine-config-controller-6f4f46457c-v8b2l_machine-config-controller.log | grep rendered-
  I1216 00:55:35.430231       1 node_controller.go:584] Pool worker[zone=us-central1-f]: node ci-op-k8c03v6z-9149a-r27w7-worker-f-t7rmb: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:35.430252       1 node_controller.go:584] Pool worker[zone=us-central1-f]: node ci-op-k8c03v6z-9149a-r27w7-worker-f-t7rmb: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:36.174629       1 node_controller.go:584] Pool worker[zone=us-central1-a]: node ci-op-k8c03v6z-9149a-r27w7-worker-a-f7hkj: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:36.174738       1 node_controller.go:584] Pool worker[zone=us-central1-a]: node ci-op-k8c03v6z-9149a-r27w7-worker-a-f7hkj: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:41.296273       1 node_controller.go:584] Pool worker[zone=us-central1-b]: node ci-op-k8c03v6z-9149a-r27w7-worker-b-554bt: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:41.296306       1 node_controller.go:584] Pool worker[zone=us-central1-b]: node ci-op-k8c03v6z-9149a-r27w7-worker-b-554bt: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:47.106173       1 node_controller.go:584] Pool worker[zone=us-central1-c]: node ci-op-k8c03v6z-9149a-r27w7-worker-c-hshj2: changed annotation machineconfiguration.openshift.io/currentConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2
  I1216 00:55:47.106201       1 node_controller.go:584] Pool worker[zone=us-central1-c]: node ci-op-k8c03v6z-9149a-r27w7-worker-c-hshj2: changed annotation machineconfiguration.openshift.io/desiredConfig = rendered-worker-6d0e61dc44f24db3272625b901024ed2

In this commit, I'm dropping the code that had been moving the
ClusterOperator to Upgradeable=False on PoolUpdating entirely, instead
of hoping that it doesn't trip.  I haven't dug into why the code had
still been tripping.  But we want to stay Upgradeable=True while new
nodes scale in, because clusters where nodes are joining should still
be able to update to 4.(y+1).  There are node-vs.-control-plane skew
issues that should block updates to 4.(y+1), but they're enforced by
the Kube API server operator [2], and don't need the MCO chipping in.

[1]: https://prow.ci.openshift.org/view/gs/test-platform-results/logs/periodic-ci-openshift-release-master-nightly-4.19-e2e-gcp-ovn-serial-crun/1868424902256627712
[2]: https://github.com/openshift/cluster-kube-apiserver-operator/pull/1199/commits/9ce4f7477570945dabf414e6282296055cf60438
---
 pkg/operator/status.go | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/pkg/operator/status.go b/pkg/operator/status.go
index 63699ec5ca..eb015e0057 100644
--- a/pkg/operator/status.go
+++ b/pkg/operator/status.go
@@ -285,13 +285,6 @@ func (optr *Operator) syncUpgradeableStatus(co *configv1.ClusterOperator) error
 			break
 		}
 	}
-	// this should no longer trigger when adding a node to a pool. It should only trigger if the node actually has to go through an upgrade
-	// updating and degraded can occur together, in that case defer to the degraded Reason that is already set above
-	if updating && !degraded && !interrupted {
-		coStatusCondition.Status = configv1.ConditionFalse
-		coStatusCondition.Reason = "PoolUpdating"
-		coStatusCondition.Message = "One or more machine config pools are updating, please see `oc get mcp` for further details"
-	}
 
 	// don't overwrite status if updating or degraded
 	if !updating && !degraded && !interrupted {