node_controller: mark unavailable if configs differ#699
node_controller: mark unavailable if configs differ#699runcom wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: runcom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Skip Unreconcilable nodes to allow them to roll back though. The NodeController shouldn't rely just on what it's syncing at that moment to deduce that a node is unavailable. It may happen that the pool is updating to a rendered-config-A but it's not done, and the code was still going to apply a new rendered-config-B causing more than 1 node at the time to go unschedulable. This patch should fix that.
| nodeNotReady := !isNodeReady(node) | ||
| if dconfig == currentConfig && (dconfig != cconfig || nodeNotReady) { | ||
| // we want to be able to roll back if a bad MC caused an unreconcilable state | ||
| if (dconfig != cconfig || nodeNotReady) && dstate != daemonconsts.MachineConfigDaemonStateUnreconcilable { |
There was a problem hiding this comment.
I think this makes sense...just a note to self that the function is now not the inverse of getReadyMachines.
(also this PR will conflict with the doc comments I added in the other PR)
There was a problem hiding this comment.
I think this makes sense...just a note to self that the function is now not the inverse of getReadyMachines.
I'm not really sure about this actually, but wanted to check what tests say :(
|
/test e2e-aws-op |
|
OK right so this PR is aiming to fix #697 (comment) I think I see now how it could be doing so. However, it feels like there's a lot of other related logic issues here. For example: If we have a node that went degraded, and we're targeting a new config, we should probably fix that one first right? Also this code should recognize that it's always safe to revert a node to its current config. IOW we don't need to respect |
|
OK I feel like I'm down a big rabbit hole here...trying to figure out what we really intend things like "ready" versus "updated" versus "unavailable" to mean and what the state transitions should be. In the end...what's clearly broken right now is reverting a MC. I don't understand quite how this test even passed before. We also really need to verify that we're not ever exceeding |
I believe we were never honoring that in reality since we're wrongly checking a given MC for a pool (that's why I dropped that check in here). So yeah, we need to make sure about that.
This has worked till this since we effectively allow e.g. 1 node degraded for a given rendered-mc of a pool but we can still make progress if the rendered-mc for the pool changes (I hope it makes sense but maybe I'm not able to communicate it).
yah, guess we need reverse engineer all that code and cross fingers that current units will catch any regression we're trying to introduce during a potential refactor. |
|
I took a stab at this in #701 |
|
is there a specific order these (now) 3 prs need to go in? |
|
or is 701 going to be the PR? |
I think we don't know yet 😦 The other upgrade BZ is taking most of the mental energy right now and this bug is...ugly. I am fearing the rabbit hole that trying to fix it is going to lead us down. |
Retesting in the event that's a flake. /test e2e-aws-op |
|
So this one did pass CI and it's obviously a lot simpler than #701 However, this PR isn't explicitly trying to fix https://bugzilla.redhat.com/show_bug.cgi?id=1707212 either, though it may be doing so also? I think the key is the combination of not filtering unavailable by the target config, and also explicitly checking the MCD state. Both PRs do that. I found the logic very confusing before though, and 701 goes a lot farther in trying to address it; we avoid the duplicate logic in |
|
I'm closing this in favor of #701 which I like more |
Bigger rewrite for openshift#699 This should ensure that the node controller avoids exceeding maxUnavailable, by changing the notion of "unavailable" to mean nodes which are either not ready, *or* may become not ready as they're working on an update. That list is also further filtered by nodes which are degraded/unreconcilable. Also while we're here, just hardcode `maxUnavailable=1` for any pool with the name `master` since that's all we support. Note the clearer separation between "degraded" and "unavailable" now required changes in the tests.
Skip Unreconcilable nodes to allow them to roll back though.
The NodeController shouldn't rely just on what it's syncing at that
moment to deduce that a node is unavailable. It may happen that the pool
is updating to a rendered-config-A but it's not done, and the code was
still going to apply a new rendered-config-B causing more than 1 node at
the time to go unschedulable. This patch should fix that.