Skip to content

Start of centralizing node writes via nodeWriter#3141

Closed
cgwalters wants to merge 7 commits intoopenshift:masterfrom
cgwalters:nodewriter-rework
Closed

Start of centralizing node writes via nodeWriter#3141
cgwalters wants to merge 7 commits intoopenshift:masterfrom
cgwalters:nodewriter-rework

Conversation

@cgwalters
Copy link
Copy Markdown
Member

Part of #3135


daemon: Have nodeWriter maintain ref to lister and node name

The NodeWriter is an instance of the "actor" pattern effectively;
it processes messages to update its internal state.

However, it's a bit redundant to have every update take the node
lister and node name - we only use this on the daemon side,
where there is exactly one node we will be updating.

Have the clusterNodeWriter instance of this interface cache
references to that data and avoid passing it on every message.

Prep for reworking the NodeWriter to have its own clientset.


controller: Export default resync period function

So I can use this from the daemon code.


daemon: Have nodewriter use kubelet credentials


daemon: Change hypershift path to use nodewriter

Ensure that this code path is using kubelet credentials.


cgwalters added 2 commits May 9, 2022 19:29
The NodeWriter is an instance of the "actor" pattern effectively;
it processes messages to update its internal state.

However, it's a bit redundant to have every update take the node
lister and node name - we only use this on the daemon side,
where there is exactly one node we will be updating.

Have the clusterNodeWriter instance of this interface cache
references to that data and avoid passing it on every message.

Prep for reworking the NodeWriter to have its own clientset.
So I can use this from the daemon code.
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 10, 2022
@cgwalters
Copy link
Copy Markdown
Member Author

Digging into this, there's a fair amount of stuff that's needed:

  • Move all Event notifications to use nodeWriter
  • Fix up a few places that are calling setAnnotations directly to use nodeWriter
  • Drop privs from the RBAC

However, I think after that it would really make the most sense to change nodeWriter into "nodeActor" and have it also be the sole thing providing read access to the node object to the rest of the MCD. This would obviously touch more code. But one immediate thing this would clean up is that today the MCD (i think) watches all nodes via informers, which is clearly just wrong. We should only watch our own node, and then the nodeActor provides a "node changed" event to the rest of the MCD.

Ensure that this code path is using kubelet credentials.
@cgwalters cgwalters marked this pull request as draft May 10, 2022 00:49
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 10, 2022
@cgwalters
Copy link
Copy Markdown
Member Author

Ah yes, I forgot to start the informer; classic. Will look at fixing that up.

@cgwalters cgwalters force-pushed the nodewriter-rework branch from 9b7f193 to b8683ac Compare May 10, 2022 15:53
@cgwalters
Copy link
Copy Markdown
Member Author

/test e2e-gcp-op

@cgwalters
Copy link
Copy Markdown
Member Author

Does this direction make sense to everyone?

@cgwalters
Copy link
Copy Markdown
Member Author

To back up to a high level on this - we want to use the kubelet credentials for everything ideally, but the problem is kubelet can't read e.g. MachineConfig objects today. One option is to just add that power - then we use kubelet creds for everything. It's a bit tempting.

This patch is splitting up our node mutations to use the nodeWriter which uses kubelet credentials, then our service account is mostly just there to read machineconfigs (and our node - but a future enhancement to this work could also change the nodeWriter into daemonNodeController or something that mediates access to the node).

@cgwalters
Copy link
Copy Markdown
Member Author

I do kind of feel the right long term direction is to change things so that we unify hypershift and self-driving cases.

Probably, we should indeed move away from a daemonset for self-driving too. Once we do that, it becomes natural then to e.g. put machineconfig in a configmap (or really, should be a secret) like hypershift is doing. Then the daemon/pod doesn't need any special roles.

Things like ssh access monitoring, config drift could be implemented instead with a periodically scheduled pod and not a daemonset. (Although at the cost of polling vs being event driven)

Another possible in between model is daemonset-per-pool - here we can then mount the current/desired machineconfig into the pods specifically rather than having the daemonset fetch them.

@cgwalters
Copy link
Copy Markdown
Member Author

One option is to just add that power - then we use kubelet creds for everything. It's a bit tempting.

I looked at this briefly; I think it'd require us to carry a patch to the apiserver to do that. Definitely ugly. I think I'd lean towards going this direction instead, but open to debate.

@cgwalters
Copy link
Copy Markdown
Member Author

/approve cancel
per my SOP for big changes - big PRs should get another approver and a lgtm.

@openshift-ci openshift-ci Bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 10, 2022
@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

/test unit
/test verify

Copy link
Copy Markdown
Contributor

@kikisdeliveryservice kikisdeliveryservice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a first pass (without reviewing the discussion point on the PR), I think this is a solid optimization that yields more readable and centralized code. Always felt odd to have to pass nodenames through daemon functions bc well.. shouldn't it know where it is? 😆

Comment thread cmd/machine-config-daemon/start.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: similar block above but that leads to a ctrl.common.WriteTerminationError while this is fatalf with Writetermination error for block below. Is this consistent? (Might be, I might be misunderstanding).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Forgot to look at this one, will do in the next pass)

Comment thread pkg/daemon/writer.go Outdated
Comment thread pkg/daemon/writer.go Outdated
Comment thread pkg/daemon/writer.go Outdated
@cgwalters
Copy link
Copy Markdown
Member Author

OK cool. I took the first two commits and put them in #3143 - I think we can get that in now.

I'll add some more work on top of this one for the followup listed above.

Comment thread pkg/daemon/writer.go Outdated
@cgwalters cgwalters force-pushed the nodewriter-rework branch from b8683ac to 2ba5dd8 Compare May 11, 2022 20:54
We should be able to rely on `nodeWriter` using kubelet credentials
now.
@cgwalters
Copy link
Copy Markdown
Member Author

/test unit
/test verify
/test e2e-gcp-op

@cgwalters
Copy link
Copy Markdown
Member Author

/retest

I originally missed that `loadAnnotations` was actually calling
this API which should have been nodeWriter internal.

But going to fix it, then I hit on the problem that the code flow
in the daemon relied on getting an updated node object back.

Fixing that required some code churn to also pass back the mutated
node object through the response message.

I think what we *really* want actually is to wait for the informer
to update and pass that back instead.  Which then really leads
to making this stuff part of a control loop on the daemon
and not randomly blocking here.

For now though, let's keep the daemon code as is and just change
this bit to use nodeWriter for kubelet credentials.
@cgwalters cgwalters marked this pull request as ready for review May 11, 2022 22:16
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 11, 2022
@cgwalters
Copy link
Copy Markdown
Member Author

OK, let's see how this goes in a CI run; only compile tested locally.

Copy link
Copy Markdown
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving the general direction. Also rebased #3135 onto this

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 11, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2022
@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

kikisdeliveryservice commented May 11, 2022

level=fatal msg=failed to fetch Cluster: failed to fetch dependency of "Cluster": failed to generate asset "Platform Quota Check": error(MissingQuota): compute.googleapis.com/n2_cpus is not available in us-central1 because the required number of resources (12) is more than remaining quota of 8

Trying this again, will report if see same failure
/test e2e-gcp-op

@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

kikisdeliveryservice commented May 11, 2022

new error, will report:

level=error msg=Error: Error waiting for instance to create: Quota 'N2_CPUS' exceeded.  Limit: 24.0 in region us-central1.
level=error

Reported and it's an issue with some deprovisioning hitting us today.

@cgwalters
Copy link
Copy Markdown
Member Author

cgwalters commented May 12, 2022

{Operator progressing (ProfileProgressing): Waiting for 1/6 Profiles to be applied Operator progressing (ProfileProgressing): Waiting for 1/6 Profiles to be applied}

Hmm, this was a bit concerning at first because I thought NTO might be generating some machineconfig, but AFAICS that isn't the case. I looked at their logs but didn't see the problem.
/test e2e-agnostic-upgrade

EDIT: OK, looks like this is a common failure

@yuqi-zhang
Copy link
Copy Markdown
Contributor

The installer PR merged. Let's try

/test e2e-gcp-op

@sinnykumari
Copy link
Copy Markdown
Contributor

/test e2e-gcp-op

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 13, 2022

@cgwalters: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agnostic-upgrade de3fa84 link true /test e2e-agnostic-upgrade
ci/prow/e2e-gcp-op de3fa84 link true /test e2e-gcp-op

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@cgwalters
Copy link
Copy Markdown
Member Author

E0513 12:40:14.615182 1711 writer.go:199] Marking Degraded due to: failed to cordon node (5 tries): [timed out waiting for the condition, cordon error: nodes "ci-op-bsr23qd1-1354f-4tpmg-worker-b-6dh2l" is forbidden: User "system:serviceaccount:openshift-machine-config-operator:machine-config-daemon" cannot patch resource "nodes" in API group "" at the cluster scope]

OK, I think we need to take this PR together with #3135

@cgwalters cgwalters closed this May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants