Skip to content

daemon: Have nodeWriter maintain ref to lister and node name#3143

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
cgwalters:nodewriter-rework-prep
May 12, 2022
Merged

daemon: Have nodeWriter maintain ref to lister and node name#3143
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
cgwalters:nodewriter-rework-prep

Conversation

@cgwalters
Copy link
Copy Markdown
Member

This is just the first two commits from #3141

I think we can safely land this now.

--

daemon: Have nodeWriter maintain ref to lister and node name

The NodeWriter is an instance of the "actor" pattern effectively;
it processes messages to update its internal state.

However, it's a bit redundant to have every update take the node
lister and node name - we only use this on the daemon side,
where there is exactly one node we will be updating.

Have the clusterNodeWriter instance of this interface cache
references to that data and avoid passing it on every message.

Prep for reworking the NodeWriter to have its own clientset.


controller: Export default resync period function

So I can use this from the daemon code.


cgwalters added 2 commits May 9, 2022 19:29
The NodeWriter is an instance of the "actor" pattern effectively;
it processes messages to update its internal state.

However, it's a bit redundant to have every update take the node
lister and node name - we only use this on the daemon side,
where there is exactly one node we will be updating.

Have the clusterNodeWriter instance of this interface cache
references to that data and avoid passing it on every message.

Prep for reworking the NodeWriter to have its own clientset.
So I can use this from the daemon code.
@openshift-ci openshift-ci Bot requested review from cheesesashimi and raisaat May 10, 2022 22:03
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 10, 2022
@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

WFM just waiting to see e2e
/assign

Comment thread pkg/daemon/daemon.go
dn.mcLister = mcInformer.Lister()
dn.mcListerSynced = mcInformer.Informer().HasSynced

dn.nodeWriter = newNodeWriter(dn.name, dn.kubeClient.CoreV1().Nodes(), dn.nodeLister)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan on creating both kubeclients in this main connect function, and then passing one to the nodewriter? I think having nodewriter maintaining it's own kubeclient (as you did in #3141) could work, and then the MCD SA could just have read/list/get to read off of the node. (Which would match the kubelet's perms according to https://kubernetes.io/docs/reference/access-authn-authz/node/ so its no extra harm)

Alternatively, we could have dn.kubeClient be initialized off of kubelet and have both nodewriter and regular sync loop use that, and then have just a special client for reading MC objects used by the update loop. Which do we prefer?

Copy link
Copy Markdown
Member Author

@cgwalters cgwalters May 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The later patch initializes the nodewriter kubeclient inside its initialization.

I think we have multiple phases here:

phase 0: node writes through nodewriter via kubelet auth
phase 1: s/nodewriter/nodeagent/ and all node operations through it including reads
phase 2: stop the daemon reading machineconfig at all, and switch to mounted configmap/secret and dedup things with hypershift then there's no separate service account at all

I'm mainly aiming for phase 0.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ok, let's aim for phase 0 for now. I'll rebase onto 3141, and then still use the regular SA, and change the RBAC to only have node non-write perms and MC non-write perms. Do you think that makes sense?

I'm +1 on merging this as a refactor in general so

/approve

@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 11, 2022
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 11, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, kikisdeliveryservice, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [cgwalters,kikisdeliveryservice,yuqi-zhang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2022

@cgwalters: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 605b94b into openshift:master May 12, 2022
yuqi-zhang added a commit to yuqi-zhang/machine-config-operator that referenced this pull request Jun 27, 2022
As part of openshift#3143
we re-ordered the clusterconnect operations. This meant that the login
monitor started before the nodewriter object was initialized, which
caused a race where the monitor could try to write the ssh accessed
annotation before the nodewriter was initialized, causing a panic.

This is good to get fixed, but also note that:
1. this is relatively rare since most users should not be ssh'ing
2. the general ssh accessed annotation is broken such that its not
   always applied, but we definitely should not panic here.
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/machine-config-operator that referenced this pull request Jul 4, 2022
As part of openshift#3143
we re-ordered the clusterconnect operations. This meant that the login
monitor started before the nodewriter object was initialized, which
caused a race where the monitor could try to write the ssh accessed
annotation before the nodewriter was initialized, causing a panic.

This is good to get fixed, but also note that:
1. this is relatively rare since most users should not be ssh'ing
2. the general ssh accessed annotation is broken such that its not
   always applied, but we definitely should not panic here.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants