daemon: Have nodeWriter maintain ref to lister and node name#3143
Conversation
The NodeWriter is an instance of the "actor" pattern effectively; it processes messages to update its internal state. However, it's a bit redundant to have every update take the node lister and node name - we only use this on the daemon side, where there is exactly one node we will be updating. Have the clusterNodeWriter instance of this interface cache references to that data and avoid passing it on every message. Prep for reworking the NodeWriter to have its own clientset.
So I can use this from the daemon code.
|
WFM just waiting to see e2e |
| dn.mcLister = mcInformer.Lister() | ||
| dn.mcListerSynced = mcInformer.Informer().HasSynced | ||
|
|
||
| dn.nodeWriter = newNodeWriter(dn.name, dn.kubeClient.CoreV1().Nodes(), dn.nodeLister) |
There was a problem hiding this comment.
Do you plan on creating both kubeclients in this main connect function, and then passing one to the nodewriter? I think having nodewriter maintaining it's own kubeclient (as you did in #3141) could work, and then the MCD SA could just have read/list/get to read off of the node. (Which would match the kubelet's perms according to https://kubernetes.io/docs/reference/access-authn-authz/node/ so its no extra harm)
Alternatively, we could have dn.kubeClient be initialized off of kubelet and have both nodewriter and regular sync loop use that, and then have just a special client for reading MC objects used by the update loop. Which do we prefer?
There was a problem hiding this comment.
The later patch initializes the nodewriter kubeclient inside its initialization.
I think we have multiple phases here:
phase 0: node writes through nodewriter via kubelet auth
phase 1: s/nodewriter/nodeagent/ and all node operations through it including reads
phase 2: stop the daemon reading machineconfig at all, and switch to mounted configmap/secret and dedup things with hypershift then there's no separate service account at all
I'm mainly aiming for phase 0.
There was a problem hiding this comment.
Hmm ok, let's aim for phase 0 for now. I'll rebase onto 3141, and then still use the regular SA, and change the RBAC to only have node non-write perms and MC non-write perms. Do you think that makes sense?
I'm +1 on merging this as a refactor in general so
/approve
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, kikisdeliveryservice, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
4 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@cgwalters: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
As part of openshift#3143 we re-ordered the clusterconnect operations. This meant that the login monitor started before the nodewriter object was initialized, which caused a race where the monitor could try to write the ssh accessed annotation before the nodewriter was initialized, causing a panic. This is good to get fixed, but also note that: 1. this is relatively rare since most users should not be ssh'ing 2. the general ssh accessed annotation is broken such that its not always applied, but we definitely should not panic here.
As part of openshift#3143 we re-ordered the clusterconnect operations. This meant that the login monitor started before the nodewriter object was initialized, which caused a race where the monitor could try to write the ssh accessed annotation before the nodewriter was initialized, causing a panic. This is good to get fixed, but also note that: 1. this is relatively rare since most users should not be ssh'ing 2. the general ssh accessed annotation is broken such that its not always applied, but we definitely should not panic here.
This is just the first two commits from #3141
I think we can safely land this now.
--
daemon: Have nodeWriter maintain ref to lister and node name
The NodeWriter is an instance of the "actor" pattern effectively;
it processes messages to update its internal state.
However, it's a bit redundant to have every update take the node
lister and node name - we only use this on the daemon side,
where there is exactly one node we will be updating.
Have the clusterNodeWriter instance of this interface cache
references to that data and avoid passing it on every message.
Prep for reworking the NodeWriter to have its own clientset.
controller: Export default resync period function
So I can use this from the daemon code.