pkg/daemon: add a current config on disk check#612
pkg/daemon: add a current config on disk check#612openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
|
/hold Was just discussing this with @derekwaynecarr but thought to start experimenting if this is needed/working |
There was a problem hiding this comment.
this is wrong (note to self)
301a880 to
381e8c0
Compare
485a3b6 to
6f5f21a
Compare
There was a problem hiding this comment.
This isn't a fingerprint, it's the whole config right? I find this idea that we serialize all of the files back into JSON as a file in the filesystem kind of funny. But I guess it's not a huge amount of data...
oc get -o yaml machineconfigs/rendered-worker-74e29c89eb37d7aaa30c33c481fae174 | gzip | wc -c
10197
We can certainly assume 10k of data is free.
On the kind-of bikeshed topic, I think /var is probably a better place for this. The ostree model is that /etc is data humans might edit (e.g. "ssh to node and vi").
There was a problem hiding this comment.
fixed naming and moved under /var/machine-config-daemon
There was a problem hiding this comment.
Don't we need to do this before we load currentConfig?
There was a problem hiding this comment.
uhm, we may, I can't see why though, would you explain (I feel dumb)
There was a problem hiding this comment.
If the current config isn't available in the apiserver, won't the Get above fail before we can find our cached version?
There was a problem hiding this comment.
No wait, the scenario here is that after a restore we still have something in the apiserver, which is really what we want on the machine itself, if current isn't on the apiserver, there's not much we can do anyway (which is the BZ we got from David yesterday but this PR isn't tackling that).
This is something I wrote to understand this better:
- current=configA == desired=configA
upgrade
- current=configA != desired=configB (desired being generated by upgrade and it's being applied as new current)
roll out new configs and MCD syncs
- current=configB == desired=configB
...
trigger a backup restore pre-upgrade which isn't calling informers
- current=configA == desired=configA
but now what's on disk is still configB
we now need something which retriggers the MCD to sync to configA
configA is what we need anyway, so it must be in the apiserver after a restore
There was a problem hiding this comment.
can i ask: when is a backup restore triggered? i feel like i understand what this pr is supposed to do, but not fully why?
There was a problem hiding this comment.
there's work still ongoing on others to spec out how/when/where an etcd backup is performed (and a subsequent restore)
There was a problem hiding this comment.
@cgwalters can you check my answer above? I think this is working as intended but I'm waiting on a way to test this out
6f5f21a to
686b0e5
Compare
|
This is good to review and get in - as it's treated as a bug. The testing goes later once the other teams/steps are complete |
|
aws route errors: |
|
I know that this is a bug, but I cant find a BZ/something else for it? Could we add Bug to title? |
there's no BZ for this, came out from a conversation with Derek on Slack :( not sure we need one tho |
|
@runcom cool! just wanted to make sure I didn't miss it somewhere. |
|
weird test didn't re-run /retest |
There was a problem hiding this comment.
So I read your comment a few times...I am finding it really hard to capture the flow/state of things in my head right now.
Edit: ignore this
It feels like this would all be a lot simpler to reason about if basically we had /var/machine-config-daemon/config<hash>.json and we kept those for every config version that was relevant to us (current, desired) and used it if it was present instead of talking to the api server.
This would also help address the other BZ about having things be deleted from the apiserver.
That said...after thinking about this more I think you're right, it will work. What I was worried about is "if we're saying currentConfig is desiredConfig, how do we then actually honor the real desiredConfig?". But if they are different then we'll do a config transition and then on the next boot we should transition to the real desiredConfig?
There was a problem hiding this comment.
OK here's another way to look at it - we're basically saying we can't trust our annotations to describe our current state, the real state is a file on disk. So why are we looking at the annotation at all, versus setting it to the right thing if it's different from what's on disk?
There was a problem hiding this comment.
So why are we looking at the annotation at all, versus setting it to the right thing if it's different from what's on disk?
RIght, I had the same thought indeed, assuming we're talking about the currentConfig (desiredConfig has to come from somewhere/apiserver otherwise we can't really progress right?). So I guess a natural follow up to this would be to avoid talking to the apiserver when querying the currentConfig and just use what it's reflected on disk right?
There was a problem hiding this comment.
That said...after thinking about this more I think you're right, it will work. What I was worried about is "if we're saying currentConfig is desiredConfig, how do we then actually honor the real desiredConfig?". But if they are different then we'll do a config transition and then on the next boot we should transition to the real desiredConfig?
right, that's my understand of what will happen if there's a real drift between currentOnAPI vs currentOnDisk.
Basically, the first sync would reconcile currentOnAPI with currentOnDisk, after we're done with that, there will be another immediate sync to honor desiredConfig (which, in our disaster recovery story is gonna be almost always the same as currentOnAPI, unless you backup an etcd with current != desired at the time you actually backup, right?)
There was a problem hiding this comment.
It feels like this would all be a lot simpler to reason about if basically we had
/var/machine-config-daemon/config<hash>.jsonand we kept those for every config version that was relevant to us (current, desired) and used it if it was present instead of talking to the api server.
I'm not sure how could we keep desired on disk 🤔 that has always have to come from apiserver right? I feel I'm missing something
There was a problem hiding this comment.
or are we saying that in this case the currentConfig is the desired from the fingerprint? would it go from fingerprint->current-> desired? or....?
is there any case where if we have fingerprint, current and desired we would not want to ultimately end up in the desiredConfig?
There was a problem hiding this comment.
fingerprint->current-> desired? or....?
exactly that but in the DR case, fingerprint=configB,current=configA,desired=configA. So it's just one real sync+reboot
is there any case where if we have fingerprint, current and desired we would not want to ultimately end up in the desiredConfig?
nope, there shouldn't be any such case afaict. Desired is always evaluated at sync+1 if fingerprint and current drift
There was a problem hiding this comment.
One thing bugging me too is...why don't we
return fingerprintMC, desiredConfig, nil?
Uhm, yeah, so using my flow from above comments, would that work in this case where I take a snapshot exactly in the middle of a sync where current != desired (cause yolo!):
current=configA != desired=configB
snapshot
upgrade
- current=configB != desired=configC
roll out new configs and MCD syncs
- current=configC == desired=configC
...
trigger a backup restore pre-upgrade which isn't calling informers
- current=configA != desired=configB
but now what's on disk is still configC
should we first go from configC on disk to configA on disk, and only then configB? Your code would go straight from configC on disk to configB (which avoids a sync, but should we anyway or do we risk missing something?)
There was a problem hiding this comment.
(which avoids a sync, but should we anyway or do we risk missing something?)
I am not sure what we'd miss. It sounds like you're trying to be more conservative here just on general principle? I don't object to that to be clear. But I don't understand how this would be different from any other config change.
There was a problem hiding this comment.
yup, it's really me being just conservative indeed, I guess it would be fine anyway to avoid a sync, so yeah, changing
Signed-off-by: Antonio Murdaca <runcom@linux.com>
686b0e5 to
070dedd
Compare
|
alrighty, let's see what tests say and then pull the trigger on this for the DR story |
|
/hold cancel |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: kikisdeliveryservice, runcom The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
Mimick (and copy from) pkg/controller testing, hopefully one day I'll spend a week trying to abstract all the duplicate code in pkg/controller testings and now pkg/daemon...but not today. This patch is adding tests for openshift#612 which would greatly benefit from some testing. This can be used as a start to add more tests *hint hint*
Signed-off-by: Antonio Murdaca runcom@linux.com
- What I did
Add an on disk fingerprint about what our current config is on a node (avoiding resync if we changed it or restored from an etcd backup). We check what we have on disk with what we have in annotations before triggering a sync.
- How to verify it
- Description for the changelog