Bug 1895360: pkg/daemon: don't delete a file if its replaced with a dropin#2196
Conversation
360b99a to
92e733e
Compare
|
hey @vrutkovs this new func seems like it should have a test, can you add one? |
|
Sure, will add a unit test (verifying that it fixes OKD upgrade first). Not sure how to approach unittesting here - do you know any test which checks |
b2212a7 to
b26c28c
Compare
|
/retest |
|
Added a unit test, verified that OKD 4.5 can be upgraded using MCO with this patch |
|
Flakes /retest |
|
@vrutkovs: This pull request references Bugzilla bug 1895360, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/cherry-pick release-4.6 |
|
@vrutkovs: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/skip e2e-gcp-op is permafailing |
|
/skip |
yuqi-zhang
left a comment
There was a problem hiding this comment.
I think this patch should fix the bug in question, however this doesn't seem very bulletproof. Consider the following scenarios:
- systemd units in the unit section would have this problem as well right? if a systemd unit that was originally being written as a file gets replaced by a unit section, we should also correctly handle that.
- the opposite case is ironically not handled as well, where a dropin that was replaced by a file would also not work (why would you ever do this though). I guess in general anything cross-dependency between unit and file sections is not handled well.
- if there was a dropin originally on the system (call it version A). which the MCD made a backup of when you wrote the file via a MC (call it version B), and then you try to replace with a dropin (call it version C), it looks like the logic will end up writing version A to the file instead of version C like we want, since we perform a backup and then a no-op. A rare edge case but maybe we should account for it anyways for correctness.
Those 3 scenarios are just some that came to mind (my assessment might not be entirely correct here). I am not opposed to merging this fix as is but we should consider the interactions between files and unit sections a bit deeper if that's the route we take. I'd be ok to merge this as is as well (maybe fix scenario 3?) provided we don't regress any behaviour
There was a problem hiding this comment.
the diff looks somewhat confusing, but this line should be the essence of the change right?
There was a problem hiding this comment.
Right. I had to rework the whole function as gosec was complaining about nested ifs in e79a186 (#2196)
There was a problem hiding this comment.
the way github is showing diff is indeed confusing
|
/retest Please review the full test history for this PR and help us cut down flakes. |
19 similar comments
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
/hold Not sure why gcp-op won't pass. Holding it for >4 hours, so that fresh release image would be built |
we are waiting for #2229 to merge so feel free to hold it and check it tomorrow or later so avoid endless retests |
|
/retest |
|
/refresh |
|
@vrutkovs: All pull requests linked via external trackers have merged: Bugzilla bug 1895360 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@vrutkovs: new pull request created: #2246 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Some systemd service settings may be defined in .Storage.Files and in .Systemd.Units.Dropins. Dropins is preferable, so users may want to migrate to the way.
However if a file representing a dropin was defined in .Storage.Files and converted into .Systemd.Units.Dropin in new version of MC, it would be placed by Ignition's dropin implementation and then garbage-collected by MCD's deleteStaleData.
This change ensures we're checking Systemd.Units.Dropins before attempting to remove this file. This is crucial for OKD 4.5 -> 4.6 upgrades - kubelet MCO config was defined as a file in 4.5 and converted into a dropin in 4.6