Bug 2070805: pkg/cvo/updatepayload: Shift previous-download removal into the job#765
Conversation
|
@wking: This pull request references Bugzilla bug 2070805, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
4be3a8f to
624f7fd
Compare
With https://bugzilla.redhat.com/show_bug.cgi?id=2072389 still open a typical |
d70b3c9 to
dee4b9b
Compare
I've pushed d70b3c9 -> dee4b9b to pivot to populating |
| toReleasePath := filepath.Join(tmpDir, payload.ReleaseManifestDir) | ||
| releaseCmd := fmt.Sprintf("mv %s %s", fromReleasePath, toReleasePath) | ||
|
|
||
| moveInPlaceCmd := fmt.Sprintf("mv %s %s", tmpDirCmd, tdir) |
There was a problem hiding this comment.
I think you may have to first create tdir to make the rename atomic, https://man7.org/linux/man-pages/man2/rename.2.html
There was a problem hiding this comment.
From POSIX:
This
rename()function is equivalent for regular files to that defined by the ISO C standard. Its inclusion here expands that definition to include actions on directories and specifies behavior when the new parameter names a file that already exists. That specification requires that the action of the function be atomic.
The "If newpath already exists ..." section of the Linux man page is also in the "... and also when the target does exist..." category. Although it would be nice if someone said "this is atomic when the target does not exist" more explicitly on either side.
There was a problem hiding this comment.
This article talks through the atomic rename approach, and from that, it looks like we might need a pre-rename fsync to be very secure about the transaction. I'm not sure that covering the kernel-crash corner case is worth the overhead of calling fsync.
295db8e to
4be737e
Compare
I'd attempted to add this cleanup on the CVO side in 507b474 (pkg/cvo/updatepayload: Prune previous payload downloads, 2022-04-03, openshift#760). But Evgeni pointed out that that doesn't work because the CVO's volume mount is 'readOnly: true': 2022-04-11T20:42:32.056428784Z W0411 20:42:32.056387 1 updatepayload.go:149] failed to prune update payload directory: unlinkat /etc/cvo/updatepayloads/brRTeZnZSIkZ2J2YMdQozg: read-only file system This commit shifts removal into the job. To avoid issues with loading half-populated directories, I've also adjusted the job to populate {baseDir}/{targetName}-{randomSuffix}, and then rename that populated result to {baseDir}/{targetName}. That way, future targetUpdatePayloadDir calls using the ValidateDirectory precheck will successfully distinguish "previous job got everything for this release" (where we don't want a new download job) from "previous job got some bits over, but not everything" (where we do want a new download job). POSIX 2018 requires mv to use rename [1]. While rename discusses atomicity in the informative Rationale section, there's nothing in the normative description that mentions atomicity [2]. Still, "atomic renames" are a common dance on Linux, especially when renaming within a directory, where it is a single directory table being edited. [3] talks through the pattern, and points out the need for a pre-rename fsync or similar if you want to avoid: 1. Open a/file, write to it, and close the file descriptor. 2. Kernel holding file contents in memory, not yet on disk. 3. Atomically move a/ to b/. 4. Crash, losing any kernel-memory caches. 5. b/file exists, but it may be empty or partial. I think the risk of kernel-crashes leaving us with currupted manifests is small enough that we can ignore it, to save the cost of calling fsync. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html [2]: https://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html [3]: https://lwn.net/Articles/789600/
d8d8e2c to
452f872
Compare
All of the input strings were internally controlled, so there wasn't a risk of them containing a shell-sensitive character. But that safety was not immediately obvious when reading the code. By converting to initContainers, which run sequentially to completion before the next container is run [1], we have the same effect without involving the shell. As a side benefit, we also get clearer status logging showing exactly which steps have succeeded or failed. [1]: https://kubernetes.io/docs/concepts/workloads/pods/init-containers/#understanding-init-containers
452f872 to
c45a981
Compare
I've added c45a981 with a pivot to |
|
/test e2e-agnostic-upgrade |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jottofar, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/override ci/prow/e2e-agnostic-upgrade |
|
@wking: Overrode contexts on behalf of wking: ci/prow/e2e-agnostic-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@wking: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
@wking: All pull requests linked via external trackers have merged: Bugzilla bug 2070805 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Some backround on recent changes: * 507b474 (pkg/cvo/updatepayload: Prune previous payload downloads, 2022-04-03, openshift#760) attempted to add CVO-side directory removal, but that failed because the CVO mounts the shared volume 'readOnly: true'. * a5af89d (pkg/cvo/updatepayload: Shift previous-download removal into the job, 2022-04-18, openshift#765) shifted removal into the job itself. As far as I can tell, this worked. * c45a981 (pkg/cvo/updatepayload: Use initContainers instead of shell &&-chains, 2022-04-20, openshift#765) addressed concerns with unquoted shell arguments by pivoting to initContainers and dropping the shell. This broke the * pathname expansion that rm depends on to find directories to remove. This commit returns to using the shell to invoke the rm call, so we get pathname expansion back [1]. But I avoid the possibility of unquoted argument injection by using workingDir to bring in baseDir. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_06
Some background on recent changes: * 507b474 (pkg/cvo/updatepayload: Prune previous payload downloads, 2022-04-03, openshift#760) attempted to add CVO-side directory removal, but that failed because the CVO mounts the shared volume 'readOnly: true'. * a5af89d (pkg/cvo/updatepayload: Shift previous-download removal into the job, 2022-04-18, openshift#765) shifted removal into the job itself. As far as I can tell, this worked. * c45a981 (pkg/cvo/updatepayload: Use initContainers instead of shell &&-chains, 2022-04-20, openshift#765) addressed concerns with unquoted shell arguments by pivoting to initContainers and dropping the shell. This broke the * pathname expansion that rm depends on to find directories to remove. This commit returns to using the shell to invoke the rm call, so we get pathname expansion back [1]. But I avoid the possibility of unquoted argument injection by using workingDir to bring in baseDir. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_06
Some background on recent changes: * 507b474 (pkg/cvo/updatepayload: Prune previous payload downloads, 2022-04-03, openshift#760) attempted to add CVO-side directory removal, but that failed because the CVO mounts the shared volume 'readOnly: true'. * a5af89d (pkg/cvo/updatepayload: Shift previous-download removal into the job, 2022-04-18, openshift#765) shifted removal into the job itself. As far as I can tell, this worked. * c45a981 (pkg/cvo/updatepayload: Use initContainers instead of shell &&-chains, 2022-04-20, openshift#765) addressed concerns with unquoted shell arguments by pivoting to initContainers and dropping the shell. This broke the * pathname expansion that rm depends on to find directories to remove. This commit returns to using the shell to invoke the rm call, so we get pathname expansion back [1]. But I avoid the possibility of unquoted argument injection by using workingDir to bring in baseDir. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_06
Some background on recent changes: * 507b474 (pkg/cvo/updatepayload: Prune previous payload downloads, 2022-04-03, openshift#760) attempted to add CVO-side directory removal, but that failed because the CVO mounts the shared volume 'readOnly: true'. * a5af89d (pkg/cvo/updatepayload: Shift previous-download removal into the job, 2022-04-18, openshift#765) shifted removal into the job itself. As far as I can tell, this worked. * c45a981 (pkg/cvo/updatepayload: Use initContainers instead of shell &&-chains, 2022-04-20, openshift#765) addressed concerns with unquoted shell arguments by pivoting to initContainers and dropping the shell. This broke the * pathname expansion that rm depends on to find directories to remove. This commit returns to using the shell to invoke the rm call, so we get pathname expansion back [1]. But I avoid the possibility of unquoted argument injection by using workingDir to bring in baseDir. [1]: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_06
I'd attempted to add this cleanup on the CVO side in 507b474 (#760), but @evakhoni pointed out that that doesn't work because the CVO's volume mount is
readOnly: true:This commit shifts removal into the job.
And because we want that removal to also cover "a previous job run half-populated the target release directory", I've removed the pre-fetch
ValidateDirectorycheck. In cases where we already had valid content locally, that's one extra download, which seems like a lower price than assuming a previous partial download was actually complete. If we see many extra downloads, we use atomic renames or something so it's easy for an incoming CVO to distinguish between content that was completely populated by a previous job run from content that was incompletely populated.