Improved robustness of recycler script#3657
Conversation
|
@sdodson on the bash On Jul 10, 2015, at 11:48 AM, Mark Turansky notifications@github.com The recycler script needs to guarantee fail if the volume could not be @smarterclayton https://github.com/smarterclaytonYou can view, comment on, or merge this pull request online at: #3657
File Changes
Patch Links:
— |
images/origin/scripts/recycler.sh
Outdated
There was a problem hiding this comment.
According to the man page the return code for rm should be non zero if there were failures, why not just rely on that?
There was a problem hiding this comment.
I just changed the script to rely on rm's exit code, as you suggest. I had to remove "set -o errexit", which I thought was the standard bash habit. The script is a little clearer now, though.
There was a problem hiding this comment.
We don't want to remove set errexit, but you'll want to put the rm inside a
test block and handle it being empty
On Jul 13, 2015, at 2:43 PM, Mark Turansky notifications@github.com wrote:
In images/origin/scripts/recycler.sh
#3657 (comment):
+# first and only arg is the directory to scrub
+dir=${1:-}
+
+if [ -z $dir ]; then
- echo "Usage: recycler.sh some/path/to/scrub"
- exit 1
+fi
+echo "Scrubbing $dir"
+
+# capture the output, which can contain permission denied errors
+rm_output=$(rm -rf $dir/* 2>&1 || true)
+
+# silence is golden
+if [ -z "$rm_output" ]; then
I just changed the script to rely on rm's exit code, as you suggest. I had
to remove "set -o errexit", which I thought was the standard bash habit.
The script is a little clearer now, though.
—
Reply to this email directly or view it on GitHub
https://github.com/openshift/origin/pull/3657/files#r34496567.
There was a problem hiding this comment.
Isn't that how I originally had it? Use errexit and capture the output and test its output for emptiness?
c8b74ed to
b252b03
Compare
|
@smarterclayton 2 of 3 improvements implemented. Thanks for the feedback. It was helpful. The one about rm was commented on above. |
|
@smarterclayton is there more to do with this bash script? We can tweak it more, but this one is already better than the one that exists today. |
images/origin/scripts/recycler.sh
Outdated
There was a problem hiding this comment.
Missing quotes and eagle braces
There was a problem hiding this comment.
I commented inline about this one, but the comment block above got collapsed.
rm doesn't support wildcard expansion, so $dir/* returned true (but removed nothing) and the script would wrongly pass.
b252b03 to
aa30eac
Compare
|
@markturansky Bump |
380cdde to
46e3ce3
Compare
|
This is awaiting review. I just squashed to make it easier. I added deletion of dotfiles to the scrub script and Recycle as the policy on the Wordpress example PVs. Just ran through the whole thing in newly rebased Origin as a test and it's good to go. |
|
This requires attention and should receive priority for a release. Ignoring dotfiles in the recycler means many volumes might not successfully be recycled. This change fixes that and works when I run the wordpress example with "--latest-images" |
|
@bparees This PR might be relevant to your group. It is an Origin image. Someone needs to be assigned to this PR and get it merged. |
|
@mnagy @rhcarvalho ptal, we might want to do this in our persistent db templates? @markturansky should this be the default behavior for PVCs instead of having to specify it? |
|
@bparees The built-in Kube default will be different. I override the image already and have VolumeConfig patched into OpenShift. This PR just improves that image. |
images/origin/scripts/recycler.sh
Outdated
There was a problem hiding this comment.
How about
shopt -s dotglob nullglob
rm -rf ${dir}/*?
For the current code we have there, if there's no dot file in $dir, here's what we get:
$ mkdir b
$ rm -rf b/.*
rm: refusing to remove ‘.’ or ‘..’ directory: skipping ‘b/.’
rm: refusing to remove ‘.’ or ‘..’ directory: skipping ‘b/..’
$ echo $?
1|
@markturansky please consider: set -e # same as set -o errexit
shopt -s dotglob nullglob
if [[ rm -rf ${dir}/* ]]; then
echo 'Scrub OK'
exit 0
fi
echo 'Scrub failed'
exit 1 |
|
@rhcarvalho if you've released the claim i'm not sure what expectation you should have to get it back later. so yes, i'm suggesting delete by default. |
|
Does deleting a PVC mean releasing the claim? |
|
The volume is released from its claim when the claim is deleted. That is the trigger to recycle the volume. |
|
@bparees I disagree with deleting data by default. My work flow when I'm working on the DB images is to create a PV, then instantiate a template (including Service, DC, PVC), do some work, write data, etc. Later I might put everything down with Cluster admins could decide at some point that stale data on unclaimed PVs get recycled, but that should happen after a grace period... Recycle might work fine for small demos that we don't care about the data. It will drive people mad when they discover all of their data is gone unintentionally. |
|
@rhcarvalho if that's your use case, you should not delete the claim. If you delete your claim, you cannot easily get it back. An admin could manually manipulate the data and bind the same PV to your new PVClaim, but I wouldn't think this is likely. |
|
@markturansky our templates include a PVC, it would be very clumsy to start over from the same template for a second, third time, without deleting the old PVC... |
|
I understand. I think @bparees is correct, then. Might as well recycle those volumes. You're not going to get them back (easily) if the claim is deleted. The resource might as well be freed for another user. |
|
@rhcarvalho you're basically taking advantage of a loophole. in fact personally if i deleted a claim and then recreated it and got old data back, i'd be pretty surprised. i think you need to consider a different workfow if you want your volume data to stick around. after all, if you had 3 volumes, what guarantee do you have that you'd get back the one you wanted w/ your next claim? |
|
@rhcarvalho thanks for the feedback on the script. I'll have it implemented and tested for you by Monday. |
|
Hmmm... I'm not trying to reuse the PV, but the data... as I mentioned, I'm running
I don't necessarily want to have the data back when I create a new PV and PVC and the rest of the template. It's not about doing this rematching, but about having the data there. For a production database I'd rather spare an extra PV if I needed rather than recycling... Sorry for the noise in this PR. We should probably be discussing this somewhere else. @bparees we can talk about it over email or IRC next week. |
|
@markturansky @rhcarvalho @bparees - just following up on this PR as it has been pending for the storage team for awhile. Is there something left to be done here, or can this get merged? |
|
I owe this PR the suggested changes. I will follow up shortly. Sorry for the delay. |
46e3ce3 to
e3abb51
Compare
|
@rhcarvalho I added the bash script exactly as you posted it above. It looks simple enough, and simple is good. |
images/origin/scripts/recycler.sh
Outdated
There was a problem hiding this comment.
d'oh! cut and paste fail.
|
|
|
My example above was not supposed to replace the whole file :) |
e3abb51 to
92d738b
Compare
|
@rhcarvalho I added back what seems to be correct but still get the same error as posted above. I am, literally, a bash newb and have no idea how to script this. |
|
@markturansky I can give you a hand. Give me some minutes :-) |
|
@markturansky I've opened a PR against your repo - markturansky#1 |
92d738b to
5fcff7d
Compare
|
@rhcarvalho thank you so much for your help. Your script is perfect. I tested through it locally and its good to go. @bparees @smarterclayton final review and merge, please? |
There was a problem hiding this comment.
this is going to miss . (dot) directories. eg .config
also why not just
rm -rfv ${dir}/.* ${dir}/*
?
i guess that returns an error code because it fails to delete "." and "..", so maybe you need to incorporate it into the find statement.
There was a problem hiding this comment.
|
[merge] |
|
continuous-integration/openshift-jenkins/merge SUCCESS (https://ci.openshift.redhat.com/jenkins/job/merge_pull_requests_origin/3407/) (Image: devenv-fedora_2411) |
|
Evaluated for origin merge up to 5fcff7d |
Merged by openshift-bot
Resolves #3410
The recycler script needs to guarantee fail if the volume could not be scrubbed for any reason. This revision is better at doing so, but I am a Bash newb and could use a review by someone with good scripting skills.
@smarterclayton