first commit for first cloud-init PR#745
Conversation
raharper
left a comment
There was a problem hiding this comment.
Thanks for putting up this PR! This is a useful addition to cloud-init script handling, thank you!
I've included some feedback in the code. I would also like to see some unittests for these. We have some existing tests you can add to, I think tests/unittests/test_builtin_handlers.py could add a TestScriptPerHandler class testing the common scripts-per-frequency common implementation.
| def handle_part(data, ctype, script_path, payload): | ||
| if script_path is not None: | ||
| LOG.info("in shell_script-per-boot.handle_part() ...") | ||
| LOG.debug("script_path=%s", script_path) | ||
| (folder, filename) = os.path.split(script_path) | ||
| LOG.debug("folder=%s filename=%s", folder, filename) | ||
| path = f"/var/lib/cloud/scripts/per-boot/{filename}" | ||
| LOG.debug("path=%s", path) | ||
| util.write_file(path, payload, 0o700) |
There was a problem hiding this comment.
handle_part appears to be identical in each of these script_per_XXX files. Can we move to a common implementation which takes the variable element?
def script_per_handler(frequency, data, ctype, script_path, payload):
if script_path is not None:
And each of the files can import this and then call it like:
def handle_part(data, ctype, script_path, payload):
return script_per_handler("per-boot", data, ctype, script_path, payload)
There was a problem hiding this comment.
@raharper Happy to add tests, but I didn't see any shell script handler tests in the file you mentioned, or elsewhere, so I wasn't quite sure where to put tests. I'm also not quite sure how to unit test a module that simply saves a file to a folder, but I am open to ideas.
| LOG.info("in shell_script-per-boot.handle_part() ...") | ||
| LOG.debug("script_path=%s", script_path) |
There was a problem hiding this comment.
Let's drop the verbose logging in these handlers; util.write_file() already logs which file it writes.
| LOG.debug("script_path=%s", script_path) | ||
| (folder, filename) = os.path.split(script_path) | ||
| LOG.debug("folder=%s filename=%s", folder, filename) | ||
| path = f"/var/lib/cloud/scripts/per-boot/{filename}" |
There was a problem hiding this comment.
You need to use cloudinit/helpers.py.Paths class to get the correct path cross distro/platforms etc. There's a function in cloudinit/cmd/devel/__init__.py which will get you a Paths object:
from cloudinit.cmd.devel import read_cfg_paths
ci_paths = read_cfg_paths()
cloud_scripts_dir = ci_paths.get_cpath('scripts') # defaults to /var/lib/cloud/ + scripts
target = os.path.join(cloud_scripts_dir, frequency, os.path.basename(script_path))
There was a problem hiding this comment.
Thanks. I've made this change. Interestingly, the shell_script.py handler actually doesn't do what you described. It does some hocus pocus in the constructor to extract that scripts path, and assign it to a variable, which it then uses later in handle_part. I am not sure which is better -- yours is certainly simpler -- and maybe it'd be better to unify shell_script.py and my new handlers.
There was a problem hiding this comment.
Correction: I could not get this change to build, because of a circular reference issue: cloudinit.cmd.devel:init.py imports stages.py, which imports all the handlers including mine, so my handlers can't import cloud.cmd.devel to get that utility method. I ended up copy/pasting the shell_script.py approach.
The method you pointed out seems like a useful method though ... would it make sense to put that functionality up somewhere higher, like in cloudinit.util?
smoser
left a comment
There was a problem hiding this comment.
Hi.
If I understand correctly, you're adding handlers for 3 new mime types.
- text/x-shellscript-per-once
- text/x-shellscript-per-always
- text/x-shellscript-per-instance
You should be adding these to a new file in cloudinit/handlers/ and then
hooking them into cloudinit stages via cloudinit/stages.py.
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed, please do feel free to reopen it if you wish to continue working on it.) |
|
Revisiting this ... I've just been reading some cloud-init code. Should I be trying to reuse some code from cc_write_files.py, like cc_write_files#write_files() to do what I want? Even if that requires lifting the code out of cc_write_files.py and into somewhere common, would that be the best way to go? |
|
Also, in the comment for the bot-closure of this issue, it says "(If the pull request is closed, please do feel free to reopen it if you wish to continue working on it.)" I'm happy to reopen but I'm not seeing where. Thanks! |
Hmm, I guess that must only be available to folks with commit rights; we'll fix that message. (I'll reopen it for you!) |
I've only scanned through the PR, but I don't see where this would be needed: |
|
Looks like the bot got cranky and re-closed since the tag was there. I've removed the stale-pr tag and reopened. |
|
One more commit, to merge in changes from master (no conflicts) |
There was a problem hiding this comment.
Hey @beantaxi , thanks for the updates. I have a few more (much more minor) comments inline.
One major issue though, is that I would have expected we no longer need the original three shell_script_per_boot.py, shell_script_per_instance.py, and shell_script_per_once.py files, but are you still using them somehow? How do you generate your mime archive now?
I can nuke those three files entirely and create the mime with something like:
cloud-init devel make-mime -a /tmp/script1.sh:x-shellscript-per-boot -a /tmp/script2.sh:x-shellscript-per-instance -a /tmp/script3.sh:x-shellscript-per-once > /tmp/user-dataIf there's no longer a use for these files, then lets get rid of them.
Also, you have a few functions with comments above them describing what they do. It'd be better to turn those comments into docstrings for the functions.
|
|
||
| def __init__(self, paths, **_kwargs): | ||
| # pylint: disable=too-many-function-args | ||
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_ALWAYS, | ||
| **_kwargs) |
There was a problem hiding this comment.
| def __init__(self, paths, **_kwargs): | |
| # pylint: disable=too-many-function-args | |
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_ALWAYS, | |
| **_kwargs) | |
| frequency = PER_ALWAYS |
|
|
||
| def __init__(self, paths, **_kwargs): | ||
| # pylint: disable=too-many-function-args | ||
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_INSTANCE, | ||
| **_kwargs) |
There was a problem hiding this comment.
| def __init__(self, paths, **_kwargs): | |
| # pylint: disable=too-many-function-args | |
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_INSTANCE, | |
| **_kwargs) | |
| frequency = PER_INSTANCE |
|
|
||
| def __init__(self, paths, **_kwargs): | ||
| # pylint: disable=too-many-function-args | ||
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_ONCE, | ||
| **_kwargs) |
There was a problem hiding this comment.
| def __init__(self, paths, **_kwargs): | |
| # pylint: disable=too-many-function-args | |
| ShellScriptByFreqPartHandler.__init__(self, paths, PER_ONCE, | |
| **_kwargs) | |
| frequency = PER_ONCE |
There was a problem hiding this comment.
I can make these changes, but they seem odd. frequency is assigned by the Handler c'tor, which all these classes inherit from. Is it necessary in Python for subclasses to assign instance variables, even if they're assigned in a parent class? I wouldn't think so but I have been surprised by Python before.
There was a problem hiding this comment.
Hmmm, there was one other change that I thought I requested, but I'm not seeing now. These other changes won't make sense without it. It was to change ShellScriptByFreqPartHandler to look like this:
class ShellScriptByFreqPartHandler(Handler):
def __init__(self, paths, **_kwargs):
Handler.__init__(self, self.frequency)Looking at the Handler base class now, I can see how this would be a little confusing given that there is also a frequency instance variable there.
Since I'm proposing we define class variables named frequency in the subclasses, the call to that Handler.__init__(self, self.frequency) is passing a reference to the class variable named frequency which then sets the instance variable named frequency.
If that's confusing, I'd be ok with changing the names of the class variables (along with the corresponding Handler.__init__ call) to something like freq so it's not so easy to confuse class variable names with instance variable names.
Does that make sense?
There was a problem hiding this comment.
Ahhhh I suspected my mediocre Python skills played a role here. You were requesting class variables in the subclasses, not instance variables, although there happen to be instance variables of the same name up in Handler.
The whole frequency handling is confusing to me, since Handler takes a single frequency in its c'tor and then maintains it as in ivar, but then handle_part() takes a frequency as an argument as well.
Meanwhile had this idea: I just checked in stages where these handlers actually get created, and it looks like this:
shellscript_per_boot_handler = ShellScriptPerBootPartHandler(**opts)
shellscript_per_instance_handler = \
ShellScriptPerInstancePartHandler(**opts)
shellscript_per_once_handler = ShellScriptPerOncePartHandler(**opts)
In theory, those could all be c'tor invocations of a single class, parameterized on frequency, and then the subclasses can go away entirely. There'd still be the apparent redundancy of frequency as an arg to handle_parts() but that's ok.
If not, I'm happy to add class variables to subclasses and rename them to freq, or make them ALL CAPS ... whatever works.
There was a problem hiding this comment.
Ah, right. Your proposed solution makes sense and sounds like a good idea. Go for it!
There was a problem hiding this comment.
Thanks! Just committed. I had to add a prefixes collection to the base class, of all 3 prefixes, but that's the only part I was a bit unsure about.
|
@beantaxi , this PR looks good to me at this point. One additional thing I'd like to have before we merge it is an integration test. The docs for our integration tests are at https://cloudinit.readthedocs.io/en/latest/topics/integration_tests.html , and the tests themselves can be found at I've got the beginnings of one here: The test itself should be verifying that the scripts exist on the instance (via assert statements...not just printing stuff), and that they've all executed at least once. I don't actually think we need to verify the 'per-boot, per-instance, per-once' functionality as that is testing the behavior of those modules, not this new code. My one major hangup is that the make-mime code prints directly to the console rather than returning any sort of text anywhere, so my I unfortunately have some higher priority work happening right now, so I may not be able to do any additional work on it in the next few days. If you'd like to poke at it, feel free, but I also realize it's a lot to digest for somebody new to this code base and these tests |
|
@TheRealFalcon I'm in. A couple questions on your last comment and the paste ... "The test itself should be verifying ... that they've [the freq-specific scripts] all executed at least once" I am inferring that you'd be happy with something as simple as sample userdata scripts that each write a file, eg As far as the paste goes, it looks like there is a fair amount of module-level code that creates |
Yes, that plus verifying that the "per-boot, per-instance, per-once" scripts themselves are put in their proper places in /var/lib/cloud/scripts. What you have posted would implicitly test that behavior, but better to be explicit.
Either one could work, but module-level would be easiest here. I know it is sometimes frowned upon to do so much at the module level, but given that this is test code and tests are more independent than a large running program, I don't think it is a problem. The reason it is easier to do it this way rather than using a fixture is that the user data has to be defined before the test definition in order to use the |
|
"My one major hangup is that the make-mime code prints directly to the console rather than returning any sort of text anywhere ..." (etc) After reading the code, I took this to mean that your To that end I extracted a
Thanks! |
|
Pending the answers to my previous comment, I think I'm ready to go. I spent many hours reading about The one thing I can think of, is that it would be big help to be able to do Thanks! |
Yes, that's correct.
It'd be slightly more accurate to call it something like
I think so, but it's hard to understand completely without seeing the changes yet. If you want to push what you have so far, it'd be easier to understand exactly what you're doing.
I don't think that you're duplicating an existing function. The userdata is technically always provided by the user, so cloud-init usually isn't involved in the creating of that data. What we have here is a simple tool to create a mime message that can be provided as user data, but the vast majority of our use cases don't use this sort of thing, but rather provide a single shell script or cloud-config.
With this integration test 🙂 . There's certainly gaps in our testing in the code base, and especially for a convenience tool like this which won't break boot like a bug in cloud-init proper would. Going forward though, we're trying to ensure all changes get properly tested, hence my asking for an integration test here.
That's some good feedback. Being able to interact with an already running instance would be useful for debugging in a lot of cases. But just FYI, the conf_exists = class_client.execute(
'test -f /etc/apt/apt.conf.d/90cloud-init-pipelining'
).ok
assert conf_exists is False |
|
Thanks for the response, especially the color around how at the end of the day Thanks also for the detail on I'll get to this later today or tomorrow, & then it sounds like it might be good to go from my end. |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
Real quickly ... I got around to creating and running an integration test, and it appears that my tests are failing; my scripts didn't get copied to the proper scripts directory, and they didn't execute. I can't think of a next step to testing, other than installing a my-branch version of cloud-init on a VM and seeing what happens. I can think of 3 ways of doing so ...
All 3 have value to me, so in the absence of feedback I'll probably just start with #1 and see what happens. Thanks! |
|
Hi @chrislalos . Are you working with @beantaxi on this? Broadly, when it comes to debugging integration tests, I'll usually ensure the following are set in my This allows me to connect after a test has completed to see what happened. If that doesn't help, it's mostly print/logging statements and examining the logs. Be sure to set For this issue in particular, How about something like this for the refactored def create_mime_message(files):
sub_messages = []
errors = []
for i, (fh, filename, format_type) in enumerate(files):
contents = fh.read()
sub_message = MIMEText(contents, format_type, sys.getdefaultencoding())
sub_message.add_header('Content-Disposition',
'attachment; filename="%s"' % (filename))
content_type = sub_message.get_content_type().lower()
if content_type not in get_content_types():
msg = ("content type %r for attachment %s "
"may be incorrect!") % (content_type, i + 1)
errors.append(msg)
sub_messages.append(sub_message)
combined_message = MIMEMultipart()
for msg in sub_messages:
combined_message.attach(msg)
return (combined_message, errors)I like this because it doesn't write to stdout/stderr, so it can be called from other code without having side effects that assume we're running from the CLI. It also takes the Then it can be called further down with something like this: combined_message, errors = create_mime_message(args.files)
if errors:
level = "WARNING" if args.force else "ERROR"
for error in errors:
sys.stderr.write("{}: {}\n".format(level, error))
sys.stderr.write("Invalid content-types, override with --force\n")
if not args.force:
return 1
print(combined_message)
return 0Then using it in an integration test becomes much simpler too: FILES = [
(PER_BOOT_FILE, 'boot.sh', 'x-shellscript-per-boot'),
(PER_INSTANCE_FILE, 'instance.sh', 'x-shellscript-per-instance'),
(PER_ONCE_FILE, 'once.sh', 'x-shellscript-per-once'),
]
USER_DATA, _ = create_mime_message(FILES)(Note that I haven't thoroughly tested these code snippets, so feel free to correct something I missed) One thing I did notice when trying this is that all of the scripts are ending up in the |
|
Hello! Thank you for this proposed change to cloud-init. This pull request is now marked as stale as it has not seen any activity in 14 days. If no activity occurs within the next 7 days, this pull request will automatically close. If you are waiting for code review and you are seeing this message, apologies! Please reply, tagging mitechie, and he will ensure that someone takes a look soon. (If the pull request is closed and you would like to continue working on it, please do tag mitechie to reopen it.) |
|
May I request a reopen please? It's been a while, but seeing as this was a New Years Resolution for 2021, and I hadn't finished it, I figured I'd take a look last night / this morning for old time's sake ... and to my surprise I was able to get my integration test passing this morning! |
|
@chrislalos , this PR cannot be re-opened as we no longer have a |
Proposed Commit Message
Additional Context
Test Steps
I'm currently running a command like the following
Then, I deploy the file as used data, check logfiles, and check /var/lib/cloud/scripts/per-boot and -instance to make sure my files are there.
I'd love for the script files to be placed in cloud-init such that they do not need to be explicitly added to userdata. But I didn't know how to do that, so I put them in cloudinit/cmd/devel instead.
Checklist:
(I am happy to do these once my code looks like it's on the right track)