Cleanup Vultr support#987
Conversation
Cleanup and improve Vultr support
|
Thanks for this PR @eb3095 , would you be able to update this PR description with a Proposed Commit message that provides a lot more context about the changes you are making? Context in the single squashed commit message will enable future "us" to have context on the complexity of some of the changes to Vultr backend metadataservice |
blackboxsw
left a comment
There was a problem hiding this comment.
Just a couple of questions for you:
- Do you think this Datasource can persist your metadata_full as self.metadata or is there a side-effect or concern you have for your platform with doing that?
- are you seeing a bug on your platform with
cloud-init query instance-idnot showing your matching SUBID? and instead showing something like "datasource-iid"? If possible I'd like to see output ofsudo cloud-init query --allto confirm behavior (make sure if there is any private user-data (passwords, ssh key info) that you review that output before pasting it.
| 'ethtool-script': '', | ||
| 'raid1-script': '', | ||
| 'config': { | ||
| 'vendor-data': [ |
There was a problem hiding this comment.
This change int types from dict -> list makes me a little leery in this unittest as cloud-init handles those vendordata types differently depending on list vs dict processing in convert_vendordata. I'm not certain if this is just a unittest change for you or if Vultr platform actually provides as vendordata to the VM has changed type. Please provide sudo cloud-init query vendordata to confirm for us what structure of data is really surfaced.
There was a problem hiding this comment.
So basically before we were providing a json dict with defined endpoints for each script since we were doing little modifications here and there. With the new format we moved to just providing a json array of strings each being a specific script or config.
So how we provide it would be an array of strings, here I left it a dic to be be easily modifiable if things changed or needs to be adjusted and convert it to the expected format bellow. If you would prefer I just leave it in the format that is fed to cloud-init I can go ahead and make those changes.
There was a problem hiding this comment.
Thanks for the clarification here. I think this is ok, but I really need to see the output of both sudo cloud-init query vendordata and possibly sudo cloud-init query merged_cfg. Normally vendordata in datasources is structured such that it merges on top of (and overrides) config defaults that are originally present in /etc/cloud/cloud.cfg and /etc/cloud/cloud.cfg.d/*.cfg. Your format seems to diverge from this typical behavior for Datasources. Your first iteration of DatasourceVultr also seems to diverge from this format anyway, so I don't think you are regressing anything. But, I think we might have to consider how your are using vendordata because this seems a bit different than how almost all other platforms treat vendordata. The docs remind us that vendordata can be disabled by user-data and that vendordata shouldn't be "required" to make a system operational. I don't know if that necessarily applies to your case, but it's a tension we should be aware of here.
There was a problem hiding this comment.
root@t2:~# cloud-init query vendordata
[
"#cloud-config\n{\"package_upgrade\":\"true\",\"disable_root\":0,\"ssh_pwauth\":1,\"manage_etc_hosts\":\"true\",\"chpasswd\":{\"expire\":false,\"list\":[\"root:$6$snip\"]},\"system_info\":{\"default_user\":{\"name\":\"root\"}}}",
"#!/bin/bash\n\n#\n# Functions\n#\n\nfunction check_command_exists () {\n OUT=\"0\"\n if ! [ -z \"$(which $1)\" ]; then\n OUT=\"1\"\n fi\n echo \"${OUT}\"\n}\n\nfunction print () {\n echo \"${@}\" >> /var/log/cloudinit_networking.log\n echo \"${@}\"\n}\n\nfunction get_interfaces () {\n if [ -z \"${INTERFACES}\" ]; then\n INTERFACES=($(ls -l /sys/class/net/ | grep \"/net/e\" | awk -F' ' '{print $9}'))\n fi\n}\n\n#\n# Start script\n#\n\n# Get the interface list\nget_interfaces\n\nif [ \"$(check_command_exists ethtool)\" == \"1\" ]; then\n for int in \"${INTERFACES[@]}\"\n do\n ethtool -L ${int} combined $(nproc --all)\n done\nelse\n print \"Failed to find ethtool, cannot configure multi-queue!\"\nfi\n\necho \"\" > /usr/lib/sysctl.d/90-vultr.conf\necho \"# Accept IPv6 advertisements when forwarding is enabled\" >> /usr/lib/sysctl.d/90-vultr.conf\n\nfor int in \"${INTERFACES[@]}\"\ndo\n echo \"net.ipv6.conf.${int}.accept_ra = 2\" >> /usr/lib/sysctl.d/90-vultr.conf\ndone\n\necho 'net.core.default_qdisc=fq' >> /usr/lib/sysctl.d/90-vultr.conf\necho 'net.ipv4.tcp_congestion_control=bbr' >> /usr/lib/sysctl.d/90-vultr.conf\necho \"\" >> /usr/lib/sysctl.d/90-vultr.conf"
]
root@t2:~# cloud-init query merged_cfg
{
"_doc": "Merged cloud-init system config from /etc/cloud/cloud.cfg and /etc/cloud/cloud.cfg.d/",
"_log": [
"[loggers]\nkeys=root,cloudinit\n\n[handlers]\nkeys=consoleHandler,cloudLogHandler\n\n[formatters]\nkeys=simpleFormatter,arg0Formatter\n\n[logger_root]\nlevel=DEBUG\nhandlers=consoleHandler,cloudLogHandler\n\n[logger_cloudinit]\nlevel=DEBUG\nqualname=cloudinit\nhandlers=\npropagate=1\n\n[handler_consoleHandler]\nclass=StreamHandler\nlevel=WARNING\nformatter=arg0Formatter\nargs=(sys.stderr,)\n\n[formatter_arg0Formatter]\nformat=%(asctime)s - %(filename)s[%(levelname)s]: %(message)s\n\n[formatter_simpleFormatter]\nformat=[CLOUDINIT] %(filename)s[%(levelname)s]: %(message)s\n",
"[handler_cloudLogHandler]\nclass=FileHandler\nlevel=DEBUG\nformatter=arg0Formatter\nargs=('/var/log/cloud-init.log', 'a', 'UTF-8')\n",
"[handler_cloudLogHandler]\nclass=handlers.SysLogHandler\nlevel=DEBUG\nformatter=simpleFormatter\nargs=(\"/dev/log\", handlers.SysLogHandler.LOG_USER)\n"
],
"cloud_config_modules": [
"emit_upstart",
"snap",
"ssh-import-id",
"locale",
"set-passwords",
"grub-dpkg",
"apt-pipelining",
"apt-configure",
"ntp",
"timezone",
"disable-ec2-metadata",
"runcmd",
"byobu"
],
"cloud_final_modules": [
"package-update-upgrade-install",
"fan",
"landscape",
"lxd",
"puppet",
"chef",
"mcollective",
"salt-minion",
"reset_rmc",
"refresh_rmc_and_interface",
"rightscale_userdata",
"scripts-vendor",
"scripts-per-once",
"scripts-per-boot",
"scripts-per-instance",
"scripts-user",
"ssh-authkey-fingerprints",
"keys-to-console",
"phone-home",
"final-message",
"power-state-change"
],
"cloud_init_modules": [
"migrator",
"seed_random",
"bootcmd",
"write-files",
"growpart",
"resizefs",
"disk_setup",
"mounts",
"set_hostname",
"update_hostname",
"update_etc_hosts",
"ca-certs",
"rsyslog",
"users-groups",
"ssh"
],
"datasource_list": [
"Vultr",
"None"
],
"def_log_file": "/var/log/cloud-init.log",
"disable_root": true,
"log_cfgs": [
[
"[loggers]\nkeys=root,cloudinit\n\n[handlers]\nkeys=consoleHandler,cloudLogHandler\n\n[formatters]\nkeys=simpleFormatter,arg0Formatter\n\n[logger_root]\nlevel=DEBUG\nhandlers=consoleHandler,cloudLogHandler\n\n[logger_cloudinit]\nlevel=DEBUG\nqualname=cloudinit\nhandlers=\npropagate=1\n\n[handler_consoleHandler]\nclass=StreamHandler\nlevel=WARNING\nformatter=arg0Formatter\nargs=(sys.stderr,)\n\n[formatter_arg0Formatter]\nformat=%(asctime)s - %(filename)s[%(levelname)s]: %(message)s\n\n[formatter_simpleFormatter]\nformat=[CLOUDINIT] %(filename)s[%(levelname)s]: %(message)s\n",
"[handler_cloudLogHandler]\nclass=FileHandler\nlevel=DEBUG\nformatter=arg0Formatter\nargs=('/var/log/cloud-init.log', 'a', 'UTF-8')\n"
]
],
"output": {
"all": "| tee -a /var/log/cloud-init-output.log"
},
"preserve_hostname": false,
"syslog_fix_perms": [
"syslog:adm",
"root:adm",
"root:wheel",
"root:root"
],
"users": [
"default"
],
"vendor_data": {
"enabled": true,
"prefix": []
},
"vendor_data2": {
"enabled": true,
"prefix": []
}
}
So, in terms of how we are using vendordata. We are using it pass preferred settings (updates, packages, basic settings) and settings that the user chose for deployment. Outside of the root password, nothing is actually required for the system to be operational.
We do add scripts in afterwards to handle specific tasks at boot such as configuring second drives, network optimizations that do not exist as options in the config, and misc other things. The reason we do this is because we offer a marketplace and custom ISO features that allow users to end up with an image not made by us that we have no insight into but we would like the same tweaks and customizations/features we have available for our own images to still be available in these cases with the least user action required. Cloud-init vendordata looked like a suitable way to provide the startup scripts we have traditionally used to users own images with no modification required.
I am aware that user-data overwrites this. I wasn't sure if it was merged in or if the presence of user-data overrided the existing vendordata altogether, and if this applied only to that actual cloud-init config or the entire array, including scripts. If I could get clarification on that it would greatly help! Regardless the vendordata only results in an optimal system at this current moment and the system should be functional whether or not it runs (if not inaccessible barring the user accounted for a password in user-data).
We were also looking at this as an avenue to using the widely available cloud images without having to worry about building them with our requirements and having a way to make changes for compatibility purposes to OS's that need it without needing to directly modify the image.
Let me know if you have any suggestions or comment on how we intended to use this here.
There was a problem hiding this comment.
@blackboxsw was wondering if I could get a follow up to this portion.
There was a problem hiding this comment.
So, in terms of how we are using vendordata. We are using it pass preferred settings (updates, packages, basic settings) and settings that the user chose for deployment. Outside of the root password, nothing is actually required for the system to be operational.
This is a perfectly applicable use of vendor data. Thanks for the context.
We were also looking at this as an avenue to using the widely available cloud images without having to worry about building them with our requirements and having a way to make changes for compatibility purposes to OS's that need it without needing to directly modify the image.
Thank you. This too makes sense. I just setup a Vultr account to poke around to see the images you are hosting don't yet expose cloud-init to the end-user in the compute VM launched. :/ At some point if there is access to a Vultr instance under test with access to your metadata service I'd ❤️ to poke around for a few minutes to get my bearings, but that doesn't need to happen for this PR to land.
I also didn't grok initially that you were already packing a list in as vendordata_raw in the first iteration of Vultr datasource. So I got a bit prickly thinking that list wouldn't work for the vendordata handling, but I was wrong in that regard.
I am aware that user-data overwrites this. I wasn't sure if it was merged in or if the presence of user-data overrided the existing vendordata altogether, and if this applied only to that actual cloud-init config or the entire array, including scripts.
Correct, user-data provided #cloud-config values will overwrite/override specific #cloud-config values from vendor-data or meta-data, it will not blow away the whole of the vendordata list of "parts".
But, user-data can also complete disable honoring any vendordata at all by providing the following
#cloud-config
vendor_data:
enabled: falseIf this is the case on a VM you'll see the log (and none of your vendor scripts would be run I believe.
2021-09-23 04:06:19,234 - stages.py[DEBUG]: vendordata consumption is disabled.
If not set enabled: false by a user, I'd expect to see
2021-09-23 04:13:26,355 - stages.py[DEBUG]: vendordata will be consumed. disabled_handlers=None
As long as your aren't getting WARNING logs in /var/log/cloud-init.log or /var/log/cloud-init-output.log for failures to handle the parts of your vendordata in the "init-network/consume-vendor-data" section of logs your should be in good shape here with providing both scripts and #cloud-config items in your vendordata_raw list.
There was a problem hiding this comment.
Great, I am not too worried about being able to disable it. I am aware of that and enjoy the idea the user always has an option to turn it off completely!
As for the account, I'll get you setup first thing. Just do me a favor and open a ticket from your account and request it be forwarded to the Marketplace queue.
If not required, loop will just not execute.
blackboxsw
left a comment
There was a problem hiding this comment.
I finally believe I'm +1 on this having gotten my bearings on vendordata handling of lists.
Thanks for your patience here.
|
@eb3095, I know you've been on top of it. We will can land this once git rebase upstream/main is complete. Free free to ping us in #cloud-init IRC channel on Libera.chat. |
Proposed Commit Message
Cleanup Vultr support
Checklist: