Skip to content

Add "install hotplug" module (SC-476)#1069

Merged
blackboxsw merged 6 commits into
canonical:mainfrom
TheRealFalcon:hotplug-udev
Oct 27, 2021
Merged

Add "install hotplug" module (SC-476)#1069
blackboxsw merged 6 commits into
canonical:mainfrom
TheRealFalcon:hotplug-udev

Conversation

@TheRealFalcon
Copy link
Copy Markdown
Contributor

Proposed Commit Message

Add "install hotplug" module

This commit removes automatically installing udev rules for hotplug
and adds a module to install them instead.

Automatically including the udev rules and checking if hotplug was
enabled consumed too many resources in certain circumstances. Moving the
rules to a module ensures we don't spend extra extra cycles on hotplug
if hotplug functionality isn't desired.

LP: #1946003

Additional Context

https://bugs.launchpad.net/cloud-init/+bug/1946003

Test Steps

Note that when running integration tests, an extra step has to be added between installing the new deb and creating a new machine image:

self.execute('rm /etc/udev/rules.d/10-cloud-init-hook-hotplug.rules')
self.execute('udevadm control --reload-rules')

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

Comment thread cloudinit/stages.py
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad your adding a module for this, it offers centralized docs for opting in to this feature.
Some doc and schema requests

Comment thread cloudinit/config/cc_install_hotplug.py Outdated
Comment on lines +19 to +21
updates:
network:
when: ['hotplug']
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define json schema validation for this. As mentioned in standup, we can grow that schema validation to storage or other updates in the future by having individual config modules validate updates: network vs updates: storage or updates: user-data if we go down that path in the future. A module doesn't have to have exclusive rights to perform full schema validation of everything under "updates"

Comment thread cloudinit/config/cc_install_hotplug.py Outdated
--------------
**Summary:** Install hotplug if supported and enabled

This module will install the udev rules to enable hotplug if supported by
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we note the current datasources which support hotplug somewhere. Also it might be worth adding a config doc update to doc/rtd/topics/datasources/ec2.rst and openstack.rst configuration sections.

Comment thread cloudinit/stages.py
Comment thread config/cloud.cfg.tmpl
- scripts-user
- ssh-authkey-fingerprints
- keys-to-console
- install-hotplug
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad you are ordering this last. Avoid thrashing or collisions during initial network bringup

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

@blackboxsw , updated. I believe I addressed all of your comments.

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be nice if schema.py:_get_property_doc knew about "enum" types

Instead of documenting this

Config schema:
    updates: (object)
        network: (object)
            when: (array of string)

It would be great if it could document the possible enum values somehow:

Config schema:
    updates: (object)
        network: (object)
            when: (array of enum ["boot-new-instance", "boot", "hotplug"...])

Not sure really how that would easily read. but just wanted more context in docs if we could.
Doesn't have to block this PR.

Comment thread cloudinit/config/cc_install_hotplug.py Outdated
Comment thread cloudinit/config/cc_install_hotplug.py
TheRealFalcon and others added 3 commits October 22, 2021 09:52
This commit removes automatically installing udev rules for hotplug
and adds a module to install them instead.

Automatically including the udev rules and checking if hotplug was
enabled consumed too many resources in certain circumstances. Moving the
rules to a module ensures we don't spend extra extra cycles on hotplug
if hotplug functionality isn't desired.

LP: #1946003
- supported check
- json schema
- supported datasources
- warning if enabled but not supported
- test refactor.
@blackboxsw
Copy link
Copy Markdown
Collaborator

Hrm schema validation seems to be having trouble with this structure:

cat > test1 <<EOF
#cloud-config
updates:
  network:
    when: [fail]
EOF

%   python3 -m cloudinit.cmd.main devel schema test1 --annotate
...
    errors_by_line[schemapaths[path]].append(msg)
KeyError: 'updates.network.when.0'

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the compound conditionals in cc_install_hotplug is now disabling hotplug on openstack. Not sure if integration tests still pass here?

Comment thread cloudinit/config/cc_install_hotplug.py
Comment thread cloudinit/config/cc_install_hotplug.py Outdated
Comment on lines +106 to +108
EventType.HOTPLUG in cloud.datasource.get_supported_events(
[EventType.HOTPLUG]
) and
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a bug, I'm seeing hotplug not getting enabled on openstack:
return value of get_supported_events is a dict keyed by EventScope:

{<EventScope.NETWORK: 'network'>: {<EventType.HOTPLUG: 'hotplug'>}}

So I think we need something like this:

Suggested change
EventType.HOTPLUG in cloud.datasource.get_supported_events(
[EventType.HOTPLUG]
) and
EventType.HOTPLUG in cloud.datasource.get_supported_events(
[EventType.HOTPLUG]
).get(EventScope.NETWORK, set()) and

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can validate easliy on openstack with sudo cloud-init single --name install_hotplug --frequency always

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a combined diff of what fixed OpenStack hotplug detection on my end for your consideration

diff --git a/cloudinit/config/cc_install_hotplug.py b/cloudinit/config/cc_install_hotplug.py
index 18b7a8b7..1a50aa31 100644
--- a/cloudinit/config/cc_install_hotplug.py
+++ b/cloudinit/config/cc_install_hotplug.py
@@ -96,28 +96,23 @@ LABEL="cloudinit_end"
 
 def handle(_name, cfg, cloud, log, _args):
     validate_cloudconfig_schema(cfg, schema)
-    network_hotplug_enabled = (
-        'updates' in cfg and
-        'network' in cfg['updates'] and
-        'when' in cfg['updates']['network'] and
-        'hotplug' in cfg['updates']['network']['when']
+    hotplug_when = cfg.get('updates', {}).get('network', {}).get('when', [])
+    userdata_hotplug_enabled = 'hotplug' in hotplug_when
+    ds_hotplug = EventType.HOTPLUG in cloud.datasource.get_supported_events(
+        [EventType.HOTPLUG]
+    ).get(EventScope.NETWORK, set())
+    stage_hotplug_enabled = stages.update_event_enabled(
+        datasource=cloud.datasource,
+        cfg=cfg,
+        event_source_type=EventType.HOTPLUG,
+        scope=EventScope.NETWORK,
     )
-    if not (
-        EventType.HOTPLUG in cloud.datasource.get_supported_events(
-            [EventType.HOTPLUG]
-        ) and
-        stages.update_event_enabled(
-            datasource=cloud.datasource,
-            cfg=cfg,
-            event_source_type=EventType.HOTPLUG,
-            scope=EventScope.NETWORK,
-        )
-    ):
+    if not (ds_hotplug and stage_hotplug_enabled):
         if os.path.exists(HOTPLUG_UDEV_PATH):
             log.debug("Uninstalling hotplug, not enabled")
             util.del_file(HOTPLUG_UDEV_PATH)
             subp.subp(["udevadm", "control", "--reload-rules"])
-        elif network_hotplug_enabled:
+        elif userdata_hotplug_enabled:
             log.warning(
                 "Hotplug is unsupported by current datasource. "
                 "Udev rules will NOT be installed."

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

Gah...sorry for that and thank you. That's what I get for trusting the unit tests that I made using wrong mocks 🤦‍♂️ . I addressed the main problem but haven't looked at the schema validation yet. I responded to one of your suggestions inline.

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

Not sure how the code change would have affected this, but I'm not seeing the same schema issue you are:

root@me:~# python3 -m cloudinit.cmd.main devel schema --config-file fail --annotate
#cloud-config
updates:
  network:
    when: [fail]		# E1

# Errors: -------------
# E1: 'fail' is not one of ['boot-new-instance', 'boot-legacy', 'boot', 'hotplug']

tried on both bionic and impish.

@blackboxsw
Copy link
Copy Markdown
Collaborator

Not sure how the code change would have affected this, but I'm not seeing the same schema issue you are:

root@me:~# python3 -m cloudinit.cmd.main devel schema --config-file fail --annotate
#cloud-config
updates:
  network:
    when: [fail]		# E1

# Errors: -------------
# E1: 'fail' is not one of ['boot-new-instance', 'boot-legacy', 'boot', 'hotplug']

tried on both bionic and impish.

OOPS: this was a local config issue. I was running on a system with cloud-init deb installed and I attempted to run locally python3 -m in cwd. This ended up falling back to system installed cloudinit modules from the updated code-local CLI entry point and getting a mismatch of expected return values from schema functions. Sorted when I ran in a clean env.

Comment thread cloudinit/config/cc_install_hotplug.py Outdated
from cloudinit.settings import PER_ALWAYS


frequency = PER_ALWAYS
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheRealFalcon Do we really want to write out the udev rules PER_ALWAYS? I would think maybe PER_INSTANCE as we don't want to re-enable this feature every boot if an admin has turned it off again by removing the udev rules file. What do you think.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I thought there were pros and cons to both and I went back and forth in my head. PER_INSTANCE makes sense.

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch looks really good and goes a long way toward cohesive docs and encapsulation of the hotplug offering. Thanks!

One functional item I think we need to sort is that we still fully come into Datasource._get_data on OpenStack and run EphemeralDCHP (copying out a /var/tmp/dhclient even though network is already "up". I think we might have to improve datasource detection for network up scenarios as part of hotplug support in general. No need to do extra work. Maybe we just use EphemeralDHCP(connectivity_url=...) in OpenStack's case here. This feels like a datasource shortfall in light of hotplug support, not a problem introduced by this branch.

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

Added a jira issue to the backlog for the ephemeral DHCP.

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

@blackboxsw , I think I've addressed your comments. Anything left from your end?

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this great work James. +1 on this iteration and we can capture other backlog items separately

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants