Skip to content

ds-identify: also discover LXD by DMI board_name = LXD#1311

Merged
holmanb merged 2 commits into
canonical:mainfrom
blackboxsw:lxd-vm-ds-id
Mar 29, 2022
Merged

ds-identify: also discover LXD by DMI board_name = LXD#1311
holmanb merged 2 commits into
canonical:mainfrom
blackboxsw:lxd-vm-ds-id

Conversation

@blackboxsw
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw commented Feb 28, 2022

Proposed Commit Message

ds-identify: also discover LXD by presence from DMI board_name = LXD

VMs will not start lxd-agent.service in systemd generator timeframe
which means /dev/lxd/sock will not exist yet on LXD VM.

For VM support, ds-identify will return DS_FOUND when
/sys/class/dmi/id/board_name == "LXD" which exists at
early boot regardless of LXD socket status.

Later, in cloud-init init-local boot stage, cloud-init's will only
discover the LXDDatasource if the /dev/lxd/sock is active.
This allows consumers to disable the LXD datasource
behavior by running:
   lxc config set MACHINE_NAME security.devlxd=false

Additional Context

Test Steps

# test VM
$ CLOUD_INIT_PLATFORM=lxd_vm CLOUD_INIT_KEEP_INSTANCE=true CLOUD_INIT_CLOUD_INIT_SOURCE=/home/csmith/src/cloud-init/cloud-init_22.1-25-g67b413dc-1~bddeb_all.deb CLOUD_INIT_OS_IMAGE=focal tox -e integration-tests tests/integration_tests/datasources/test_lxd_discovery.py 
# test container
$ CLOUD_INIT_PLATFORM=lxd_container CLOUD_INIT_KEEP_INSTANCE=true 
CLOUD_INIT_CLOUD_INIT_SOURCE=/home/csmith/src/cloud-init/cloud-init_22.1-25-g67b413dc-1~bddeb_all.deb CLOUD_INIT_OS_IMAGE=focal tox -e integration-tests tests/integration_tests/datasources/test_lxd_discovery.py 

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

@blackboxsw
Copy link
Copy Markdown
Collaborator Author

blackboxsw commented Feb 28, 2022

@stgraber I'm seeing Racy behavior with virtio device files not being setup at systemd generator timeframe. Not sure how we should resolve it given that ds-identify runs so early during generator timeframe

  1. ds-identify is run during generator timeframe on a system @ boot
-rw-r--r-- 1 root root   1171 2022-02-28 19:49:00.372919794 +0000 /run/cloud-init/ds-identify.log_

  1. udev event where virtio device is seen and processed by the kernel (set /etc/udev/udev.conf udev_log=debug)
# journalctl -b 0 -o short-precise | grep vport5p1
Feb 28 19:49:01.333628 cloudinit-0228-1914032ly6ffbu systemd-udevd[147]: vport5p1: Device (SEQNUM=915, ACTION=add) is queued
Feb 28 19:49:01.368207 cloudinit-0228-1914032ly6ffbu systemd-udevd[147]: vport5p1: sd-device-monitor: Passed 247 byte to netlink monitor
Feb 28 19:49:01.368340 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Processing device (SEQNUM=915, ACTION=add)
Feb 28 19:49:01.380769 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: /usr/lib/udev/rules.d/50-udev-default.rules:7 LINK 'virtio-ports/org.linuxcontainers.lxd'
Feb 28 19:49:01.382186 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Handling device node '/dev/vport5p1', devnum=c250:1
Feb 28 19:49:01.387501 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Preserve permissions of /dev/vport5p1, uid=0, gid=0, mode=0600
Feb 28 19:49:01.387878 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Creating symlink '/dev/char/250:1' to '../vport5p1'
Feb 28 19:49:01.388017 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Creating symlink '/dev/virtio-ports/org.linuxcontainers.lxd' to '../vport5p1'
Feb 28 19:49:01.388733 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: sd-device: Created db file '/run/udev/data/c250:1' for '/devices/pci0000:00/0000:00:01.0/0000:01:00.5/virtio5/virtio-ports/vport5p1'
Feb 28 19:49:01.390647 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: Device (SEQNUM=915, ACTION=add) processed
Feb 28 19:49:01.393185 cloudinit-0228-1914032ly6ffbu systemd-udevd[152]: vport5p1: sd-device-monitor: Passed 313 byte to netlink monitor

  1. virtio-ports link not seen until half a second later than `ds-identify due to timing of udev add event and fuse/virtiofs module loads
lrwxrwxrwx 1 root root     11 2022-02-28 19:49:01.384919794 +0000 /dev/virtio-ports/org.linuxcontainers.lxd -> ../vport5p1
crw------- 1 root root 250, 1 2022-02-28 19:49:01.384919794 +0000 /dev/vport5p

@stgraber
Copy link
Copy Markdown
Contributor

stgraber commented Mar 1, 2022

@blackboxsw is that an alternative you could use?

root@buildd01:~# cat /sys/devices/virtual/dmi/id/board_name
LXD

@blackboxsw blackboxsw changed the title ds-identify: also discover LXD by presence of lxd virtio-ports ds-identify: also discover LXD by presence of board_name = "LXD" Mar 1, 2022
@blackboxsw blackboxsw changed the title ds-identify: also discover LXD by presence of board_name = "LXD" ds-identify: also discover LXD by DMI board_name = LXD Mar 1, 2022
@blackboxsw
Copy link
Copy Markdown
Collaborator Author

@blackboxsw is that an alternative you could use?

root@buildd01:~# cat /sys/devices/virtual/dmi/id/board_name
LXD

@stgraber works great. Thanks for this pointer. I've updated the PR here with docs and integration test which asserts that both kvm and container can be detected appropriately during ds-identify without races.

@smoser
Copy link
Copy Markdown
Collaborator

smoser commented Mar 1, 2022

@blackboxsw is that an alternative you could use?

root@buildd01:~# cat /sys/devices/virtual/dmi/id/board_name
LXD

@stgraber works great. Thanks for this pointer. I've updated the PR here with docs and integration test which asserts that both kvm and container can be detected appropriately during ds-identify without races.

Does it guarantee the presence of a LXD source? Vs just guaranteeing that you’re rubbing on lxd.

@stgraber
Copy link
Copy Markdown
Contributor

stgraber commented Mar 2, 2022

It's got the same guarantee as the virtio port minus the udev race.

It doesn't guarantee that the guest will spawn a lxd-agent process but neither did the virtio-serial device.

@blackboxsw
Copy link
Copy Markdown
Collaborator Author

blackboxsw commented Mar 3, 2022

It's got the same guarantee as the virtio port minus the udev race.

It doesn't guarantee that the guest will spawn a lxd-agent process but neither did the virtio-serial device.

Ultimately the only way cloud-init can detect an operable datasource at python DS discovery time is when /dev/lxd/sock exists. Since /dev/lxd/sock isn't yet present on kvm images during ds-identify due to a race on lxd-agent-loader systemd units, and /dev/virtioports being available due to delayed udev events, we need a fallback option to determine that LXD is viable. This PR allows ds-identify to queue a DS_FOUND for LXD and rely on DataSourceLXD.get_data to confirm the the desired /dev/lxd/sockis present and active. We are positioning LXD at the end of datasource_list:[ ... ] so that python get_data discovery won't happen unless no other datasources are discovered.

In al lxd container/kvm cases, /dev/lxd/sock presence can still be revoked during early boot, or anytime thereafter with lxc config set <your_vm> security.devlxd=false which will immediately remove the sock device from the system rendering cloud-init unable to discover the LXD datasource or reuse that datasource.

Maybe since this race condition with /dev/lxd/sock exists only on lxd kvm instances, we should limit this DMI board_name check to only where DI_VIRT = "kvm" too to limit false positives

dscheck_LXD() {
    [ -S /dev/lxd/sock ] && return ${DS_FOUND}
    if [ "$DI_VIRT" = "kvm" ]; then
        get_dmi_field board_name
        [ "$_RET" = "LXD" ] && return ${DS_FOUND}
    fi
    return ${DS_NOT_FOUND}
}

@smoser
Copy link
Copy Markdown
Collaborator

smoser commented Mar 3, 2022

It's got the same guarantee as the virtio port minus the udev race.
It doesn't guarantee that the guest will spawn a lxd-agent process but neither did the virtio-serial device.

Ultimately the only way cloud-init can detect an operable datasource at python DS discovery time is when /dev/lxd/sock exists. Since /dev/lxd/sock isn't yet present on kvm images during ds-identify due to a race on lxd-agent-loader systemd units, and /dev/virtioports being available due to delayed udev events, we need a fallback option to determine that LXD is viable. This PR allows ds-identify to queue a DS_FOUND for LXD and rely on DataSourceLXD.get_data to confirm the the desired /dev/lxd/sockis present and active. We are positioning LXD at the end of datasource_list:[ ... ] so that python get_data discovery won't happen unless no other datasources are discovered.

In al lxd container/kvm cases, /dev/lxd/sock presence can still be revoked during early boot, or anytime thereafter with lxc config set <your_vm> security.devlxd=false which will immediately remove the sock device from the system rendering cloud-init unable to discover the LXD datasource or reuse that datasource.

If you are running a guest on EC2 and something routes off the metadata service during boot, then cloud-init will fail. Thats not really something to be concerned about. There are many ways thing can fail.

The goal is for ds-identify to correctly identify "I am running on a platform with LXD metadata service." If all systems that have 'LXD' in dmi information can be expect to talk over /dev/lxd/sock, then this is good enough. If there are systems where that is not the case then it seems better to have the host present clear information.

Maybe since this race condition with /dev/lxd/sock exists only on lxd kvm instances, we should limit this DMI board_name check to only where DI_VIRT = "kvm" too to limit false positives

"limit the false positives" is just "limit the bugs to some unlucky people". Its not really a great situation.

@blackboxsw
Copy link
Copy Markdown
Collaborator Author

If you are running a guest on EC2 and something routes off the metadata service during boot, then cloud-init will fail. That's not really something to be concerned about. There are many ways thing can fail.

Agreed. This failure mode you mention, "something routing off Ec2 IMDS" or launching Ec2 instances wth "metadata access: disabled" is the same type of failure mode we can see in LXD when someone configures security.devlxd=false. Meaning that the LXD platform via DMI board_name says, I am LXD instance. But the IMDS access point through /dev/lxd/sock is not enabled, so LXD.get_data will quickly return False and and cloud-init fallsback to DataSourceNone.

On Ec2 with disabled IMDS at instance lauch we see ds-identify detect the EC2 platform yet DataSourceEc2.get_data ends up warning:
DataSourceEc2.py[WARNING]: Ec2 IMDS endpoint returned a 403 error. HTTP endpoint is disabled. Aborting.

and falling back to DataSourceNone when it doesn't find any matching active DS.

The goal is for ds-identify to correctly identify "I am running on a platform with LXD metadata service." If all systems that have 'LXD' in dmi information can be expect to talk over /dev/lxd/sock, then this is good enough. If there are systems where that is not the case then it seems better to have the host present clear information.

Maybe since this race condition with /dev/lxd/sock exists only on lxd kvm instances, we should limit this DMI board_name check to only where DI_VIRT = "kvm" too to limit false positives

"limit the false positives" is just "limit the bugs to some unlucky people". Its not really a great situation.

My phrasing was probably wrong above. The "false positive" I meant was only the ds-identify returning DS_FOUND even on containers which were launched with security.devlxd=false. We expect non-kvm LXD containers will have a /dev/lxd/sock setup at systemd generator time if so configured at instance launch. If we return DS_FOUND based only on DMI board name on containers, then we force cloud-init python discovery code to spend a couple of unnecessary cycles running DataSourceLXD.get_data to ultimately reject that datasource as False because the socket file still doesn't exist which we log as "Not an LXD datasource: No LXD socket found.". There will be no "false positives" from python datasource via LXD.get_data, just the potential of a spent cycle ultimately detecting absence of /dev/lxd/sock and moving on to other priority datasource in datasource_list (which really only equates to DataSourceNone)

Copy link
Copy Markdown
Contributor

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with a small docs nit.

@smoser , does Chad's explanation make sense. Do you still have concerns here?

Comment thread doc/rtd/topics/datasources/lxd.rst Outdated
VMs will not start lxd-agent.service in systemd generator timeframe
which means /dev/lxd/sock will not exist yet on LXD VM.

For VM support, ds-identify will return DS_FOUND when
/sys/class/dmi/id/board_name == "LXD" which exists at
early boot regardless of LXD socket status.
@holmanb
Copy link
Copy Markdown
Member

holmanb commented Mar 9, 2022

I noticed this breaks the pattern of printing DMI info we gather, this question is basically: "why do we break the pattern?", along with a proposal how we might follow the pattern. If we don't care about this data or printing (I assume this is just for debugging) for some reason, feel free to disregard.

Is DI_MAIN=print_info in ds-identify used for anything other than debugging, and would it benefit from adding the board_name output? I see that all other DMI queries in ds-identify currently get printed if called with this env variable, so I'm curious if we want to match this pattern.

something along these lines, perhaps:

--- a/tools/ds-identify
+++ b/tools/ds-identify
@@ -507,6 +507,12 @@ read_dmi_product_serial() {
     DI_DMI_PRODUCT_SERIAL="$_RET"
 }
 
+read_dmi_product_board_name() {
+    cached "${DI_DMI_PRODUCT_BOARD_NAME}" && return
+    get_dmi_field board_name
+    DI_DMI_PRODUCT_BOARD_NAME="$_RET"
+}
+
 # shellcheck disable=2034
 read_uname_info() {
     # run uname, and parse output.
@@ -1470,6 +1476,7 @@ collect_info() {
     read_dmi_product_name
     read_dmi_product_serial
     read_dmi_product_uuid
+    read_dmi_product_board_name
     read_fs_info
 }
 
@@ -1482,7 +1489,7 @@ _print_info() {
     local n="" v="" vars=""
     vars="DMI_PRODUCT_NAME DMI_SYS_VENDOR DMI_PRODUCT_SERIAL"
     vars="$vars DMI_PRODUCT_UUID PID_1_PRODUCT_NAME DMI_CHASSIS_ASSET_TAG"
-    vars="$vars FS_LABELS ISO9660_DEVS KERNEL_CMDLINE VIRT"
+    vars="$vars DI_DMI_PRODUCT_BOARD_NAME FS_LABELS ISO9660_DEVS KERNEL_CMDLINE VIRT"
     vars="$vars UNAME_KERNEL_NAME UNAME_KERNEL_RELEASE UNAME_KERNEL_VERSION"
     vars="$vars UNAME_MACHINE UNAME_NODENAME UNAME_OPERATING_SYSTEM"
     vars="$vars DSNAME DSLIST"

in addition to replacing get_dmi_field board_name with a call to read_dmi_product_board_name?

@blackboxsw
Copy link
Copy Markdown
Collaborator Author

Excellent review suggestion @holmanb and +1 on this sentiment, we frequently use /run/cloud-init/ds-identify.log to triage and debug errors in systemd generator time detection. So, having this output factored into that print is helpful for ubuntu-bug cloud-init and cloud-init collect-logs processing during bug creation.

To aid in triage: read_dmi_board_name needs to print DMI board name.
Also factor out is_socket_file to aid in unit test of LXD and LXD-kvm.
@TheRealFalcon
Copy link
Copy Markdown
Contributor

@blackboxsw , is this ready for review again?

@blackboxsw
Copy link
Copy Markdown
Collaborator Author

@TheRealFalcon yes

@blackboxsw , is this ready for review again?

Yes, no changes expected here. We were waiting I think on a response from @smoser if there was an objection to this approach, but I believe I explained the merit of this approach per his original concerns.

Copy link
Copy Markdown
Member

@holmanb holmanb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@holmanb holmanb merged commit 30ccd51 into canonical:main Mar 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants