Skip to content

systemd-detect-virt can return qemu, which we support#1325

Merged
TheRealFalcon merged 1 commit into
canonical:mainfrom
holmanb:holmanb/qemu
Mar 14, 2022
Merged

systemd-detect-virt can return qemu, which we support#1325
TheRealFalcon merged 1 commit into
canonical:mainfrom
holmanb:holmanb/qemu

Conversation

@holmanb
Copy link
Copy Markdown
Member

@holmanb holmanb commented Mar 9, 2022

Work around bug in LXD VM detection

On kernels >=5.10, LXD starts qemu with kvm and hv_passthrough.
This causes `systemd-detect-virt` to identify the host as "qemu", rather than "kvm".

Cloud-init treats emulated (TCG) virtualization the same way as virtualized (KVM).
If systemd (see issue #22709) decides to report this as something other than
kvm/qemu, we'll need to extend our list of accepted types to include that as well.

https://github.com/systemd/systemd/issues/22709

See:
https://github.com/lxc/lxd/blob/master/lxd/instance/drivers/driver_qemu.go#L1268

systemd/systemd#22709

Copy link
Copy Markdown
Contributor

@TheRealFalcon TheRealFalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: since there's formatting issues anyway, in "set of things" isn't a convention I've seen many times. Usually it's a list or tuple.

@cpaelzer
Copy link
Copy Markdown
Collaborator

I'd expect that at least the commit message could say "Even once the detection fixed in systemd cloud-init treads emulated (TCG) virtualization the same way as virtualized (KVM) anyway. If systemd (see issue ...) decides to report this as something else then KVM/Qemu we'll need to extend our list of accepted types to include that as well".

Furthermore since this discussion made me see the code - depending on my minimal understanding what you use network_v1["config"][0]["name"] for there might be follow up changes needed.

Example: on s390x you define enc9, but for example I have 2: enc2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
To be clear that number reflects the ccw adress that the device is presented to the guest, in my case
cssid='0xfe' ssid='0x0' devno='0x0002'.
The others like enp0s5 are also just PCI(e) bus/slot adresses - and those devices might or might not be present already depending on guest config.
This is for LXD Datasource, did we just make assumptions for these addresses or did LXD tell us to rely on those as they always specify it that way?
What will these be used for later on?
A lot of questions I know, I'm fine if this is a follow up discussion ending in insight (for me) or a different PR (to fix this).

@cpaelzer
Copy link
Copy Markdown
Collaborator

Oh and for this very purpose - maybe systemd-detect-virt --vm would make you more resilient and future proof instead of listing accepted types?

@smoser
Copy link
Copy Markdown
Collaborator

smoser commented Mar 11, 2022

Oh and for this very purpose - maybe systemd-detect-virt --vm would make you more resilient and future proof instead of listing accepted types?

My personal feeling is that specific is better. When LXD runs a VMware guest you can add it here. That’s preferable to false-positive on a known invalid case now.

And, hello @cpaelzer

@holmanb
Copy link
Copy Markdown
Member Author

holmanb commented Mar 11, 2022

I'd expect that at least the commit message could say ...

@cpaelzer: I think I captured your suggestion in the latest updated "commit message". Agreed, context in the git log is important here.

Furthermore since this discussion made me see the code - depending on my minimal understanding what you use network_v1["config"][0]["name"] for there might be follow up changes needed.

Example: on s390x you define enc9, but for example I have 2: enc2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000 To be clear that number reflects the ccw adress that the device is presented to the guest, in my case cssid='0xfe' ssid='0x0' devno='0x0002'. The others like enp0s5 are also just PCI(e) bus/slot adresses - and those devices might or might not be present already depending on guest config. This is for LXD Datasource, did we just make assumptions for these addresses or did LXD tell us to rely on those as they always specify it that way? What will these be used for later on? A lot of questions I know, I'm fine if this is a follow up discussion ending in insight (for me) or a different PR (to fix this).

@cpaelzer: Thanks for noticing this discrepancy. I'll do some archeology and check if our assumptions are valid.

Oh and for this very purpose - maybe systemd-detect-virt --vm would make you more resilient and future proof instead of listing accepted types?

@cpaelzer @smoser I don't think we want to do this. We will still have to check for valid types with systemd-detect-virt --vm since "kvm" and "qemu" are both possible return types with --vm.

More importantly, systemd-detect-virt without flags reports the "innermost" virtualization method. Using --vm will cause it to skip checking for containers (since it will call detect_vm() directly with this flag, rather than detect_virtualization()[1]). This could result in a container inside a VM to being reported as a VM. Therefore, I think that this change in behavior (adding --vm) poses more risk than the current solution. This could cause issues for us specifically, for example, if a LXC container inside a LXD VM might possibly be reported as a LXC VM due to us skipping the container check. Is there some reason that we can do this safely that I am unaware of?

[1] https://github.com/systemd/systemd/blob/main/src/detect-virt/detect-virt.c#L163

@TheRealFalcon
Copy link
Copy Markdown
Contributor

I'm going to merge this as-is. I'm creating an additional card in Jira to follow up on @cpaelzer 's questions.

@TheRealFalcon TheRealFalcon merged commit be9389c into canonical:main Mar 14, 2022
@holmanb
Copy link
Copy Markdown
Member Author

holmanb commented Apr 3, 2022

Update: Systemd was updated to report "kvm" with hv_passthrough in systemd/systemd#22945. No further changes should be necessary; the current code should continue to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants