Bug report
ds-identify can falsely identify a non-EC2 VM as running on AWS EC2 due to a collision in the UUID-based platform detection heuristic in ec2_identify_platform(). This causes cloud-init to remain enabled on VMs where it should be auto-disabled via the notfound=disabled policy.
In tools/ds-identify, the ec2_identify_platform() function uses two glob patterns on the first group of DMI_PRODUCT_UUID to detect EC2:
local start_uuid="${DI_DMI_PRODUCT_UUID%%-*}"
case "$start_uuid" in
[Ee][Cc]2*)
_RET="AWS"
;;
*2[0-9a-fA-F][Ee][Cc])
_RET="AWS"
;;
*)
_RET="$default"
;;
esac
AWS's own documentation acknowledges this UUID method is potentially inaccurate
False positive example:
DMI_SYS_VENDOR=SomeVendor
DMI_PRODUCT_NAME=Virtual Machine
DMI_PRODUCT_UUID=a83f26ec-1234-5678-abcd-ef0123456789
VIRT=microsoft
First UUID group: a83f26ec
^^^^
26ec → matches *2[0-9a-fA-F][Ee][Cc] → "AWS"
Steps to reproduce the problem
- Create a VM on any non-EC2 hypervisor
- Ensure DMI_PRODUCT_UUID first group ends in a pattern matching 2[0-9a-fA-F][Ee][Cc] (e.g., a83f26ec-...)
- Install cloud-init with default ds-identify policy (notfound=disabled)
- Boot the VM
- Observe in /run/cloud-init/ds-identify.log:
ec2 platform is 'AWS'
Found single datasource: Ec2
returning 0
- cloud-init remains enabled despite no valid datasource being reachable
Suggested fix
The function already checks DMI_SYS_VENDOR for specific cloud providers (e24cloud, Tilaa, Outscale) but does not use it to gate the UUID heuristic. The UUID pattern is inherently a weak signal and should only be trusted when corroborated by a vendor string consistent with EC2. We could use a:
-
allowlist (preferred): Only trust the UUID heuristic for known EC2 vendors ("Amazon EC2"|"Xen")
-
blocklist: Exclude known non-EC2 hypervisors before the UUID check (eg Microsoft Corporation"|"QEMU"|"Red Hat"|"Nutanix"|"Google")
Option A is preferred because it is forward-compatible — any new platform with random UUIDs is automatically excluded, whereas Option B requires adding entries for each new vendor.
Environment details
- Cloud-init version: 24.3.1
- Operating System Distribution: AzureLinux
- Cloud provider, platform or installer type: Azure
Bug report
ds-identify can falsely identify a non-EC2 VM as running on AWS EC2 due to a collision in the UUID-based platform detection heuristic in ec2_identify_platform(). This causes cloud-init to remain enabled on VMs where it should be auto-disabled via the notfound=disabled policy.
In tools/ds-identify, the ec2_identify_platform() function uses two glob patterns on the first group of DMI_PRODUCT_UUID to detect EC2:
AWS's own documentation acknowledges this UUID method is potentially inaccurate
False positive example:
Steps to reproduce the problem
Suggested fix
The function already checks DMI_SYS_VENDOR for specific cloud providers (e24cloud, Tilaa, Outscale) but does not use it to gate the UUID heuristic. The UUID pattern is inherently a weak signal and should only be trusted when corroborated by a vendor string consistent with EC2. We could use a:
allowlist (preferred): Only trust the UUID heuristic for known EC2 vendors ("Amazon EC2"|"Xen")
blocklist: Exclude known non-EC2 hypervisors before the UUID check (eg Microsoft Corporation"|"QEMU"|"Red Hat"|"Nutanix"|"Google")
Option A is preferred because it is forward-compatible — any new platform with random UUIDs is automatically excluded, whereas Option B requires adding entries for each new vendor.
Environment details