This bug was originally filed in Launchpad as LP: #1966085
Launchpad details
affected_projects = ['cloud-init (Ubuntu)', 'cloud-init (Ubuntu Bionic)', 'cloud-init (Ubuntu Focal)', 'cloud-init (Ubuntu Impish)', 'cloud-init (Ubuntu Jammy)']
assignee = falcojr
assignee_name = James Falcon
date_closed = 2022-03-24T16:21:36.503591+00:00
date_created = 2022-03-23T14:02:18.034768+00:00
date_fix_committed = 2022-03-24T02:25:16.032161+00:00
date_fix_released = 2022-03-24T16:21:36.503591+00:00
id = 1966085
importance = high
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1966085
milestone = None
owner = falcojr
owner_name = James Falcon
private = False
status = fix_released
submitter = falcojr
submitter_name = James Falcon
tags = ['regression-update', 'verification-done-bionic', 'verification-done-focal', 'verification-done-impish']
duplicates = []
Launchpad user James Falcon(falcojr) wrote on 2022-03-23T14:02:18.034768+00:00
[Impact]
* In environments where cloud-init-generator detects no viable datasources, cloud-init systemd units are not included in the boot process and cloud-init status --wait blocks indefinitely reporting "not run" state instead of "disabled". Utilities which rely on the blocking status will sit indefinitely as cloud-init will never run in these scenarios.
This upload ensures that the expected artifact of cloud-init-generator emits a /run/cloud-init/disabled file (which cloud-init status --wait expects to see when coud-init is disabled at generator timeframe) which allows cloud-init status --wait to return expected "disabled" status.
[Test Plan]
* this scenario is reproducible by disabling LXD nocloud metadata templates and security.devlxd=false configuration which will allow cloud-init-generator to find no viable datasources during systemd generator time frame. Cloud-init status --wait in this case should not block, but return 'disabled' status immediately after systemd generator is run.
Procedure
* lxc launch ubuntu-daily: test-series
# To avoid LXD datasource detection don't allow /dev/lxd/sock in instance
* lxc config set test-<series< security.devlxd=false
# Remove nocloud-net cloud-init instance metadata to avoid NoCloud
* lxc exec test- -- rm -rf /var/lib/cloud/seed/nocloud-net
# clean reboot of cloud-init to ensure generator detection
* lxc exec test- -- cloud-init clean --logs --reboot
# make sure cloud-init status --wait doesn't block indefiinitely in this case
* lxc exec tests- -- cloud-init status --wait
This test procedure is captured by cloud-init integration test
https://github.com/canonical/cloud-init/blob/main/tests/integration_tests/cmd/test_status.py
It can be run with the following:
for $SERIES in bionic focal impish; do
CLOUD_INIT_PLATFORM=lxd_container CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=$SERIES tox -e integration-tests tests/integration_tests/cmd/test_status.py
done
[Where problems could occur]
* Problems can occur if cloud-init early exits from cloud-init status --wait without blocking if 3rd party apps/scripts which expect to block until cloud-init completes start running before cloud-init is "done" could could lead to unexpected system configuration or apt-lock issues if scripts attempt to install packages
[Other Info]
[Original Description]
Calling "cloud-init status --wait" never returns if ds-identify cannot find a datasource.
In #1162, we modified status checks to wait until we get an "enabled" or "disabled" file from ds-identiy. ds-identify never outputs a "disabled" file, so "status --wait" will wait indefinitely if no datasource is found.
A simple reproduction is to change "ret=1" on this line:
https://github.com/canonical/cloud-init/blob/main/tools/ds-identify#L1866
This bug was originally filed in Launchpad as LP: #1966085
Launchpad details
Launchpad user James Falcon(falcojr) wrote on 2022-03-23T14:02:18.034768+00:00
[Impact]
* In environments where cloud-init-generator detects no viable datasources, cloud-init systemd units are not included in the boot process and
cloud-init status --waitblocks indefinitely reporting "not run" state instead of "disabled". Utilities which rely on the blocking status will sit indefinitely as cloud-init will never run in these scenarios.This upload ensures that the expected artifact of cloud-init-generator emits a /run/cloud-init/disabled file (which cloud-init status --wait expects to see when coud-init is disabled at generator timeframe) which allows cloud-init status --wait to return expected "disabled" status.
[Test Plan]
* this scenario is reproducible by disabling LXD nocloud metadata templates and security.devlxd=false configuration which will allow cloud-init-generator to find no viable datasources during systemd generator time frame. Cloud-init status --wait in this case should not block, but return 'disabled' status immediately after systemd generator is run.
Procedure
* lxc launch ubuntu-daily: test-series
# To avoid LXD datasource detection don't allow /dev/lxd/sock in instance
* lxc config set test-<series< security.devlxd=false
# Remove nocloud-net cloud-init instance metadata to avoid NoCloud
* lxc exec test- -- rm -rf /var/lib/cloud/seed/nocloud-net
# clean reboot of cloud-init to ensure generator detection
* lxc exec test- -- cloud-init clean --logs --reboot
# make sure cloud-init status --wait doesn't block indefiinitely in this case
* lxc exec tests- -- cloud-init status --wait
This test procedure is captured by cloud-init integration test
https://github.com/canonical/cloud-init/blob/main/tests/integration_tests/cmd/test_status.py
It can be run with the following:
for $SERIES in bionic focal impish; do
CLOUD_INIT_PLATFORM=lxd_container CLOUD_INIT_CLOUD_INIT_SOURCE=PROPOSED CLOUD_INIT_OS_IMAGE=$SERIES tox -e integration-tests tests/integration_tests/cmd/test_status.py
done
[Where problems could occur]
* Problems can occur if cloud-init early exits from
cloud-init status --waitwithout blocking if 3rd party apps/scripts which expect to block until cloud-init completes start running before cloud-init is "done" could could lead to unexpected system configuration or apt-lock issues if scripts attempt to install packages[Other Info]
[Original Description]
Calling "cloud-init status --wait" never returns if ds-identify cannot find a datasource.
In #1162, we modified status checks to wait until we get an "enabled" or "disabled" file from ds-identiy. ds-identify never outputs a "disabled" file, so "status --wait" will wait indefinitely if no datasource is found.
A simple reproduction is to change "ret=1" on this line:
https://github.com/canonical/cloud-init/blob/main/tools/ds-identify#L1866