Move GCE metadata fetch to init-local (SC-502)#1122
Conversation
GCE currently fetches metadata after network has come up. There's no reason we can't fetch at init-local time, so update GCE to fetch at init-local time to be more consistent with other datasources.
blackboxsw
left a comment
There was a problem hiding this comment.
Content looks good, I think we need at least some unittest and integration test coverage.
For integration tests can we add a @pytest.mark.gce test case to tests.integration_tests.test_combined.TestCombined for test_instance_json_gce somewhat like what we have for the other clouds.
I'd also like to extend those combined tests with a datasource detection test. something like
diff --git a/tests/integration_tests/modules/test_combined.py b/tests/integration_tests/modules/test_combined.py
index bc19c2a2..4bee0f20 100644
--- a/tests/integration_tests/modules/test_combined.py
+++ b/tests/integration_tests/modules/test_combined.py
@@ -195,6 +195,32 @@ class TestCombined:
'date "+%Z" --date="Thu, 03 Nov 2016 00:47:00 -0400"')
assert timezone_output.strip() == "HDT"
+ def test_correct_datasource_detected(self, class_client: IntegrationInstance):
+ """Test datasource is detected at the proper boot stage."""
+ client = class_client
+ status_file = client.read_from_file('/run/cloud-init/status.json')
+
+ if instance.settings.PLATFORM == 'azure':
+ datasource = 'DataSourceAzure [seed=/dev/sr0]'
+ elif instance.settings.PLATFORM == 'ec2':
+ datasource = 'DataSourceEc2Local'
+ elif instance.settings.PLATFORM == 'gce':
+ datasource = 'DataSourceGCELocal'
+ elif instance.settings.PLATFORM == 'oci':
+ datasource = 'DataSourceOracle'
+ elif instance.settings.PLATFORM == 'openstack':
+ datasource = 'DataSourceOpenStackLocal [net,ver=2]'
+ elif instance.settings.PLATFORM in 'lxd_container':
+ datasource = (
+ 'DataSourceNoCloud'
+ ' [seed=/var/lib/cloud/seed/nocloud-net][dsmode=net]'
+ )
+ elif instance.settings.PLATFORM == 'lxd_vm':
+ datasource = 'DataSourceNoCloud [seed=/dev/sr0][dsmode=net]'
+ else:
+ datasource = "UNKNOWN_PLATFORM"
+ assert datasrouce == json.loads(status_file)['v1']['datasource']
+
def test_no_problems(self, class_client: IntegrationInstance):
"""Test no errors, warnings, or tracebacks"""
client = class_clientWhat do you think?
|
@blackboxsw Added some tests. Please re-review. |
|
Cool BTW on the cloud-init analyze show on your boot logs looking around a 30% "performance" gain in the boot time for that GCP instance. We'll see if that performance improvement holds in general over a broad set of runs, detecting in init-local instead of init-network stage. |
blackboxsw
left a comment
There was a problem hiding this comment.
Much cleaner approach on your integration test updates than my suggestion +1!
Looks like some mock issues in CI which prevent package builds and result in failures on integration tests.
Could this work?
diff --git a/tests/unittests/test_datasource/test_gce.py b/tests/unittests/test_datasource/test_gce.py
index c9360da4..7a6848bc 100644
--- a/tests/unittests/test_datasource/test_gce.py
+++ b/tests/unittests/test_datasource/test_gce.py
@@ -371,8 +371,9 @@ class TestDataSourceGCE(test_helpers.HttprettyTestCase):
ds = DataSourceGCE.DataSourceGCELocal(
sys_cfg={}, distro=None, paths=None
)
+ _set_mock_metadata()
ds._get_data()
- m_dhcp.assert_called_once()
+ self.assertEqual(1, m_dhcp.call_count)
@mock.patch(
"cloudinit.sources.DataSourceGCE.EphemeralDHCPv4",
@@ -380,6 +381,7 @@ class TestDataSourceGCE(test_helpers.HttprettyTestCase):
)
def test_datasource_doesnt_use_ephemeral_dhcp(self, m_dhcp):
ds = DataSourceGCE.DataSourceGCE(sys_cfg={}, distro=None, paths=None)
+ _set_mock_metadata()
ds._get_data()
m_dhcp.assert_not_called()Approved once CI is fixed.
| sys_cfg={}, distro=None, paths=None | ||
| ) | ||
| ds._get_data() | ||
| m_dhcp.assert_called_once() |
There was a problem hiding this comment.
DNE, you probably meant assert_called_once_with. But we can instead just check call_count
| m_dhcp.assert_called_once() | |
| self.assertEqual(1, m_dhcp.call_count) |
There was a problem hiding this comment.
Heh...so since I knew the death of 3.5 is imminent for cloud-init, and Brett showed me a really cool package for black formatting of only changed code that is 3.6+, I figured I can upgrade my local dev environment because am I really going to run into any python 3.5-only stuff in the next 3 weeks?
Apparently the answer is yes 🤦♂️
|
It's interesting that these mocks are only necessary for 3.5. I wonder what sort of magic is or isn't happening in 3.5. |
Proposed Commit Message
Additional Context
Test Steps
Upload dpkg to GCE instance and install
cloud-init clean --logs --rebootVerify metadata is fetched at init-local time
Verify metadata is NOT fetched at init time
cloud-init.log can be seen at:
https://paste.ubuntu.com/p/tY3RkS9rKY/
Additionally, I ran integration tests on GCE
Checklist: