This bug was originally filed in Launchpad as LP: #1758409
Launchpad details
affected_projects = []
assignee = None
assignee_name = None
date_closed = 2019-07-19T19:24:20.052730+00:00
date_created = 2018-03-23T18:25:46.957162+00:00
date_fix_committed = 2019-07-19T19:24:20.052730+00:00
date_fix_released = 2019-07-19T19:24:20.052730+00:00
id = 1758409
importance = undecided
is_complete = True
lp_url = https://bugs.launchpad.net/cloud-init/+bug/1758409
milestone = None
owner = powersj
owner_name = Joshua Powers
private = False
status = fix_released
submitter = powersj
submitter_name = Joshua Powers
tags = []
duplicates = []
Launchpad user Joshua Powers(powersj) wrote on 2018-03-23T18:25:46.957162+00:00
Summary
During the integration tests, currently if SSH to instance times out it holds up testing for over an hour in an attempt to SSH to an instance; note the timestamp jump on: https://paste.ubuntu.com/p/NBQKwm9wdG/
The _ssh_connect function was originally written for the nocloud_kvm platform and used as a method for determining if an instance was up and accessible. As such, the function is doing double duty and not correctly focused on SSH'ing to an up and running instance and has a bug in it as it is waiting far too long.
Action plan
-
For the nocloud_kvm platform when when starting and before _wait_for_system, there should be a check if an instance is accessible during the is_running check. This could be done again by SSH with a number of retries, but should be taken care of inside the nocloud_kvm platform itself and not in the SSH connect function.
-
Update the _ssh_connect to timeout quickly, reduce wait on banner, and only retry up to 3 times.
Noted Files
tests/cloud_tests/platforms/platforms.py:_ssh_connect()
tests/cloud_tests/platforms/nocloudkvm/instance.py:start()
This bug was originally filed in Launchpad as LP: #1758409
Launchpad details
Launchpad user Joshua Powers(powersj) wrote on 2018-03-23T18:25:46.957162+00:00
Summary
During the integration tests, currently if SSH to instance times out it holds up testing for over an hour in an attempt to SSH to an instance; note the timestamp jump on: https://paste.ubuntu.com/p/NBQKwm9wdG/
The _ssh_connect function was originally written for the nocloud_kvm platform and used as a method for determining if an instance was up and accessible. As such, the function is doing double duty and not correctly focused on SSH'ing to an up and running instance and has a bug in it as it is waiting far too long.
Action plan
For the nocloud_kvm platform when when starting and before _wait_for_system, there should be a check if an instance is accessible during the is_running check. This could be done again by SSH with a number of retries, but should be taken care of inside the nocloud_kvm platform itself and not in the SSH connect function.
Update the _ssh_connect to timeout quickly, reduce wait on banner, and only retry up to 3 times.
Noted Files
tests/cloud_tests/platforms/platforms.py:_ssh_connect()
tests/cloud_tests/platforms/nocloudkvm/instance.py:start()