lxd/instance: use SSH by default, allow switching to exec#114
Conversation
|
The corresponding cloud-init changes are at canonical/cloud-init#802 |
|
Looks like I need some test changes; I think the code is good to review while I look at them. |
|
Ah, and it also looks like I've introduced an error condition in the "default SSH key" codepath (which I don't take on my local machine), so I'll also need to address that. |
613d69c to
55e1162
Compare
And remove the parameterisation, as we only have a single parameter set.
051bc2f to
8f1a526
Compare
Analysis performed (and captured in canonical#112) indicates that accessing LXD instances via SSH rather than via `lxc exec` leads to faster test execution, and also brings LXD's default behaviour in line with the default behaviour of other clouds. This commit introduces a `execute_via_ssh` parameter and instance variable, which are used by `execute()` to determine whether to connect to the instance via SSH, or use `lxc exec`. The default value of `execute_via_ssh` is `True`. (Prior to this commit, we would execute via SSH but only if we had a key manually set against the instance, leading to inconsistencies in behaviour depending on local configuration. This also addresses that inconsistency, and removes code that was required to support it.) Fixes: canonical#112
| local_path=local_path | ||
| ) | ||
|
|
||
| with open(local_path, "r") as f: |
There was a problem hiding this comment.
I know that this code is not optimal at all, but the reason I have added it here is that we could change the ssh keys here, but still have launched instances that are using an old key. In that situation, we would provided an unusable instance.
However, I know that this is not optimal because, for example, if we can't exec into the machine, that pull_file call will not work. So maybe we can drop this check and live with this possibility.
But if you think there is a smarter way to handle that, no problem at all
There was a problem hiding this comment.
Thanks for the comment, this is the code I was hoping to discuss! You're correct, the issue I found is that we don't necessarily have exec access when this runs (and, indeed, we definitely don't when this runs in the demo scripts, which was causing CI failure). Adding an instance._wait_for_execute() call in get_instance doesn't seem like the right thing to do: if the instance isn't expected to be accessible, that would cause a long delay.
I think this logic is less necessary than it was previously, however, because instance.key_pair no longer determines our execute behaviour: even if instance.key_pair is None, we will still attempt to SSH in (using any keys available to paramiko).
However, I don't think we can say it's entirely unnecessary: while we don't need it to determine the SSH key to use (and, actually, never did because paramiko will try all keys by default), it did serve a purpose previously: if SSH execution could work (determined via exec), the LXDInstance would be configured to execute via SSH instead of via exec, otherwise it would be configured to use exec. We lose that behaviour here: we will always return a LXDInstance configured to execute via SSH.
cloud-init's tests only have a single place which uses get_instance (when CLOUD_INIT_EXISTING_INSTANCE_ID is set). It actually has a similar issue to the one we're trying to solve here: previously with a local SSH key configured, it would have always configured SSH access (even to instances for which SSH wouldn't work: the existence of an authorized key does not indicate, e.g., working networking), and would otherwise have always configured exec access. I need to update canonical/cloud-init#802 to handle things differently: we should set instance.execute_via_ssh after the instance has been instantiated. We would still not require the "execution method auto-detection" which is being removed here.
Does UA client have test code which would be affected by this removal? Or can we safely remove it (and modify the docstring here to explain the execution behaviour)?
There was a problem hiding this comment.
I did not known about the fact that paramiko would try to access the machine with all of the keys that is has seen. If that is the case, I don't see any problem removing the code and it should not affect uaclient at all.
Thanks for all the provided context @OddBloke :)
pycloudlib has modified the way LXD executes tests (canonical/pycloudlib#114): it will always use SSH to access them by default, instead of using `lxc exec`. This behaviour is transparent for them majority of cloud-init's integration tests, but some currently depend on using `lxc exec` to access instances with (intentionally) broken networking: obviously these are not accessible via SSH. pycloudlib retains support for switching an instance to use `lxc exec`. This commit introduces the `lxd_use_exec` mark, which tests can use to indicate to the integration testing framework that they should be so switched, and applies it to all applicable tests.
pycloudlib has modified the way LXD executes tests (canonical/pycloudlib#114): it will always use SSH to access them by default, instead of using `lxc exec`. This behaviour is transparent for them majority of cloud-init's integration tests, but some currently depend on using `lxc exec` to access instances with (intentionally) broken networking: obviously these are not accessible via SSH. pycloudlib retains support for switching an instance to use `lxc exec`. This commit introduces the `lxd_use_exec` mark, which tests can use to indicate to the integration testing framework that they should be so switched, and applies it to all applicable tests.
pycloudlib has modified the way LXD executes tests (canonical/pycloudlib#114): it will always use SSH to access them by default, instead of using `lxc exec`. This behaviour is transparent for them majority of cloud-init's integration tests, but some currently depend on using `lxc exec` to access instances with (intentionally) broken networking: obviously these are not accessible via SSH. pycloudlib retains support for switching an instance to use `lxc exec`. This commit introduces the `lxd_use_exec` mark, which tests can use to indicate to the integration testing framework that they should be so switched, and applies it to all applicable tests.
Analysis performed (and captured in #112) indicates that accessing LXD
instances via SSH rather than via
lxc execleads to faster testexecution, and also brings LXD's default behaviour in line with the
default behaviour of other clouds.
This commit introduces a
execute_via_sshparameter and instancevariable, which are used by
execute()to determine whether to connectto the instance via SSH, or use
lxc exec. The default value ofexecute_via_sshisTrue.(Prior to this commit, we would execute via SSH but only if we had a key
manually set against the instance, leading to inconsistencies in
behaviour depending on local configuration. This also addresses that.)
Fixes: #112