Skip to content

Add new network activators to bring up interfaces (SC-85)#919

Merged
TheRealFalcon merged 8 commits into
canonical:mainfrom
TheRealFalcon:bring-up
Jul 1, 2021
Merged

Add new network activators to bring up interfaces (SC-85)#919
TheRealFalcon merged 8 commits into
canonical:mainfrom
TheRealFalcon:bring-up

Conversation

@TheRealFalcon
Copy link
Copy Markdown
Contributor

@TheRealFalcon TheRealFalcon commented Jun 14, 2021

Proposed Commit Message

Add new network activators to bring up interfaces

Currently _bring_up_interfaces() is a no-op for any distro using
renderers. We need to be able to support bringing up a single
interfaces, a list of interfaces, and all interfaces. This should be
independent of the renderers, as the network config is often
generated independent of the mechanism used to apply it.

Additionally, I included a refactor to remove
"_supported_write_network_config". We had a confusing call chain of
apply_network_config->_write_network_config->_supported_write_network_config.
The last two have been combined.

Additional Context

This was motivated by needing to support hotplug.

_bring_up_interfaces() is only ever called by arch and gentoo these days, so I removed the other implementations.

Test Steps

I added some unit tests. Integration testing is difficult. Given that _bring_up_interfaces was previously a no-op, it's hard to imagine real world situations where we can test this. The only time we exercise this code path is when we're running in network mode, and we were unable to obtain DataSource information during the local phase.

One way to simulate this would be to create a testing DataSource that returns a different instance id every time it is queried. It would be an ugly hack, but it would work. Given that we'll need to test this when hotplug lands, I'm not sure it's worth the integration testing effort at the moment.

Checklist:

  • My code follows the process laid out in the documentation
  • I have updated or added any unit tests accordingly
  • I have updated or added any documentation accordingly

'bridge_waitport': None}}


def parse_net_config_data(net_config, skip_broken=True):
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved parse_net_config_data to the bottom of the file so I could return some type information (NetworkConfig isn't defined here yet).

I removed parse_net_config as it is dead code.

cur = (name, render_mod.Renderer)
if first:
return cur
return [cur]
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much easier to reason about if we're always returning a list. The only place this function is being called is from the select function below (which I modified accordingly) and a couple unit tests.

@TheRealFalcon TheRealFalcon changed the title WIP: Add new network configurers to bring up interfaces WIP: Add new network configurers to bring up interfaces (SC-85) Jun 14, 2021
state = nsi.get_network_state()

if not state:
raise RuntimeError(
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we were either returning NetworkState or None. In net_convert.py we handled the None accordingly, but the renderers don't check for None and we'll likely get an ugly AttributeError error if they're passed it. So failing early here makes more sense and makes the function easier to reason about.

@TheRealFalcon TheRealFalcon changed the title WIP: Add new network configurers to bring up interfaces (SC-85) Add new network configurers to bring up interfaces (SC-85) Jun 15, 2021
@blackboxsw
Copy link
Copy Markdown
Collaborator

Yeah I'm struggling a bit with calling it a Configurer, maybe NetworkBackpane? What stinks is that netplan already calls -NetworkManager and systemd-networkd renderers... yet cloud-init also calls netplan a renderer. "NetworkBackplaneManager/ BackplaneBringup? I'll hold on the rest of the bike-shedding and just review the code and we can discuss more broadly what this class should be called when we figure our what exactly it needs to do.

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

What about InterfaceBringerUpper? NetworkInterfaceInterfacer?
(these aren't serious suggestions just in case you were wondering 😉 )

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TheRealFalcon, the code and refactor looks good I think the only critical thing at this point is solving the gentoo.py issues which I think lead to a break in network rendering.

The refactors and rename made it a bit time-consuming to review and understand for me, but I think I've walked through the setup path in general and the end result is a much cleaner set of method/function names.

See if any of my alternative bike-shed suggestions to "Configurer" ring as more appropriate/clear.... (I'm not sure they do).

util.logexc(LOG, "Running interface command %s failed", cmd)
return False
"""Deprecated. Remove if/when arch and gentoo support renderers."""
raise NotImplementedError
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break both gentoo which directly calls Distros._bring_up_interface.

Don't we need to either:

  1. leave the _bring_up_interface stub in place still in the base class?
    OR
  2. Migrate this function declaration into gentoo.py._bring_up_interface? Or do we want to correct the gentoo reference to just call self._bring_up_interface

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't actually. The call hierarchy would work like this:
base:apply_network -> gentoo:_bring_up_interfaces->base:_bring_up_interfaces->gentoo:_bring_up_interface

The distros.Distro._bring_up_interfaces is confusing (it should just call super), but since it's passing self to the base class, the "normal" method calling rules still apply.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh right.

raise NotImplementedError()

def _supported_write_network_config(self, network_config):
def _write_network_state(self, network_state):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good naming, makes it more clear

Comment thread cloudinit/net/configurers.py Outdated
return found


def select_configurer(priority=None, target=None) -> Type[NetworkConfigurer]:
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw Jun 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're only calling this from the module, WDYT about aligning with our renderers.select|search naming?

Suggested change
def select_configurer(priority=None, target=None) -> Type[NetworkConfigurer]:
def select(priority=None, target=None) -> Type[NetworkConfigurer]:

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah...I named it this way because I was frustrated when grepping for usages of the renderers.select|search. Searching for 'select' or 'search' will return a million results. I figured the function call would be slightly more explicit too.

That said, I'm not super strongly opinionated about it. If you are, I don't mind changing it.

Comment thread cloudinit/net/configurer.py Outdated
from cloudinit.net.network_state import NetworkState


class NetworkConfigurer(ABC):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NetworkActivator ??? since all methods are bring_up related
NetworkActuator, NetworkUtilityController, NetworkCmdController, NetworkCmdAbstraction, NetworkCmd, NetworkTool, NetworkBackend?

Comment thread cloudinit/net/configurers.py Outdated
]


def search_configurer(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def search_configurer(
def search(

???

Comment thread cloudinit/net/netplan.py Outdated
return header + contents


class NetplanConfigurer(NetworkConfigurer):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels a bit strange that we have completely new/separate modules for network_manager and ifupdown "configurers", but we still house NetplanConfigurer declaration in netplan.py. It feels a tiny bit like our naming of modules is muddy in this directory in general. Having the common module namespace for renderers vs "configurer" modules in this dir makes looking up renderers and configurer declarations a bit more costly as we have generic module naming like eni.py and ifupdown.py in the same directory.

What do you think about one of these options?

  • including these "configurer" declarations inline in net.configurers.py?
    • It doesn't seem like the netplanconfigurer needs anything specifcally from netplan.py anyway beyond the "available" function.
  • prefixing "configurer" module names with the bike-sheded configurer_(ifupdown|network_manager|netplan).py

Comment thread cloudinit/distros/arch.py
self.package_command('', pkgs=pkglist)

def _write_network_config(self, netconfig):
def _write_network_state(self, network_state):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also drop "_bring_up_interfaces" from arch as it as a near equivalent (yet less functional) of the parent class Distro._bring_up_interfaces

@blackboxsw
Copy link
Copy Markdown
Collaborator

I was able to go through inplace upgrades on existing systems greenfield and brownfield runs without error so, as expected, we haven't broken any the networking paths in this refactor.

@TheRealFalcon
Copy link
Copy Markdown
Contributor Author

@blackboxsw , I think I've addressed (or responded to) each of your comments. The 'configurers-activators' was a fairly large change, so it might be easier to review the other commits separately.

@TheRealFalcon TheRealFalcon requested a review from blackboxsw June 30, 2021 18:18
@TheRealFalcon TheRealFalcon changed the title Add new network configurers to bring up interfaces (SC-85) Add new network activators to bring up interfaces (SC-85) Jun 30, 2021
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for working through this @TheRealFalcon

@blackboxsw
Copy link
Copy Markdown
Collaborator

blackboxsw commented Jul 1, 2021

One CI error is likely due to mock not operating as expected in travis build env:

E       AssertionError: assert [<class 'clou...anActivator'>] == [<class 'clou...anActivator'>]
3097E         At index 0 diff: <class 'cloudinit.net.activators.NetplanActivator'> != <class 'cloudinit.net.activators.NetworkManagerActivator'>
3098E         Right contains one more item: <class 'cloudinit.net.activators.NetplanActivator'>
3099E         Full diff:
3100E           [
3101E         -  <class 'cloudinit.net.activators.NetworkManagerActivator'>,
3102E            <class 'cloudinit.net.activators.NetplanActivator'>,
3103E           ]
3104
3105tests/unittests/test_net_activators.py:110: AssertionError

Looks like we need the following patch to actually apply mocks to avoid actually checking on NetworkManager and Netplan

diff --git a/tests/unittests/test_net_activators.py b/tests/unittests/test_net_activators.py
index 2cd37078..f11486ff 100644
--- a/tests/unittests/test_net_activators.py
+++ b/tests/unittests/test_net_activators.py
@@ -105,7 +105,7 @@ class TestSearchAndSelect:
 
     @patch('cloudinit.net.activators.IfUpDownActivator.available',
            return_value=False)
-    def test_first_not_available(self, m_available):
+    def test_first_not_available(self, m_available, available_mocks):
         resp = search_activator()
         assert resp == DEFAULT_PRIORITY[1:]
 

`

Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix mocked test to add available_mocks. then +1

Currently _bring_up_interfaces() is a no-op for any distro using
renderers. We need to be able to support bringing up a single
interfaces, a list of interfaces, and all interfaces. This should be
independent of the renderers, as the network config is often
generated independent of the mechanism used to apply it.

Additionally, I included a refactor to remove
"_supported_write_network_config". We had a confusing call chain of
apply_network_config->_write_network_config->_supported_write_network_config.
The last two have been combined.
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@TheRealFalcon TheRealFalcon merged commit 81299de into canonical:main Jul 1, 2021
@TheRealFalcon TheRealFalcon deleted the bring-up branch July 1, 2021 19:43
TheRealFalcon added a commit to TheRealFalcon/cloud-init that referenced this pull request Oct 5, 2021
In canonical#919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.
TheRealFalcon added a commit to TheRealFalcon/cloud-init that referenced this pull request Oct 5, 2021
In canonical#919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.
TheRealFalcon added a commit to TheRealFalcon/cloud-init that referenced this pull request Oct 5, 2021
In canonical#919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.
TheRealFalcon added a commit to TheRealFalcon/cloud-init that referenced this pull request Oct 5, 2021
In canonical#919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.
TheRealFalcon added a commit to TheRealFalcon/cloud-init that referenced this pull request Oct 7, 2021
In canonical#919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.

LP: #1938299
blackboxsw pushed a commit that referenced this pull request Oct 7, 2021
In #919 (81299de), we refactored some of the code used to bring up
networks across distros. Previously, the call to bring up network
interfaces during 'init' stage unintentionally resulted in a no-op
such that network interfaces were NEVER brought up by cloud-init, even
if new network interfaces were found after crawling the metadata.

The code was altered to bring up these discovered network interfaces.
On ubuntu, this results in a 'netplan apply' call during 'init' stage
for any ubuntu-based distro on a datasource that has a NETWORK
dependency. On GCE, this additional 'netplan apply' conflicts with the
google-guest-agent service, resulting in an instance that can no
be connected to.

This commit adds a 'disable_network_activation' option that can be
enabled in /etc/cloud.cfg to disable the activation of network
interfaces in 'init' stage.

LP: #1938299
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants