Skip to content

Consistently strip top-level network key (SC-963)#1417

Merged
blackboxsw merged 4 commits into
canonical:mainfrom
TheRealFalcon:strip-network
May 5, 2022
Merged

Consistently strip top-level network key (SC-963)#1417
blackboxsw merged 4 commits into
canonical:mainfrom
TheRealFalcon:strip-network

Conversation

@TheRealFalcon
Copy link
Copy Markdown
Contributor

Proposed Commit Message

Consistently strip top-level network key

Our NetworkState parsing code expects no top level network key even
though it's part of our v1 and v2 standard. NoCloud and LXD datasources
had their own code to strip off the top-level network key, but it failed
for v2 configs. This commit simplifies and moves that code into the
parser so that it is consistent across all datasources.

LP: #1906187

Additional Context

See https://bugs.launchpad.net/cloud-init/+bug/1906187

@TheRealFalcon TheRealFalcon changed the title Consistently strip top-level network key Consistently strip top-level network key (SC-963) Apr 29, 2022
Comment thread cloudinit/net/network_state.py Outdated
"""
state = None
if "network" in net_config:
net_config = net_config["network"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a top-level network key we are off the typical expected path for providing network configuration. We probably should do some element of validation for the expected keys below "network". To avoid potentially changing behavior, I think we still need to minimally validate expected values below a top-level "network" key as our _maybe_remove_top_network function attempted.

Since we know valid network config can be either "network": {"config": "disabled"} or "network": {"version": [12], ...} let's at least validate that one of these structures is present, otherwise we use the unaltered net_config. I'd like to see some breadcrumb log when we net_config = net_config.pop("network"). Then we'd have something during triage that better informs us of the raw network-config that came in on a given platform.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this request of mine is just slightly adjusting the error message in parse_net_config_data to do a little better job of describing the failures on an invalid network-config I'd want to know if I have invalid config due to two failure cases:

  1. missing a "version:" key
  2. or that "config: <something_other_than_the_disabled_string>" was provided.

The 2nd case I could see happening if someone thinks network config needs to live under network: {config: {<my_v1_or_v2_config>} because of the generic name of the "config" key potentially confusing image creators.

Long-term, I think I'd like to also see network-config JSON schema in some form. But, you know, "when a developer has a hammer, everything looks like a nail".

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this request of mine is just slightly adjusting the error message in parse_net_config_data to do a little better job of describing the failures on an invalid network-config I'd want to know if I have invalid config due to two failure cases:

  1. missing a "version:" key
  2. or that "config: <something_other_than_the_disabled_string>" was provided.

The 2nd case I could see happening if someone thinks network config needs to live under network: {config: {<my_v1_or_v2_config>} because of the generic name of the "config" key potentially confusing image creators.

Long-term, I think I'd like to also see network-config JSON schema in some form. But, you know, "when a developer has a hammer, everything looks like a nail".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we get a top-level network key we are off the typical expected path for providing network configuration.

Not according to our networking config formats 😬 . I'm not sure why we ever considered it normal not have the top-level network key, but I consider the "normal" path of not having it to be the exception we're working around.

I noticed that we're not quite handling this early enough, so I pushed a new version that handles this little earlier (I still need to update tests).

About your concerns:

missing a "version:" key

That indirectly happens here: https://github.com/canonical/cloud-init/blob/main/cloudinit/net/__init__.py#L660

or that "config: <something_other_than_the_disabled_string>" was provided.

But <something_other_than_the_disabled_string> is valid, isn't it? V1 config is:

network:
  version: 1
  config:

so config is valid there. Or am I misunderstanding what you're asking for?

I added a log of the network config when we raise the exception. Do you think that is good enough?

I definitely agree that a schema for networking is a good idea. We should create one.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Good move on earlier in stages, right before the is_disabled check makes sense to make sure we pop off the network key before checking "config": "disabled "
  2. missing a "version:" key: Thanks for pointing out we are already validating and raising a descriptive error here, I was unaware of that still happening and that satisfies my concern
  3. When I made the comment I was thinking specifically if a string value other than "disabled" were provided. Agreed that our "schema" should permit 'config: { type: [disabled, array]....} for this. Nothing really to do here, I was just thinking about checking isinstance(net_config.get("config"), str) and net_config["config"] != "disabled": raise RuntimeError("bogus net config value dude")
  4. Yes let's make a network-config schema. https://warthogs.atlassian.net/browse/SC-969

"No valid network_state object created from network config. "
"Did you specify the correct version?"
"Did you specify the correct version? Network config:\n"
f"{net_config}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx.

Comment thread cloudinit/stages.py
ncfg = self._remove_top_level_network_key(
available_cfgs[cfg_source]
)
if net.is_disabled_cfg(ncfg):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just in time for the "config: disabled" check. +1.

Comment thread cloudinit/net/network_state.py Outdated
"""
state = None
if "network" in net_config:
net_config = net_config["network"]
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Good move on earlier in stages, right before the is_disabled check makes sense to make sure we pop off the network key before checking "config": "disabled "
  2. missing a "version:" key: Thanks for pointing out we are already validating and raising a descriptive error here, I was unaware of that still happening and that satisfies my concern
  3. When I made the comment I was thinking specifically if a string value other than "disabled" were provided. Agreed that our "schema" should permit 'config: { type: [disabled, array]....} for this. Nothing really to do here, I was just thinking about checking isinstance(net_config.get("config"), str) and net_config["config"] != "disabled": raise RuntimeError("bogus net config value dude")
  4. Yes let's make a network-config schema. https://warthogs.atlassian.net/browse/SC-969

@TheRealFalcon TheRealFalcon force-pushed the strip-network branch 2 times, most recently from 645f7a6 to 6c6300f Compare May 3, 2022 13:55
@TheRealFalcon TheRealFalcon requested a review from blackboxsw May 3, 2022 13:58
Our NetworkState parsing code expects no top level network key even
though it's part of our v1 and v2 standard. NoCloud and LXD datasources
had their own code to strip off the top-level network key, but it failed
for v2 configs. This commit simplifies and moves that code into the
parser so that it is consistent across all datasources.

LP: #1906187
* Stop using the CiTestCase base class
* Use a common autouse fixture as a replacement of the setup method
* Replace unittest assert calls to raw assert calls
* Replace custom tmpdir implementation with the pytest fixture
* parametrize the tests that test _find_networking_config to include a network config with a top-level network key and one without
Copy link
Copy Markdown
Collaborator

@blackboxsw blackboxsw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this comprehensive fix and dropping some cut-n-paste for NoCloud/LXD.
Verified expected failure prior to this PR and fix afterward:

cat > net-with-top-level.yaml <<EOF
network:
  version: 2
  ethernets:
        eth0:
            dhcp4: true
EOF
cat > net-no-top-level.yaml <<EOF
version: 2
ethernets:
        eth0:
            dhcp4: true
EOF

lxc launch ubuntu-daily:jammy dev-j -c cloud-init.network-config="$(cat net-with-top-level.yaml)"
lxc exec dev-j cloud-init status --long
# See error network doesn't have version key"
- upgrade to this PR
lxc exec dev-j cloud-init clean reboot --logs
# no traceback/error

@blackboxsw blackboxsw merged commit d335866 into canonical:main May 5, 2022
aciba90 pushed a commit to aciba90/cloud-init that referenced this pull request May 10, 2022
Our NetworkState parsing code expects no top level network key even
though it's part of our v1 and v2 standard. NoCloud and LXD datasources
had their own code to strip off the top-level network key, but it failed
for v2 configs. This commit simplifies and moves that code into the
parser so that it is consistent across all datasources.

LP: #1906187
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants