Skip to content

[WIP] baremetal: make dhcp range configurable for bootstrap node#2758

Closed
stbenjam wants to merge 4 commits intoopenshift:masterfrom
stbenjam:provisioning-dhcp
Closed

[WIP] baremetal: make dhcp range configurable for bootstrap node#2758
stbenjam wants to merge 4 commits intoopenshift:masterfrom
stbenjam:provisioning-dhcp

Conversation

@stbenjam
Copy link
Copy Markdown
Member

@stbenjam stbenjam commented Dec 5, 2019

This adds a ProvisioningDHCPRange to the baremetal platform, and makes
that information available to the startironic template that runs on the
bootstrap.

This introduces the ability to have platform-specific information in the
templating struct.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 5, 2019
@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 5, 2019
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: stbenjam
To complete the pull request process, please assign sdodson
You can assign the PR to them by writing /assign @sdodson in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 5, 2019

@hardys Untested so far, but this gets the info into the bootstrap, what do you think of the approach? Not sure if the platformData stuff is too overengineered, but felt weird to put baremetal data at the top level.

@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 5, 2019

I forgot we made an attempt at this already in #2449, I'll pull over the DHCP-related changes to this PR. There's some things in that one I didn't account for here.

@stbenjam stbenjam force-pushed the provisioning-dhcp branch 2 times, most recently from d5b898f to 7d8bd26 Compare December 5, 2019 19:21
@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 5, 2019

/cc @imain

Would you mind taking a look at this? It's mostly from your PR, with the code needed to insert the vars into bootstrap's startironic.sh.

@stbenjam stbenjam force-pushed the provisioning-dhcp branch 3 times, most recently from dbb5faa to eb03173 Compare December 5, 2019 19:37
@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 5, 2019

/label platform/baremetal

@openshift-ci-robot openshift-ci-robot added the platform/baremetal IPI bare metal hosts platform label Dec 5, 2019
@metal3ci
Copy link
Copy Markdown

metal3ci commented Dec 5, 2019

Build FAILURE, see build http://10.8.144.11:8080/job/dev-tools/1349/

@hardys
Copy link
Copy Markdown

hardys commented Dec 6, 2019

The approach looks good to me - it'd be nice if we could move all the provisioning network interfaces from #2449 into this template data, then we can fix #2091 completely for the bootstrap ironic services.

That can be done as a followup though, I'm fine if we add the DHCPRange now then iterate on the remaining template data, seems it should be easy enough to add those with the approach you have here.

This adds a ProvisioningDHCPRange to the baremetal platform, and makes
that information available to the startironic template that runs on the
bootstrap.

This introduces the ability to have platform-specific information in the
templating struct.

Co-authored-by: Ian Main <imain@redhat.com>
Co-authored-by: Stephen Benjamin <stephen@redhat.com>
@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 6, 2019

The approach looks good to me - it'd be nice if we could move all the provisioning network interfaces from #2449 into this template data, then we can fix #2091 completely for the bootstrap ironic services.

That can be done as a followup though, I'm fine if we add the DHCPRange now then iterate on the remaining template data, seems it should be easy enough to add those with the approach you have here.

Ok thanks, I didn't want to expand the scope here too much beyond DHCP.

There is one thing I'm not sure about now that I'm seeing the code: it's a bit weird if we have defaults for all the provisioning network (cluster, provisioning IP, and the CIDR) but not a DHCP range, isn't it?

Maybe have a platform var that explicitly says enable DHCP or not would be better?

@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 6, 2019

Alternatively, we could maybe make ProvisioningDHCPRange a pointer. nil (unspecified) would mean setup DHCP, but explicitly using "" would mean not use it at all. Not sure if you can do a string pointer in the install-config, and if you can, if it would be too confusing.

@hardys
Copy link
Copy Markdown

hardys commented Dec 6, 2019

The approach looks good to me - it'd be nice if we could move all the provisioning network interfaces from #2449 into this template data, then we can fix #2091 completely for the bootstrap ironic services.
That can be done as a followup though, I'm fine if we add the DHCPRange now then iterate on the remaining template data, seems it should be easy enough to add those with the approach you have here.

Ok thanks, I didn't want to expand the scope here too much beyond DHCP.

There is one thing I'm not sure about now that I'm seeing the code: it's a bit weird if we have defaults for all the provisioning network (cluster, provisioning IP, and the CIDR) but not a DHCP range, isn't it?

In #2449 the DHCPRange default is calculated based on the ProvisioningNetworkCIDR, which I think is reasonable - you can then just change ProvisioningNetworkCIDR and have all the other IPs switch to the new subnet (basically we just use the last octet of the current defaults).

211cfd1

Maybe have a platform var that explicitly says enable DHCP or not would be better?

Yeah maybe - I'm open to that idea but it would need an update to the ironic container, MAO and openshift/enhancements#119 accordingly.

I guess the other alternative would be to use a different value (null?) in the yaml config that doesn't conflict with the golang default for the string?

@metal3ci
Copy link
Copy Markdown

metal3ci commented Dec 6, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1351/

This allows 3 options:
        - nil: use default range (.10->.100) for Provisioning CIDR
        - specify a range, override the default range
        - "": disable DHCP
@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 6, 2019

I guess the other alternative would be to use a different value (null?) in the yaml config that doesn't conflict with the golang default for the string?

@hardys I updated to use that approach, I think it looks ok. Specifying the DHCP range to be "" is the only way to disable it, otherwise we use the default (10-100) or whatever the user gave us.

Still have to test if it actually works in the YAML install-config to give a string to a pointer value

@metal3ci
Copy link
Copy Markdown

metal3ci commented Dec 6, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1352/

@metal3ci
Copy link
Copy Markdown

metal3ci commented Dec 6, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1354/

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Dec 6, 2019

@stbenjam: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 9c18a3c link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@stbenjam
Copy link
Copy Markdown
Member Author

stbenjam commented Dec 6, 2019

A couple of things from my testing so far:

  • Control plane and workers DHCP from different places. Control plane DHCP's from .2, and then workers DHCP from in the cluster on .3. We'll have to document this, that when configuring an external DHCP you'll need to add individual leases for the control plane and workers with the appropriate next-server. This is not ideal, and I guess goes away with virtual media anyway.

  • I need to fix the interface configuration in startironic.sh as it hardcodes the IP's, so I guess I may need bring the interface stuff over

@metal3ci
Copy link
Copy Markdown

metal3ci commented Dec 6, 2019

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/1356/

@hardys
Copy link
Copy Markdown

hardys commented Dec 9, 2019

Control plane DHCP's from .2, and then workers DHCP from in the cluster on .3.

I've been having discussion with @celebdor about adding a VIP for the controlplane, which would avoid this problem as we could use a single IP for both the bootstrap and cluster provisioning services.

IIUC @bcrochet started working on that, but I'm not sure on the latest status - for now we can probably go ahead as you say and document the two different IPs that exist for next-server.

@hardys
Copy link
Copy Markdown

hardys commented Dec 9, 2019

I need to fix the interface configuration in startironic.sh as it hardcodes the IP's, so I guess I may need bring the interface stuff over

It's more work but one option would be to bring all the install-config interfaces from #2449 (except the cache URL which has now been removed from openshift/enhancements#119) over into this PR as a series of commits.

I think that already includes all of the interface work needed, including tests etc, and that would resolve the bootstrap part of #2091 - then @imain could rebase #2449 to only add the CR creation aligned with openshift/enhancements#119

@stbenjam
Copy link
Copy Markdown
Member Author

@imain will incorporate my changes back into his original PR or the CRD, going to close this for now.

/close

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@stbenjam: Closed this PR.

Details

In response to this:

@imain will incorporate my changes back into his original PR or the CRD, going to close this for now.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hardys
Copy link
Copy Markdown

hardys commented Dec 16, 2019

@imain will incorporate my changes back into his original PR or the CRD, going to close this for now.

/close

I asked @imain on #2449 if there's going to be a new version we can help review/test soon - if not I'd be inclined to revive this so we can make progress on the installer changes for configurable subnet - that could be a first step to the #2449 changes, the second being to write out the CR being defined via openshift/api#540

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. platform/baremetal IPI bare metal hosts platform size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants