[baremetal & friends] Move on-prem api-int record to dnsmasq by cybertron · Pull Request #2374 · openshift/machine-config-operator

cybertron · 2021-01-29T23:03:26Z

This is in preparation for moving the cluster-hosted network services
to a separate operator. With coredns no longer running as a static
pod, it will not be usable for providing the api-int record needed
for the node to register.

We decided to use dnsmasq instead of /etc/hosts because when a
deployer wants to use an external loadbalancer it will be necessary
to change the api-int record. If it's in /etc/hosts, that will require
restarting many/all of the pods to pick up the change. Using dnsmasq
allows us to just change the record in dnsmasq and SIGHUP it.

To allow dnsmasq and coredns to coexist on the node, coredns is moved
to port 5333 and dnsmasq has a server entry added to send queries
for the cluster domain to coredns.

- Description for the changelog
Move api-int record for on-prem platforms to dnsmasq service.

cybertron · 2021-01-29T23:06:22Z

/hold
/cc @yboaron @celebdor

The operator design is still in progress, but this is what we're currently planning. I've tested it with the operator and it works.

cybertron · 2021-02-01T14:30:28Z

/test e2e-openstack
/test e2e-ovirt
/test e2e-vsphere

darkmuggle · 2021-02-04T17:26:13Z

/hold

@crawford @miabbott looks like others had my idea too -- moving from /etc/hosts to dnsmasq.

From the perspective of the MCO, I think using dnsmasq is cleaner than manipulating /etc/hosts. I like this PR a lot better than the other PRs for manipulating hostnames.

Given the discussions about dnsmasq and its supportability, accepting this PR would make dnsmasq an explicit dependency.
Before we can accept this PR, we'll need ACK's from the RHCOS team that we're good to use dnsmasq.

darkmuggle · 2021-02-04T17:35:15Z

Perhaps using this as an ExecStartPre and then using ExecStart=/usr/bin/dnsmasq -k?

Makes sense. This is a holdover from the nodeip-configuration service I stole the pattern from, but in that case the podman call is the only thing being run.

cybertron · 2021-02-10T15:57:09Z

/hold

@crawford @miabbott looks like others had my idea too -- moving from /etc/hosts to dnsmasq.

From the perspective of the MCO, I think using dnsmasq is cleaner than manipulating /etc/hosts. I like this PR a lot better than the other PRs for manipulating hostnames.

One of our primary motivations for wanting this over /etc/hosts is that we need to be able to modify the records, potentially after initial deployment. With /etc/hosts I believe we'd have to restart all of the pods on the system to pick up changes. With dnsmasq, we just make the change, SIGHUP it (or the dbus equivalent), and all of the pods will use the new address immediately.

ashcrow · 2021-02-10T21:24:18Z

Before we can accept this PR, we'll need ACK's from the RHCOS team that we're good to use dnsmasq.

I've spoken with a few folks and the consensus is that using dnsmasq is acceptable.

darkmuggle · 2021-02-10T21:25:39Z

/unhold
/retest

yboaron · 2021-02-11T07:38:49Z

Don't you need to update CoreDNS port also for 'friends' platforms files (e.g: https://github.com/openshift/machine-config-operator/blob/master/templates/common/vsphere/files/coredns-corefile.yaml) ?

cybertron · 2021-02-11T16:26:32Z

Don't you need to update CoreDNS port also for 'friends' platforms files (e.g: https://github.com/openshift/machine-config-operator/blob/master/templates/common/vsphere/files/coredns-corefile.yaml) ?

Hmm, that seems bad. It means both the on-prem and the vsphere template are going to be written to the same location. I guess it must happen to work out, but we should probably work with the vsphere team to converge those files.

This is in preparation for moving the cluster-hosted network services to a separate operator. With coredns no longer running as a static pod, it will not be usable for providing the api-int record needed for the node to register. We decided to use dnsmasq instead of /etc/hosts because when a deployer wants to use an external loadbalancer it will be necessary to change the api-int record. If it's in /etc/hosts, that will require restarting many/all of the pods to pick up the change. Using dnsmasq allows us to just change the record in dnsmasq and SIGHUP it. To allow dnsmasq and coredns to coexist on the node, coredns is moved to port 5333 and dnsmasq has a server entry added to send queries for the cluster domain to coredns.

openshift-ci-robot · 2021-02-12T23:13:13Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: cybertron
To complete the pull request process, please assign kikisdeliveryservice after the PR has been reviewed.
You can assign the PR to them by writing /assign @kikisdeliveryservice in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cybertron · 2021-02-12T23:16:23Z

Okay, I moved the config rendering to Pre and removed the Reload command because we aren't using it and I'm not sure it was working correctly anyway. I also have #2410 up to de-dupe the Corefiles because I think that's something we should do anyway. We either need to merge that or I'll need to change the port in those configs as well.

/hold

cybertron · 2021-03-02T22:58:38Z

This is probably going to be superseded by #2450 but since that's a more significant change there's a possibility it will be nacked by the associated enhancement review.

openshift-ci · 2021-05-21T17:31:51Z

@cybertron: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-openstack	50839585c39ebbd35ddd68a537bc09db01b511fc	link	`/test e2e-openstack`
ci/prow/e2e-ovirt	50839585c39ebbd35ddd68a537bc09db01b511fc	link	`/test e2e-ovirt`
ci/prow/e2e-vsphere	50839585c39ebbd35ddd68a537bc09db01b511fc	link	`/test e2e-vsphere`
ci/prow/e2e-metal-ipi	`eb88780`	link	`/test e2e-metal-ipi`
ci/prow/okd-e2e-aws	`eb88780`	link	`/test okd-e2e-aws`
ci/prow/e2e-agnostic-upgrade	`eb88780`	link	`/test e2e-agnostic-upgrade`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2021-08-19T20:53:13Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-09-19T02:43:24Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2021-10-19T03:10:44Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-10-19T03:12:11Z

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot requested review from runcom and yuqi-zhang January 29, 2021 23:03

openshift-ci-robot requested review from celebdor and yboaron January 29, 2021 23:06

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 29, 2021

darkmuggle reviewed Feb 4, 2021

View reviewed changes

openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 10, 2021

darkmuggle added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 10, 2021

cybertron force-pushed the api-int-dnsmasq branch from 5083958 to eb88780 Compare February 12, 2021 23:12

cybertron mentioned this pull request Mar 1, 2021

ARO private DNS zone resource removal openshift/enhancements#654

Closed

yboaron mentioned this pull request Mar 4, 2021

New method for providing configurable self-hosted LB/DNS/VIP for on-prem openshift/enhancements#524

Closed

cybertron mentioned this pull request Mar 29, 2021

[baremetal & friends] Move api-int to /etc/hosts #2258

Closed

openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 19, 2021

openshift-ci Bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Sep 19, 2021

openshift-ci Bot closed this Oct 19, 2021

Conversation

cybertron commented Jan 29, 2021

Uh oh!

cybertron commented Jan 29, 2021

Uh oh!

cybertron commented Feb 1, 2021

Uh oh!

darkmuggle commented Feb 4, 2021

Uh oh!

darkmuggle Feb 4, 2021

Choose a reason for hiding this comment

Uh oh!

cybertron Feb 5, 2021

Choose a reason for hiding this comment

Uh oh!

cybertron commented Feb 10, 2021

Uh oh!

ashcrow commented Feb 10, 2021

Uh oh!

darkmuggle commented Feb 10, 2021

Uh oh!

yboaron commented Feb 11, 2021

Uh oh!

cybertron commented Feb 11, 2021

Uh oh!

openshift-ci-robot commented Feb 12, 2021

Uh oh!

cybertron commented Feb 12, 2021

Uh oh!

cybertron commented Mar 2, 2021

Uh oh!

openshift-ci Bot commented May 21, 2021

Uh oh!

openshift-bot commented Aug 19, 2021

Uh oh!

openshift-bot commented Sep 19, 2021

Uh oh!

openshift-bot commented Oct 19, 2021

Uh oh!

openshift-ci Bot commented Oct 19, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants