Skip to content

Bug 1946506: Add hosts template for monitor-based resolution#2465

Merged
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
cybertron:monitor-node-dns
May 5, 2021
Merged

Bug 1946506: Add hosts template for monitor-based resolution#2465
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
cybertron:monitor-node-dns

Conversation

@cybertron
Copy link
Copy Markdown
Member

Now that node resolution is not required for bootstrapping, we no
longer need to use mdns to build the list of nodes. We can just wait
until the api is up and retrieve it from there.

This change removes the mdns plugin from the coredns config and
replaces it with a hosts plugin template which will be populated
by the coredns-monitor.

- Description for the changelog
Removed mdns node resolution in favor of a kubernetes api-based implementation.

@cybertron
Copy link
Copy Markdown
Member Author

This is dependent on openshift/baremetal-runtimecfg#129. I suspect it may not pass ci until that merges.

@yboaron
Copy link
Copy Markdown
Contributor

yboaron commented Mar 25, 2021

/retest
openshift/baremetal-runtimecfg#129 got merged.

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-openstack
/test e2e-ovirt
/test e2e-vsphere
/cc @mandre @Gal-Zaidman @jcpowermac @patrickdillon

metal-ipi is currently broken, but the runtimecfg patch merged so this should be good to go whenever it starts working again.

@Gal-Zaidman
Copy link
Copy Markdown

/test e2e-ovirt

@Gal-Zaidman
Copy link
Copy Markdown

/lgtm

oVirt seems to be ok with the change :)

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 30, 2021
@cybertron
Copy link
Copy Markdown
Member Author

/retest

@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

Ping: @mandre @jcpowermac @patrickdillon

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-metal-ipi
/test e2e-vsphere
/test e2e-vsphere-upgrade

@jcpowermac
Copy link
Copy Markdown
Contributor

Reviewing the e2e-vsphere logs the test failures are unrelated.

Upgrade is failing at MCO which might indicate a problem - I will look more in depth tomorrow.

@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

Reviewing the e2e-vsphere logs the test failures are unrelated.

Upgrade is failing at MCO which might indicate a problem - I will look more in depth tomorrow.

MCD having trouble with image registry pod: https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_machine-config-operator/2465/pull-ci-openshift-machine-config-operator-master-e2e-vsphere-upgrade/1377258688363892736/artifacts/e2e-vsphere-upgrade/gather-extra/artifacts/pods/openshift-machine-config-operator_machine-config-daemon-nzw6c_machine-config-daemon.log

It's covered by: https://bugzilla.redhat.com/show_bug.cgi?id=1944762

It's hitting some jobs incl vsphere

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we no longer need the rule for mdns in the security groups and it should be removed from the installer. Are we tracking this somewhere?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no I hadn't considered that because baremetal doesn't have security groups. What would be the best way to make sure that doesn't get lost?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, I would say in the same BZ that tracks the removal of mdns, or just a patch in the installer, but with only 3 days to feature freeze I think it's unrealistic to expect both patch to merge before FF. I'll create the BZ.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're now tracking mdns removal in https://bugzilla.redhat.com/show_bug.cgi?id=1946506. Aside from allowing to merge past feature freeze, this also ensures QE looks at and verifies the code change.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need mdns-publisher?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it stands right now, we do. My thought was that after this and #2410 merged we could remove mdns-publisher completely, but right now some platforms are still going to be using mdns.

Given how close we are to feature freeze now, maybe I need to pull 2410 into this PR and do everything at once.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the two changes of updating the path of the kubeconfig file and removing the mdns plugin related? If not they should probably go in separate PRs or at least separate commits.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in order for the coredns-monitor to read the node list and populate this new template we need the kubelet kubeconfig. The old one doesn't have permission to read nodes.

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-metal-ipi
/test e2e-vsphere

Copy link
Copy Markdown
Member

@mandre mandre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/retest
/lgtm

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally, I would say in the same BZ that tracks the removal of mdns, or just a patch in the installer, but with only 3 days to feature freeze I think it's unrealistic to expect both patch to merge before FF. I'll create the BZ.

@mandre
Copy link
Copy Markdown
Member

mandre commented Apr 6, 2021

/retitle Bug 1946506: Add hosts template for monitor-based resolution

@openshift-ci-robot openshift-ci-robot changed the title Add hosts template for monitor-based resolution Bug 1946506: Add hosts template for monitor-based resolution Apr 6, 2021
@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. label Apr 6, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cybertron: This pull request references Bugzilla bug 1946506, which is invalid:

  • expected the bug to target the "4.8.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1946506: Add hosts template for monitor-based resolution

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Apr 6, 2021
@mandre
Copy link
Copy Markdown
Member

mandre commented Apr 6, 2021

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Apr 6, 2021
@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-metal-ipi

403 downloading go. o.O

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-metal-ipi

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-aws

I think the vsphere-upgrade and okd jobs are known flaky, but it would be nice to get an e2e-aws pass.

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-aws

I'm seeing the same errors in other PRs which eventually merged, so I don't think it's related to this one.

@cybertron
Copy link
Copy Markdown
Member Author

/test e2e-aws

@cybertron
Copy link
Copy Markdown
Member Author

Okay, this is now passing all of the expected jobs. The only change since the previous lgtms was adding fallthrough to fix external DNS resolution.

@cybertron
Copy link
Copy Markdown
Member Author

@mandre @Gal-Zaidman @jcpowermac Can someone lgtm this again? It should be good to go now, and this fixes a pretty big scaling issue with on-prem. Thanks!

@Gal-Zaidman
Copy link
Copy Markdown

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 25, 2021
Copy link
Copy Markdown
Contributor

@JAORMX JAORMX left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/retest

@kikisdeliveryservice
Copy link
Copy Markdown
Contributor

/approve

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cybertron, Gal-Zaidman, JAORMX, kikisdeliveryservice, mandre, yboaron

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [kikisdeliveryservice]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 3, 2021
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

10 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 1dbe843 into openshift:master May 5, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cybertron: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Bugzilla bug in order for it to move to the next state. Once unlinked, request a bug refresh with /bugzilla refresh.

Bugzilla bug 1946506 has not been moved to the MODIFIED state.

Details

In response to this:

Bug 1946506: Add hosts template for monitor-based resolution

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

cybertron added a commit to cybertron/machine-config-operator that referenced this pull request May 6, 2021
mDNS was removed in openshift#2465, but a couple of the related templates were
missed. This just cleans those up too.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.