Skip to content

Bug 1884053: Configure CoreDNS to shut down gracefully#237

Merged
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
Miciah:BZ1884053-use-CoreDNSs-ready-plugin-for-readiness-probe
Mar 8, 2021
Merged

Bug 1884053: Configure CoreDNS to shut down gracefully#237
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
Miciah:BZ1884053-use-CoreDNSs-ready-plugin-for-readiness-probe

Conversation

@Miciah
Copy link
Copy Markdown
Contributor

@Miciah Miciah commented Feb 20, 2021

Configure CoreDNS to shut down gracefully

This PR is the same as #205, which was reverted with #213, except that this PR does not change DNS pods' termination grace period.

Note that this change does not by itself solve BZ#1884053. We still need the graceful node shutdown feature referenced in the BZ to get correct draining behavior on shutdown. However, this change should reduce glitches when the pod comes back up after a node reboot.

  • assets/dns/daemonset.yaml: Change the readiness probe to use :8181/ready.
  • pkg/manifests/bindata.go: Regenerate.
  • pkg/operator/controller/controller_dns_configmap.go (corefileTemplate): Configure CoreDNS's health plugin to sleep 20 seconds when CoreDNS is shut down. Enable CoreDNS's ready plugin in order to provide a readiness endpoint on :8181/ready, which doesn't report ready until all plugins are initialized and stops reporting ready when CoreDNS is shutting down.
  • pkg/operator/controller/controller_dns_configmap_test.go (TestDesiredDNSConfigmap): Adjust for changes to corefileTemplate.

Delete TestCoreDNSImageUpgrade

Delete the TestCoreDNSImageUpgrade CI test. This test is unreliable, and we can achieve sufficient test coverage without it.

  • test/e2e/operator_test.go (TestCoreDNSImageUpgrade, setVersion, setImage,checkCurrentDNSImage): Delete functions.

Add TestCoreDNSDaemonSetReconciliation

Add an end-to-end test that verifies that the operator reconciles changes to the dns-default daemonset. This new test adds a node selector to the daemonset and verifies that the operator reverts the change.

The operator already has unit tests to verify that the daemonset update logic handles changes to image pullspecs and other important fields. Together, the new end-to-end test and the existing unit tests should provide sufficient test coverage for reconciliation of daemonsets.

  • test/e2e/operator_test.go (TestCoreDNSDaemonSetReconciliation): New test. Verify that the operator reconciles the dns-default daemonset.

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. label Feb 20, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Miciah: This pull request references Bugzilla bug 1884053, which is invalid:

  • expected the bug to target the "4.8.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1884053: Configure CoreDNS to shut down gracefully

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Feb 20, 2021
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 20, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Miciah: This pull request references Bugzilla bug 1884053, which is invalid:

  • expected the bug to target the "4.8.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1884053: Configure CoreDNS to shut down gracefully

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sgreene570
Copy link
Copy Markdown
Contributor

Timed out on TestCoreDNSImageUpgrade
/test e2e-aws-operator

@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Feb 23, 2021

Seeing the same familiar flakes, and no evidence of DNS problems.
/retest

@candita
Copy link
Copy Markdown
Contributor

candita commented Feb 25, 2021

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Feb 25, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@candita: This pull request references Bugzilla bug 1884053, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.8.0) matches configured target release for branch (4.8.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Feb 25, 2021
@candita
Copy link
Copy Markdown
Contributor

candita commented Feb 25, 2021

/retest

@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Feb 25, 2021

/test e2e-aws-operator

@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Mar 4, 2021

/cherry-pick release-4.7

@openshift-cherrypick-robot
Copy link
Copy Markdown

@Miciah: once the present PR merges, I will cherry-pick it on top of release-4.7 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Mar 4, 2021

/cherry-pick release-4.6

@openshift-cherrypick-robot
Copy link
Copy Markdown

@Miciah: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you.

Details

In response to this:

/cherry-pick release-4.6

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sgreene570
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 4, 2021
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@sgreene570
Copy link
Copy Markdown
Contributor

Timed out on TestCoreDNSImageUpgrade again. Maybe the changes here require a timeout bump for that test, which would make sense?

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 4, 2021
@Miciah Miciah force-pushed the BZ1884053-use-CoreDNSs-ready-plugin-for-readiness-probe branch from 52559fe to 2d15224 Compare March 5, 2021 18:28
@sgreene570
Copy link
Copy Markdown
Contributor

/lgtm
/hold to make sure new e2e test works as intended.

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2021
@Miciah Miciah force-pushed the BZ1884053-use-CoreDNSs-ready-plugin-for-readiness-probe branch from 2d15224 to 3751a67 Compare March 5, 2021 20:10
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2021
@sgreene570
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 5, 2021
@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Mar 8, 2021

/test e2e-aws
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 8, 2021
Miciah added 3 commits March 8, 2021 00:20
This commit is the same as commit f094ddf,
which was reverted with commit a96c45e,
except that this commit does not change DNS pods' termination grace period.

This commit is related to bug 1884053.

https://bugzilla.redhat.com/show_bug.cgi?id=1884053

* assets/dns/daemonset.yaml: Change the readiness probe to use :8181/ready.
* pkg/manifests/bindata.go: Regenerate.
* pkg/operator/controller/controller_dns_configmap.go (corefileTemplate):
Configure CoreDNS's health plugin to sleep 20 seconds when CoreDNS is shut
down.  Enable CoreDNS's ready plugin in order to provide a readiness
endpoint on :8181/ready, which doesn't report ready until all plugins are
initialized and stops reporting ready when CoreDNS is shutting down.
* pkg/operator/controller/controller_dns_configmap_test.go
(TestDesiredDNSConfigmap): Adjust for changes to corefileTemplate.
Delete the TestCoreDNSImageUpgrade CI test.  This test is unreliable, and
we can achieve sufficient test coverage without it.

* test/e2e/operator_test.go (TestCoreDNSImageUpgrade, setVersion, setImage,
checkCurrentDNSImage): Delete functions.
Add an end-to-end test that verifies that the operator reconciles changes
to the dns-default daemonset.  This new test adds a node selector to the
daemonset and verifies that the operator reverts the change.

The operator already has unit tests to verify that the daemonset update
logic handles changes to image pullspecs and other important fields.
Together, the new end-to-end test and the existing unit tests should
provide sufficient test coverage for reconciliation of daemonsets.

* test/e2e/operator_test.go (TestCoreDNSDaemonSetReconciliation): New
test.  Verify that the operator reconciles the dns-default daemonset.
@Miciah Miciah force-pushed the BZ1884053-use-CoreDNSs-ready-plugin-for-readiness-probe branch from 3751a67 to 709ad5f Compare March 8, 2021 05:20
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Mar 8, 2021
@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Mar 8, 2021

Rebased.

@Miciah
Copy link
Copy Markdown
Contributor Author

Miciah commented Mar 8, 2021

/test e2e-upgrade

@sgreene570
Copy link
Copy Markdown
Contributor

e2e operator is passing and this PR has been properly rebased. Looks good!
/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Mar 8, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Miciah, sgreene570

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 2ee7e37 into openshift:master Mar 8, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@Miciah: All pull requests linked via external trackers have merged:

Bugzilla bug 1884053 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1884053: Configure CoreDNS to shut down gracefully

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot
Copy link
Copy Markdown

@Miciah: #237 failed to apply on top of branch "release-4.7":

Applying: Configure CoreDNS to shut down gracefully
Using index info to reconstruct a base tree...
M	pkg/manifests/bindata.go
M	pkg/operator/controller/controller_dns_configmap.go
M	pkg/operator/controller/controller_dns_configmap_test.go
Falling back to patching base and 3-way merge...
Auto-merging pkg/operator/controller/controller_dns_configmap_test.go
Auto-merging pkg/operator/controller/controller_dns_configmap.go
Auto-merging pkg/manifests/bindata.go
CONFLICT (content): Merge conflict in pkg/manifests/bindata.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Configure CoreDNS to shut down gracefully
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

Details

In response to this:

/cherry-pick release-4.7

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants