Skip to content

baremetal: keepalived/coredns pods remove before create#2033

Closed
hardys wants to merge 1 commit intoopenshift:masterfrom
hardys:issue_2032
Closed

baremetal: keepalived/coredns pods remove before create#2033
hardys wants to merge 1 commit intoopenshift:masterfrom
hardys:issue_2032

Conversation

@hardys
Copy link
Copy Markdown

@hardys hardys commented Jul 18, 2019

If the container exists unexpectedly for any reason, then
ExecStop not called, and sometimes there are "storage for container removed"
errors trying to re-start it, presumably because podman cleanup
removes resources for exited containers.

Closes: #2032

@openshift-ci-robot openshift-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 18, 2019
@hardys
Copy link
Copy Markdown
Author

hardys commented Jul 18, 2019

cc @celebdor and @cybertron

@celebdor
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 18, 2019
@hardys
Copy link
Copy Markdown
Author

hardys commented Jul 18, 2019

/label platform/baremetal

@russellb russellb added the platform/baremetal IPI bare metal hosts platform label Jul 18, 2019
Comment thread data/data/bootstrap/baremetal/files/usr/local/bin/coredns.sh Outdated
If the container exists unexpectedly for any reason, then
ExecStop not called, and sometimes there are "storage for container removed"
errors trying to re-start it, presumably because podman cleanup
removes resources for exited containers.

So we add --rm on the podman create, and add logic to be completely sure any
stale containers are removed on startup of the services.

Closes: openshift#2032
@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 22, 2019
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@metal3ci
Copy link
Copy Markdown

Build SUCCESS, see build http://10.8.144.11:8080/job/dev-tools/916/

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: celebdor, hardys
To complete the pull request process, please assign steveej
You can assign the PR to them by writing /assign @steveej in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hardys
Copy link
Copy Markdown
Author

hardys commented Jul 24, 2019

/retest

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@hardys: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-scaleup-rhel7 b97c7fd link /test e2e-aws-scaleup-rhel7

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@hardys
Copy link
Copy Markdown
Author

hardys commented Jul 25, 2019

/assign @steveej

@hardys
Copy link
Copy Markdown
Author

hardys commented Jul 26, 2019

@abhinavdahiya Thanks for the previous review, comments addressed so would appreciate another review when you get a moment, anything else which needs addressing before this can merge?

@abhinavdahiya
Copy link
Copy Markdown
Contributor

will this still be required if we move to static pods for bare-metal using mco-bootstrap... ?

@stbenjam
Copy link
Copy Markdown
Member

will this still be required if we move to static pods for bare-metal using mco-bootstrap... ?

This should no longer required, now that openshift/machine-config-operator#1002 is in.

/close

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@stbenjam: Closed this PR.

Details

In response to this:

will this still be required if we move to static pods for bare-metal using mco-bootstrap... ?

This should no longer required, now that openshift/machine-config-operator#1002 is in.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

platform/baremetal IPI bare metal hosts platform size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

baremetal: keepalived/coredns systemd restarts can permanently fail

8 participants