Skip to content

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf#2811

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
palonsoro:criosize
Nov 29, 2021
Merged

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf#2811
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
palonsoro:criosize

Conversation

@palonsoro
Copy link
Copy Markdown
Contributor

This intends to fix https://bugzilla.redhat.com/show_bug.cgi?id=2017756

The way it does is:

  • Removing crio options from /etc/crio/crio.conf.d/00-default that may overwrite changes in /etc/containers/storage.conf introduced by ContainerRuntimeConfig custom resources.
  • Wiping default /etc/crio/crio.conf, as it also includes the offending settings and any other interesting default has already been moved to MCO-managed configuration /etc/crio/crio.conf.d/00-default.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 27, 2021

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Oct 27, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 27, 2021

@palonsoro: This pull request references Bugzilla bug 2017756, which is invalid:

  • expected the bug to target the "4.10.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot requested review from mtrmac and umohnani8 October 27, 2021 11:53
@palonsoro
Copy link
Copy Markdown
Contributor Author

/bugzilla refresh

@openshift-ci openshift-ci Bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Oct 27, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 27, 2021

@palonsoro: This pull request references Bugzilla bug 2017756, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (schoudha@redhat.com), skipping review request.

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Oct 27, 2021
@palonsoro palonsoro marked this pull request as ready for review October 27, 2021 12:35
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 27, 2021
@palonsoro
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@palonsoro
Copy link
Copy Markdown
Contributor Author

/retest

Copy link
Copy Markdown
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial comments (I am not the most knowledgeable regarding crio conf so we likely need someone from the container runtimes side to review)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you don't provide a containerruntime object (that as I understand, would provide these configs)? Do they have defaults? Or are they simply gone?

It sounds to me that the better way to approach this is to have the containerruntime config controller correctly handle this to override defaults correctly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are defaults and MCO has full control on them. Defaults can be found here and match the removed config: https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/container-storage.yaml

Doing what you suggest (making the controller handle this) might require rewriting the whole controller almost from scratch and would go against its design principles (https://github.com/openshift/machine-config-operator/blob/master/docs/ContainerRuntimeConfigDesign.md).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I honestly thing that we should just keep /etc/containers/storage.conf as the only source for CRIO storage configuration (either relying just on defaults or letting them be modified as needed), which also aligns with current container runtime config controller and requires least coding effort.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are defaults and MCO has full control on them. Defaults can be found here and match the removed config

Ah ok, thanks!

Given that @haircommander authored the original #2723, let's see if he has any insight into wiping crio.conf (I'm leaning towards somehow not shipping it in the first place, as opposed to having it overwritten by an empty file

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuqi-zhang I agree that not shipping it is best definitive solution. However, I also placed this because I thought it would be easier to make it empty via MCO while a later fix to remove it from RPM is developed.

Comment thread templates/worker/01-worker-container-runtime/_base/files/crio-conf-wipe.yml Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing this line in https://github.com/openshift/machine-config-operator/blob/master/templates/common/_base/files/container-storage.yaml though frankly I'm not sure what it does or why it's there in the first place
(@rhatdan @giuseppe @nalind what does this do/do we need it)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overlay driver used to retrieve the version information about the running kernel, and if was below 4.0 in general, or below 4.7 if the backing storage was on btrfs, it would error out during initialization. The logic that checks for and heeds that flag was replaced by run-time try-it-and-find-out logic some time ago.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/containers/storage/blob/77999de8b3ded5a4d41fb2642a68a17aa7c9eb3d/drivers/overlay/overlay.go#L418
It seems c/storage no longer handle this override_kernel_check option, so we can get this PR in, at least merge this to master branch?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh haha I missed it

@haircommander
Copy link
Copy Markdown
Member

in general I'm in favor of cri-o doing the right thing regardless of the config. @QiWang19 has opened a PR to have cri-o handle this situation more logically: cri-o/cri-o#5423

That said, it is worth evaluating the two pieces of this PR. Dropping the cri-o specific storage information in favor of deferring to /etc/containers/storage.conf makes sense to me, as long as we don't need "override_kernel_checks". I'll wait for the experts on this to chime in

dropping the /etc/crio/crio.conf here solves the case where a user directly changed /etc/crio/crio.conf and it's saved by the RPM (which caused the need for #2723 in the first place), so I think I'm in favor of it.

TL;DR:
/approve

assuming we don't want override_kernel_checks
though I don't think we should abandon cri-o/cri-o#5423

@palonsoro
Copy link
Copy Markdown
Contributor Author

Closing this PR. The bug turned out to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2012838 . In that bug, the proposed code fix is to make cri-o merge configs from both /etc/containers/storage.conf and /etc/crio.conf and /etc/crio.conf.d/*, so this PR wouldn't be needed and it would be a cleaner solution.

@palonsoro palonsoro closed this Oct 27, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 27, 2021

@palonsoro: This pull request references Bugzilla bug 2017756. The bug has been updated to no longer refer to the pull request using the external bug tracker.

Details

In response to this:

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@palonsoro
Copy link
Copy Markdown
Contributor Author

@haircommander I realized that I closed the PR before seeing your last comment, because the main issue was being addressed by the other bug and PR.

But if you think this is still worth it, I can reopen.

@haircommander
Copy link
Copy Markdown
Member

/reopen

yeah I think this is worth investigating

@openshift-ci openshift-ci Bot reopened this Oct 28, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 28, 2021

@haircommander: Reopened this PR.

Details

In response to this:

/reopen

yeah I think this is worth investigating

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot removed the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Oct 28, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 28, 2021

@palonsoro: This pull request references Bugzilla bug 2017756, which is invalid:

  • expected the bug to be open, but it isn't
  • expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is CLOSED (DUPLICATE) instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Oct 28, 2021
@palonsoro
Copy link
Copy Markdown
Contributor Author

/retest

1 similar comment
@palonsoro
Copy link
Copy Markdown
Contributor Author

/retest

@haircommander
Copy link
Copy Markdown
Member

I think we should pivot a bit here. I don't think we should overwrite /etc/crio/crio.conf, nor do I think we should not ship it in the RPM. The reasoning for the latter is because without cri-o being configured OOTB to update it's cni plugin dir, cri-o starts but doesn't accept container creation requests and basically endlessly stalls. Even though no one is using RHCOS+CRI-O without MCO, it feels too weird to have that behavior come from the vanilla RPM.

Instead, I have dropped the storage related fields in the crio.conf from the RPM, and I think we should do the same here. then, we will ineherit it from /etc/containers/storage.conf, which correctly configures everything.

WDYT @palonsoro

@palonsoro
Copy link
Copy Markdown
Contributor Author

If you have removed the storage related overrides from crio.conf on RPM, and that removal can be backported to crio versions used in older OCPs (as necessary), I think I can just drop the crio.conf modification completely and only leave the rest of the changes. Please confirm if that makes sense for you and I'll just go ahead and do it.

@palonsoro
Copy link
Copy Markdown
Contributor Author

@haircommander ^^^

@haircommander
Copy link
Copy Markdown
Member

yup sounds good to me, thanks @palonsoro

…te settings from /etc/containers/storage.conf.
@palonsoro
Copy link
Copy Markdown
Contributor Author

Done.

@haircommander
Copy link
Copy Markdown
Member

/lgtm

thanks for sticking with this and waiting while we figured out the solution here :)

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 29, 2021
Copy link
Copy Markdown
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New changes lgtm

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 29, 2021

@palonsoro: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-disruptive dd0c6bb link false /test e2e-aws-disruptive
ci/prow/e2e-aws-upgrade-single-node dd0c6bb link false /test e2e-aws-upgrade-single-node
ci/prow/okd-e2e-aws dd0c6bb link false /test okd-e2e-aws
ci/prow/e2e-aws-single-node dd0c6bb link false /test e2e-aws-single-node
ci/prow/e2e-aws-workers-rhel7 dd0c6bb link false /test e2e-aws-workers-rhel7

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 29, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: haircommander, palonsoro, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 29, 2021
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@haircommander
Copy link
Copy Markdown
Member

/bugzilla refresh

@openshift-ci openshift-ci Bot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Nov 29, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 29, 2021

@haircommander: This pull request references Bugzilla bug 2017756, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @lyman9966

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Nov 29, 2021
@openshift-ci openshift-ci Bot requested a review from lyman9966 November 29, 2021 21:18
@openshift-merge-robot openshift-merge-robot merged commit 4e5bd43 into openshift:master Nov 29, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 29, 2021

@palonsoro: All pull requests linked via external trackers have merged:

Bugzilla bug 2017756 has been moved to the MODIFIED state.

Details

In response to this:

Bug 2017756: Remove crio settings that overwrite /etc/containers/storage.conf

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@palonsoro
Copy link
Copy Markdown
Contributor Author

Thanks to you for all your assistance

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants