Skip to content

Bug 1981549: lib/resourcemerge: handle container env var deletions#2800

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
yuqi-zhang:fix-proxy-daemonset-env
Oct 19, 2021
Merged

Bug 1981549: lib/resourcemerge: handle container env var deletions#2800
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
yuqi-zhang:fix-proxy-daemonset-env

Conversation

@yuqi-zhang
Copy link
Copy Markdown
Contributor

The logic in the resourcemerge functions only iterate through required
variables, meaning any removed variable is not handled.

As a fix to bug 1981549, this adds removal check for env vars, which
ensures that e.g. a removed proxy will correctly propagate to the
daemonset definition, which needs the proxy injected as an environment
variable to allow MCD to pull os image updates.

This of course is also a problem for everything else being synced via
these lib functions, but for now I only added the fix to EnvVar for
proxy, as it is the most likely issue for users to hit. If we want to
fix all other variables, we should probably also consider reworking
the resource* libraries in general, since they are outdated and error
prone.

To test, you can pause the pool, add a cluster proxy and then remove
it, checking the MCD daemonset after both steps to see the proxy being
added/removed. Interestingly, the adding of the proxy is almost instant,
whereas the removal can take up to 10 minutes due to the MCO seemingly
not resyncing (no action from proxy informer?). I am investigating that
separately.

Also add some initial tests for EnvVar handling specifically, to make sure
we don't regress proxy. Other unit tests should be added as we clean up
the code.

The logic in the resourcemerge functions only iterate through required
variables, meaning any removed variable is not handled.

As a fix to bug 1981549, this adds removal check for env vars, which
ensures that e.g. a removed proxy will correctly propagate to the
daemonset definition, which needs the proxy injected as an environment
variable to allow MCD to pull os image updates.

This of course is also a problem for everything else being synced via
these lib functions, but for now I only added the fix to EnvVar for
proxy, as it is the most likely issue for users to hit. If we want to
fix all other variables, we should probably also consider reworking
the resource* libraries in general, since they are outdated and error
prone.

To test, you can pause the pool, add a cluster proxy and then remove
it, checking the MCD daemonset after both steps to see the proxy being
added/removed. Interestingly, the adding of the proxy is almost instant,
whereas the removal can take up to 10 minutes due to the MCO seemingly
not resyncing (no action from proxy informer?). I am investigating that
separately.

Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
This aims to add some initial tests for EnvVar handling specifically,
to make sure we don't regress proxy.

Other unit tests should be added as we clean up the code.

Signed-off-by: Yu Qi Zhang <jerzhang@redhat.com>
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 12, 2021

@yuqi-zhang: This pull request references Bugzilla bug 1981549, which is invalid:

  • expected the bug to target the "4.10.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1981549: lib/resourcemerge: handle container env var deletions

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot added bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Oct 12, 2021
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 12, 2021
@yuqi-zhang
Copy link
Copy Markdown
Contributor Author

/bugzilla refresh

@openshift-ci openshift-ci Bot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. labels Oct 12, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 12, 2021

@yuqi-zhang: This pull request references Bugzilla bug 1981549, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @jianzhangbjz

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Oct 12, 2021
@openshift-ci openshift-ci Bot requested a review from jianzhangbjz October 12, 2021 20:43
@dmage
Copy link
Copy Markdown

dmage commented Oct 18, 2021

/retest

@cgwalters
Copy link
Copy Markdown
Member

we should probably also consider reworking the resource* libraries in general, since they are outdated and error prone.

This code has to exist in other places...what are other operators using? Also I'd guess all "gitops" things like argocd must have similar code.

@yuqi-zhang
Copy link
Copy Markdown
Contributor Author

CVO, as an example, has their own fork as well: https://github.com/openshift/cluster-version-operator/tree/master/lib. There is a card https://issues.redhat.com/browse/GRPA-3832 but I am not sure what came out of that discussion

@sinnykumari
Copy link
Copy Markdown
Contributor

In the past there was an attempt in MCO repo to move away from using old version of resourcemerge and move to use library-go #829 but didn't made it to get merged. We have corresponding jira card https://issues.redhat.com/browse/MCO-17 as well, if it makes sense we can prioritize to get it fixed.

@yuqi-zhang
Copy link
Copy Markdown
Contributor Author

yuqi-zhang commented Oct 19, 2021

We have corresponding jira card https://issues.redhat.com/browse/MCO-17 as well, if it makes sense we can prioritize to get it fixed.

I think it still makes sense, although I am not sure how much effort that would be, since it looks like the library-go version and the MCO fork now looks significantly different.

Since this is blocking CI for other teams, would you be ok with merging this as a temp fix, and we can re-evaluate MCO-17 during a later planning?

Copy link
Copy Markdown
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Jerry for fixing this!
Tested locally, proxy changes get applied as expected to MCD pods.
/lgtm

@sinnykumari
Copy link
Copy Markdown
Contributor

Since this is blocking CI for other teams, would you be ok with merging this as a temp fix, and we can re-evaluate MCO-17 during a later planning?

Yes, that's what I meant.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Oct 19, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 19, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sinnykumari, yuqi-zhang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [sinnykumari,yuqi-zhang]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

4 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 19, 2021

@yuqi-zhang: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-e2e-aws 2ad0f61 link false /test okd-e2e-aws
ci/prow/e2e-aws-upgrade-single-node 2ad0f61 link false /test e2e-aws-upgrade-single-node
ci/prow/e2e-aws-workers-rhel7 2ad0f61 link false /test e2e-aws-workers-rhel7
ci/prow/e2e-aws-workers-rhel8 2ad0f61 link false /test e2e-aws-workers-rhel8
ci/prow/e2e-aws-disruptive 2ad0f61 link false /test e2e-aws-disruptive

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Copy Markdown
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 03ad769 into openshift:master Oct 19, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Oct 19, 2021

@yuqi-zhang: All pull requests linked via external trackers have merged:

Bugzilla bug 1981549 has been moved to the MODIFIED state.

Details

In response to this:

Bug 1981549: lib/resourcemerge: handle container env var deletions

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot
Copy link
Copy Markdown

@palonsoro: new pull request created: #3057

Details

In response to this:

/cherry-pick release-4.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@palonsoro
Copy link
Copy Markdown
Contributor

(I removed my cherry-pick comment because I thought it had a typo but it didn't)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants