OCPBUGS-15200: Filter out shallowly UpdateEffectNone errors from a MultipleErrors message in the Failing condition#1050
Conversation
|
@Davoska: This pull request references Jira Issue OCPBUGS-15200, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@Davoska: This pull request references Jira Issue OCPBUGS-15200, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
I would like to test this on a live cluster (edit: and fix the failing CI). Thus, I am putting this PR on hold for the time being. /hold |
|
/uncc LalatenduMohanty |
|
Approach SGTM 👍 |
|
I have not looked at the code closely yet but one piece to check for possible interaction is #1041 which renders all reconciliation problems (including the If possible we'd like to keep |
|
/hold I am re-working the PR. |
e11d635 to
8b4d632
Compare
|
@Davoska: This pull request references Jira Issue OCPBUGS-15200, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
UpdateEffectNone errors from the summarized task graph errorUpdateEffectNone errors from the Failing condition
petr-muller
left a comment
There was a problem hiding this comment.
approach lgtm, some code readability nits + what Trevor says ;)
|
Test Scenario: Make a CO(authentication) degraded. Original Failure: Install a 4.17 cluster and degrade the Cluster operator authentication. Trigger Upgrade to version which doesn't contain the PR Changes with error upgrade is proceeded and CVO is throwing the error Expected/New Failure: Trigger an upgrade to version which contains the PR changes
|
|
@dis016 this looks good, right? if so (and unless you plan more testing) can you please drop a |
|
@petr-muller i am looking for more testing scenario's as @Davoska mentioned.
|
|
Hi @Davoska, after degrading the Operator, upgrade is not triggered. Please check once when you have time. |
|
Oh, I thought that the verification was successful. It is uncommon for the CVO to not trigger an update and not provide any information. I would expect the Then finally (notice the Is it possible that the release no longer existed in your run as well? It's maybe possible that the Updating to a freshly created build of this PR is successful: |
|
Edit: This comment is wrong. It checks the version that does not contain the PR.
|
|
Expected/New Failure: Reason: ClusterOperatorDegraded; Message: CO A is degraded Trigger an upgrade to version which contains the PR changes Upgrade is triggered and CVO is throwing new error after sometime. After the upgrade stuck with Now CVO error should disappear then upgrade should resume. |
|
To help with the verification, there is another method that combines a degraded CO and another issue. I have a cluster that contains this PR using the Cluster Bot. I have also set the My goal is to create another issue while upgrading the cluster. In the same run-level as the I have created a custom The policy and its binding: Apply the resources: Request an upgrade to a release that contains this PR: After a while, we get the We can check the CVO logs to be sure that message was filtered as expected. As we can see, the filtering successfully filtered out the |
This is AWESOME |
|
/label qe-approved |
|
@Davoska: This pull request references Jira Issue OCPBUGS-15200, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh Fixed up the target version, we missed 4.17 |
|
@petr-muller: This pull request references Jira Issue OCPBUGS-15200, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@Davoska: Jira Issue OCPBUGS-15200: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-15200 has been moved to the MODIFIED state. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
🎉🎉🎉 |
|
/cherry-pick release-4.17 |
|
@dis016: new pull request created: #1082 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
[ART PR BUILD NOTIFIER] Distgit: cluster-version-operator |
@dis016 fyi |
|
/cherry-pick release-4.17 |
|
@DavidHurta: new pull request created: #1114 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Various errors get propagated to users, such as the summarized task
graph error. For example, in the form of the message in the Failing
condition. However, update errors set with the update effect of
UpdateEffectNonecan confuse users, as these primarily informingmessages get displayed together with valid update errors that heavily
impact the update. This can result in a message such as:
The Failing condition is not true because of the
UpdateEffectNoneerror (
"Cluster operator authentication is updating versions"), butits message still gets displayed.
This PR makes sure that update errors that do not heavily affect
the update will be removed from the Failing condition message to an
extent.
This pull request references https://issues.redhat.com/browse/OCPBUGS-15200