Skip to content

OCPBUGS-3714: pkg/cli/admin/upgrade: Report on Failing!=False conditions#900

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
wking:drop-cluster-version-degraded-check
Nov 18, 2022
Merged

OCPBUGS-3714: pkg/cli/admin/upgrade: Report on Failing!=False conditions#900
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
wking:drop-cluster-version-degraded-check

Conversation

@wking
Copy link
Copy Markdown
Member

@wking wking commented Aug 17, 2021

Spun out of this comment.

cae0b5e (openshift/origin#22644) moved this code from Failing to Degraded, likely inspired by openshift/api#287. But Degraded is only used in ClusterOperator. ClusterVersion kept using Failing, as seen in openshift/cluster-version-operator#191. This commit returns us to watching for Failing (the condition the CVO has been setting the whole time), and informing the caller for any non-happy statuses (or the lack of a Failing condition at all).

Even though the issue causing Failing=True may block the current update from progressing, it should not block admins from requesting a new update target. For some bugs, retargeting is the recommended way to resolve the issue that is currently sticking the update (one example).

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Aug 17, 2021

@wking: This pull request references Bugzilla bug 1992680, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.9.0) matches configured target release for branch (4.9.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @zhouying7780

Details

In response to this:

Bug 1992680: pkg/cli/admin/upgrade: Report on Failing!=False conditions

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 17, 2021
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 17, 2021
@wking wking force-pushed the drop-cluster-version-degraded-check branch from f42948c to 004261b Compare August 17, 2021 00:48
Comment thread pkg/cli/admin/upgrade/upgrade.go Outdated
}
return fmt.Errorf("The cluster can't be upgraded, see `oc describe clusterversion`")
} else {
fmt.Fprintf(o.ErrOut, "warning: No current %s info, see `oc describe clusterversion` for more details", ClusterStatusFailing)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a warning? Also it should be s/No current %s info/ No information on current %s/

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a warning, because we should always be setting Failing. Ideally to Failing=False, but sometimes to Failing=True. A ClusterVersion with no Failing at all is a sign that something is probably wrong with the CVO.

...No current Failing info, see... sounds pretty clear to me. I'd be fine rephrasing to something like ...No current conditions with type Failing, see... here and below for Progressing if you prefer.

@openshift-merge-robot
Copy link
Copy Markdown
Contributor

/bugzilla refresh

The requirements for Bugzilla bugs have changed, recalculating validity.

@openshift-ci openshift-ci Bot added bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. and removed bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Sep 6, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Sep 6, 2021

@openshift-merge-robot: This pull request references Bugzilla bug 1992680, which is invalid:

  • expected the bug to target the "4.10.0" release, but it targets "4.9.0" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/bugzilla refresh

The requirements for Bugzilla bugs have changed, recalculating validity.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking
Copy link
Copy Markdown
Member Author

wking commented Nov 16, 2021

/bugzilla refresh

@openshift-ci openshift-ci Bot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels Nov 16, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 16, 2021

@wking: This pull request references Bugzilla bug 1992680, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

Requesting review from QA contact:
/cc @shellyyang1989

Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Copy Markdown
Contributor

/bugzilla refresh

The requirements for Bugzilla bugs have changed (BZs linked to PRs on master branch need to target OCP 4.11), recalculating validity.

@openshift-ci openshift-ci Bot added bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. and removed bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Jan 28, 2022
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Jan 28, 2022

@openshift-bot: This pull request references Bugzilla bug 1992680, which is invalid:

  • expected the bug to target the "4.11.0" release, but it targets "4.10.0" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/bugzilla refresh

The requirements for Bugzilla bugs have changed (BZs linked to PRs on master branch need to target OCP 4.11), recalculating validity.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 28, 2022
@wking wking force-pushed the drop-cluster-version-degraded-check branch from 304931d to fa302a7 Compare February 9, 2022 19:01
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 9, 2022
@LalatenduMohanty
Copy link
Copy Markdown
Member

@wking Do you have a sample output of old code vs this PR?

@wking wking force-pushed the drop-cluster-version-degraded-check branch from fa302a7 to f95fea0 Compare February 9, 2022 20:03
@wking
Copy link
Copy Markdown
Member Author

wking commented Feb 9, 2022

Before this PR, oc ignored Failing=True. So to use the unit-test dummy data, Progressing=True Failing=True would have looked like:

the cluster is already upgrading:

  Reason: RollingOut
  Message: Updating to v2.

With this PR, it will look like:

the cluster is experiencing a possibly-upgrade-blocking error:

  Reason: BadStuff
  Message: The widgets are slow.

the cluster is already upgrading:

  Reason: RollingOut
  Message: Updating to v2.

@openshift-ci openshift-ci Bot removed the bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. label Nov 15, 2022
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 15, 2022

@wking: No Bugzilla bug is referenced in the title of this pull request.
To reference a bug, add 'Bug XXX:' to the title of this pull request and request another bug refresh with /bugzilla refresh.

Details

In response to this:

OCPBUGS-3714: pkg/cli/admin/upgrade: Report on Failing!=False conditions

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot removed the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Nov 15, 2022
@openshift-ci-robot openshift-ci-robot added the jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. label Nov 15, 2022
@openshift-ci-robot
Copy link
Copy Markdown

@wking: This pull request references Jira Issue OCPBUGS-3714, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.13.0) matches configured target version for branch (4.13.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @zhouying7780

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Spun out of this comment.

cae0b5e (openshift/origin#22644) moved this code from Failing to Degraded, likely inspired by openshift/api#287. But Degraded is only used in ClusterOperator. ClusterVersion kept using Failing, as seen in openshift/cluster-version-operator#191. This commit returns us to watching for Failing (the condition the CVO has been setting the whole time), and informing the caller for any non-happy statuses (or the lack of a Failing condition at all).

Even though the issue causing Failing=True may block the current update from progressing, it should not block admins from requesting a new update target. For some bugs, retargeting is the recommended way to resolve the issue that is currently sticking the update (one example).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Nov 15, 2022
@wking wking force-pushed the drop-cluster-version-degraded-check branch 3 times, most recently from a1d6433 to ea3dd70 Compare November 16, 2022 17:39
@evakhoni
Copy link
Copy Markdown

pre-merge verified in https://issues.redhat.com/browse/OCPBUGS-3714#comment-21273892
/label qe-approved

@openshift-ci openshift-ci Bot added the qe-approved Signifies that QE has signed off on this PR label Nov 16, 2022
Comment thread pkg/cli/admin/upgrade/upgrade.go
Comment thread pkg/cli/admin/upgrade/upgrade.go Outdated
Comment thread pkg/cli/admin/upgrade/upgrade.go
@wking wking force-pushed the drop-cluster-version-degraded-check branch from ea3dd70 to 1900577 Compare November 16, 2022 22:19
Comment thread pkg/cli/admin/upgrade/upgrade.go
@wking wking force-pushed the drop-cluster-version-degraded-check branch 5 times, most recently from b6e8e43 to 13196b1 Compare November 17, 2022 21:04
cae0b5e (React to degraded condition change, 2019-04-23,
openshift/origin#22644) moved this code from Failing to Degraded,
likely inspired by [1].  But Degraded is only used in ClusterOperator.
ClusterVersion kept using Failing, as seen in [2].  This commit
returns us to watching for Failing (the condition the CVO has been
setting the whole time), and informing the caller for any non-happy
statuses (or the lack of a Failing condition at all).

Even though the issue causing `Failing=True` may block the current
update from progressing, it should not block admins from requesting a
new update target.  For some bugs, retargeting is the recommended way
to resolve the issue that is currently sticking the update [3].

[1]: openshift/api#287
[2]: openshift/cluster-version-operator#191
[3]: https://bugzilla.redhat.com/show_bug.cgi?id=1988576#c30
@wking wking force-pushed the drop-cluster-version-degraded-check branch from 13196b1 to d071b82 Compare November 17, 2022 21:08
Copy link
Copy Markdown
Member

@LalatenduMohanty LalatenduMohanty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 17, 2022
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 17, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LalatenduMohanty, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 17, 2022

@wking: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-agnostic-cmd 33c93e1 link true /test e2e-agnostic-cmd

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@LalatenduMohanty
Copy link
Copy Markdown
Member

/test e2e-agnostic-cmd
/test e2e-aws-ovn
/test e2e-aws-ovn-serial

@openshift-merge-robot openshift-merge-robot merged commit 3ac1b02 into openshift:master Nov 18, 2022
@openshift-ci-robot
Copy link
Copy Markdown

@wking: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-3714 has been moved to the MODIFIED state.

Details

In response to this:

Spun out of this comment.

cae0b5e (openshift/origin#22644) moved this code from Failing to Degraded, likely inspired by openshift/api#287. But Degraded is only used in ClusterOperator. ClusterVersion kept using Failing, as seen in openshift/cluster-version-operator#191. This commit returns us to watching for Failing (the condition the CVO has been setting the whole time), and informing the caller for any non-happy statuses (or the lack of a Failing condition at all).

Even though the issue causing Failing=True may block the current update from progressing, it should not block admins from requesting a new update target. For some bugs, retargeting is the recommended way to resolve the issue that is currently sticking the update (one example).

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the drop-cluster-version-degraded-check branch November 18, 2022 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants