Skip to content

Introducing Rollback informing jobs#39488

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
bradmwilliams:rollback-informing-jobs
May 19, 2023
Merged

Introducing Rollback informing jobs#39488
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
bradmwilliams:rollback-informing-jobs

Conversation

@bradmwilliams
Copy link
Copy Markdown
Contributor

@bradmwilliams bradmwilliams commented May 18, 2023

It was decided in the 5/18 SHIP architecture call that we would introduce the "rollback" jobs, as release informing, to the various nightly streams. This PR adds the jobs to versions 4.10 through 4.13.

@bradmwilliams
Copy link
Copy Markdown
Contributor Author

/hold
For stakeholder buy in

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label May 18, 2023
@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 18, 2023
@openshift-ci openshift-ci Bot requested review from sosiouxme and xueqzhan May 18, 2023 16:52
@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2023
@stbenjam
Copy link
Copy Markdown
Member

  • Who owns these jobs?
  • Where can TRT send slack Alertmanager alerts for them?
  • Why are they SDN? If we're going to have one, I'd suggest making it the default OCP configuration (OVN)

@vikaslaad Can you help? I noticed you were listed on the agenda item

@bradmwilliams bradmwilliams force-pushed the rollback-informing-jobs branch from d827d4c to 56b242f Compare May 18, 2023 17:50
@vikaslaad
Copy link
Copy Markdown
Contributor

  • Who owns these jobs?
  • Where can TRT send slack Alertmanager alerts for them?
  • Why are they SDN? If we're going to have one, I'd suggest making it the default OCP configuration (OVN)

@vikaslaad Can you help? I noticed you were listed on the agenda item

  • QE would own the jobs, we are making them informing for now so that it does not block anything.
  • TRT should not be looking into the failures, patch manager and QE would look into them for z-stream releases.
  • I think you worked with Brad on the third one.

@stbenjam
Copy link
Copy Markdown
Member

stbenjam commented May 18, 2023

  • Who owns these jobs?
  • Where can TRT send slack Alertmanager alerts for them?
  • Why are they SDN? If we're going to have one, I'd suggest making it the default OCP configuration (OVN)

@vikaslaad Can you help? I noticed you were listed on the agenda item

* QE would own the jobs, we are making them informing for now so that it does not block anything.

* TRT should not be looking into the failures, patch manager and QE would look into them for z-stream releases.

* I think you worked with Brad on the third one.

The PR still adds them to 4.14. TRT owns all the jobs on dev branch, we no longer accept adding informers to the dev branch without someone who's actually accountable to fix them (who gets alerts when they fail), otherwise we know they'll eventually fail forever and no one will look.

If you just want to add them to 4.13 and earlier, that's fine and TRT is out of the picture, but in general I find the answer "QE would own the jobs" a bit vague on the accountability for when they start failing.

@bradmwilliams bradmwilliams force-pushed the rollback-informing-jobs branch from 56b242f to cf76138 Compare May 18, 2023 20:02
@bradmwilliams
Copy link
Copy Markdown
Contributor Author

@stbenjam @vikaslaad I have dropped the job definitions from 4.14.

@stbenjam
Copy link
Copy Markdown
Member

/lgtm

Feel free to merge then if 4.14 is out of the picture

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 18, 2023
@bradmwilliams bradmwilliams force-pushed the rollback-informing-jobs branch from cf76138 to ef34f72 Compare May 19, 2023 13:11
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label May 19, 2023
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@bradmwilliams: no rehearsable tests are affected by this change

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 10 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 20 rehearsals
Comment: /pj-rehearse max to run up to 35 rehearsals
Comment: /pj-rehearse auto-ack to run up to 10 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse abort to abort all active rehearsals

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@vikaslaad
Copy link
Copy Markdown
Contributor

@stbenjam it needs another lgtm please

@bradmwilliams
Copy link
Copy Markdown
Contributor Author

/unhold

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 19, 2023
@stbenjam
Copy link
Copy Markdown
Member

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 19, 2023
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 19, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bradmwilliams, stbenjam

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 19, 2023

@bradmwilliams: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 421c921 into openshift:master May 19, 2023
@bradmwilliams bradmwilliams deleted the rollback-informing-jobs branch May 19, 2023 15:56
wking added a commit to wking/openshift-release that referenced this pull request Jun 1, 2023
The job flavor was originally added in 0837634 (Add
ovn-upgrade-rollback job for 4.7->4.8, 2021-02-24, openshift#16260).  The jobs
have subsequently been cloned forward to new minors as part of the
branching process.  And as older jobs started failing, I'd been
dropping them gradually like 856aab2
(ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10:
Drop failing rollback jobs, 2022-10-11, openshift#33005).  But rounding with
Jamo, the jobs no longer serve a useful role, and as 856aab2 points
out, rollbacks between minor releases are not supported.  Drop the
likely-to-fail and not-useful-even-when-it-passes jobs in their
entirety, so they stop getting cloned forward during branching.

I'm also adjusting the release controller changes from 421c921
(Introducing Rollback informing jobs, 2023-05-19, openshift#39488).  I'm
dropping 4.12 and earlier rollback informers, so we can focus on 4.13
while we feel out the new process.  And I'm pivoting 4.13 away from
the cross-minor job that this pull request drops, and towards the
rollback-oldest-supported job that will help back [1].

[1]: https://issues.redhat.com/browse/OTA-455
openshift-merge-robot pushed a commit that referenced this pull request Jun 7, 2023
…39897)

* ci-operator/config/openshift/release: Drop cross-minor rollback jobs

The job flavor was originally added in 0837634 (Add
ovn-upgrade-rollback job for 4.7->4.8, 2021-02-24, #16260).  The jobs
have subsequently been cloned forward to new minors as part of the
branching process.  And as older jobs started failing, I'd been
dropping them gradually like 856aab2
(ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10:
Drop failing rollback jobs, 2022-10-11, #33005).  But rounding with
Jamo, the jobs no longer serve a useful role, and as 856aab2 points
out, rollbacks between minor releases are not supported.  Drop the
likely-to-fail and not-useful-even-when-it-passes jobs in their
entirety, so they stop getting cloned forward during branching.

I'm also adjusting the release controller changes from 421c921
(Introducing Rollback informing jobs, 2023-05-19, #39488).  I'm
dropping 4.12 and earlier rollback informers, so we can focus on 4.13
while we feel out the new process.  And I'm pivoting 4.13 away from
the cross-minor job that this pull request drops, and towards the
rollback-oldest-supported job that will help back [1].

[1]: https://issues.redhat.com/browse/OTA-455

* hack/validate-release-controller-config: Supplemental Git diff

Because [1]:

  ERROR: The following differences were found:
  3a4
  > 03c544e5d55a55ae9f19d0de7d786341  .//core-services/release-controller/_releases/priv/release-ocp-4.12.json
  35d35
  < 1826a1b520574b66f152f814811c19f6  .//core-services/release-controller/_releases/priv/release-ocp-4.13.json
  42a43
  ...

tells me what files need changing, but not what changes to make to them.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/39897/pull-ci-openshift-release-master-release-controller-config/1664331471080394752

---------

Co-authored-by: wking <wking@penguin>
jtaleric pushed a commit to jtaleric/release that referenced this pull request Jun 9, 2023
…penshift#39897)

* ci-operator/config/openshift/release: Drop cross-minor rollback jobs

The job flavor was originally added in 0837634 (Add
ovn-upgrade-rollback job for 4.7->4.8, 2021-02-24, openshift#16260).  The jobs
have subsequently been cloned forward to new minors as part of the
branching process.  And as older jobs started failing, I'd been
dropping them gradually like 856aab2
(ci-operator/config/openshift/release/openshift-release-master__ci-4.11-upgrade-from-stable-4.10:
Drop failing rollback jobs, 2022-10-11, openshift#33005).  But rounding with
Jamo, the jobs no longer serve a useful role, and as 856aab2 points
out, rollbacks between minor releases are not supported.  Drop the
likely-to-fail and not-useful-even-when-it-passes jobs in their
entirety, so they stop getting cloned forward during branching.

I'm also adjusting the release controller changes from 421c921
(Introducing Rollback informing jobs, 2023-05-19, openshift#39488).  I'm
dropping 4.12 and earlier rollback informers, so we can focus on 4.13
while we feel out the new process.  And I'm pivoting 4.13 away from
the cross-minor job that this pull request drops, and towards the
rollback-oldest-supported job that will help back [1].

[1]: https://issues.redhat.com/browse/OTA-455

* hack/validate-release-controller-config: Supplemental Git diff

Because [1]:

  ERROR: The following differences were found:
  3a4
  > 03c544e5d55a55ae9f19d0de7d786341  .//core-services/release-controller/_releases/priv/release-ocp-4.12.json
  35d35
  < 1826a1b520574b66f152f814811c19f6  .//core-services/release-controller/_releases/priv/release-ocp-4.13.json
  42a43
  ...

tells me what files need changing, but not what changes to make to them.

[1]: https://prow.ci.openshift.org/view/gs/origin-ci-test/pr-logs/pull/openshift_release/39897/pull-ci-openshift-release-master-release-controller-config/1664331471080394752

---------

Co-authored-by: wking <wking@penguin>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants