Skip to content

core-services/release-controller/_releases/release-ocp-4.9-ci: Cred-request freeze informer#24177

Merged
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
wking:blocker-4.9-cred-request-freeze
Dec 13, 2021
Merged

core-services/release-controller/_releases/release-ocp-4.9-ci: Cred-request freeze informer#24177
openshift-merge-robot merged 3 commits intoopenshift:masterfrom
wking:blocker-4.9-cred-request-freeze

Conversation

@wking
Copy link
Copy Markdown
Member

@wking wking commented Dec 1, 2021

I'm skipping the cooking and optional phases [edit: now starting with the optional/informer phase], because this job should be very reliable, and reverts are cheap if I'm wrong.

This will increase the odds that we notice the #24126 periodic dying before shipping a patch release with an accidental credentials change.

@wking wking force-pushed the blocker-4.9-cred-request-freeze branch from 20f3b35 to 6dfe51d Compare December 1, 2021 22:22
@wking wking force-pushed the blocker-4.9-cred-request-freeze branch 2 times, most recently from 64a5210 to bc478b9 Compare December 8, 2021 17:16
@wking wking changed the title core-services/release-controller/_releases/release-ocp-4.9-ci: Require cred-request freeze core-services/release-controller/_releases/release-ocp-4.9-ci: Cred-request freeze informer Dec 8, 2021
@wking
Copy link
Copy Markdown
Member Author

wking commented Dec 8, 2021

Internal discussion with @dgoodwin and the technical release team ended up with "no new blockers at the moment, use an informer", so 64a5210407 -> bc478b9bb9 pivots to that. I've also added Slack notifications to ping the patch manager (in charge of monitoring released 4.y CI health) when these fail, similarly to #24387.

@wking wking force-pushed the blocker-4.9-cred-request-freeze branch from bc478b9 to 488903e Compare December 8, 2021 17:21
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this subteam syntax? Will this clearly indicate to ping the hive team?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pinging the patch manager. From the 488903ef81 commit message:

Syntax described in [1]. SMZ7PJ1L0 is @Patch-Manager.
...
[1]: https://api.slack.com/reference/surfaces/formatting#mentioning-groups

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was redirecting the patch manager quickly to the hive team for help figuring out who did what. Do you think that would be a good idea or can we assume the patch manager can read the failure and know exactly who to talk to quickly?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the patch manager should be able to figure this out, and I've pushed 488903ef81 -> 7e1759c, rebasing on master and printing some context to help with interpreting and acting on failures.

…equest freeze informer

This will increase the odds that we notice the 9e91c0d
(ci-operator/config/openshift/release: Add an
oldest-supported-credential-request job, 2021-11-30, openshift#24126) periodic
dying before shipping a patch release with an accidental credentials
change.

The technical release team wouldn't be watching this 4.9 job, because
they're focused on 4.dev (currently 4.10).  But they are being very
strict about accepting new blocking jobs today, so I'm adding this as
an informer, per [1].

[1]: https://docs.ci.openshift.org/docs/architecture/release-gating/#add-the-job-to-the-release-gating-suite-as-optional
…ze failures

Syntax described in [1].  SMZ7PJ1L0 is @Patch-Manager.

This should get eyeballs on failures by the patch-manager (who's only
remaining job is monitoring released 4.y health), without making the
job formally blocking (which the technical release team isn't on board
with, see 058214e8ec,
core-services/release-controller/_releases/release-ocp-4.9-ci:
Cred-request freeze informer, 2021-12-01, openshift#24177).

[1]: https://api.slack.com/reference/surfaces/formatting#mentioning-groups
…de suggested next steps on failure

The folks responding to a failing job may not be familiar with its
intended purpose.  Give them an overview, and suggest some possible
next-steps, so they can drive resolution themselves, and don't need to
track down a step/job expert to interpret for them.
@wking wking force-pushed the blocker-4.9-cred-request-freeze branch from 488903e to 7e1759c Compare December 11, 2021 07:05
@dgoodwin
Copy link
Copy Markdown
Contributor

/lgtm

@dgoodwin
Copy link
Copy Markdown
Contributor

/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Dec 13, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 13, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dgoodwin, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 13, 2021

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 7e603d8 into openshift:master Dec 13, 2021
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Dec 13, 2021

@wking: Updated the following 2 configmaps:

  • job-config-master-periodics configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-release-master-periodics.yaml using file ci-operator/jobs/openshift/release/openshift-release-master-periodics.yaml
  • step-registry configmap in namespace ci at cluster app.ci using the following files:
    • key openshift-credentials-request-freeze-commands.sh using file ci-operator/step-registry/openshift/credentials-request-freeze/openshift-credentials-request-freeze-commands.sh
Details

In response to this:

I'm skipping the cooking and optional phases [edit: now starting with the optional/informer phase], because this job should be very reliable, and reverts are cheap if I'm wrong.

This will increase the odds that we notice the #24126 periodic dying before shipping a patch release with an accidental credentials change.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the blocker-4.9-cred-request-freeze branch December 13, 2021 17:58
wking added a commit to wking/openshift-release that referenced this pull request Jan 13, 2022
…y-4.10: Add oldest-* jobs

Pulling e5e2d16 (ci-operator/config/openshift/release: Add 4.9
nightly to 4.9.0 rollback tests, 2021-10-19, openshift#22854) and 9e91c0d
(ci-operator/config/openshift/release: Add an
oldest-supported-credential-request job, 2021-11-30, openshift#24126) forward
into 4.10, now that we have our first feature candidate to pin them to
[1].  We'll keep bumping the pinned version forward until we get to
our first GA 4.10 release.

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).

[1]: openshift/cincinnati-graph-data#1360
wking added a commit to wking/openshift-release that referenced this pull request Jul 28, 2022
…y-4.11: Add oldest-* jobs

Pulling e5e2d16 (ci-operator/config/openshift/release: Add 4.9
nightly to 4.9.0 rollback tests, 2021-10-19, openshift#22854) and 9e91c0d
(ci-operator/config/openshift/release: Add an
oldest-supported-credential-request job, 2021-11-30, openshift#24126) forward
into 4.11, now that we have our first feature candidate to pin them to
[1].  We'll keep bumping the pinned version forward until we get to
our first GA 4.11 release.  This is the 4.11 equivalent of 4.10's
923db4f
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.10:
Add oldest-* jobs, 2022-01-12, openshift#25213).

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).

I also dropped "the canary", which I'd been copy/pasting around,
because this isn't the canary job.  Without it, messages will be
rendered like:

  @ patch-manager, test job
  periodic-ci-openshift-release-master-nightly-4.11-credentials-request-freeze
  failed, see https://prow.ci.openshift.org/...

where the job name is sufficient context without attempting to echo
some portion of it in the earlier string.

[1]: openshift/cincinnati-graph-data#2001
openshift-merge-robot pushed a commit that referenced this pull request Jul 29, 2022
…y-4.11: Add oldest-* jobs (#30899)

Pulling e5e2d16 (ci-operator/config/openshift/release: Add 4.9
nightly to 4.9.0 rollback tests, 2021-10-19, #22854) and 9e91c0d
(ci-operator/config/openshift/release: Add an
oldest-supported-credential-request job, 2021-11-30, #24126) forward
into 4.11, now that we have our first feature candidate to pin them to
[1].  We'll keep bumping the pinned version forward until we get to
our first GA 4.11 release.  This is the 4.11 equivalent of 4.10's
923db4f
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.10:
Add oldest-* jobs, 2022-01-12, #25213).

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, #24177).

I also dropped "the canary", which I'd been copy/pasting around,
because this isn't the canary job.  Without it, messages will be
rendered like:

  @ patch-manager, test job
  periodic-ci-openshift-release-master-nightly-4.11-credentials-request-freeze
  failed, see https://prow.ci.openshift.org/...

where the job name is sufficient context without attempting to echo
some portion of it in the earlier string.

[1]: openshift/cincinnati-graph-data#2001
wking added a commit to wking/openshift-release that referenced this pull request Nov 28, 2022
…y-4.12: Add oldest-* jobs

Like b7da7de
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.11:
Add oldest-* jobs, 2022-07-29, openshift#30899), but for 4.12.  I guess we
could have done this back with ec.0, but it's probably good to wait
until these later engineering candidates when the bigger changes have
likely already landed.  We could have waited until early release
candidates, but I don't want to forget ;).

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).

Also, I seem to have neglected to actually add the reporter_config
block in 4.11, despite claiming I'd added it in the commit message :/.
Luckily, no changes have slipped in yet, and I'm catching up for that
mistake now.
openshift-merge-robot pushed a commit that referenced this pull request Nov 29, 2022
…y-4.12: Add oldest-* jobs (#33134)

Like b7da7de
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.11:
Add oldest-* jobs, 2022-07-29, #30899), but for 4.12.  I guess we
could have done this back with ec.0, but it's probably good to wait
until these later engineering candidates when the bigger changes have
likely already landed.  We could have waited until early release
candidates, but I don't want to forget ;).

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, #24177).

Also, I seem to have neglected to actually add the reporter_config
block in 4.11, despite claiming I'd added it in the commit message :/.
Luckily, no changes have slipped in yet, and I'm catching up for that
mistake now.
wking added a commit to wking/openshift-release that referenced this pull request Apr 25, 2023
…y-4.13: Add oldest-* jobs

Like f1e912d
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.12:
Add oldest-* jobs, 2022-11-29, openshift#33134), but for 4.13.  I guess we
could have done this back with ec.0, but it's probably good to wait
until later engineering candidates, or in this case, later release
candidates, when the bigger changes have likely already landed.

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).
openshift-merge-robot pushed a commit that referenced this pull request Apr 26, 2023
…y-4.13: Add oldest-* jobs (#38728)

Like f1e912d
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.12:
Add oldest-* jobs, 2022-11-29, #33134), but for 4.13.  I guess we
could have done this back with ec.0, but it's probably good to wait
until later engineering candidates, or in this case, later release
candidates, when the bigger changes have likely already landed.

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, #24177).
ascerra pushed a commit to ascerra/release that referenced this pull request May 8, 2023
…y-4.13: Add oldest-* jobs (openshift#38728)

Like f1e912d
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.12:
Add oldest-* jobs, 2022-11-29, openshift#33134), but for 4.13.  I guess we
could have done this back with ec.0, but it's probably good to wait
until later engineering candidates, or in this case, later release
candidates, when the bigger changes have likely already landed.

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).
wking added a commit to wking/openshift-release that referenced this pull request Sep 21, 2023
4.14, becasue we want to freeze these through the life of 4.14,
following the existing pattern, most recently d666767
(ci-operator/config/openshift/release/openshift-release-master__nightly-4.13:
Add oldest-* jobs, 2023-04-26, openshift#38728).  I'm not pulling in the
rollback job this time, because that's moving under QE and is in
flight separately in [1].

I'm also adding a 4.15 cred-freeze job this time, to catch up with
dbcbb85 (add explanation of blocking jobs in master before service
streams, 2023-09-18, openshift#43418).  As I pointed out in d666767, I'm
still concerned about the amount of churn that I expect will land
during the engineering candidate, but I'm not on the release-oversight
team, and if they perfer having a blocker job in the development
branch with occasional pin bumps, that's fine with me.

The bulk of the ci-operator/jobs content is from:

  $ make jobs

But then I manually edited to inject reporter_config, as described in
08db24d (ci-operator/jobs/openshift/release: Ping @Patch-Manager
for cred-freeze failures, 2021-12-08, openshift#24177).

[1]: openshift#43401
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants