[release-4.13] OCPBUGS-38254: [CARRY] perform operator apiService certificate validity checks directly by ankitathomas · Pull Request #836 · openshift/operator-framework-olm

ankitathomas · 2024-08-09T15:33:59Z

Manual cherry-pick of #821 to 4.14

The issue for this is open for years and it's not super interesting to go debug it. The test threads will exit when the test process does. Having teardown fail means none of the other tests run for me. Signed-off-by: Steve Kuznetsov <skuznets@redhat.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: b683c28b31ad12f8acb8f7fd4d7beb85c74a751f

openshift-ci · 2024-08-09T15:34:34Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ankitathomas

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~DOWNSTREAM_OWNERS~~ [ankitathomas]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2024-08-09T15:34:49Z

@ankitathomas: This pull request references Jira Issue OCPBUGS-38254, which is valid. The bug has been moved to the POST state.

7 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.13.z) matches configured target version for branch (4.13.z)
bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
release note text is set and does not match the template
dependent bug Jira Issue OCPBUGS-36949 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
dependent Jira Issue OCPBUGS-36949 targets the "4.14.z" version, which is one of the valid target versions: 4.14.0, 4.14.z
bug has dependents

No GitHub users were found matching the public email listed for the QA contact in Jira (jiazha@redhat.com), skipping review request.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-ci-robot · 2024-08-09T15:35:03Z

@ankitathomas: This pull request references Jira Issue OCPBUGS-38254, which is valid.

7 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (4.13.z) matches configured target version for branch (4.13.z)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
release note text is set and does not match the template
dependent bug Jira Issue OCPBUGS-36949 is in the state Verified, which is one of the valid states (VERIFIED, RELEASE PENDING, CLOSED (ERRATA), CLOSED (CURRENT RELEASE), CLOSED (DONE), CLOSED (DONE-ERRATA))
dependent Jira Issue OCPBUGS-36949 targets the "4.14.z" version, which is one of the valid target versions: 4.14.0, 4.14.z
bug has dependents

No GitHub users were found matching the public email listed for the QA contact in Jira (jiazha@redhat.com), skipping review request.

Details

In response to this:

Manual cherry-pick of #821 to 4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

jianzhangbjz · 2024-08-12T03:25:10Z

Hi @ankitathomas , I got the below error when building cluster with this PR, could you help have a look when you get a chance? Thanks! https://prow.ci.openshift.org/view/gs/test-platform-results/logs/release-openshift-origin-installer-launch-gcp-modern/1822827253600358400

Go compliance shim [6124] [rhel-8-golang-1.19][openshift-golang-builder]: invoking real go binary
# github.com/operator-framework/operator-lifecycle-manager/pkg/controller/install
vendor/github.com/operator-framework/operator-lifecycle-manager/pkg/controller/install/certresources.go:280:19: undefined: sets.New
Go compliance shim [6124] [rhel-8-golang-1.19][openshift-golang-builder]: Exited with: 1
make[1]: Leaving directory '/build'
make[1]: *** [Makefile:79: github.com/operator-framework/operator-lifecycle-manager/cmd/catalog] Error 1
make: *** [Makefile:67: build/olm] Error 2
error: build error: building at STEP "RUN make build/olm bin/cpb": while running runtime: exit status 2

jianzhangbjz · 2024-08-13T04:35:53Z

Test pass, details: https://issues.redhat.com/browse/OCPBUGS-38254
/lgtm
/label qe-approved
/label cherry-pick-approved

perdasilva · 2024-08-13T16:31:55Z

/retest

perdasilva · 2024-08-14T13:34:38Z

/retest

ankitathomas · 2024-08-14T14:53:34Z

/retest

jianzhangbjz · 2024-08-15T07:27:04Z

/test e2e-gcp-olm-flaky
/test e2e-gcp-olm

ankitathomas · 2024-08-15T13:41:51Z

/retest

jianzhangbjz · 2024-08-16T10:13:27Z

/test e2e-gcp-olm

jianzhangbjz · 2024-08-16T10:14:07Z

/test e2e-gcp-olm-flaky

jianzhangbjz · 2024-08-19T07:51:10Z

/test e2e-gcp-olm

jianzhangbjz · 2024-08-19T07:52:15Z

https://prow.ci.openshift.org/view/gs/test-platform-results/pr-logs/pull/openshift_operator-framework-olm/836/pull-ci-openshift-operator-framework-olm-release-4.13-e2e-gcp-olm/1824389101466423296

level=error msg=Cluster operator kube-scheduler Degraded is True with MissingStaticPodController_SyncError::StaticPods_Error: MissingStaticPodControllerDegraded: static pod lifecycle failure - static pod: "openshift-kube-scheduler" in namespace: "openshift-kube-scheduler" for revision: 7 on node: "ci-op-di2mvmr9-ae41b-g86vb-master-1" didn't show up, waited: 3m0s
level=error msg=StaticPodsDegraded: pod/openshift-kube-scheduler-ci-op-di2mvmr9-ae41b-g86vb-master-1 container "kube-scheduler" is terminated: Completed: 
level=error msg=StaticPodsDegraded: pod/openshift-kube-scheduler-ci-op-di2mvmr9-ae41b-g86vb-master-1 container "kube-scheduler-cert-syncer" is terminated: Error: go:169: Failed to watch *v1.Secret: failed to list *v1.Secret: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-scheduler/secrets?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
level=error msg=StaticPodsDegraded: W0816 10:51:46.493141       1 reflector.go:424] k8s.io/client-go@v0.26.10/tools/cache/reflector.go:169: failed to list *v1.ConfigMap: Get "https://localhost:6443/api/v1/namespaces/openshift-kube-scheduler/configmaps?limit=500&resourceVersion=0": x509: certificate signed by unknown authority
...

perdasilva · 2024-08-19T17:30:31Z

@ankitathomas e2e is failing because of another fetchCSV with the wrong ordering in the parameters (namespace <-> name). The flake (blocks a CRD upgrade that could cause data loss) is a perma fail. While I think the workload is still being protected (i.e. the upgrade doesn't go through), it seems the IP isn't getting updated with the error (at least not in the right place) ? Or being put in the right state after detecting it?

perdasilva · 2024-08-19T17:42:11Z


 			Eventually(func() error {
+				// Fetch the current csv
+				fetchedCSV, err := fetchCSV(crc, csv.Name, generatedNamespace.GetName(), csvSucceededChecker)


Suggested change

fetchedCSV, err := fetchCSV(crc, csv.Name, generatedNamespace.GetName(), csvSucceededChecker)

fetchedCSV, err := fetchCSV(crc, generatedNamespace.GetName(), csv.Name, csvSucceededChecker)

Some of the e2e loops are a bit flakey, make them more robust * Retry on certain errors * Use Eventually() consistently (avoid wait.Poll()) * Clarify logging * Reduce some logging * Change csvExists() to waitForCsvToDelete(), as that's how it's used * Change awaitCSV() to fetchCSV() Signed-off-by: Todd Short <todd.short@me.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: e5f7320f29ee4e9def114d6dcc1d22b4c7bb2b0d

Fix #3151 Remove non-InstallPlan related checks for this test. Also: * Clean up some looping log messages * Clean up some logging added when comments were converted These comments/logs are at the beginning of the test, and are also part of the test sequence, so they are redundant (and possibly confusing) Signed-off-by: Todd Short <todd.short@me.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: 5299830576c8e8e6cd728b08a3a2e60f212ba387

…cks directly (#3217) * perform operator apiService certificate validity checks directly Signed-off-by: Ankita Thomas <ankithom@redhat.com> * use sets to track certs to install, revert to checking for installPlan timeouts after API availability checks, add service FQDN to list of hostnames. Signed-off-by: Ankita Thomas <ankithom@redhat.com> Upstream-repository: operator-lifecycle-manager Upstream-commit: 908da0c05363da40ad09ab774d9904b22aca7869 --------- Signed-off-by: Ankita Thomas <ankithom@redhat.com>

ankitathomas · 2024-09-05T13:10:39Z

/retest

ankitathomas · 2024-09-05T19:26:52Z

/retest

jianzhangbjz · 2024-09-06T01:32:58Z

/test e2e-gcp-olm-flaky

jianzhangbjz · 2024-09-06T06:45:38Z

It failed at:

Summarizing 1 Failure:
  [FAIL] CRD Versions [It] [FLAKE] blocks a CRD upgrade that could cause data loss
  /go/src/github.com/openshift/operator-framework-olm/staging/operator-lifecycle-manager/test/e2e/crd_e2e_test.go:275
Ran 8 of 199 Specs in 305.024 seconds
FAIL! -- 7 Passed | 1 Failed | 2 Pending | 189 Skipped
--- FAIL: TestEndToEnd (305.04s)

tmshort · 2024-09-06T14:18:12Z

That is a known flake.
/lgtm

ankitathomas · 2024-09-11T13:20:19Z

/retest

jianzhangbjz · 2024-09-12T06:57:01Z

/test e2e-gcp-olm-flaky

jianzhangbjz · 2024-09-12T06:57:59Z

Hi @tmshort , I guess it needs the backport-risk-assessed label.

perdasilva · 2024-09-12T07:34:27Z

/label backport-risk-assessed

openshift-ci-robot · 2024-09-12T07:38:54Z

@ankitathomas: Jira Issue OCPBUGS-38254: All pull requests linked via external trackers have merged:

openshift/operator-framework-olm#836

Jira Issue OCPBUGS-38254 has been moved to the MODIFIED state.

Details

In response to this:

Manual cherry-pick of #821 to 4.14

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

openshift-bot · 2024-09-12T08:48:54Z

[ART PR BUILD NOTIFIER]

Distgit: operator-lifecycle-manager
This PR has been included in build operator-lifecycle-manager-container-v4.13.0-202409120806.p0.g42d01eb.assembly.stream.el8.
All builds following this will include this PR.

openshift-bot · 2024-09-12T08:51:53Z

[ART PR BUILD NOTIFIER]

Distgit: operator-registry
This PR has been included in build operator-registry-container-v4.13.0-202409120806.p0.g42d01eb.assembly.stream.el8.
All builds following this will include this PR.

grokspawn · 2024-09-12T16:10:24Z

/cherrypick release-4.12

openshift-cherrypick-robot · 2024-09-12T16:11:17Z

@grokspawn: new pull request created: #866

Details

In response to this:

/cherrypick release-4.12

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

openshift-ci Bot requested review from oceanc80 and perdasilva August 9, 2024 15:34

ankitathomas changed the title ~~Ocp25341 4.13~~ [release-4.13] OCPBUGS-38254: [CARRY] perform operator apiService certificate validity checks directly Aug 9, 2024

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 9, 2024

openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Aug 9, 2024

openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Aug 9, 2024

ankitathomas force-pushed the OCP25341-4.13 branch from a7bd263 to 840c367 Compare August 12, 2024 17:45

openshift-ci Bot added qe-approved Signifies that QE has signed off on this PR cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. labels Aug 13, 2024

openshift-ci Bot assigned jianzhangbjz Aug 13, 2024

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2024

ankitathomas force-pushed the OCP25341-4.13 branch from 840c367 to 493cfa9 Compare August 13, 2024 09:16

openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Aug 13, 2024

ankitathomas force-pushed the OCP25341-4.13 branch from 493cfa9 to cb34d66 Compare August 13, 2024 19:29

perdasilva reviewed Aug 19, 2024

View reviewed changes

tmshort and others added 3 commits August 27, 2024 14:24

ankitathomas force-pushed the OCP25341-4.13 branch from cb34d66 to c7ddc4f Compare August 27, 2024 18:27

openshift-ci Bot assigned tmshort Sep 6, 2024

openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Sep 6, 2024

openshift-ci Bot added the backport-risk-assessed Indicates a PR to a release branch has been evaluated and considered safe to accept. label Sep 12, 2024

openshift-ci Bot assigned bandrade, emmajiafan, KeenonLee, kuiwang02 and Xia-Zhao-rh Sep 12, 2024

openshift-merge-bot Bot merged commit 42d01eb into openshift:release-4.13 Sep 12, 2024

openshift-cherrypick-robot mentioned this pull request Sep 12, 2024

[release-4.12] OCPBUGS-41881: [CARRY] perform operator apiService certificate validity checks directly #866

Merged

	fetchedCSV, err := fetchCSV(crc, csv.Name, generatedNamespace.GetName(), csvSucceededChecker)
	fetchedCSV, err := fetchCSV(crc, generatedNamespace.GetName(), csv.Name, csvSucceededChecker)

Conversation

ankitathomas commented Aug 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

openshift-ci Bot commented Aug 9, 2024

Uh oh!

openshift-ci-robot commented Aug 9, 2024

Uh oh!

openshift-ci-robot commented Aug 9, 2024

Uh oh!

jianzhangbjz commented Aug 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jianzhangbjz commented Aug 13, 2024

Uh oh!

perdasilva commented Aug 13, 2024

Uh oh!

perdasilva commented Aug 14, 2024

Uh oh!

ankitathomas commented Aug 14, 2024

Uh oh!

jianzhangbjz commented Aug 15, 2024

Uh oh!

ankitathomas commented Aug 15, 2024

Uh oh!

jianzhangbjz commented Aug 16, 2024

Uh oh!

jianzhangbjz commented Aug 16, 2024

Uh oh!

jianzhangbjz commented Aug 19, 2024

Uh oh!

jianzhangbjz commented Aug 19, 2024

Uh oh!

perdasilva commented Aug 19, 2024

Uh oh!

perdasilva Aug 19, 2024

Choose a reason for hiding this comment

Uh oh!

ankitathomas commented Sep 5, 2024

Uh oh!

ankitathomas commented Sep 5, 2024

Uh oh!

jianzhangbjz commented Sep 6, 2024

Uh oh!

jianzhangbjz commented Sep 6, 2024

Uh oh!

tmshort commented Sep 6, 2024

Uh oh!

ankitathomas commented Sep 11, 2024

Uh oh!

jianzhangbjz commented Sep 12, 2024

Uh oh!

jianzhangbjz commented Sep 12, 2024

Uh oh!

perdasilva commented Sep 12, 2024

Uh oh!

openshift-ci-robot commented Sep 12, 2024

Uh oh!

openshift-bot commented Sep 12, 2024

Uh oh!

openshift-bot commented Sep 12, 2024

Uh oh!

grokspawn commented Sep 12, 2024

Uh oh!

openshift-cherrypick-robot commented Sep 12, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

14 participants

ankitathomas commented Aug 9, 2024 •

edited

Loading

jianzhangbjz commented Aug 12, 2024 •

edited

Loading