Skip to content

Flake: Metrics are generated for OLM managed resources/a CSV is created/the OLM pod restarts #2390

@timflannagan

Description

@timflannagan

#2216 had fixed a bug where the csv_succeeded was lost between deployment pod restarts. In those changes, a new metric e2e test was created that restarts (e.g. scales down/scales back up) to test whether the metric was retained after pod restarts:

			When("the OLM pod restarts", func() {
				BeforeEach(func() {
					restartDeploymentWithLabel(c, "app=olm-operator")
				})
				It("CSV metric is preserved", func() {
					Expect(getMetricsFromPod(c, getPodWithLabel(c, "app=olm-operator"))).To(
						ContainElement(LikeMetric(WithFamily("csv_succeeded"), WithName(csv.Name), WithValue(1))),
					)
				})
			})

It looks like the restartDeploymentWithLabel(...) doesn't have enough safeguards to verify that the restarted deployment is ready and available, leading to issues running this e2e test on more bloated clusters, as we're attempting to grab the metric before the metric endpoint has been setup and ready to serve traffic.

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions