test: Assess average series rather than max over the test window #25783

smarterclayton · 2021-01-06T22:17:13Z

With the recent increase in cluster metrics, some disruptive tests
can trigger errors that result in a burst of
cluster_operator_conditions or alerts series that then clear after
the disruption. We want to run the full suite after we run a
disruption, and in general we are concerned with average over max,
so shorten the interval we check to 1h and calculate the average.

When looking at telemetry from 4.7 CI clusters, the disruptive tests
BRIEFLY peak at 600 series and then fall to 300 almost immediately
after. Using the average, the total count is closer to 400 over the
hour the tests run and that better represents the desired goal of
the test (to limit average load, not spikes). Check the maximum as
double the average.

Resolves failures encountered when attempting to run the disruptive
suite (destroy the cluster and recover) and then the conformance
suite. Subsequent PR will remove the skip on disruptive

@marun, @lilic

With the recent increase in cluster metrics, some disruptive tests can trigger errors that result in a burst of cluster_operator_conditions or alerts series that then clear after the disruption. We want to run the full suite after we run a disruption, and in general we are concerned with average over max, so shorten the interval we check to 1h and calculate the average. When looking at telemetry from 4.7 CI clusters, the disruptive tests BRIEFLY peak at 600 series and then fall to 300 almost immediately after. Using the average, the total count is closer to 400 over the hour the tests run and that better represents the desired goal of the test (to limit average load, not spikes). Check the maximum as double the average.

marun · 2021-01-06T23:25:09Z

/lgtm

openshift-ci-robot · 2021-01-06T23:25:25Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: marun, smarterclayton

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~test/extended/prometheus/OWNERS~~ [smarterclayton]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci · 2021-01-06T23:40:31Z

@smarterclayton: The following test failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-agnostic-cmd	`25a026e`	link	`/test e2e-agnostic-cmd`

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2021-01-06T23:52:14Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot requested review from metalmatze and squat January 6, 2021 22:17

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 6, 2021

smarterclayton mentioned this pull request Jan 6, 2021

test: Allow tests that check invariants over time to be constrained #25784

Merged

openshift-ci-robot assigned marun Jan 6, 2021

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 6, 2021

openshift-merge-robot merged commit 8ca3f31 into openshift:master Jan 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Assess average series rather than max over the test window #25783

test: Assess average series rather than max over the test window #25783

Uh oh!

smarterclayton commented Jan 6, 2021 •

edited

Loading

Uh oh!

marun commented Jan 6, 2021

Uh oh!

openshift-ci-robot commented Jan 6, 2021

Uh oh!

openshift-ci bot commented Jan 6, 2021

Uh oh!

openshift-bot commented Jan 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

test: Assess average series rather than max over the test window #25783

test: Assess average series rather than max over the test window #25783

Uh oh!

Conversation

smarterclayton commented Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marun commented Jan 6, 2021

Uh oh!

openshift-ci-robot commented Jan 6, 2021

Uh oh!

openshift-ci bot commented Jan 6, 2021

Uh oh!

openshift-bot commented Jan 6, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

smarterclayton commented Jan 6, 2021 •

edited

Loading