Skip to content

Conversation

@hasbro17
Copy link
Contributor

Resolves ETCD-296
Moving the etcd vertical scaling test to its own test suite so we can remove the noise from serial jobs as the disruption from adding and removing a node can have downstream effects for other tests and invariants in the serial suite.

While the scaling feature itself is stable we still require the early/late invariants to troubleshoot this test in isolation without blocking the serial suite.

Also requires a new workflow and presubmits to run this test on the different platforms.
Blocked by openshift/release#32623

@hasbro17
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Sep 26, 2022
@hasbro17
Copy link
Contributor Author

/cc @tjungblu @dgoodwin

@stbenjam
Copy link
Member

You'll need to run hack/update-generated.sh to update the annotations

@hasbro17 hasbro17 force-pushed the move-etcd-scaling-test branch from 4b0dcf6 to aeb0034 Compare September 26, 2022 21:51
@stbenjam
Copy link
Member

Looks correct to me

/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Sep 26, 2022
@tjungblu
Copy link
Contributor

yep, also

/lgtm

@tjungblu
Copy link
Contributor

/retest-required

@stbenjam
Copy link
Member

@hasbro17 If this is ready to go can you remove the WIP from the title?

@hasbro17 hasbro17 changed the title WIP: Add etcd vertical scaling test suite Add etcd vertical scaling test suite Sep 27, 2022
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 27, 2022
@hasbro17
Copy link
Contributor Author

/unhold

I guess we can merge this first then and test it out in the release PR.

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Sep 27, 2022
@hasbro17
Copy link
Contributor Author

/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 31c1187 and 2 for PR HEAD aeb0034 in total

@hasbro17
Copy link
Contributor Author

/retest-required

2 similar comments
@tjungblu
Copy link
Contributor

/retest-required

@tjungblu
Copy link
Contributor

/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD d2336da and 1 for PR HEAD aeb0034 in total

@tjungblu
Copy link
Contributor

/retest-required

@openshift-ci openshift-ci bot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 30, 2022
@hasbro17 hasbro17 force-pushed the move-etcd-scaling-test branch from aeb0034 to ae4f791 Compare September 30, 2022 20:48
@hasbro17
Copy link
Contributor Author

/test ci/prow/e2e-aws-ovn-serial

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Sep 30, 2022

@hasbro17: The specified target(s) for /test were not found.
The following commands are available to trigger required jobs:

  • /test e2e-aws-image-registry
  • /test e2e-aws-jenkins
  • /test e2e-aws-ovn-fips
  • /test e2e-aws-ovn-serial
  • /test e2e-gcp-builds
  • /test e2e-gcp-image-ecosystem
  • /test e2e-gcp-ovn
  • /test e2e-gcp-ovn-upgrade
  • /test extended_gssapi
  • /test extended_ldap_groups
  • /test extended_networking
  • /test images
  • /test lint
  • /test verify
  • /test verify-deps

The following commands are available to trigger optional jobs:

  • /test e2e-agnostic-ovn-cmd
  • /test e2e-aws
  • /test e2e-aws-csi
  • /test e2e-aws-csi-migration
  • /test e2e-aws-disruptive
  • /test e2e-aws-multitenant
  • /test e2e-aws-ovn
  • /test e2e-aws-ovn-cgroupsv2
  • /test e2e-aws-ovn-single-node
  • /test e2e-aws-ovn-single-node-serial
  • /test e2e-aws-ovn-single-node-upgrade
  • /test e2e-aws-proxy
  • /test e2e-aws-upgrade
  • /test e2e-azure
  • /test e2e-gcp-csi
  • /test e2e-gcp-disruptive
  • /test e2e-gcp-fips-serial
  • /test e2e-gcp-ovn-rt-upgrade
  • /test e2e-metal-ipi
  • /test e2e-metal-ipi-ovn-dualstack
  • /test e2e-metal-ipi-ovn-ipv6
  • /test e2e-metal-ipi-serial
  • /test e2e-metal-ipi-serial-ovn-ipv6
  • /test e2e-metal-ipi-virtualmedia
  • /test e2e-openstack
  • /test e2e-openstack-serial
  • /test e2e-vsphere
  • /test okd-e2e-gcp

Use /test all to run the following jobs that were automatically triggered:

  • pull-ci-openshift-origin-master-e2e-agnostic-ovn-cmd
  • pull-ci-openshift-origin-master-e2e-aws-csi
  • pull-ci-openshift-origin-master-e2e-aws-ovn-cgroupsv2
  • pull-ci-openshift-origin-master-e2e-aws-ovn-fips
  • pull-ci-openshift-origin-master-e2e-aws-ovn-serial
  • pull-ci-openshift-origin-master-e2e-aws-ovn-single-node
  • pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-serial
  • pull-ci-openshift-origin-master-e2e-aws-ovn-single-node-upgrade
  • pull-ci-openshift-origin-master-e2e-gcp-builds
  • pull-ci-openshift-origin-master-e2e-gcp-csi
  • pull-ci-openshift-origin-master-e2e-gcp-ovn
  • pull-ci-openshift-origin-master-e2e-gcp-ovn-rt-upgrade
  • pull-ci-openshift-origin-master-e2e-gcp-ovn-upgrade
  • pull-ci-openshift-origin-master-e2e-metal-ipi-ovn-ipv6
  • pull-ci-openshift-origin-master-images
  • pull-ci-openshift-origin-master-lint
  • pull-ci-openshift-origin-master-verify
  • pull-ci-openshift-origin-master-verify-deps
Details

In response to this:

/test ci/prow/e2e-aws-ovn-serial

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hasbro17
Copy link
Contributor Author

/test e2e-aws-ovn-serial

@hasbro17
Copy link
Contributor Author

hasbro17 commented Oct 2, 2022

/retest-required

@stbenjam
Copy link
Member

stbenjam commented Oct 2, 2022

/hold cancel
/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Oct 2, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 2, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hasbro17, stbenjam, tjungblu

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 2, 2022
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 1000ea0 and 2 for PR HEAD ae4f791 in total

@hasbro17
Copy link
Contributor Author

hasbro17 commented Oct 3, 2022

Seems like infra setup failures
/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 9edb6c7 and 1 for PR HEAD ae4f791 in total

@tjungblu
Copy link
Contributor

tjungblu commented Oct 4, 2022

/retest-required

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD efd2179 and 0 for PR HEAD ae4f791 in total

@hasbro17
Copy link
Contributor Author

hasbro17 commented Oct 4, 2022

The ci/prow/e2e-gcp-ovn-upgrade failure again seems unrelated with readiness probes failing:

 [sig-network] there should be nearly zero single second disruptions for ns/openshift-authentication route/oauth-openshift disruption/ingress-to-oauth-server connection/new

/retest-required

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 5, 2022

@hasbro17: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-ovn-single-node-serial ae4f791 link false /test e2e-aws-ovn-single-node-serial
ci/prow/e2e-aws-ovn-cgroupsv2 ae4f791 link false /test e2e-aws-ovn-cgroupsv2
ci/prow/e2e-gcp-ovn-rt-upgrade ae4f791 link false /test e2e-gcp-ovn-rt-upgrade
ci/prow/e2e-metal-ipi-ovn-ipv6 ae4f791 link false /test e2e-metal-ipi-ovn-ipv6
ci/prow/e2e-aws-ovn-single-node-upgrade ae4f791 link false /test e2e-aws-ovn-single-node-upgrade

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/hold

Revision ae4f791 was retested 3 times: holding

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2022
@tjungblu
Copy link
Contributor

tjungblu commented Oct 5, 2022

/retest-required

@tjungblu
Copy link
Contributor

tjungblu commented Oct 5, 2022

/hold cancel

mostly OVN flakes

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 5, 2022
@stbenjam
Copy link
Member

stbenjam commented Oct 5, 2022

/override ci/prow/e2e-gcp-ovn
/skip

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 5, 2022

@stbenjam: Overrode contexts on behalf of stbenjam: ci/prow/e2e-gcp-ovn

Details

In response to this:

/override ci/prow/e2e-gcp-ovn
/skip

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

"[Top Level] [sig-etcd][Feature:DisasterRecovery][Disruptive] [Feature:EtcdRecovery] Cluster should restore itself after quorum loss [apigroup:machine.openshift.io][apigroup:operator.openshift.io]": "[Feature:EtcdRecovery] Cluster should restore itself after quorum loss [apigroup:machine.openshift.io][apigroup:operator.openshift.io] [Serial]",

"[Top Level] [sig-etcd][Serial] etcd [apigroup:config.openshift.io] is able to vertically scale up and down with a single node [Timeout:60m][apigroup:machine.openshift.io]": "is able to vertically scale up and down with a single node [Timeout:60m][apigroup:machine.openshift.io] [Suite:openshift/conformance/serial]",
"[Top Level] [sig-etcd][Feature:EtcdVerticalScaling] etcd [apigroup:config.openshift.io] is able to vertically scale up and down with a single node [Timeout:60m][apigroup:machine.openshift.io]": "is able to vertically scale up and down with a single node [Timeout:60m][apigroup:machine.openshift.io] [Suite:openshift/conformance/parallel]",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change made this test run in [Suite:openshift/conformance/parallel] instead of the intended new etcd suite. It's causing jobs to fail, so I'm going to have to revert this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants