Skip to content

Single node reserve 3GiB of system memory to avoid alerts#17403

Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
omertuc:snosysmem
Apr 22, 2021
Merged

Single node reserve 3GiB of system memory to avoid alerts#17403
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
omertuc:snosysmem

Conversation

@omertuc
Copy link
Copy Markdown
Contributor

@omertuc omertuc commented Apr 1, 2021

Running E2E conformance tests on a single-node cluster results in a
SystemMemoyExceedsReservation alert (defined here)

In order to facilitate E2E tests on a single-node, we create a KubeletConfig installer
manifest that increases the amount of reserved memory from the (current)
default of 1Gi to 3Gi.

@omertuc omertuc changed the title Single node reserve 3GiB of system memory to avoid alerts WIP - Single node reserve 3GiB of system memory to avoid alerts Apr 1, 2021
@openshift-ci-robot openshift-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 1, 2021
@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 1, 2021

/test pj-rehearse

2 similar comments
@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 1, 2021

/test pj-rehearse

@romfreiman
Copy link
Copy Markdown

/test pj-rehearse

@smarterclayton
Copy link
Copy Markdown
Contributor

/retest

2 similar comments
@romfreiman
Copy link
Copy Markdown

/retest

@praveenkumar
Copy link
Copy Markdown
Contributor

/retest

@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 3, 2021

/test pj-rehearse

@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 6, 2021

It seems that this is currently failing because of an MCO bug, probably related to automatic memory sizing.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 12, 2021

@omertuc: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 12, 2021
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 13, 2021
@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 13, 2021

Waiting for openshift/machine-config-operator#2517

@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 20, 2021

/retest

@omertuc omertuc changed the title WIP - Single node reserve 3GiB of system memory to avoid alerts Single node reserve 3GiB of system memory to avoid alerts Apr 20, 2021
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 20, 2021
@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 20, 2021

/retest

1 similar comment
@omertuc
Copy link
Copy Markdown
Contributor Author

omertuc commented Apr 20, 2021

/retest

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 20, 2021

@omertuc: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/rehearse/periodic-ci-openshift-release-master-ci-4.8-e2e-aws-upgrade-single-node 7735c01f1a48ec738da5cecc70691a3c9a0dec9a link /test pj-rehearse
ci/rehearse/periodic-ci-openshift-release-master-nightly-4.8-e2e-aws-single-node 7735c01f1a48ec738da5cecc70691a3c9a0dec9a link /test pj-rehearse
ci/rehearse/openshift/machine-config-operator/release-4.9/e2e-gcp-single-node 7735c01f1a48ec738da5cecc70691a3c9a0dec9a link /test pj-rehearse
ci/prow/pj-rehearse 7735c01f1a48ec738da5cecc70691a3c9a0dec9a link /test pj-rehearse

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Running E2E conformance tests on a single-node cluster results in a
SystemMemoyExceedsReservation alert (defined [here](https://github.com/openshift/machine-config-operator/blob/8da1e3c21c46c80a54e83824839c6244af69437b/install/0000_90_machine-config-operator_01_prometheus-rules.yaml#L51-L60))

In order to facilitate E2E tests on a single-node, we create a KubeletConfig installer
manifest that increases the amount of reserved memory from the (current)
default of 1Gi to 3Gi.
@romfreiman
Copy link
Copy Markdown

/lgtm

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: omertuc, romfreiman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 22, 2021
@openshift-merge-robot openshift-merge-robot merged commit a340397 into openshift:master Apr 22, 2021
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@omertuc: Updated the step-registry configmap in namespace ci at cluster app.ci using the following files:

  • key openshift-e2e-aws-single-node-workflow.yaml using file ci-operator/step-registry/openshift/e2e/aws/single-node/openshift-e2e-aws-single-node-workflow.yaml
  • key openshift-e2e-gcp-single-node-workflow.yaml using file ci-operator/step-registry/openshift/e2e/gcp/single-node/openshift-e2e-gcp-single-node-workflow.yaml
  • key OWNERS using file ci-operator/step-registry/single-node/conf/e2e/OWNERS
  • key single-node-conf-e2e-commands.sh using file ci-operator/step-registry/single-node/conf/e2e/single-node-conf-e2e-commands.sh
  • key single-node-conf-e2e-ref.metadata.json using file ci-operator/step-registry/single-node/conf/e2e/single-node-conf-e2e-ref.metadata.json
  • key single-node-conf-e2e-ref.yaml using file ci-operator/step-registry/single-node/conf/e2e/single-node-conf-e2e-ref.yaml
Details

In response to this:

Running E2E conformance tests on a single-node cluster results in a
SystemMemoyExceedsReservation alert (defined here)

In order to facilitate E2E tests on a single-node, we create a KubeletConfig installer
manifest that increases the amount of reserved memory from the (current)
default of 1Gi to 3Gi.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants