ci-operator/templates/openshift: m4.xlarge compute nodes by wking · Pull Request #4027 · openshift/release

wking · 2019-06-11T10:54:44Z

Prometheus starting making memory requests with openshift/prometheus-operator@cda68a3f (openshift/prometheus-operator#30):

$ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/591/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/592/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}')
--- /dev/fd/63    2019-06-04 14:10:31.908436038 -0700
+++ /dev/fd/62    2019-06-04 14:10:31.908436038 -0700
@@ -1,5 +1,5 @@
{
-  "name": "prometheus-adapter-5f78cc955d-2899k",
+  "name": "prometheus-adapter-64f4f64b7-pvmhn",
  "resources": [
    {
      "requests": {
@@ -10,7 +10,7 @@
  ]
}
{
-  "name": "prometheus-adapter-5f78cc955d-2rlnx",
+  "name": "prometheus-adapter-64f4f64b7-tgnld",
  "resources": [
    {
      "requests": {
@@ -22,14 +22,56 @@
}
{
  "name": "prometheus-k8s-0",
-  "resources": []
+  "resources": [
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    },
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    }
+  ]
}
{
  "name": "prometheus-k8s-1",
-  "resources": []
+  "resources": [
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    },
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    }
+  ]
}
{
-  "name": "prometheus-operator-68f7b6bd55-hmqtj",
+  "name": "prometheus-operator-d8745bf44-l9khn",
  "resources": [
    {
      "requests": {

With that change, our nodes no longer satisfied the assumptions that the SchedulerPreemption tests make about the schedule load on test nodes (i.e. less than 40% of capacity is scheduled). openshift/origin@13b6d0e4a7 (openshift/origin#23029) disabled the test, but this change takes the alternative temporary workaround of bumping our node capacity to re-satisfy the existing test's assumptions.

We have sufficient capacity for doubling our xlarge consumption:

$ export AWS_PROFILE=ci
$ aws --region us-east-1 support describe-trusted-advisor-checks --language en --query "checks[? category == 'service_limits'].{id: @.id, name: @.name}" --output text | grep 'EC2 On-Demand Instances'
0Xc6LMYG8P   EC2 On-Demand Instances
$ AWS_PROFILE=ci aws --region us-east-1 support describe-trusted-advisor-check-result --check-id 0Xc6LMYG8P --query "join(\`\\n\`, result.flaggedResources[].join(\`\\t\`, [@.metadata[4] || '0', @.metadata[3], @.region || '-', '0Xc6LMYG8P', @.metadata[2]]))" --output text
91  3000  us-east-1  0Xc6LMYG8P  On-Demand instances - m4.large
97  3000  us-east-1  0Xc6LMYG8P  On-Demand instances - m4.xlarge

Prometheus starting making memory requests with openshift/prometheus-operator@cda68a3f (Merge pull request openshift/prometheus-operator#30 from paulfantom/merge-release-0.30.1, 2019-06-04): $ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/591/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/592/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') --- /dev/fd/63 2019-06-04 14:10:31.908436038 -0700 +++ /dev/fd/62 2019-06-04 14:10:31.908436038 -0700 @@ -1,5 +1,5 @@ { - "name": "prometheus-adapter-5f78cc955d-2899k", + "name": "prometheus-adapter-64f4f64b7-pvmhn", "resources": [ { "requests": { @@ -10,7 +10,7 @@ ] } { - "name": "prometheus-adapter-5f78cc955d-2rlnx", + "name": "prometheus-adapter-64f4f64b7-tgnld", "resources": [ { "requests": { @@ -22,14 +22,56 @@ } { "name": "prometheus-k8s-0", - "resources": [] + "resources": [ + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + }, + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + } + ] } { "name": "prometheus-k8s-1", - "resources": [] + "resources": [ + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + }, + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + } + ] } { - "name": "prometheus-operator-68f7b6bd55-hmqtj", + "name": "prometheus-operator-d8745bf44-l9khn", "resources": [ { "requests": { With that change, our nodes no longer satisfied the assumptions that the SchedulerPreemption tests make about the schedule load on test nodes (i.e. less than 40% of capacity is scheduled). openshift/origin@13b6d0e4a7 (test/e2e: scheduling: disable preemption tests, 2019-06-04, openshift/origin#23029) disabled the test, but this change takes the alternative temporary workaround of bumping our node capacity to re-satisfy the existing test's assumptions. We have sufficient capacity for doubling our xlarge consumption: $ export AWS_PROFILE=ci $ aws --region us-east-1 support describe-trusted-advisor-checks --language en --query "checks[? category == 'service_limits'].{id: @.id, name: @.name}" --output text | grep 'EC2 On-Demand Instances' 0Xc6LMYG8P EC2 On-Demand Instances $ AWS_PROFILE=ci aws --region us-east-1 support describe-trusted-advisor-check-result --check-id 0Xc6LMYG8P --query "join(\`\\n\`, result.flaggedResources[].join(\`\\t\`, [@.metadata[4] || '0', @.metadata[3], @.region || '-', '0Xc6LMYG8P', @.metadata[2]]))" --output text 91 3000 us-east-1 0Xc6LMYG8P On-Demand instances - m4.large 97 3000 us-east-1 0Xc6LMYG8P On-Demand instances - m4.xlarge

openshift-ci-robot · 2019-06-11T12:24:16Z

@wking: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/rehearse/openshift/installer/master/e2e-aws-upi	`7eb64eb`	link	`/test pj-rehearse`
ci/rehearse/openshift/installer/master/e2e-aws-scaleup-rhel7	`7eb64eb`	link	`/test pj-rehearse`
ci/rehearse/openshift/installer/master/e2e-vsphere	`7eb64eb`	link	`/test pj-rehearse`
ci/prow/pj-rehearse	`7eb64eb`	link	`/test pj-rehearse`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

vrutkovs · 2019-06-11T16:46:00Z

Ansible part looks fine
/approve

ravisantoshgudimetla · 2019-06-11T16:47:25Z

/lgtm

From scheduling side

openshift-ci-robot · 2019-06-11T16:47:39Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ravisantoshgudimetla, vrutkovs, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~ci-operator/templates/openshift/installer/OWNERS~~ [wking]
~~ci-operator/templates/openshift/openshift-ansible/OWNERS~~ [vrutkovs]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-ci-robot · 2019-06-11T16:51:28Z

@wking: Updated the following 10 configmaps:

prow-job-cluster-launch-installer-upi-e2e configmap in namespace ci using the following files:
- key cluster-launch-installer-upi-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml
prow-job-cluster-launch-installer-upi-e2e configmap in namespace ci-stg using the following files:
- key cluster-launch-installer-upi-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-upi-e2e.yaml
prow-job-cluster-launch-installer-console configmap in namespace ci-stg using the following files:
- key cluster-launch-installer-console.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-console.yaml
prow-job-cluster-launch-installer-e2e configmap in namespace ci using the following files:
- key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
prow-job-cluster-launch-installer-e2e configmap in namespace ci-stg using the following files:
- key cluster-launch-installer-e2e.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-e2e.yaml
prow-job-cluster-scaleup-e2e-40 configmap in namespace ci using the following files:
- key cluster-scaleup-e2e-40.yaml using file ci-operator/templates/openshift/openshift-ansible/cluster-scaleup-e2e-40.yaml
prow-job-cluster-scaleup-e2e-40 configmap in namespace ci-stg using the following files:
- key cluster-scaleup-e2e-40.yaml using file ci-operator/templates/openshift/openshift-ansible/cluster-scaleup-e2e-40.yaml
prow-job-cluster-launch-installer-console configmap in namespace ci using the following files:
- key cluster-launch-installer-console.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-console.yaml
prow-job-cluster-launch-installer-src configmap in namespace ci using the following files:
- key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml
prow-job-cluster-launch-installer-src configmap in namespace ci-stg using the following files:
- key cluster-launch-installer-src.yaml using file ci-operator/templates/openshift/installer/cluster-launch-installer-src.yaml

Details

In response to this:

Prometheus starting making memory requests with openshift/prometheus-operator@cda68a3f (openshift/prometheus-operator#30):

$ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/591/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/592/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}')
--- /dev/fd/63    2019-06-04 14:10:31.908436038 -0700
+++ /dev/fd/62    2019-06-04 14:10:31.908436038 -0700
@@ -1,5 +1,5 @@
{
-  "name": "prometheus-adapter-5f78cc955d-2899k",
+  "name": "prometheus-adapter-64f4f64b7-pvmhn",
 "resources": [
   {
     "requests": {
@@ -10,7 +10,7 @@
 ]
}
{
-  "name": "prometheus-adapter-5f78cc955d-2rlnx",
+  "name": "prometheus-adapter-64f4f64b7-tgnld",
 "resources": [
   {
     "requests": {
@@ -22,14 +22,56 @@
}
{
 "name": "prometheus-k8s-0",
-  "resources": []
+  "resources": [
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    },
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    }
+  ]
}
{
 "name": "prometheus-k8s-1",
-  "resources": []
+  "resources": [
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    },
+    {
+      "limits": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      },
+      "requests": {
+        "cpu": "100m",
+        "memory": "25Mi"
+      }
+    }
+  ]
}
{
-  "name": "prometheus-operator-68f7b6bd55-hmqtj",
+  "name": "prometheus-operator-d8745bf44-l9khn",
 "resources": [
   {
     "requests": {

With that change, our nodes no longer satisfied the assumptions that the SchedulerPreemption tests make about the schedule load on test nodes (i.e. less than 40% of capacity is scheduled). openshift/origin@13b6d0e4a7 (openshift/origin#23029) disabled the test, but this change takes the alternative temporary workaround of bumping our node capacity to re-satisfy the existing test's assumptions.

We have sufficient capacity for doubling our xlarge consumption:

$ export AWS_PROFILE=ci
$ aws --region us-east-1 support describe-trusted-advisor-checks --language en --query "checks[? category == 'service_limits'].{id: @.id, name: @.name}" --output text | grep 'EC2 On-Demand Instances'
0Xc6LMYG8P   EC2 On-Demand Instances
$ AWS_PROFILE=ci aws --region us-east-1 support describe-trusted-advisor-check-result --check-id 0Xc6LMYG8P --query "join(\`\\n\`, result.flaggedResources[].join(\`\\t\`, [@.metadata[4] || '0', @.metadata[3], @.region || '-', '0Xc6LMYG8P', @.metadata[2]]))" --output text
91  3000  us-east-1  0Xc6LMYG8P  On-Demand instances - m4.large
97  3000  us-east-1  0Xc6LMYG8P  On-Demand instances - m4.xlarge

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jun 11, 2019

openshift-ci-robot requested review from jcpowermac and vrutkovs June 11, 2019 10:55

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2019

openshift-ci-robot assigned ravisantoshgudimetla Jun 11, 2019

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2019

openshift-merge-robot merged commit 544ea94 into openshift:master Jun 11, 2019

wking deleted the larger-workers branch August 10, 2019 04:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci-operator/templates/openshift: m4.xlarge compute nodes#4027

ci-operator/templates/openshift: m4.xlarge compute nodes#4027
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
wking:larger-workers

wking commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

vrutkovs commented Jun 11, 2019

Uh oh!

ravisantoshgudimetla commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

wking commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

vrutkovs commented Jun 11, 2019

Uh oh!

ravisantoshgudimetla commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

openshift-ci-robot commented Jun 11, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants