test/e2e: scheduling: disable preemption tests#23029
Merged
openshift-merge-robot merged 1 commit intoopenshift:masterfrom Jun 5, 2019
Merged
test/e2e: scheduling: disable preemption tests#23029openshift-merge-robot merged 1 commit intoopenshift:masterfrom
openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
Member
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sjenning, wking The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Contributor
|
/retest Please review the full test history for this PR and help us cut down flakes. |
Member
|
/retest Now that openshift/cluster-kube-apiserver-operator#495 has landed. |
wking
added a commit
to wking/openshift-release
that referenced
this pull request
Jun 11, 2019
Prometheus starting making memory requests with openshift/prometheus-operator@cda68a3f (Merge pull request openshift/prometheus-operator#30 from paulfantom/merge-release-0.30.1, 2019-06-04): $ diff -u <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/591/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') <(curl -s https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/592/artifacts/e2e-aws-serial/pods.json | jq '.items[] | select(.metadata.name | contains("prometheus")) | {name: .metadata.name, resources: [.spec.containers[].resources | select((. | length) > 0)]}') --- /dev/fd/63 2019-06-04 14:10:31.908436038 -0700 +++ /dev/fd/62 2019-06-04 14:10:31.908436038 -0700 @@ -1,5 +1,5 @@ { - "name": "prometheus-adapter-5f78cc955d-2899k", + "name": "prometheus-adapter-64f4f64b7-pvmhn", "resources": [ { "requests": { @@ -10,7 +10,7 @@ ] } { - "name": "prometheus-adapter-5f78cc955d-2rlnx", + "name": "prometheus-adapter-64f4f64b7-tgnld", "resources": [ { "requests": { @@ -22,14 +22,56 @@ } { "name": "prometheus-k8s-0", - "resources": [] + "resources": [ + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + }, + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + } + ] } { "name": "prometheus-k8s-1", - "resources": [] + "resources": [ + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + }, + { + "limits": { + "cpu": "100m", + "memory": "25Mi" + }, + "requests": { + "cpu": "100m", + "memory": "25Mi" + } + } + ] } { - "name": "prometheus-operator-68f7b6bd55-hmqtj", + "name": "prometheus-operator-d8745bf44-l9khn", "resources": [ { "requests": { With that change, our nodes no longer satisfied the assumptions that the SchedulerPreemption tests make about the schedule load on test nodes (i.e. less than 40% of capacity is scheduled). openshift/origin@13b6d0e4a7 (test/e2e: scheduling: disable preemption tests, 2019-06-04, openshift/origin#23029) disabled the test, but this change takes the alternative temporary workaround of bumping our node capacity to re-satisfy the existing test's assumptions. We have sufficient capacity for doubling our xlarge consumption: $ export AWS_PROFILE=ci $ aws --region us-east-1 support describe-trusted-advisor-checks --language en --query "checks[? category == 'service_limits'].{id: @.id, name: @.name}" --output text | grep 'EC2 On-Demand Instances' 0Xc6LMYG8P EC2 On-Demand Instances $ AWS_PROFILE=ci aws --region us-east-1 support describe-trusted-advisor-check-result --check-id 0Xc6LMYG8P --query "join(\`\\n\`, result.flaggedResources[].join(\`\\t\`, [@.metadata[4] || '0', @.metadata[3], @.region || '-', '0Xc6LMYG8P', @.metadata[2]]))" --output text 91 3000 us-east-1 0Xc6LMYG8P On-Demand instances - m4.large 97 3000 us-east-1 0Xc6LMYG8P On-Demand instances - m4.xlarge
This was referenced Jun 11, 2019
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
After Prometheus starting making reasonable memory requests, the assumptions that the
SchedulerPreemptiontests make about the scheduled load on test nodes do not hold (i.e. less than 40% of capacity is scheduled).Example e2e failure
https://openshift-gce-devel.appspot.com/build/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-serial-4.2/598
Flaking since openshift/prometheus-operator#30 which allowed the resource requests for the prometheus statefulset to flow through
https://testgrid.k8s.io/redhat-openshift-release-blocking#redhat-release-openshift-origin-installer-e2e-aws-serial-4.2&sort-by-flakiness=
BZ to track reenablement
https://bugzilla.redhat.com/show_bug.cgi?id=1717198
@smarterclayton @wking @ravisantoshgudimetla