Skip to content

Bug 1731263: Balance preemption e2e nodes#24947

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
damemi:balance-preemption-e2e-nodes
May 11, 2020
Merged

Bug 1731263: Balance preemption e2e nodes#24947
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
damemi:balance-preemption-e2e-nodes

Conversation

@damemi
Copy link
Copy Markdown
Contributor

@damemi damemi commented May 4, 2020

These preemption e2es attempt to create filler "victim" pods that use 60% of a node's available resources, and assumes that with all nodes full, the low priority pod will get evicted.

However, if the nodes have very different usages (say one is at 500/1500 cpu and another is at 1000/1500), then the request of 60% of the low-priority-pod's-node's usage may not fit on one node, but will fit on another and the test fails.

This PR creates balanced pods in these e2es to ensure a level field for the preemption tests.

Upstream: kubernetes/kubernetes#90740

@openshift-ci-robot openshift-ci-robot added the bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. label May 4, 2020
@openshift-ci-robot
Copy link
Copy Markdown

@damemi: This pull request references Bugzilla bug 1731263, which is invalid:

  • expected the bug to be in one of the following states: NEW, ASSIGNED, ON_DEV, POST, POST, but it is ON_QA instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Bug 1731263: Balance preemption e2e nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label May 4, 2020
@damemi damemi force-pushed the balance-preemption-e2e-nodes branch from 1f9ad12 to 6db1ee9 Compare May 4, 2020 20:18
@openshift-ci-robot openshift-ci-robot added vendor-update Touching vendor dir or related files labels May 4, 2020
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 5, 2020

So, re-enabled and with these fixes the tests fail with:

Unschedulable 0/6 nodes are available: 2 Insufficient memory, 3 Insufficient cpu, 3 node(s) didn't match node selector.}]
But they pass locally for me, because this is exactly the case I would expect preemption to happen it means that for some reason it must not be.

Something else I see is that the calculated usage for the balanced pods differs from the usage we calculate when creating the victim pods

@damemi damemi force-pushed the balance-preemption-e2e-nodes branch 3 times, most recently from f5d05cd to 2824c6b Compare May 5, 2020 18:47
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 6, 2020

/retest

@damemi damemi force-pushed the balance-preemption-e2e-nodes branch 3 times, most recently from 2b16bd1 to b540d91 Compare May 6, 2020 20:22
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 7, 2020

/retest

@damemi damemi force-pushed the balance-preemption-e2e-nodes branch from b540d91 to 36b09a0 Compare May 8, 2020 18:24
@damemi damemi force-pushed the balance-preemption-e2e-nodes branch from 36b09a0 to c584cec Compare May 8, 2020 19:54
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 8, 2020

/bugzilla refresh

@openshift-ci-robot openshift-ci-robot added bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. labels May 8, 2020
@openshift-ci-robot
Copy link
Copy Markdown

@damemi: This pull request references Bugzilla bug 1731263, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)
Details

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 8, 2020

/kind failing-test

@openshift-ci-robot openshift-ci-robot added the kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. label May 8, 2020
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 9, 2020

/retest

Copy link
Copy Markdown
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 11, 2020
@openshift-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damemi, soltysh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 11, 2020
@soltysh
Copy link
Copy Markdown
Contributor

soltysh commented May 11, 2020

/hold
for k8s bump, feel to drop when that lands.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 11, 2020
@damemi
Copy link
Copy Markdown
Contributor Author

damemi commented May 11, 2020

#24936 has merged the 1.18.2 bump
/hold cancel

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 11, 2020
@openshift-merge-robot openshift-merge-robot merged commit 65834ad into openshift:master May 11, 2020
@openshift-ci-robot
Copy link
Copy Markdown

@damemi: Some pull requests linked via external trackers have merged: . The following pull requests linked via external trackers have not merged:

Details

In response to this:

Bug 1731263: Balance preemption e2e nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. kind/failing-test Categorizes issue or PR as related to a consistently or frequently failing test. lgtm Indicates that a PR is ready to be merged. vendor-update Touching vendor dir or related files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants