Bug 1731263: Disable failing preemption e2es#24868
Bug 1731263: Disable failing preemption e2es#24868openshift-merge-robot merged 1 commit intoopenshift:masterfrom
Conversation
soltysh
left a comment
There was a problem hiding this comment.
I'm not a fan of disabling test but it looks like there's a major problem with this test upstream that should be fixed first than we can rely on the actual results. The band-aid that upstream did to prevent flakes doesn't apparently solve the root cause.
/lgtm
/approve
/hold
you'll need to fix the verify script
|
@soltysh what are you referring to? there doesn't appear to be any major problems with the test upstream, and I think the issue I linked was actually caused by us trying to patch up the test to work for our clusters |
3532feb to
f8c378c
Compare
|
/retest |
2 similar comments
|
/retest |
|
/retest |
|
/retitle Bug 1731263: Disable failing preemption e2es |
|
/hold cancel |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: damemi, soltysh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@damemi: This pull request references Bugzilla bug 1731263, which is valid. The bug has been updated to refer to the pull request using the external bug tracker. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest Please review the full test history for this PR and help us cut down flakes. |
1 similar comment
|
/retest Please review the full test history for this PR and help us cut down flakes. |
|
@damemi: Some pull requests linked via external trackers have merged: . The following pull requests linked via external trackers have not merged:
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Main BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1731263. BZ contains discussion on failure causes, usually that the "preemptor" pod does not land on the expected "victim" pod's node, and so it does not evict the pod we expect it to.
These tests have been disabled for several releases, and were recently re-enabled by 5f52d7e#diff-6ba77494282f6e840f44b01ce97335afL88
After being reenabled, they are still failing: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-azure-serial-4.5
These have had the following attempted fixes:
(among others that have been scrapped or overwritten by the above)
There was an upstream issue for some of these tests here kubernetes/kubernetes#88441. They do not currently flake in upstream CI runs
There are some comments in this PR explaining how the test currently assigns a pod to a node with affinity: kubernetes/kubernetes#90118. I believe my attempt to fix this by manually setting the
NodeNamewon't actually work because that bypasses scheduling, and so preemption does not run.