test/helpers: use RetryOnConflict for node writes#2921
test/helpers: use RetryOnConflict for node writes#2921openshift-merge-robot merged 1 commit intoopenshift:masterfrom
RetryOnConflict for node writes#2921Conversation
I saw this race in a PR:
```
Error: Expected nil, but got: &errors.StatusError{ErrStatus:v1.Status{TypeMeta:v1.TypeMeta{Kind:"", APIVersion:""}, ListMeta:v1.ListMeta{SelfLink:"", ResourceVersion:"", Continue:"", RemainingItemCount:(*int64)(nil)}, Status:"Failure", Message:"Operation cannot be fulfilled on nodes \"ci-op-pjcj27z5-ff089-5jw6w-worker-b-dhsmk\": the object has been modified; please apply your changes to the latest version and try again", Reason:"Conflict", Details:(*v1.StatusDetails)(0xc000450cc0), Code:409}}
Test: TestKernelType
```
A general problem with our tests is that they tend to be written
in "one shot" mode but in cases like this we need to do retries
and reconciliation.
(Actually in this case it'd be better to format a strategic patch
I think, but for now let's just do a standard `RetryOnConflict`
as is done elsewhere)
|
Do we want this against Just cause I don't understand: how is there a race? Are our e2e tests run in parallel? |
The kubelet running on the node may also update the node object. This code isn't using a strategic merge patch or the (linked from there) newer server side apply. That means any concurrent modification (e.g. from kubelet) will result in a conflict error as we saw. I don't think our e2e tests here run in parallel, and in practice we probably don't hit this issue often because I think kubelet doesn't touch the node object very frequently. |
We could, but I suspect just retrying the tests there will work. If we hit the race again let's try pulling this commit there too. Or perhaps better, merge to main/master and do another PR to rebase mcbs. |
|
Gotcha, thanks! Thanks for the fix the failure was looking pretty mysterious to me |
|
Nice, this makes sense to me. 👍 |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, kikisdeliveryservice, mkenigs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
7 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
I can confirm that our e2e tests do not run in parallel. To my understanding, Golang will run all tests within a given package sequentially (unless one makes use of |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
6 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
16 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/skip |
|
Since this change only affects our e2e-gcp-op tests (which have long passed), going to override the failing e2e-agnositic-upgrade, since this helper isn't called in that test and wasting retest cycles on that job for this PR seems a bit wasteful. We tried. 😆 /override ci/prow/e2e-agnostic-upgrade |
|
@kikisdeliveryservice: Overrode contexts on behalf of kikisdeliveryservice: ci/prow/e2e-agnostic-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
2 similar comments
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
@cgwalters: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
/retest-required Please review the full test history for this PR and help us cut down flakes. |
|
ughh why is this retesting if i overrode?? /override ci/prow/e2e-agnostic-upgrade |
|
@kikisdeliveryservice: Overrode contexts on behalf of kikisdeliveryservice: ci/prow/e2e-agnostic-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/override ci/prow/e2e-agnostic-upgrade |
|
@kikisdeliveryservice: Overrode contexts on behalf of kikisdeliveryservice: ci/prow/e2e-agnostic-upgrade DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I saw this race in a PR:
A general problem with our tests is that they tend to be written
in "one shot" mode but in cases like this we need to do retries
and reconciliation.
(Actually in this case it'd be better to format a strategic patch
I think, but for now let's just do a standard
RetryOnConflictas is done elsewhere)