-
Notifications
You must be signed in to change notification settings - Fork 3.9k
Fix restart container test #6390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
AkihiroSuda
merged 2 commits into
containerd:main
from
gabriel-samfira:fix-restart-container-test
Dec 24, 2021
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the comment. The NOT-READY state means that we should stop the sandbox container and we can get NOT-Ready state after restart containerd. Even if there is no running containers, we are handling the sandbox one. It is confusing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. This comment was confusing. Sorry about that.
Sending a
Kill()to a task that is already stopped, or that was never started, will return anErrorNotFoundon Windows. This sandbox only has a container in stopped state and one in created state. So sending a kill here will always return anErrorNotFoundon Windows. By the time we send thisKill(), the wait channel for the task should already be closed and anErrorNotFoundcan be safely ignored.On Linux, there is also a container in started state, added here: https://github.com/containerd/containerd/blob/main/integration/restart_test.go#L83-L91
so on Linux, the
Kill()function should always return anilerror.Sorry for the confusion. I rewrote the comment. Hopefully it's clear now.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Windows, is the sandbox container running after RunPodSandbox? Even if there is no running containers.
The L103 is about to run pod sandbox container. the sandbox container with sid ID is running after L103. Since the test case is to make the sandbox into not-ready, the L134 is to kill it. So, if it is running state, the not-found doesn't make senses. That is why I am confusing 😂. If I understand the case correctly...
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After
containerdrestart, we end up with one sandbox inSANDBOX_READYand another one inSANDBOX_NOTREADY. This is checked here:https://github.com/containerd/containerd/blob/main/integration/restart_test.go#L175-L181
so the test case should be fulfilled. Unless I'm reading it wrong 😀.
Another option is to simply not try and send a kill there (on Windows), as there is no running container, if you prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before restart, how to change the running sandbox container into exited state? We should send signal, right? If so, why we can receive not-found error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you wait for a while? The state is updated in async way.
containerd/pkg/cri/server/events.go
Line 418 in 653f2f1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added logging to the test that showed the status was unchanged by the time containerd restarted. Should we change the test and poll until the sandbox transitions to
NOTREADYbefore restarting containerd?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do that. The
task.Waitreturn can guarantee that. :)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the change I made:
https://paste.samfira.com/public/p/TNTzRQIaJREB14JFwJ7i7NIS
This is the test output:
https://paste.samfira.com/public/p/mXhv2fZLvmgxJjapkKd2LG4h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It is about timing. 😂