Conversation
7417cf5 to
ed9cc54
Compare
ed9cc54 to
29fa205
Compare
|
This problem has been reported to runc a long time ago, but it seems like runc devs decided to keep things as they are opencontainers/runc#1765 |
Should we probably change the OCI spec itself? |
|
I've opened an issue in the spec repo: opencontainers/runtime-spec#1309 |
Thanks. Please refer this link in your TODO comment. |
29fa205 to
9ad2ca6
Compare
|
added link to the TODO comment |
Signed-off-by: Pavel Safronov <pv.safronov@gmail.com>
|
rebased with deduplicated helper methods. But otherwise not sure how to proceed with this one. Spec guys don't seem to be wiling to change spec, and changing youki is definitely out of scope of this PR. |
Description
This implements a test similar to https://github.com/opencontainers/runtime-tools/blob/master/validation/poststop_fail/poststop_fail.go as a part of the #361
NOTE: I believe that the current behaviour is incorrect. See "Additional Context" section for more details. I have left TODOs in the code to highlight the problem.
Type of Change
Testing
Related Issues
#361
Additional Context
When a poststop hook exits with a non-zero status, youki stops executing subsequent hooks and propagates the error, causing the delete operation to fail. This contradicts the OCI runtime spec.
The spec (https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#lifecycle) states:
Where https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#warnings is defined as:
Note that poststop is the only hook type with this "warn and continue" requirement. All other hooks (prestart, createRuntime, createContainer, startContainer, poststart) require generating an error on failure.
Current behavior:
Given three poststop hooks where the second one exits with status 1:
Expected behavior:
Bug: https://github.com/youki-dev/youki/blob/d541c752/crates/libcontainer/src/hooks.rs#L149-L156 uses ? at the end of the match block inside the loop, which returns immediately on the first hook failure:
The caller in https://github.com/youki-dev/youki/blob/d541c752/crates/libcontainer/src/container/container_delete.rs#L100-L106 then propagates the error, failing the delete.
This bug (or rather divergence from the spec) exists in
runcas well. So maybe the spec instead should be aligned to the actual behaviour?