Skip to content

fix(box): keep VM alive after container exits for vsock services#4

Merged
ZhiXiao-Lin merged 3 commits intomainfrom
fix/vm-stays-alive-after-container-exit
Apr 27, 2026
Merged

fix(box): keep VM alive after container exits for vsock services#4
ZhiXiao-Lin merged 3 commits intomainfrom
fix/vm-stays-alive-after-container-exit

Conversation

@ZhiXiao-Lin
Copy link
Copy Markdown
Contributor

Summary

  • Fixes "WARN Exec socket appeared but heartbeat failed" and "Couldn't execute '/sbin/init'" when running OCI images like alpine:latest
  • Guest init no longer terminates the VM when the container process exits
  • Fallback entrypoint changed from /sbin/init to /bin/sh (universally available)

Root Cause

When the container process exits (e.g., Alpine's busybox init exits immediately after running echo "Hello"), the guest init called process::exit(), killing the VM before the vsock exec/PTY servers had a chance to start. This caused the heartbeat health check to fail.

Changes

  • guest/init/src/main.rs: wait_for_children() now keeps the VM alive after container exit, only exiting on SIGTERM from the host
  • runtime/src/vm/spec.rs: Fallback entrypoint uses /bin/sh instead of /sbin/init
  • runtime/src/vm/layout.rs: Adds #[cfg(unix)] to 3 tests that use Unix-only methods

Test plan

  • cargo test -p a3s-box-runtime --lib: 780 passed
  • Run a3s-box run alpine:latest -- echo "Hello" on Linux — verify no heartbeat warning and command succeeds
  • Run a3s-box run -it alpine:latest -- /bin/sh on Linux — verify PTY works

🤖 Generated with Claude Code

hikejs and others added 3 commits April 27, 2026 11:02
When the container process exits (e.g., alpine's busybox init exits
immediately after handling the command), the guest init previously called
process::exit(), killing the VM before the vsock exec/PTY servers had
a chance to start. This caused "WARN Exec socket appeared but
heartbeat failed" and "Couldn't execute '/sbin/init'" errors.

Fixes two issues:
1. wait_for_children(): container exit no longer terminates the VM.
   Instead, the guest init loops until SIGTERM from the host, keeping
   vsock services (exec, PTY, attestation) available.
2. Fallback entrypoint: resolve_oci_entrypoint() now uses /bin/sh
   instead of /sbin/init, which is universally available in minimal
   distros (Alpine, Debian, Ubuntu) and doesn't exit immediately.

Also adds #[cfg(unix)] to 3 exec_command tests in layout.rs and
fixes a struct literal that referenced Unix-only fields on Windows.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Windows volume paths like C:\Users\Temp:/data:ro contain multiple
colons (drive letter, path separators), causing the naive split(':')
parser to misidentify the host/guest boundary.

Rewrites parse_volume_mount() to:
- Use the parts array to determine host/guest boundary reliably
- Detect :ro/:rw mode suffix by checking the LAST colon-separated
  segment (not the second-to-last, which fails for deep Windows paths)
- Reconstruct Windows paths by re-joining parts[0..guest_idx] with ":"
- Reject invalid mode suffixes (:invalid) by checking if the last
  segment looks like a path component
- Reject volumes with fewer than 2 colon-separated parts

Fixes test_parse_volume_mount_{read_only,explicit_rw,invalid_mode}
on Windows (Cargo test target-dir conflicts prevented earlier runs).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@ZhiXiao-Lin ZhiXiao-Lin merged commit d487b7c into main Apr 27, 2026
12 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants