diff --git a/.claude/CLAUDE.md b/.claude/CLAUDE.md index 5d630dc8..f46649ec 100644 --- a/.claude/CLAUDE.md +++ b/.claude/CLAUDE.md @@ -5,12 +5,29 @@ fcvm is a Firecracker VM manager for running Podman containers in lightweight mi ## Quick Reference +### Shell Scripts to /tmp + +**Write complex shell logic to /tmp instead of fighting escaping issues:** +```bash +# BAD - escaping nightmare +for dir in ...; do count=$(grep ... | wc -l); done + +# GOOD - write to file, execute +cat > /tmp/script.sh << 'EOF' +for dir in */; do + count=$(grep -c pattern "$dir"/*.rs) + echo "$dir: $count" +done +EOF +chmod +x /tmp/script.sh && /tmp/script.sh +``` + ### Streaming Test Output **Use `STREAM=1` to see test output in real-time:** ```bash -make test-vm FILTER=sanity STREAM=1 # Host tests with streaming -make container-test-vm FILTER=sanity STREAM=1 # Container tests with streaming +make test-root FILTER=sanity STREAM=1 # Host tests with streaming +make container-test-root FILTER=sanity STREAM=1 # Container tests with streaming ``` Without `STREAM=1`, nextest captures output and only shows it after tests complete (better for parallel runs). @@ -20,11 +37,14 @@ Without `STREAM=1`, nextest captures output and only shows it after tests comple # Build make build # Build fcvm + fc-agent make test # Run fuse-pipe tests -make rebuild # Full rebuild including rootfs update +make setup-fcvm # Download kernel and create rootfs -# Run a VM +# Run a VM (requires setup first, or use --setup flag) sudo fcvm podman run --name my-vm --network bridged nginx:alpine +# Or run with auto-setup (first run takes 5-10 minutes) +sudo fcvm podman run --name my-vm --network bridged --setup nginx:alpine + # Snapshot workflow fcvm snapshot create --pid --tag my-snapshot fcvm snapshot serve my-snapshot # Start UFFD server (prints serve PID) @@ -120,7 +140,7 @@ our NO LEGACY policy prohibits. Rootless tests work fine under sudo. Removed function and all 12 call sites across test files. -Tested: make test-vm FILTER=sanity (both rootless and bridged pass) +Tested: make test-root FILTER=sanity (both rootless and bridged pass) ``` **Bad example:** @@ -131,8 +151,8 @@ Fix tests **Testing section format** - show actual commands: ``` Tested: - make test-vm FILTER=sanity # 2 passed - make container-test-vm FILTER=sanity # 2 passed + make test-root FILTER=sanity # passed + make container-test-root FILTER=sanity # passed ``` Not vague claims like "tested and works" or "verified manually". @@ -175,35 +195,33 @@ If a test fails intermittently, that's a **concurrency bug** or **race condition ### Race Condition Debugging Protocol -**Workarounds are NOT acceptable.** When a test fails due to a race condition: +**Show, don't tell. We have extensive logs - it's NEVER a guess.** + +1. **NEVER "fix" with timing changes** (timeouts, sleeps, reducing parallelism) -1. **NEVER "fix" it with timing changes** like: - - Increasing timeouts - - Adding sleeps - - Separating phases that should work concurrently - - Reducing parallelism +2. **ALWAYS find the smoking gun in logs** - compare failing vs passing timestamps -2. **ALWAYS examine the actual output:** - - Capture FULL logs from failing test runs - - Look at what the SPECIFIC failing component did/didn't do - - Trace timestamps to understand ordering - - Find the EXACT operation that failed +3. **Real example - Firecracker crash during parallel tests:** -3. **Ask the right questions:** - - What's different about the failing component vs. successful ones? - - What resource/state is being contended? - - What initialization happens on first access? - - Are there orphaned processes or stale state? + ``` + # FAILING (truncate): + 05:01:26 Exporting image with skopeo + 05:03:34 Image exported (122s later - lock contention!) + 05:03:34.835 Firecracker spawned + 05:03:34.859 VM setup failed (24ms - crashed immediately) + + # PASSING (chmod): + 05:01:27 Exporting image with skopeo + 05:03:10 Image exported (103s - finished earlier) + 05:03:11.258 Firecracker spawned + 05:03:11.258 API server received request (success) + ``` -4. **Find and fix the ROOT CAUSE:** - - If it's a lock ordering issue, fix the locking - - If it's uninitialized state, fix the initialization - - If it's resource exhaustion, fix the resource management - - If it's a cleanup issue, fix the cleanup + **Root cause from logs:** All 17 tests serialize on podman storage lock, then thundering herd of VMs start at once. -**Example bad fix:** "Clone-0 times out while clones 1-99 succeed" → "Let's wait for all spawns before health checking" + **Fix:** Content-addressable image cache - first test exports, others hit cache. -**Correct approach:** Look at clone-0's logs to see WHY it specifically failed. What did clone-0 do differently? What resource did it touch first? +4. **The mantra:** What do timestamps show? What's different between failing and passing? The logs ALWAYS have the answer. ### NO TEST HEDGES @@ -244,7 +262,7 @@ assert!(localhost_works, "Localhost port forwarding should work (requires route_ - `#[cfg(feature = "privileged-tests")]`: Tests requiring sudo (iptables, root podman storage) - No feature flag: Unprivileged tests run by default - Features are compile-time gates - tests won't exist unless the feature is enabled -- Use `FILTER=` to further filter by name pattern: `make test-vm FILTER=exec` +- Use `FILTER=` to further filter by name pattern: `make test-root FILTER=exec` **Common parallel test pitfalls and fixes:** @@ -254,9 +272,16 @@ assert!(localhost_works, "Localhost port forwarding should work (requires route_ // Returns: mytest-base-12345-0, mytest-clone-12345-0, etc. ``` -2. **Port conflicts**: Loopback IP allocation checks port availability before assigning - - If orphaned processes hold ports, allocation skips those IPs - - Implemented in `state/manager.rs::is_port_available()` +2. **Port forwarding**: Both networking modes use unique IPs, so same port works + ```rust + // BRIDGED: DNAT scoped to veth IP (172.30.x.y) - same port works across VMs + "--publish", "8080:80" // Test curls veth's host_ip:8080 + + // ROOTLESS: each VM gets unique loopback IP (127.x.y.z) - same port works + "--publish", "8080:80" // Test curls loopback_ip:8080 + ``` + - Tests must curl the VM's assigned IP (veth host_ip or loopback_ip), not localhost + - Get the IP from VM state: `config.network.host_ip` (bridged) or `config.network.loopback_ip` (rootless) 3. **Disk cleanup**: VM data directories are cleaned up on exit - `podman.rs` and `snapshot.rs` both delete `data_dir` on VM exit @@ -272,30 +297,34 @@ assert!(localhost_works, "Localhost port forwarding should work (requires route_ ### Build and Test Rules -**CRITICAL: NEVER run `cargo build` or `cargo test` directly. ALWAYS use Makefile targets.** +**CRITICAL: NEVER use `sudo cargo` or `sudo cargo test`. ALWAYS use Makefile targets.** -The Makefile handles: -- Correct `CARGO_TARGET_DIR` for sudo vs non-sudo builds (avoids permission conflicts) -- Proper feature flags (`--features privileged-tests`) -- btrfs setup prerequisites -- Container image building for container tests +The Makefile uses `CARGO_TARGET_*_RUNNER='sudo -E'` to run test **binaries** with sudo, not cargo itself. Using `sudo cargo` creates root-owned files in `target/` that break subsequent non-sudo builds. ```bash # CORRECT - always use make -make build # Build fcvm + fc-agent -make test # Run fuse-pipe tests -make test-vm # All VM tests (runs with sudo via target runner) -make test-vm FILTER=exec # Only exec tests -make test-vm FILTER=sanity # Only sanity tests -make container-test # Run tests in container -make clean # Clean build artifacts +make build # Build fcvm + fc-agent (no sudo) +make test-unit # Unit tests only, no sudo +make test-fast # + quick VM tests, no sudo (rootless only) +make test-all # + slow VM tests, no sudo (rootless only) +make test-root # + privileged tests (bridged, pjdfstest), uses sudo runner +make test # Alias for test-root # WRONG - never do this -sudo cargo build ... # Wrong target dir, permission issues +sudo cargo build ... # Creates root-owned target/, breaks everything +sudo cargo test ... # Same problem cargo test -p fcvm ... # Missing feature flags, setup ``` -**Test feature flags**: Tests use `#[cfg(feature = "privileged-tests")]` for tests requiring sudo. Unprivileged tests run by default (no feature flag). Use `FILTER=` to further filter by name. +**Test tiers (additive):** +| Target | Features | Sudo | Tests | +|--------|----------|------|-------| +| test-unit | none | no | lint, cli, state manager | +| test-fast | integration-fast | no | + quick VM (rootless) | +| test-all | + integration-slow | no | + slow VM (rootless) | +| test-root | + privileged-tests | yes | + bridged, pjdfstest | + +**Feature flags**: `privileged-tests` gates bridged networking tests and pjdfstest. Rootless tests compile without it. Use `FILTER=` to filter by name pattern. ### Container Build Rules @@ -338,7 +367,7 @@ sleep 5 && ... cp /tmp/test.log /tmp/fcvm-failed-test_exec_rootless-$(date +%Y%m%d-%H%M%S).log # Then continue with other tests using a fresh log file -make test-vm 2>&1 | tee /tmp/test-run2.log +make test-root 2>&1 | tee /tmp/test-run2.log ``` **Why this matters:** @@ -398,11 +427,16 @@ When a FUSE operation fails unexpectedly, trace the full path from kernel to fus This pattern found the ftruncate bug: kernel sends `FATTR_FH` with file handle, but fuse-pipe's `VolumeRequest::Setattr` didn't have an `fh` field. -### Container Testing for Full POSIX Compliance +### POSIX Compliance (pjdfstest) -All 8789 pjdfstest tests pass when running in a container with proper device cgroup rules. Use `make container-test-pjdfstest` for the full POSIX compliance test. +All 8789 pjdfstest tests pass via two parallel test matrices: -**Why containers work better**: The container runs with `sudo podman` and `--device-cgroup-rule` flags that allow mknod for block/char devices. +| Matrix | Location | What it tests | +|--------|----------|---------------| +| Host-side | `fuse-pipe/tests/pjdfstest_matrix_root.rs` | fuse-pipe FUSE directly (no VM) | +| In-VM | `tests/test_fuse_in_vm_matrix.rs` | Full stack: host VolumeServer → vsock → guest FUSE | + +Both matrices run 17 categories in parallel via nextest. Each category is a separate test, so all 34 tests (17 × 2) can run concurrently. Total time is ~2-3 minutes (limited by slowest category: chown ~82s). ## CI and Testing Philosophy @@ -412,12 +446,12 @@ All 8789 pjdfstest tests pass when running in a container with proper device cgr | Target | What | |--------|------| -| `make test` | fuse-pipe tests | -| `make test-vm` | All VM tests (rootless + bridged) | -| `make test-vm FILTER=exec` | Only exec tests | -| `make container-test` | fuse-pipe in container | -| `make container-test-vm` | VM tests in container | -| `make test-all` | Everything | +| `make test-unit` | Unit tests only (no VMs, no sudo) | +| `make test-fast` | + quick VM tests (rootless, no sudo) | +| `make test-all` | + slow VM tests (rootless, no sudo) | +| `make test-root` | + privileged tests (bridged, pjdfstest, sudo) | +| `make test` | Alias for test-root | +| `make container-test` | All tests in container | ### Path Overrides for CI @@ -425,7 +459,7 @@ Makefile paths can be overridden via environment: ```bash export FUSE_BACKEND_RS=/path/to/fuse-backend-rs export FUSER=/path/to/fuser -make container-test-pjdfstest +make container-test ``` ### CI Structure @@ -545,14 +579,13 @@ src/ └── setup/ # Setup subcommands tests/ -├── common/mod.rs # Shared test utilities (VmFixture, poll_health_by_pid) -├── test_sanity.rs # End-to-end VM sanity tests (rootless + bridged) -├── test_state_manager.rs # State manager unit tests -├── test_health_monitor.rs # Health monitoring tests -├── test_fuse_posix.rs # FUSE POSIX compliance in VM -├── test_fuse_in_vm.rs # FUSE integration in VM -├── test_localhost_image.rs # Local image tests -└── test_snapshot_clone.rs # Snapshot/clone workflow tests +├── common/mod.rs # Shared test utilities (VmFixture, poll_health_by_pid) +├── test_sanity.rs # End-to-end VM sanity tests (rootless + bridged) +├── test_state_manager.rs # State manager unit tests +├── test_health_monitor.rs # Health monitoring tests +├── test_fuse_in_vm_matrix.rs # In-VM pjdfstest (17 categories, parallel via nextest) +├── test_localhost_image.rs # Local image tests +└── test_snapshot_clone.rs # Snapshot/clone workflow tests fuse-pipe/tests/ ├── integration.rs # Basic FUSE operations (no root) @@ -561,7 +594,7 @@ fuse-pipe/tests/ ├── test_mount_stress.rs # Mount/unmount stress tests ├── test_allow_other.rs # AllowOther flag tests ├── test_unmount_race.rs # Unmount race condition tests -├── pjdfstest_matrix.rs # POSIX compliance (17 categories, parallel via nextest) +├── pjdfstest_matrix_root.rs # Host-side pjdfstest (17 categories, parallel) └── pjdfstest_common.rs # Shared pjdfstest utilities fuse-pipe/benches/ @@ -683,15 +716,17 @@ pub fn vm_runtime_dir(vm_id: &str) -> PathBuf { } ``` -**Setup**: Automatic via `make test-vm` or `make container-test-vm` (idempotent btrfs loopback + kernel copy). +**Setup**: Run `make setup-fcvm` before tests (called automatically by `make test-root` or `make container-test-root`). **⚠️ CRITICAL: Changing VM base image (fc-agent, rootfs)** -ALWAYS use Makefile commands to update the VM base: -- `make rebuild` - Rebuild fc-agent and regenerate rootfs/initrd -- Rootfs is auto-regenerated when setup script changes (via SHA-based caching) +When you change fc-agent or setup scripts, regenerate the rootfs: +1. Delete existing rootfs: `sudo rm -f /mnt/fcvm-btrfs/rootfs/layer2-*.raw /mnt/fcvm-btrfs/initrd/fc-agent-*.initrd` +2. Run setup: `make setup-fcvm` + +The rootfs is cached by SHA of setup script + kernel URL. Changes to these automatically invalidate the cache. -NEVER manually edit rootfs files. The setup script in `rootfs-plan.toml` and `src/setup/rootfs.rs` control what gets installed. Changes trigger automatic regeneration on next VM start. +NEVER manually edit rootfs files. The setup script in `rootfs-plan.toml` and `src/setup/rootfs.rs` control what gets installed. ### Memory Sharing (UFFD) @@ -761,12 +796,12 @@ Run `make help` for full list. Key targets: #### Testing | Target | Description | |--------|-------------| -| `make test` | fuse-pipe tests | -| `make test-vm` | All VM tests (rootless + bridged) | -| `make test-vm FILTER=exec` | Only exec tests | -| `make test-all` | Everything | -| `make container-test` | fuse-pipe in container | -| `make container-test-vm` | VM tests in container | +| `make test-unit` | Unit tests only (no VMs, no sudo) | +| `make test-fast` | + quick VM tests (rootless, no sudo) | +| `make test-all` | + slow VM tests (rootless, no sudo) | +| `make test-root` | + privileged tests (bridged, pjdfstest, sudo) | +| `make test` | Alias for test-root | +| `make container-test` | All tests in container | | `make container-shell` | Interactive shell | #### Linting @@ -792,18 +827,34 @@ Run `make help` for full list. Key targets: | Target | Description | |--------|-------------| | `make setup-btrfs` | Create btrfs loopback | -| `make setup-rootfs` | Trigger rootfs creation (~90 sec first run) | +| `make setup-fcvm` | Download kernel and create rootfs (runs `fcvm setup`) | ### How Setup Works -**What Makefile does (prerequisites):** -1. `setup-btrfs` - Creates 20GB btrfs loopback at `/mnt/fcvm-btrfs` +**Setup is explicit, not automatic.** VMs require kernel, rootfs, and initrd to exist before running. + +**Two ways to set up:** + +1. **`fcvm setup`** (explicit, works for all modes): + - Downloads kernel and creates rootfs + - Required before running VMs with bridged networking (root) -**What fcvm binary does (auto on first VM start):** -1. `ensure_kernel()` - Downloads Kata kernel from URL in `rootfs-plan.toml` if not present (cached by URL hash) -2. `ensure_rootfs()` - Creates Layer 2 rootfs if SHA doesn't match (downloads Ubuntu cloud image, runs setup in VM, creates initrd with fc-agent) +2. **`fcvm podman run --setup`** (rootless only): + - Adds `--setup` flag to opt-in to auto-setup + - Only works for rootless mode (no root) + - Disallowed when running as root - use `fcvm setup` instead -**Kernel source**: Kata Containers kernel (6.12.47 from Kata 3.24.0 release) with `CONFIG_FUSE_FS=y` built-in. This is specified in `rootfs-plan.toml` and auto-downloaded on first run. +**Without setup**, fcvm fails immediately if assets are missing: +``` +ERROR fcvm: Error: setting up rootfs: Rootfs not found. Run 'fcvm setup' first, or use --setup flag. +``` + +**What `fcvm setup` does:** +1. Downloads Kata kernel from URL in `rootfs-plan.toml` (~15MB, cached by URL hash) +2. Creates Layer 2 rootfs (~10GB, downloads Ubuntu cloud image, boots VM to install packages) +3. Creates fc-agent initrd (embeds statically-linked fc-agent binary) + +**Kernel source**: Kata Containers kernel (6.12.47 from Kata 3.24.0 release) with `CONFIG_FUSE_FS=y` built-in. ### Data Layout ``` @@ -853,6 +904,34 @@ ip addr add 172.16.29.1/24 dev tap-vm-c93e8 # Guest is 172.16.29.2 - Traffic flows: Guest → NAT → Host's DNS servers - No dnsmasq required +### Container Resource Limits (EAGAIN Debugging) + +**Symptom:** Tests fail with "Resource temporarily unavailable (os error 11)" or "fork/exec: resource temporarily unavailable" + +**Debugging steps:** +1. Check dmesg for cgroup rejections: + ```bash + sudo dmesg | grep -i "fork rejected" + # Look for: "cgroup: fork rejected by pids controller in /machine.slice/libpod-..." + ``` + +2. Check actual process/thread counts (usually much lower than limits): + ```bash + ps aux | wc -l # Process count + ps -eLf | wc -l # Thread count + ps -eo user,nlwp,comm --sort=-nlwp | head -20 # Top by threads + ``` + +3. Check container pids limit (NOT ulimit - cgroup is separate!): + ```bash + sudo podman run --rm alpine cat /sys/fs/cgroup/pids.max + # Default: 2048 (way too low for parallel VM tests) + ``` + +**Root cause:** Podman sets cgroup pids limit to 2048 by default. This is NOT the same as `ulimit -u` (nproc). The cgroup pids controller limits total processes/threads in the container. + +**Fix:** Use `--pids-limit=65536` in container run command (already in Makefile). + ### Pipe Buffer Deadlock in Tests (CRITICAL) **Problem:** Tests hang indefinitely when spawning fcvm with `Stdio::piped()` but not reading the pipes. @@ -897,9 +976,11 @@ let (mut child, pid) = common::spawn_fcvm(&["podman", "run", "--name", &vm_name, | Command | Description | |---------|-------------| -| `make container-test` | fuse-pipe tests | -| `make container-test-vm` | VM tests (rootless + bridged) | -| `make container-test-vm FILTER=exec` | Only exec tests | +| `make container-test-unit` | Unit tests in container | +| `make container-test-fast` | + quick VM tests (rootless) | +| `make container-test-all` | + slow VM tests (rootless) | +| `make container-test-root` | + privileged tests | +| `make container-test` | Alias for container-test-root | | `make container-shell` | Interactive shell | ### Tracing Targets diff --git a/.config/nextest.toml b/.config/nextest.toml index 3fc41ea0..755d4a35 100644 --- a/.config/nextest.toml +++ b/.config/nextest.toml @@ -43,22 +43,25 @@ retries = 0 max-threads = 1 # VM tests run at full parallelism (num-cpus) -# Previously limited to 16 threads due to namespace holder process deaths, -# but root cause was rootless tests running under sudo. Now that privileged -# tests filter out rootless tests (-E '!test(/rootless/)'), full parallelism works. [test-groups.vm-tests] max-threads = "num-cpus" [[profile.default.overrides]] filter = "package(fcvm) & test(/stress_100/)" test-group = "stress-tests" -slow-timeout = { period = "300s", terminate-after = 1 } +slow-timeout = { period = "600s", terminate-after = 1 } -# VM tests run with limited parallelism to avoid resource exhaustion +# VM tests get 10 minute timeout [[profile.default.overrides]] -filter = "package(fcvm) & test(/test_/) & !test(/stress_100/)" +filter = "package(fcvm) & test(/test_/) & !test(/stress_100/) & !test(/pjdfstest_vm/)" test-group = "vm-tests" -slow-timeout = { period = "300s", terminate-after = 1 } +slow-timeout = { period = "600s", terminate-after = 1 } + +# In-VM pjdfstest needs 15 minutes (image import via FUSE over vsock is slow) +[[profile.default.overrides]] +filter = "package(fcvm) & test(/pjdfstest_vm/)" +test-group = "vm-tests" +slow-timeout = { period = "900s", terminate-after = 1 } # fuse-pipe tests can run with full parallelism [[profile.default.overrides]] diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index d08f5e3c..d9a9d917 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -15,7 +15,7 @@ env: jobs: container-rootless: name: Container (rootless) - runs-on: ubuntu-latest + runs-on: buildjet-32vcpu-ubuntu-2204 steps: - uses: actions/checkout@v4 with: @@ -36,7 +36,7 @@ jobs: container-sudo: name: Container (sudo) - runs-on: ubuntu-latest + runs-on: buildjet-32vcpu-ubuntu-2204 steps: - uses: actions/checkout@v4 with: @@ -56,7 +56,7 @@ jobs: run: make ci-container-sudo vm: - name: Host (sudo+rootless) + name: VM (bare metal) runs-on: buildjet-32vcpu-ubuntu-2204 steps: - uses: actions/checkout@v4 @@ -72,6 +72,27 @@ jobs: repository: ejc3/fuser ref: master path: fuser + - name: Install Rust + run: | + curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y + echo "$HOME/.cargo/bin" >> $GITHUB_PATH + - name: Install dependencies + run: | + sudo apt-get update + sudo apt-get install -y fuse3 libfuse3-dev libclang-dev clang musl-tools \ + iproute2 iptables slirp4netns dnsmasq qemu-utils e2fsprogs parted \ + podman skopeo busybox-static cpio zstd + - name: Install Firecracker + run: | + curl -L -o /tmp/firecracker.tgz \ + https://github.com/firecracker-microvm/firecracker/releases/download/v1.14.0/firecracker-v1.14.0-x86_64.tgz + sudo tar -xzf /tmp/firecracker.tgz -C /usr/local/bin --strip-components=1 \ + release-v1.14.0-x86_64/firecracker-v1.14.0-x86_64 \ + release-v1.14.0-x86_64/jailer-v1.14.0-x86_64 + sudo mv /usr/local/bin/firecracker-v1.14.0-x86_64 /usr/local/bin/firecracker + sudo mv /usr/local/bin/jailer-v1.14.0-x86_64 /usr/local/bin/jailer + - name: Install cargo-nextest + run: cargo install cargo-nextest --locked - name: Setup KVM and networking run: | sudo chmod 666 /dev/kvm @@ -83,6 +104,6 @@ jobs: fi sudo chmod 666 /dev/userfaultfd sudo sysctl -w vm.unprivileged_userfaultfd=1 - - name: make container-test-vm + - name: make test-vm working-directory: fcvm - run: make container-test-vm + run: make test-vm diff --git a/.gitignore b/.gitignore index ae2f9378..4500c3c7 100644 --- a/.gitignore +++ b/.gitignore @@ -8,3 +8,4 @@ sync-test/ # Local settings (machine-specific) *.local.* *.local +cargo-home/ diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 42c1676b..c487bbde 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -40,12 +40,16 @@ Have an idea? [Open an issue](https://github.com/ejc3/fcvm/issues/new) describin # Build everything make build +# First-time setup (downloads kernel + creates rootfs, ~5-10 min) +make setup-btrfs +fcvm setup + # Run lints (must pass before PR) make lint # Run tests make test # fuse-pipe tests -make test-vm # VM integration tests (requires KVM) +make test-root # VM tests (requires sudo + KVM) # Format code make fmt diff --git a/Cargo.lock b/Cargo.lock index d50c9806..44ff6036 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -105,17 +105,6 @@ version = "1.1.2" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "1505bd5d3d116872e7271a6d4e16d81d0c8570876c8de68093a09ac269d8aac0" -[[package]] -name = "atty" -version = "0.2.14" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "d9b39be18770d11421cdb1b9947a45dd3f37e93092cbf377614828a319d5fee8" -dependencies = [ - "hermit-abi 0.1.19", - "libc", - "winapi", -] - [[package]] name = "autocfg" version = "1.5.0" @@ -570,10 +559,10 @@ version = "0.1.0" dependencies = [ "anyhow", "async-trait", - "atty", "chrono", "clap", "criterion", + "fs2", "fuse-pipe", "hex", "hyper 0.14.32", @@ -869,15 +858,6 @@ version = "0.5.0" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "2304e00983f87ffb38b55b444b5e3b60a884b5d30c0fca7d82fe33449bbe55ea" -[[package]] -name = "hermit-abi" -version = "0.1.19" -source = "registry+https://github.com/rust-lang/crates.io-index" -checksum = "62b467343b94ba476dcb2500d242dadbb39557df889310ac77c5d99100aaac33" -dependencies = [ - "libc", -] - [[package]] name = "hermit-abi" version = "0.5.2" @@ -1223,7 +1203,7 @@ version = "0.4.17" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "3640c1c38b8e4e43584d8df18be5fc6b0aa314ce6ebf51b53313d4306cca8e46" dependencies = [ - "hermit-abi 0.5.2", + "hermit-abi", "libc", "windows-sys 0.61.2", ] diff --git a/Cargo.toml b/Cargo.toml index be5d4880..b9a664ad 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -4,6 +4,8 @@ members = [".", "fuse-pipe", "fc-agent"] default-members = [".", "fuse-pipe", "fc-agent"] # Exclude sync-test (used only for Makefile sync verification) exclude = ["sync-test"] +# Resolver v2 makes --no-default-features work across all workspace members +resolver = "2" [package] name = "fcvm" @@ -12,7 +14,6 @@ edition = "2021" [dependencies] anyhow = "1" -atty = "0.2" clap = { version = "4", features = ["derive", "env"] } serde = { version = "1", features = ["derive"] } serde_json = "1" @@ -42,11 +43,18 @@ fuse-pipe = { path = "fuse-pipe", default-features = false } url = "2" tokio-util = "0.7" regex = "1.12.2" +fs2 = "0.4.3" [features] -# Test category - only gate tests that require sudo -# Unprivileged tests run by default (no feature flag needed) -privileged-tests = [] # Tests requiring sudo (iptables, root podman storage) +# Default: all integration tests that work without sudo (rootless networking) +default = ["integration-fast", "integration-slow"] + +# Test speed tiers (unit tests always run, no feature flag needed) +integration-fast = [] # Quick VM tests, < 30s each (sanity, signal, exec, port forward) +integration-slow = [] # Slow VM tests, > 30s each (clone, snapshot, fuse posix, egress) + +# Privileged tests require sudo (bridged networking, pjdfstest, iptables) +privileged-tests = [] [dev-dependencies] serial_test = "3" diff --git a/Containerfile b/Containerfile index b5ca506e..ade28ec3 100644 --- a/Containerfile +++ b/Containerfile @@ -1,122 +1,48 @@ -# fcvm test container -# -# Build context must include fuse-backend-rs and fuser alongside fcvm: -# cd ~/fcvm && podman build -t fcvm-test -f Containerfile \ -# --build-context fuse-backend-rs=../fuse-backend-rs \ -# --build-context fuser=../fuser . -# -# Test with: podman run --rm --privileged --device /dev/fuse fcvm-test - FROM docker.io/library/rust:1.83-bookworm -# Copy rust-toolchain.toml to read version from single source of truth +# Install Rust toolchain from rust-toolchain.toml COPY rust-toolchain.toml /tmp/rust-toolchain.toml - -# Install toolchain version from rust-toolchain.toml (avoids version drift) -# Edition 2024 is stable since Rust 1.85 -# Also add musl targets for statically linked fc-agent (portable across glibc versions) RUN RUST_VERSION=$(grep 'channel' /tmp/rust-toolchain.toml | cut -d'"' -f2) && \ rustup toolchain install $RUST_VERSION && \ rustup default $RUST_VERSION && \ rustup component add rustfmt clippy && \ rustup target add aarch64-unknown-linux-musl x86_64-unknown-linux-musl -# Install cargo-nextest for better test parallelism and output -RUN cargo install cargo-nextest --locked +# Install cargo tools +RUN cargo install cargo-nextest cargo-audit cargo-deny --locked # Install system dependencies RUN apt-get update && apt-get install -y \ - # FUSE support - fuse3 \ - libfuse3-dev \ - # pjdfstest build deps - autoconf \ - automake \ - libtool \ - # pjdfstest runtime deps - perl \ - # Build deps for bindgen (userfaultfd-sys) - libclang-dev \ - clang \ - # musl libc for statically linked fc-agent (portable across glibc versions) - musl-tools \ - # fcvm VM test dependencies - iproute2 \ - iptables \ - slirp4netns \ - dnsmasq \ - qemu-utils \ - e2fsprogs \ - parted \ - # Container runtime for localhost image tests - podman \ - skopeo \ - # Utilities - git \ - curl \ - sudo \ - procps \ - # Required for initrd creation (must be statically linked for kernel boot) - busybox-static \ - cpio \ - # Clean up + fuse3 libfuse3-dev autoconf automake libtool perl libclang-dev clang \ + musl-tools iproute2 iptables slirp4netns dnsmasq qemu-utils e2fsprogs \ + parted podman skopeo git curl sudo procps zstd busybox-static cpio uidmap \ && rm -rf /var/lib/apt/lists/* -# Download and install Firecracker (architecture-aware) -# v1.14.0 adds network_overrides support for snapshot cloning +# Install Firecracker ARG ARCH=aarch64 -RUN curl -L -o /tmp/firecracker.tgz \ +RUN curl -fsSL -o /tmp/fc.tgz \ https://github.com/firecracker-microvm/firecracker/releases/download/v1.14.0/firecracker-v1.14.0-${ARCH}.tgz \ - && tar --no-same-owner -xzf /tmp/firecracker.tgz -C /tmp \ + && tar --no-same-owner -xzf /tmp/fc.tgz -C /tmp \ && mv /tmp/release-v1.14.0-${ARCH}/firecracker-v1.14.0-${ARCH} /usr/local/bin/firecracker \ - && chmod +x /usr/local/bin/firecracker \ - && rm -rf /tmp/firecracker.tgz /tmp/release-v1.14.0-${ARCH} - -# Build and install pjdfstest (tests expect it at /tmp/pjdfstest-check/) -RUN git clone --depth 1 https://github.com/pjd/pjdfstest /tmp/pjdfstest-check \ - && cd /tmp/pjdfstest-check \ - && autoreconf -ifs \ - && ./configure \ - && make + && rm -rf /tmp/fc.tgz /tmp/release-v1.14.0-${ARCH} -# Create non-root test user with access to fuse group -RUN groupadd -f fuse \ +# Setup testuser with sudo and namespace support +RUN echo "user_allow_other" >> /etc/fuse.conf \ + && groupadd -f fuse && groupadd -f kvm \ && useradd -m -s /bin/bash testuser \ - && usermod -aG fuse testuser - -# Rust tools are installed system-wide at /usr/local/cargo (owned by root) -# Symlink to /usr/local/bin so sudo can find them (sudo uses secure_path) -RUN ln -s /usr/local/cargo/bin/cargo /usr/local/bin/cargo \ - && ln -s /usr/local/cargo/bin/rustc /usr/local/bin/rustc \ - && ln -s /usr/local/cargo/bin/cargo-nextest /usr/local/bin/cargo-nextest - -# Allow testuser to sudo without password (like host dev setup) -RUN echo "testuser ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers - -# Configure subordinate UIDs/GIDs for rootless user namespaces -# testuser (UID 1000) gets subordinate range 100000-165535 (65536 IDs) -# This enables `unshare --user --map-auto` without root -RUN echo "testuser:100000:65536" >> /etc/subuid \ + && usermod -aG fuse,kvm testuser \ + && echo "testuser ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers \ + && echo "testuser:100000:65536" >> /etc/subuid \ && echo "testuser:100000:65536" >> /etc/subgid -# Install uidmap package for newuidmap/newgidmap setuid helpers -# These are required for --map-auto to work -RUN apt-get update && apt-get install -y uidmap && rm -rf /var/lib/apt/lists/* - -# Create workspace structure matching local paths -# Source code is mounted at runtime, not copied - ensures code is always fresh -WORKDIR /workspace - -# Create directories that will be mount points -RUN mkdir -p /workspace/fcvm /workspace/fuse-backend-rs /workspace/fuser - -# Make workspace owned by testuser for non-root tests -RUN chown -R testuser:testuser /workspace +# Symlink cargo tools to /usr/local/bin for sudo +RUN for bin in cargo rustc rustfmt cargo-clippy clippy-driver cargo-nextest cargo-audit cargo-deny; do \ + ln -s /usr/local/cargo/bin/$bin /usr/local/bin/$bin 2>/dev/null || true; done +# Setup workspace WORKDIR /workspace/fcvm +RUN mkdir -p /workspace/fcvm /workspace/fuse-backend-rs /workspace/fuser \ + && chown -R testuser:testuser /workspace -# Switch to testuser - tests run as normal user with sudo like on host USER testuser - -# Default command runs all fuse-pipe tests -CMD ["cargo", "nextest", "run", "--release", "-p", "fuse-pipe"] +CMD ["make", "test-unit"] diff --git a/DESIGN.md b/DESIGN.md index a2fdf4ba..5866df08 100644 --- a/DESIGN.md +++ b/DESIGN.md @@ -40,7 +40,11 @@ - Process blocks until VM exits (hanging/foreground mode) - VM dies when process is killed (lifetime binding) -2. **`fcvm snapshot` Commands** +2. **`fcvm exec` Command** + - Execute commands in running VMs + - Supports running in guest OS or inside container (`-c` flag) + +3. **`fcvm snapshot` Commands** - `fcvm snapshot create`: Create snapshot from running VM - `fcvm snapshot serve`: Start UFFD memory server for cloning - `fcvm snapshot run`: Spawn clone from memory server @@ -48,23 +52,23 @@ - Shares memory via UFFD page fault handler - Creates independent VM with its own networking -3. **Networking Modes** +4. **Networking Modes** - **Rootless**: Works without root privileges using slirp4netns - - **Privileged**: Uses nftables + bridge for better performance + - **Privileged**: Uses iptables + TAP for better performance - **Port mapping**: `[HOSTIP:]HOSTPORT:GUESTPORT[/PROTO]` syntax - Support multiple ports, TCP/UDP protocols -4. **Volume Mounting** +5. **Volume Mounting** - Map local directories to guest filesystem - Support block devices, sshfs, and NFS modes - Read-only and read-write mounts -5. **Resource Configuration** +6. **Resource Configuration** - vCPU overcommit (more vCPUs than physical cores) - Memory overcommit with balloon device - Configurable memory ballooning -6. **Snapshot & Clone** +7. **Snapshot & Clone** - Save VM state at "warm" checkpoint (after container ready) - Fast restore from snapshot - CoW disks for instant cloning @@ -240,37 +244,42 @@ async fn setup() -> Result { #### Privileged Networking (`bridged.rs`) -Uses Linux bridge + nftables for native performance. +Uses TAP devices + iptables for native performance. **Features**: - Requires root or CAP_NET_ADMIN - Better performance than rootless -- Uses DNAT for port forwarding -- Bridge networking for VM isolation +- Uses DNAT for port forwarding (scoped to veth IP) +- Network namespace isolation per VM **Implementation**: ```rust -struct PrivilegedNetwork { +struct BridgedNetwork { vm_id: String, tap_device: String, - bridge: String, + namespace_id: String, + host_veth: String, // veth_outer in host namespace + guest_veth: String, // veth_inner in VM namespace guest_ip: String, - host_ip: String, + host_ip: String, // veth's host IP (used for port forwarding) port_mappings: Vec, } async fn setup() -> Result { - create_tap_device(tap_name) - add_to_bridge(tap_name, bridge) + create_namespace(namespace_id) + create_veth_pair(host_veth, guest_veth) + move_veth_to_namespace(guest_veth, namespace_id) + create_tap_device_in_namespace(tap_name, namespace_id) for mapping in port_mappings { - setup_nat_rule(mapping, guest_ip) + // Scope DNAT to veth IP so same port works across VMs + setup_nat_rule(mapping, guest_ip, host_ip) } } ``` -**NAT Rule Example**: +**NAT Rule Example** (scoped to veth IP): ```bash -nft add rule ip nat PREROUTING tcp dport 8080 dnat to 172.16.0.10:80 +iptables -t nat -A PREROUTING -d 172.30.x.1 -p tcp --dport 8080 -j DNAT --to-destination 172.30.x.2:80 ``` #### Port Mapping Format @@ -465,61 +474,65 @@ Host (127.0.0.2:8080) → slirp4netns → slirp0 (10.0.2.100:8080) → IP forwar - Works in nested VMs and restricted environments - Fully compatible with rootless Podman in guest -### Privileged Mode (nftables + bridge) +### Privileged Mode (Network Namespace + veth + iptables) **Topology**: ``` -┌───────────────────────────────────────┐ -│ Host │ -│ ┌─────────┐ │ -│ │ fcvmbr0 │ (172.16.0.1) │ -│ └────┬────┘ │ -│ │ │ -│ ┌────┴─────┐ │ -│ │ tap-vm1 │ ← connected to VM │ -│ └──────────┘ │ -│ │ -│ nftables DNAT rules: │ -│ tcp dport 8080 → 172.16.0.10:80 │ -└───────────────────────────────────────┘ - │ - ▼ - ┌──────────────┐ - │ Firecracker │ - │ eth0: │ - │ 172.16.0.10 │ - └──────────────┘ -``` - -**Bridge Setup**: +┌─────────────────────────────────────────────────────────────────┐ +│ Host Namespace │ +│ ┌──────────────┐ veth pair ┌──────────────────┐ │ +│ │ veth_outer │◄─────────────────────────►│ VM Namespace │ │ +│ │ 172.30.x.1 │ │ (fcvm-vm-xxxxx) │ │ +│ └──────────────┘ │ │ │ +│ │ veth_inner │ │ +│ iptables DNAT (scoped to veth IP): │ 172.30.x.2 │ │ +│ -d 172.30.x.1 --dport 8080 → 172.30.x.2 │ │ │ │ +│ │ ▼ │ │ +│ │ ┌──────────┐ │ │ +│ │ │ TAP │ │ │ +│ │ └────┬─────┘ │ │ +│ │ │ │ │ +│ │ ┌────▼─────┐ │ │ +│ │ │Firecracker│ │ │ +│ │ │eth0: │ │ │ +│ │ │172.30.x.2 │ │ │ +│ │ └───────────┘ │ │ +│ └──────────────────┘ │ +└─────────────────────────────────────────────────────────────────┘ +``` + +**Accessing port-forwarded services**: ```bash -ip link add fcvmbr0 type bridge -ip addr add 172.16.0.1/24 dev fcvmbr0 -ip link set fcvmbr0 up -``` +# Curl the veth's host IP (172.30.x.1), NOT localhost +curl http://172.30.x.1:8080 -**TAP Device**: -```bash -ip tuntap add tap-vm1 mode tap -ip link set tap-vm1 master fcvmbr0 -ip link set tap-vm1 up +# Get the veth IP from VM state +fcvm ls --json | jq '.[0].config.network.host_ip' ``` -**nftables Rules**: +**iptables Rules** (from `src/network/portmap.rs`): ```bash -# Create NAT table -nft add table ip nat +# DNAT for external traffic - scoped to veth's host IP to avoid port conflicts +# Each VM has unique veth IP (172.30.x.y) so same port works across VMs +iptables -t nat -A PREROUTING -d 172.30.x.1 -p tcp --dport 8080 -j DNAT --to-destination 172.30.x.2:80 -# DNAT for port forwarding -nft add rule ip nat PREROUTING tcp dport 8080 dnat to 172.16.0.10:80 +# DNAT for localhost traffic (OUTPUT chain) - also scoped to veth IP +iptables -t nat -A OUTPUT -d 172.30.x.1 -p tcp --dport 8080 -j DNAT --to-destination 172.30.x.2:80 -# MASQUERADE for outbound -nft add rule ip nat POSTROUTING oifname "eth0" masquerade +# MASQUERADE for outbound (guest → internet) +iptables -t nat -A POSTROUTING -s 172.30.x.0/30 -j MASQUERADE +``` + +**Accessing port-forwarded services**: +```bash +# Curl the veth's host IP (172.30.x.1), NOT localhost +curl http://172.30.x.1:8080 ``` **IP Allocation**: -- Bridge: `172.16.0.1/24` -- VMs: `172.16.0.10`, `172.16.0.11`, ... (incrementing) +- Each VM gets unique /30 subnet: `172.30.{x}.{y}/30` +- Veth host IP: `172.30.{x}.{y}` (used for port forwarding) +- Guest IP: `172.30.{x}.{y+1}` --- @@ -898,6 +911,19 @@ The guest is configured to support rootless Podman: ### Commands +#### `fcvm setup` + +**Purpose**: Download kernel and create rootfs (first-time setup). + +**Usage**: +```bash +fcvm setup +``` + +This downloads the Kata kernel (~15MB) and creates the Layer 2 rootfs (~10GB with Ubuntu + Podman). Takes 5-10 minutes on first run. + +**Note**: Must be run before `fcvm podman run` with bridged networking. For rootless mode, you can use `--setup` flag on `fcvm podman run` instead. + #### `fcvm podman run` **Purpose**: Launch a container in a new Firecracker VM. @@ -923,6 +949,7 @@ fcvm podman run --name [OPTIONS] --balloon Memory balloon target --health-check HTTP health check URL --privileged Run container in privileged mode +--setup Run setup if kernel/rootfs missing (rootless only) ``` **Examples**: @@ -958,6 +985,36 @@ sudo fcvm podman run \ ml-training:latest ``` +#### `fcvm exec` + +**Purpose**: Execute a command in a running VM. + +**Usage**: +```bash +fcvm exec --pid [OPTIONS] -- [ARGS...] +``` + +**Options**: +``` +--pid PID of the fcvm process managing the VM (required) +-c, --container Run command inside the container (not just guest OS) +``` + +**Examples**: +```bash +# Run command in guest OS +sudo fcvm exec --pid 12345 -- ls -la / + +# Run command inside container +sudo fcvm exec --pid 12345 -c -- curl -s http://localhost/health + +# Check egress connectivity from guest +sudo fcvm exec --pid 12345 -- curl -s ifconfig.me + +# Check egress connectivity from container +sudo fcvm exec --pid 12345 -c -- wget -q -O - http://ifconfig.me +``` + #### `fcvm snapshot create` **Purpose**: Create a snapshot from a running VM. @@ -1097,13 +1154,13 @@ fcvm/ │ │ │ ├── commands/ # CLI command implementations │ │ ├── mod.rs +│ │ ├── common.rs # Shared utilities +│ │ ├── exec.rs # fcvm exec │ │ ├── ls.rs # fcvm ls │ │ ├── podman.rs # fcvm podman run -│ │ ├── snapshot.rs # fcvm snapshot {create,serve,run} -│ │ ├── snapshots.rs # fcvm snapshots │ │ ├── setup.rs # fcvm setup -│ │ ├── memory_server.rs # UFFD memory server subprocess -│ │ └── common.rs # Shared utilities +│ │ ├── snapshot.rs # fcvm snapshot {create,serve,run} + UFFD server +│ │ └── snapshots.rs # fcvm snapshots │ │ │ ├── firecracker/ # Firecracker integration │ │ ├── mod.rs @@ -1220,94 +1277,78 @@ All builds are done via the root Makefile. make build # Build fcvm + fc-agent make clean # Clean build artifacts -# Testing -make test # Run fuse-pipe tests (noroot + root) -make test-vm # Run VM tests (rootless + bridged) -make test-all # Everything: test + test-vm + test-pjdfstest +# Testing (3 tiers) +make test-unit # Unit tests only (no VMs, <1s each) +make test-integration-fast # Quick VM tests (<30s each) +make test-root # All tests including slow (pjdfstest) + +# Container testing +make container-test-unit # Unit tests in container +make container-test-integration-fast # Quick VM tests in container +make container-test-root # All tests in container +make container-shell # Interactive shell # Linting make lint # Run clippy + fmt-check make fmt # Format code -# Container testing -make container-test # fuse-pipe tests in container -make container-test-vm # VM tests in container -make container-shell # Interactive shell +# Options +FILTER=pattern # Filter tests by name +STREAM=1 # Stream output (no capture) +LIST=1 # List tests without running ``` See `make help` for the complete list of targets. -### Configuration File - -**Location**: `~/.config/fcvm/config.yml` or `/etc/fcvm/config.yml` +### Data Directory -**Format**: -```yaml -# Data directory for VM state -data_dir: /var/lib/fcvm +All fcvm data is stored under `/mnt/fcvm-btrfs/` (btrfs filesystem for CoW reflinks). +Override with `FCVM_BASE_DIR` environment variable. -# Firecracker binary path -firecracker_bin: /usr/local/bin/firecracker - -# Kernel image -kernel_path: /var/lib/fcvm/kernels/vmlinux.bin - -# Base rootfs directory (layer2-{sha}.raw files) -rootfs_dir: /var/lib/fcvm/rootfs - -# Default settings -defaults: - mode: auto - vcpu: 2 - memory_mib: 2048 - map_mode: block - logs: stream - -# Network configuration -network: - mode: auto - bridge: fcvmbr0 - subnet: 172.16.0.0/24 - guest_ip_start: 172.16.0.10 - -# Logging -logging: - level: info - format: json +**Layout** (from `src/paths.rs`): +``` +/mnt/fcvm-btrfs/ +├── kernels/ # Kernel binaries +│ └── vmlinux-{sha}.bin +├── rootfs/ # Base rootfs images +│ └── layer2-{sha}.raw +├── initrd/ # fc-agent injection initrds +│ └── fc-agent-{sha}.initrd +├── vm-disks/ # Per-VM CoW disk copies +│ └── {vm-id}/disks/rootfs.raw +├── snapshots/ # Firecracker snapshots +├── state/ # VM state JSON files +│ └── {vm-id}.json +└── cache/ # Downloaded images ``` ### State Persistence -**VM State** (`~/.local/share/fcvm/vms//state.json`): +**VM State** (`/mnt/fcvm-btrfs/state/{vm-id}.json`): ```json { - "vm_id": "abc123", + "schema_version": 1, + "vm_id": "vm-abc123...", "name": "my-nginx", "status": "running", + "health_status": "healthy", + "exit_code": null, "pid": 12345, "created_at": "2025-01-09T12:00:00Z", + "last_updated": "2025-01-09T12:00:05Z", "config": { - "image": "nginx:latest", + "image": "nginx:alpine", "vcpu": 2, "memory_mib": 2048, "network": { - "mode": "rootless", "tap_device": "tap-abc123", - "guest_mac": "02:aa:bb:cc:dd:ee", - "guest_ip": "10.0.2.15", - "port_mappings": [ - {"host_port": 8080, "guest_port": 80, "proto": "tcp"} - ] + "guest_ip": "172.16.29.2", + "loopback_ip": "127.0.0.2" }, - "disks": [ - { - "path": "/var/lib/fcvm/vms/abc123/rootfs.raw", - "is_root": true - } - ], - "volumes": [ - {"host": "/data", "guest": "/mnt/data", "readonly": false} - ] + "volumes": [], + "process_type": "vm", + "snapshot_name": null, + "serve_pid": null } } ``` @@ -1392,13 +1433,12 @@ RUST_LOG=trace fcvm run nginx:latest - PID-based naming for additional uniqueness - Automatic cleanup on test exit -**Privileged/Unprivileged Test Organization**: -- Tests requiring sudo use `#[cfg(feature = "privileged-tests")]` -- Unprivileged tests run by default (no feature flag needed) -- Privileged tests: Need sudo for iptables, root podman storage -- Unprivileged tests: Run without sudo, use slirp4netns networking -- Makefile uses `--features` for selection: `make test-vm FILTER=exec` runs all exec tests -- Container tests: Use appropriate container run configurations (CONTAINER_RUN_FCVM vs CONTAINER_RUN_UNPRIVILEGED) +**Test Tier Organization** (feature-gated): +- `test-unit`: No feature flags, fast tests without VMs +- `test-integration-fast`: `--features integration-fast,privileged-tests` (quick VM tests <30s) +- `test-root`: All features including `integration-slow` (pjdfstest, slow VM tests) +- Filter by name pattern: `make test-root FILTER=exec` +- Container configs: `CONTAINER_RUN_ROOTLESS` (unit) and `CONTAINER_RUN_ROOT` (VM tests) ### Unit Tests @@ -1470,6 +1510,40 @@ kill $CLONE_PID $SERVE_PID $BASELINE_PID **Note**: `--network rootless` uses slirp4netns (no root required). `--network bridged` (default) uses iptables/TAP devices (requires sudo). +### POSIX Compliance (pjdfstest) + +The fuse-pipe library passes the pjdfstest POSIX compliance suite. Tests run via `make test-root` or `make container-test-root`. + +**Test Counts**: +- 237 total test files in pjdfstest +- 54 skipped on Linux (FreeBSD/ZFS/UFS-specific) +- 183 real test files run +- **8789 assertions** pass + +**Skipped Categories** (via `quick_exit()` - outputs trivial "ok 1"): + +| Category | Files | Skipped | Real | Reason | +|----------|-------|---------|------|--------| +| granular | 7 | 7 | 0 | FreeBSD extended ACLs only | +| open | 26 | 8 | 18 | FreeBSD-specific open behaviors | +| link | 18 | 6 | 12 | FreeBSD hardlink semantics | +| rename | 25 | 5 | 20 | FreeBSD rename edge cases | +| rmdir | 16 | 4 | 12 | FreeBSD rmdir behaviors | +| ftruncate | 15 | 3 | 12 | FreeBSD:UFS specific | +| mkdir | 13 | 3 | 10 | FreeBSD:UFS specific | +| mkfifo | 13 | 3 | 10 | FreeBSD:UFS specific | +| symlink | 13 | 3 | 10 | FreeBSD:UFS specific | +| truncate | 15 | 3 | 12 | FreeBSD:UFS specific | +| unlink | 15 | 3 | 12 | FreeBSD:UFS specific | +| chflags | 14 | 2 | 12 | Some UFS-specific flags | +| chmod | 13 | 2 | 11 | FreeBSD:ZFS specific | +| chown | 11 | 2 | 9 | FreeBSD:ZFS specific | +| mknod | 12 | 0 | 12 | All run | +| posix_fallocate | 1 | 0 | 1 | All run | +| utimensat | 10 | 0 | 10 | All run | + +**Skip mechanism**: Tests check `${os}:${fs}` and call `quick_exit()` for unsupported OS/filesystem combinations. This outputs TAP format `1..1` + `ok 1` (trivial pass) rather than running real assertions. + --- ## Performance Targets @@ -1527,7 +1601,7 @@ kill $CLONE_PID $SERVE_PID $BASELINE_PID ### Privileged Mode -- **Requires CAP_NET_ADMIN**: For TAP/bridge/nftables setup +- **Requires CAP_NET_ADMIN**: For TAP/iptables setup - **Minimal privileges**: Only for network setup, not VM execution - **Firecracker jailer**: Can use jailer for additional sandboxing (future) @@ -1596,25 +1670,62 @@ kill $CLONE_PID $SERVE_PID $BASELINE_PID - **TAP device**: Virtual network interface (TUN/TAP) - **slirp4netns**: User-mode networking for rootless containers - **CoW**: Copy-on-Write, disk strategy for fast cloning -- **nftables**: Linux firewall/NAT configuration tool +- **iptables**: Linux firewall/NAT configuration tool - **vsock**: Virtual socket for host-guest communication - **Balloon device**: Memory reclamation mechanism for VMs --- +## Build Performance + +Benchmarked on c6g.metal (64 ARM cores, 128GB RAM). + +### Compilation Times + +| Scenario | Time | Notes | +|----------|------|-------| +| Cold build (clean target) | 44s | ~12 parallel rustc processes | +| Incremental (touch main.rs) | 13s | Only recompiles fcvm | +| test-unit LIST (cold) | 24s | Compiles test binaries | +| test-unit LIST (warm) | 1.2s | No recompilation | + +### Optimization Attempts + +| Tool | Cold Build | Incremental | Verdict | +|------|------------|-------------|---------| +| Default (no tools) | 44s | 13.7s | Baseline | +| mold linker | 43s | 12.7s | ~1s savings, not worth config | +| sccache | 52s cold / 21s warm | 13s | Overhead > benefit for local dev | + +### Why Only 12 Parallel Processes? + +Cargo parallelizes by **crate**, limited by the dependency graph: +- Early build: many leaf crates → high parallelism (11+ rustc) +- Late build: waiting on syn, tokio → low parallelism (1-3 rustc) + +The 64 CPUs help within each crate (LLVM codegen), but crate-level parallelism is dependency-limited. + +### Recommendations + +- **Local dev**: Use defaults. Incremental builds are fast (13s). +- **CI**: Consider sccache if rebuilding from scratch frequently. +- **mold**: Not worth it - linking is not the bottleneck. + +--- + ## References - [Firecracker Documentation](https://github.com/firecracker-microvm/firecracker/tree/main/docs) - [Firecracker API Specification](https://github.com/firecracker-microvm/firecracker/blob/main/src/api_server/swagger/firecracker.yaml) - [Podman Documentation](https://docs.podman.io/) - [slirp4netns](https://github.com/rootless-containers/slirp4netns) -- [nftables Wiki](https://wiki.nftables.org/) +- [iptables Documentation](https://netfilter.org/documentation/) - [KVM Documentation](https://www.linux-kvm.org/page/Documents) --- **End of Design Specification** -*Version: 2.1* -*Date: 2025-12-21* +*Version: 2.2* +*Date: 2025-12-24* *Author: fcvm project* diff --git a/Makefile b/Makefile index ef06303f..8ebb3d40 100644 --- a/Makefile +++ b/Makefile @@ -1,591 +1,148 @@ SHELL := /bin/bash -# Paths (can be overridden via environment for CI) +# Paths (can be overridden via environment) FUSE_BACKEND_RS ?= /home/ubuntu/fuse-backend-rs FUSER ?= /home/ubuntu/fuser -# SUDO prefix - override to empty when already root (e.g., in container) -SUDO ?= sudo - -# Separate target directories for sudo vs non-sudo builds -# This prevents permission conflicts when running tests in parallel -TARGET_DIR := target -TARGET_DIR_ROOT := target-root - -# Container image name and architecture -CONTAINER_IMAGE := fcvm-test +# Container settings +CONTAINER_TAG := fcvm-test:latest CONTAINER_ARCH ?= aarch64 -# Test filter - use to run subset of tests -# Usage: make test-vm FILTER=sanity (runs only *sanity* tests) -# make test-vm FILTER=exec (runs only *exec* tests) +# Test options: FILTER=pattern STREAM=1 LIST=1 FILTER ?= - -# Stream test output (disable capture) - use for debugging -# Usage: make test-vm STREAM=1 (show output as tests run) -STREAM ?= 0 ifeq ($(STREAM),1) NEXTEST_CAPTURE := --no-capture -else -NEXTEST_CAPTURE := endif - -# Enable fc-agent strace debugging - use to diagnose fc-agent crashes -# Usage: make test-vm STRACE=1 (runs fc-agent under strace in VM) -STRACE ?= 0 -ifeq ($(STRACE),1) -FCVM_STRACE_AGENT := 1 +ifeq ($(LIST),1) +NEXTEST_CMD := list else -FCVM_STRACE_AGENT := +NEXTEST_CMD := run endif -# Test commands - organized by root requirement -# Uses cargo-nextest for better parallelism and output handling -# Host tests use CARGO_TARGET_DIR for sudo/non-sudo isolation -# Container tests don't need CARGO_TARGET_DIR - volume mounts provide isolation -# -# nextest benefits: -# - Each test runs in own process (better isolation) -# - Smart parallelism with test groups (see .config/nextest.toml) -# - No doctests by default (no --tests flag needed) -# - Better output: progress, timing, failures highlighted - -# No root required (uses TARGET_DIR): -TEST_UNIT := CARGO_TARGET_DIR=$(TARGET_DIR) cargo nextest run --release --lib -TEST_FUSE_NOROOT := CARGO_TARGET_DIR=$(TARGET_DIR) cargo nextest run --release -p fuse-pipe --test integration -TEST_FUSE_STRESS := CARGO_TARGET_DIR=$(TARGET_DIR) cargo nextest run --release -p fuse-pipe --test test_mount_stress - -# Root required (uses TARGET_DIR_ROOT): -TEST_FUSE_ROOT := CARGO_TARGET_DIR=$(TARGET_DIR_ROOT) cargo nextest run --release -p fuse-pipe --test integration_root -# Note: test_permission_edge_cases requires C pjdfstest with -u/-g flags, only available in container -# Matrix tests run categories in parallel via nextest process isolation -TEST_PJDFSTEST := CARGO_TARGET_DIR=$(TARGET_DIR_ROOT) cargo nextest run --release -p fuse-pipe --test pjdfstest_matrix - -# VM tests: privileged-tests feature gates tests that require sudo -# Unprivileged tests run by default (no feature flag) -# Use -p fcvm to only run fcvm package tests (excludes fuse-pipe) -# -# VM test command - runs all tests with privileged-tests feature -# Sets target runner to "sudo -E" so test binaries run with privileges -# (not set globally in .cargo/config.toml to avoid affecting non-root tests) -# Excludes rootless tests which have signal handling issues under sudo -TEST_VM := sh -c "CARGO_TARGET_DIR=$(TARGET_DIR) FCVM_STRACE_AGENT=$(FCVM_STRACE_AGENT) CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' cargo nextest run -p fcvm --release $(NEXTEST_CAPTURE) --features privileged-tests -E '!test(/rootless/)' $(FILTER)" - -# Container test commands (no CARGO_TARGET_DIR - volume mounts provide isolation) -# No global target runner in .cargo/config.toml, so these run without sudo by default -CTEST_UNIT := cargo nextest run --release --lib -CTEST_FUSE_NOROOT := cargo nextest run --release -p fuse-pipe --test integration -CTEST_FUSE_STRESS := cargo nextest run --release -p fuse-pipe --test test_mount_stress -CTEST_FUSE_ROOT := cargo nextest run --release -p fuse-pipe --test integration_root -CTEST_FUSE_PERMISSION := cargo nextest run --release -p fuse-pipe --test test_permission_edge_cases -CTEST_PJDFSTEST := cargo nextest run --release -p fuse-pipe --test pjdfstest_matrix - -# Container VM tests now use `make test-vm-*` inside container (see container-test-vm-* targets) - -# Benchmark commands (fuse-pipe) -BENCH_THROUGHPUT := cargo bench -p fuse-pipe --bench throughput -BENCH_OPERATIONS := cargo bench -p fuse-pipe --bench operations -BENCH_PROTOCOL := cargo bench -p fuse-pipe --bench protocol - -# Benchmark commands (fcvm - requires VMs) -BENCH_EXEC := cargo bench --bench exec - -.PHONY: all help build build-root build-all clean \ - test test-noroot test-root test-unit test-fuse test-vm test-all \ - test-pjdfstest test-all-host test-all-container ci-local pre-push \ - bench bench-throughput bench-operations bench-protocol bench-exec bench-quick bench-logs bench-clean \ - lint clippy fmt fmt-check \ - container-build container-build-root container-build-rootless container-build-only container-build-allow-other \ - container-test container-test-unit container-test-noroot container-test-root container-test-fuse \ - container-test-vm container-test-pjdfstest container-test-all container-test-allow-other \ - ci-container-rootless ci-container-sudo \ - container-bench container-bench-throughput container-bench-operations container-bench-protocol container-bench-exec \ - container-shell container-clean \ - setup-btrfs setup-rootfs setup-all - -all: build - -help: - @echo "fcvm Build System" - @echo "" - @echo "Development:" - @echo " make build - Build fcvm and fc-agent" - @echo " make clean - Clean build artifacts" - @echo "" - @echo "Testing (with optional FILTER and STREAM):" - @echo " VM tests run with sudo (via CARGO_TARGET_*_RUNNER env vars)" - @echo " Use FILTER= to filter tests matching a pattern, STREAM=1 for live output." - @echo "" - @echo " make test-vm - All VM tests" - @echo " make test-vm FILTER=exec - Only *exec* tests" - @echo " make test-vm FILTER=sanity - Only *sanity* tests" - @echo "" - @echo " make test - All fuse-pipe tests" - @echo " make test-pjdfstest - POSIX compliance (8789 tests)" - @echo " make test-all - Everything" - @echo "" - @echo "Container Testing:" - @echo " make container-test-vm - All VM tests" - @echo " make container-test-vm FILTER=exec - Only *exec* tests" - @echo " make container-test - fuse-pipe tests" - @echo " make container-test-pjdfstest - POSIX compliance" - @echo " make container-test-all - Everything" - @echo " make container-shell - Interactive shell" - @echo "" - @echo "Linting:" - @echo " make lint - Run clippy + fmt-check" - @echo " make fmt - Format code" - @echo "" - @echo "Setup:" - @echo " make setup-btrfs - Create btrfs loopback (kernel/rootfs auto-created by fcvm)" - -#------------------------------------------------------------------------------ -# Setup targets (idempotent) -#------------------------------------------------------------------------------ - -# Create btrfs loopback filesystem if not mounted -# Kernel is auto-downloaded by fcvm binary from Kata release (see rootfs-plan.toml) -setup-btrfs: - @if ! mountpoint -q /mnt/fcvm-btrfs 2>/dev/null; then \ - echo '==> Creating btrfs loopback...'; \ - if [ ! -f /var/fcvm-btrfs.img ]; then \ - sudo truncate -s 20G /var/fcvm-btrfs.img && \ - sudo mkfs.btrfs /var/fcvm-btrfs.img; \ - fi && \ - sudo mkdir -p /mnt/fcvm-btrfs && \ - sudo mount -o loop /var/fcvm-btrfs.img /mnt/fcvm-btrfs && \ - sudo mkdir -p /mnt/fcvm-btrfs/{kernels,rootfs,initrd,state,snapshots,vm-disks,cache} && \ - sudo chown -R $$(id -un):$$(id -gn) /mnt/fcvm-btrfs && \ - echo '==> btrfs ready at /mnt/fcvm-btrfs'; \ - fi - -# Create base rootfs if missing (requires build + setup-btrfs) -# Rootfs and kernel are auto-created by fcvm binary on first VM start -setup-rootfs: build setup-btrfs - @echo '==> Rootfs and kernel will be auto-created on first VM start' - -# Full setup -setup-all: setup-btrfs setup-rootfs - @echo "==> Setup complete" - -#------------------------------------------------------------------------------ -# Build targets -#------------------------------------------------------------------------------ - -# Detect musl target for current architecture +# Architecture detection ARCH := $(shell uname -m) ifeq ($(ARCH),aarch64) MUSL_TARGET := aarch64-unknown-linux-musl -else ifeq ($(ARCH),x86_64) -MUSL_TARGET := x86_64-unknown-linux-musl else -MUSL_TARGET := unknown -endif - -# Build non-root targets (uses TARGET_DIR) -# Builds fcvm, fc-agent binaries AND test harnesses -# fc-agent is built with musl for static linking (portable across glibc versions) -build: - @echo "==> Building non-root targets..." - CARGO_TARGET_DIR=$(TARGET_DIR) cargo build --release -p fcvm - @echo "==> Building fc-agent with musl (statically linked)..." - CARGO_TARGET_DIR=$(TARGET_DIR) cargo build --release -p fc-agent --target $(MUSL_TARGET) - @mkdir -p $(TARGET_DIR)/release - cp $(TARGET_DIR)/$(MUSL_TARGET)/release/fc-agent $(TARGET_DIR)/release/fc-agent - CARGO_TARGET_DIR=$(TARGET_DIR) cargo test --release --all-targets --no-run - -# Build root targets (uses TARGET_DIR_ROOT, run with sudo) -# Builds fcvm, fc-agent binaries AND test harnesses -# fc-agent is built with musl for static linking (portable across glibc versions) -build-root: - @echo "==> Building root targets..." - sudo CARGO_TARGET_DIR=$(TARGET_DIR_ROOT) cargo build --release -p fcvm - @echo "==> Building fc-agent with musl (statically linked)..." - sudo CARGO_TARGET_DIR=$(TARGET_DIR_ROOT) cargo build --release -p fc-agent --target $(MUSL_TARGET) - sudo mkdir -p $(TARGET_DIR_ROOT)/release - sudo cp -f $(TARGET_DIR_ROOT)/$(MUSL_TARGET)/release/fc-agent $(TARGET_DIR_ROOT)/release/fc-agent - sudo CARGO_TARGET_DIR=$(TARGET_DIR_ROOT) cargo test --release --all-targets --no-run - -# Build everything (both target dirs) -build-all: build build-root - -clean: - # Use sudo to ensure we can remove any root-owned files - sudo rm -rf $(TARGET_DIR) $(TARGET_DIR_ROOT) - -#------------------------------------------------------------------------------ -# Testing (native) - organized by root requirement -#------------------------------------------------------------------------------ - -# Tests that don't require root (run first for faster feedback) -test-noroot: build - @echo "==> Running tests (no root required)..." - $(TEST_UNIT) - $(TEST_FUSE_NOROOT) - $(TEST_FUSE_STRESS) - -# Tests that require root -test-root: build-root - @echo "==> Running tests (root required)..." - sudo $(TEST_FUSE_ROOT) - -# All fuse-pipe tests: noroot first, then root -test: test-noroot test-root - -# Unit tests only -test-unit: build - $(TEST_UNIT) - -# All fuse-pipe tests (needs both builds) -test-fuse: build build-root - $(TEST_FUSE_NOROOT) - $(TEST_FUSE_STRESS) - sudo $(TEST_FUSE_ROOT) - -# VM tests - runs all tests with privileged-tests feature -# Test binaries run with sudo via CARGO_TARGET_*_RUNNER env vars -# Use FILTER= to run subset, e.g.: make test-vm FILTER=exec -test-vm: build setup-btrfs -ifeq ($(STREAM),1) - @echo "==> STREAM=1: Output streams live (parallel disabled)" -else - @echo "==> STREAM=0: Output captured until test completes (use STREAM=1 for live output)" +MUSL_TARGET := x86_64-unknown-linux-musl endif - $(TEST_VM) - -# POSIX compliance tests (host - requires pjdfstest installed) -test-pjdfstest: build-root - @echo "==> Running POSIX compliance tests (8789 tests)..." - sudo $(TEST_PJDFSTEST) - -# Run everything (use container-test-pjdfstest for POSIX compliance) -test-all: test test-vm test-pjdfstest - -#------------------------------------------------------------------------------ -# Benchmarks (native) -#------------------------------------------------------------------------------ - -bench: build - @echo "==> Running all benchmarks..." - sudo $(BENCH_THROUGHPUT) - sudo $(BENCH_OPERATIONS) - $(BENCH_PROTOCOL) -bench-throughput: build - sudo $(BENCH_THROUGHPUT) +# Base test command +NEXTEST := CARGO_TARGET_DIR=target cargo nextest $(NEXTEST_CMD) --release -bench-operations: build - sudo $(BENCH_OPERATIONS) +# Container run command +CONTAINER_RUN := podman run --rm --privileged --userns=keep-id --group-add keep-groups \ + -v .:/workspace/fcvm -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs -v $(FUSER):/workspace/fuser \ + -v ./target:/workspace/fcvm/target -v ./cargo-home:/home/testuser/.cargo \ + -e CARGO_HOME=/home/testuser/.cargo --device /dev/fuse --device /dev/kvm \ + --ulimit nofile=65536:65536 --pids-limit=65536 -v /mnt/fcvm-btrfs:/mnt/fcvm-btrfs -bench-protocol: build - $(BENCH_PROTOCOL) +.PHONY: all help build clean test test-unit test-fast test-all test-root \ + _test-unit _test-fast _test-all _test-root \ + container-build container-test container-test-unit container-test-fast container-test-all \ + container-shell container-clean setup-btrfs setup-fcvm setup-pjdfstest bench lint fmt -bench-exec: build setup-btrfs - @echo "==> Running exec benchmarks (bridged vs rootless)..." - sudo $(BENCH_EXEC) - -bench-quick: build - @echo "==> Running quick benchmarks..." - sudo cargo bench -p fuse-pipe --bench throughput -- --quick - sudo cargo bench -p fuse-pipe --bench operations -- --quick - -bench-logs: - @echo "==> Recent benchmark logs..." - @ls -lt /tmp/fuse-bench-*.log 2>/dev/null | head -5 || echo 'No logs found' - @echo "" - @echo "==> Latest telemetry..." - @cat $$(ls -t /tmp/fuse-bench-telemetry-*.json 2>/dev/null | head -1) 2>/dev/null | jq . || echo 'No telemetry found' - -bench-clean: - @echo "==> Cleaning benchmark artifacts..." - rm -rf target/criterion - rm -f /tmp/fuse-bench-*.log /tmp/fuse-bench-telemetry-*.json /tmp/fuse-stress*.sock /tmp/fuse-ops-bench-*.sock - -#------------------------------------------------------------------------------ -# Linting -#------------------------------------------------------------------------------ - -lint: clippy fmt-check - -clippy: - @echo "==> Running clippy..." - cargo clippy --all-targets --all-features -- -D warnings +all: build -fmt: - @echo "==> Formatting code..." - cargo fmt +help: + @echo "fcvm: make build | test-unit | test-fast | test-all | test-root" + @echo " make container-test-unit | container-test-fast | container-test-all" + @echo "Options: FILTER=pattern STREAM=1 LIST=1" -fmt-check: - @echo "==> Checking format..." - cargo fmt -- --check +build: + @echo "==> Building..." + CARGO_TARGET_DIR=target cargo build --release -p fcvm + CARGO_TARGET_DIR=target cargo build --release -p fc-agent --target $(MUSL_TARGET) + @mkdir -p target/release && cp target/$(MUSL_TARGET)/release/fc-agent target/release/fc-agent +clean: + sudo rm -rf target cargo-home -#------------------------------------------------------------------------------ -# Container testing -#------------------------------------------------------------------------------ +# Run-only targets (no setup deps, used by container) +_test-unit: + $(NEXTEST) --no-default-features -# Container tag - podman layer caching handles incremental builds -CONTAINER_TAG := fcvm-test:latest +_test-fast: + $(NEXTEST) $(NEXTEST_CAPTURE) --no-default-features --features integration-fast $(FILTER) -# CI mode: use host directories instead of named volumes (for artifact sharing) -# Set CI=1 to enable artifact-compatible mode -# Note: Container tests use separate volumes for root vs non-root to avoid permission conflicts -CI ?= 0 -ifeq ($(CI),1) -VOLUME_TARGET := -v ./target:/workspace/fcvm/target -VOLUME_TARGET_ROOT := -v ./target-root:/workspace/fcvm/target -VOLUME_CARGO := -v ./cargo-home:/home/testuser/.cargo -else -VOLUME_TARGET := -v fcvm-cargo-target:/workspace/fcvm/target -VOLUME_TARGET_ROOT := -v fcvm-cargo-target-root:/workspace/fcvm/target -VOLUME_CARGO := -v fcvm-cargo-home:/home/testuser/.cargo -endif +_test-all: + $(NEXTEST) $(NEXTEST_CAPTURE) $(FILTER) -# Container run with source mounts (code always fresh, can't run stale) -# Cargo cache goes to testuser's home so non-root builds work -# Note: We have separate bases for root vs non-root to use different target volumes -# Uses rootless podman - no sudo needed. --privileged grants capabilities within -# user namespace which is sufficient for fuse tests and VM tests. -CONTAINER_RUN_BASE := podman run --rm --privileged \ - --group-add keep-groups \ - -v .:/workspace/fcvm \ - -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs \ - -v $(FUSER):/workspace/fuser \ - $(VOLUME_TARGET) \ - $(VOLUME_CARGO) \ - -e CARGO_HOME=/home/testuser/.cargo +_test-root: + CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' \ + CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUNNER='sudo -E' \ + $(NEXTEST) $(NEXTEST_CAPTURE) --features privileged-tests $(FILTER) -# Same as CONTAINER_RUN_BASE but uses sudo podman for root tests -# Must use sudo because container-build-root builds with sudo podman, -# and sudo/rootless podman have separate image stores -CONTAINER_RUN_BASE_ROOT := sudo podman run --rm --privileged \ - --group-add keep-groups \ - -v .:/workspace/fcvm \ - -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs \ - -v $(FUSER):/workspace/fuser \ - $(VOLUME_TARGET_ROOT) \ - $(VOLUME_CARGO) \ - -e CARGO_HOME=/home/testuser/.cargo +# Host targets (with setup) +test-unit: build _test-unit +test-fast: setup-fcvm _test-fast +test-all: setup-fcvm _test-all +test-root: setup-fcvm setup-pjdfstest _test-root +test: test-root -# Container run options for fuse-pipe tests (non-root) -CONTAINER_RUN_FUSE := $(CONTAINER_RUN_BASE) \ - --device /dev/fuse \ - --ulimit nofile=65536:65536 \ - --ulimit nproc=65536:65536 \ - --pids-limit=-1 +# Container targets (setup on host where needed, run-only in container) +container-test-unit: container-build + @echo "==> Running unit tests in container..." + $(CONTAINER_RUN) $(CONTAINER_TAG) make build _test-unit -# Container run options for fuse-pipe tests (root) -# Note: --device-cgroup-rule not supported in rootless mode -# Uses --user root to override Containerfile's USER testuser -CONTAINER_RUN_FUSE_ROOT := $(CONTAINER_RUN_BASE_ROOT) \ - --user root \ - --device /dev/fuse \ - --ulimit nofile=65536:65536 \ - --ulimit nproc=65536:65536 \ - --pids-limit=-1 +container-test-fast: setup-fcvm container-build + @echo "==> Running fast tests in container..." + $(CONTAINER_RUN) $(CONTAINER_TAG) make _test-fast -# Container run options for fcvm tests (adds KVM, btrfs, netns) -# Used for bridged mode tests that require root/iptables -# REQUIRES sudo - network namespace creation needs real root, not user namespace root -# Uses VOLUME_TARGET_ROOT for isolation from rootless podman builds -# Note: /run/systemd/resolve mount provides real DNS servers when host uses systemd-resolved -CONTAINER_RUN_FCVM := sudo podman run --rm --privileged \ - --group-add keep-groups \ - -v .:/workspace/fcvm \ - -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs \ - -v $(FUSER):/workspace/fuser \ - $(VOLUME_TARGET_ROOT) \ - $(VOLUME_CARGO) \ - -e CARGO_HOME=/home/testuser/.cargo \ - --device /dev/kvm \ - --device /dev/fuse \ - --ulimit nofile=65536:65536 \ - --ulimit nproc=65536:65536 \ - --pids-limit=-1 \ - -v /mnt/fcvm-btrfs:/mnt/fcvm-btrfs \ - -v /var/run/netns:/var/run/netns:rshared \ - -v /run/systemd/resolve:/run/systemd/resolve:ro \ - --network host +container-test-all: setup-fcvm container-build + @echo "==> Running all tests in container..." + $(CONTAINER_RUN) $(CONTAINER_TAG) make _test-all -# Container run for rootless networking tests -# Uses rootless podman (no sudo!) with --privileged for user namespace capabilities. -# --privileged with rootless podman grants capabilities within the user namespace, -# not actual host root. We're root inside the container but unprivileged on host. -# --group-add keep-groups preserves host user's groups (kvm) for /dev/kvm access. -# --device /dev/userfaultfd needed for snapshot/clone UFFD memory sharing. -# The container's user namespace is the isolation boundary. -ifeq ($(CI),1) -VOLUME_TARGET_ROOTLESS := -v ./target:/workspace/fcvm/target -VOLUME_CARGO_ROOTLESS := -v ./cargo-home:/home/testuser/.cargo -else -VOLUME_TARGET_ROOTLESS := -v fcvm-cargo-target-rootless:/workspace/fcvm/target -VOLUME_CARGO_ROOTLESS := -v fcvm-cargo-home-rootless:/home/testuser/.cargo -endif -CONTAINER_RUN_ROOTLESS := podman --root=/tmp/podman-rootless run --rm \ - --privileged \ - --group-add keep-groups \ - -v .:/workspace/fcvm \ - -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs \ - -v $(FUSER):/workspace/fuser \ - $(VOLUME_TARGET_ROOTLESS) \ - $(VOLUME_CARGO_ROOTLESS) \ - -e CARGO_HOME=/home/testuser/.cargo \ - --device /dev/kvm \ - --device /dev/net/tun \ - --device /dev/userfaultfd \ - -v /mnt/fcvm-btrfs:/mnt/fcvm-btrfs \ - --network host +container-test: container-test-all -# Build containers - podman layer caching handles incremental builds -# CONTAINER_ARCH can be overridden: export CONTAINER_ARCH=x86_64 for CI container-build: - @echo "==> Building rootless container (ARCH=$(CONTAINER_ARCH))..." podman build -t $(CONTAINER_TAG) -f Containerfile --build-arg ARCH=$(CONTAINER_ARCH) . -container-build-root: - @echo "==> Building root container (ARCH=$(CONTAINER_ARCH))..." - sudo podman build -t $(CONTAINER_TAG) -f Containerfile --build-arg ARCH=$(CONTAINER_ARCH) . - -container-build-rootless: container-build - -# Container tests - organized by root requirement -# Non-root tests run with --user testuser to verify they don't need root -# fcvm unit tests with network ops skip themselves when not root -# Uses CTEST_* commands (no CARGO_TARGET_DIR - volume mounts provide isolation) -container-test-unit: container-build - @echo "==> Running unit tests as non-root user..." - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_UNIT) - -container-test-noroot: container-build - @echo "==> Running tests as non-root user..." - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_UNIT) - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_FUSE_NOROOT) - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_FUSE_STRESS) - -# Root tests run as root inside container (uses separate volume) -container-test-root: container-build-root - @echo "==> Running tests as root..." - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) $(CTEST_FUSE_ROOT) - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) $(CTEST_FUSE_PERMISSION) - -# All fuse-pipe tests (explicit) - matches native test-fuse -# Note: Uses both volumes since it mixes root and non-root tests -container-test-fuse: container-build container-build-root - @echo "==> Running all fuse-pipe tests..." - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_FUSE_NOROOT) - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) $(CTEST_FUSE_STRESS) - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) $(CTEST_FUSE_ROOT) - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) $(CTEST_FUSE_PERMISSION) - -# Test AllowOther with user_allow_other configured (non-root with config) -# Uses separate image with user_allow_other pre-configured -CONTAINER_IMAGE_ALLOW_OTHER := fcvm-test-allow-other - -container-build-allow-other: container-build - @echo "==> Building allow-other container..." - podman build -t $(CONTAINER_IMAGE_ALLOW_OTHER) -f Containerfile.allow-other . - -container-test-allow-other: container-build-allow-other - @echo "==> Testing AllowOther with user_allow_other in fuse.conf..." - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_IMAGE_ALLOW_OTHER) cargo test --release -p fuse-pipe --test test_allow_other -- --nocapture - -# All fuse-pipe tests: noroot first, then root -container-test: container-test-noroot container-test-root - -# VM tests in container -# Uses privileged container, test binaries run with sudo via CARGO_TARGET_*_RUNNER -# Use FILTER= to run subset, e.g.: make container-test-vm FILTER=exec -container-test-vm: container-build-root setup-btrfs - $(CONTAINER_RUN_FCVM) $(CONTAINER_TAG) make test-vm TARGET_DIR=target FILTER=$(FILTER) STREAM=$(STREAM) STRACE=$(STRACE) - -container-test-pjdfstest: container-build-root - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) $(CTEST_PJDFSTEST) - -# Run everything in container -container-test-all: container-test container-test-vm container-test-pjdfstest - -#------------------------------------------------------------------------------ -# CI Targets (one command per job) -#------------------------------------------------------------------------------ - -# CI Job 1: Lint + rootless FUSE tests -ci-container-rootless: container-build - $(MAKE) lint - $(CONTAINER_RUN_FUSE) --user testuser $(CONTAINER_TAG) \ - cargo nextest run --release --lib -p fuse-pipe --test integration --test test_mount_stress --test test_unmount_race - -# CI Job 2: Root FUSE tests + POSIX compliance -ci-container-sudo: container-build-root - $(CONTAINER_RUN_FUSE_ROOT) $(CONTAINER_TAG) \ - cargo nextest run --release -p fuse-pipe --test integration_root --test test_permission_edge_cases --test pjdfstest_matrix - -# CI Job 3: VM tests (container-test-vm already exists above) - -# Container benchmarks - uses same commands as native benchmarks -container-bench: container-build - @echo "==> Running all fuse-pipe benchmarks..." - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_THROUGHPUT) - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_OPERATIONS) - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_PROTOCOL) - -container-bench-throughput: container-build - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_THROUGHPUT) - -container-bench-operations: container-build - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_OPERATIONS) - -container-bench-protocol: container-build - $(CONTAINER_RUN_FUSE) $(CONTAINER_TAG) $(BENCH_PROTOCOL) - -# fcvm exec benchmarks - requires VMs (uses CONTAINER_RUN_FCVM) -container-bench-exec: container-build setup-btrfs - @echo "==> Running exec benchmarks (bridged vs rootless)..." - $(CONTAINER_RUN_FCVM) $(CONTAINER_TAG) $(BENCH_EXEC) - container-shell: container-build - $(CONTAINER_RUN_FUSE) -it $(CONTAINER_TAG) bash + $(CONTAINER_RUN) -it $(CONTAINER_TAG) bash -# Force container rebuild (removes images and volumes) container-clean: podman rmi $(CONTAINER_TAG) 2>/dev/null || true - sudo podman rmi $(CONTAINER_TAG) 2>/dev/null || true - podman volume rm fcvm-cargo-target fcvm-cargo-target-root fcvm-cargo-home 2>/dev/null || true -#------------------------------------------------------------------------------ -# CI Simulation (local) -#------------------------------------------------------------------------------ +# Setup targets +setup-pjdfstest: + @if [ ! -x /tmp/pjdfstest-check/pjdfstest ]; then \ + echo '==> Building pjdfstest...'; \ + rm -rf /tmp/pjdfstest-check && \ + git clone --depth 1 https://github.com/pjd/pjdfstest /tmp/pjdfstest-check && \ + cd /tmp/pjdfstest-check && autoreconf -ifs && ./configure && make; \ + fi -# Run full CI locally with max parallelism -# Phase 1: Build all 5 target directories in parallel (host x2, container x3) -# Phase 2: Run all tests in parallel (they use pre-built binaries) -ci-local: - @echo "==> Phase 1: Building all targets in parallel..." - $(MAKE) -j build build-root container-build container-build-root container-build-rootless - @echo "==> Phase 2: Running all tests in parallel..." - $(MAKE) -j \ - lint \ - test-unit \ - test-fuse \ - test-pjdfstest \ - test-vm \ - container-test-noroot \ - container-test-root \ - container-test-pjdfstest \ - container-test-vm - @echo "==> CI local complete" +setup-btrfs: + @if ! mountpoint -q /mnt/fcvm-btrfs 2>/dev/null; then \ + echo '==> Creating btrfs loopback...'; \ + if [ ! -f /var/fcvm-btrfs.img ]; then \ + sudo truncate -s 20G /var/fcvm-btrfs.img && sudo mkfs.btrfs /var/fcvm-btrfs.img; \ + fi && \ + sudo mkdir -p /mnt/fcvm-btrfs && \ + sudo mount -o loop /var/fcvm-btrfs.img /mnt/fcvm-btrfs && \ + sudo mkdir -p /mnt/fcvm-btrfs/{kernels,rootfs,initrd,state,snapshots,vm-disks,cache} && \ + sudo chown -R $$(id -un):$$(id -gn) /mnt/fcvm-btrfs && \ + echo '==> btrfs ready at /mnt/fcvm-btrfs'; \ + fi -# Quick pre-push check (just lint + unit, parallel) -pre-push: build - $(MAKE) -j lint test-unit - @echo "==> Ready to push" +setup-fcvm: build setup-btrfs + @FREE_GB=$$(df -BG /mnt/fcvm-btrfs 2>/dev/null | awk 'NR==2 {gsub("G",""); print $$4}'); \ + if [ -n "$$FREE_GB" ] && [ "$$FREE_GB" -lt 15 ]; then \ + echo "ERROR: Need 15GB on /mnt/fcvm-btrfs (have $${FREE_GB}GB)"; \ + exit 1; \ + fi + @echo "==> Running fcvm setup..." + ./target/release/fcvm setup -# Host-only tests (parallel, builds both target dirs first) -# test-vm runs all VM tests (privileged + unprivileged) -test-all-host: - $(MAKE) -j build build-root - $(MAKE) -j lint test-unit test-fuse test-pjdfstest test-vm +bench: build + @echo "==> Running benchmarks..." + sudo cargo bench -p fuse-pipe --bench throughput + sudo cargo bench -p fuse-pipe --bench operations + cargo bench -p fuse-pipe --bench protocol -# Container-only tests (parallel, builds all 3 container target dirs first) -test-all-container: - $(MAKE) -j container-build container-build-root container-build-rootless - $(MAKE) -j container-test-noroot container-test-root container-test-pjdfstest container-test-vm +lint: + cargo test --test lint + +fmt: + cargo fmt diff --git a/README.md b/README.md index 8054ba00..596e6fcb 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ A Rust implementation that launches Firecracker microVMs to run Podman container **Runtime Dependencies** - Rust 1.83+ with cargo (nightly for fuser crate) - Firecracker binary in PATH -- For bridged networking: sudo, iptables, iproute2, dnsmasq +- For bridged networking: sudo, iptables, iproute2 - For rootless networking: slirp4netns - For building rootfs: qemu-utils, e2fsprogs @@ -37,9 +37,9 @@ A Rust implementation that launches Firecracker microVMs to run Podman container **Container Testing (Recommended)** - All dependencies bundled: ```bash # Just needs podman and /dev/kvm -make container-test # fuse-pipe tests -make container-test-vm # VM tests (rootless + bridged) -make container-test-all # Everything +make container-test-unit # Unit tests (no VMs) +make container-test-integration-fast # Quick VM tests (<30s each) +make container-test-root # All tests including pjdfstest ``` **Native Testing** - Additional dependencies required: @@ -50,7 +50,7 @@ make container-test-all # Everything | pjdfstest build | autoconf, automake, libtool | | pjdfstest runtime | perl | | bindgen (userfaultfd-sys) | libclang-dev, clang | -| VM tests | iproute2, iptables, slirp4netns, dnsmasq | +| VM tests | iproute2, iptables, slirp4netns | | Rootfs build | qemu-utils, e2fsprogs | | User namespaces | uidmap (for newuidmap/newgidmap) | @@ -66,7 +66,7 @@ sudo apt-get update && sudo apt-get install -y \ fuse3 libfuse3-dev \ autoconf automake libtool perl \ libclang-dev clang \ - iproute2 iptables slirp4netns dnsmasq \ + iproute2 iptables slirp4netns \ qemu-utils e2fsprogs \ uidmap ``` @@ -81,6 +81,13 @@ sudo apt-get update && sudo apt-get install -y \ cargo build --release --workspace ``` +### Setup (First Time) +```bash +# Create btrfs filesystem and download kernel + rootfs (takes 5-10 minutes) +make setup-btrfs +fcvm setup +``` + ### Run a Container ```bash # Run nginx in a Firecracker VM (using AWS ECR public registry to avoid Docker Hub rate limits) @@ -262,311 +269,109 @@ sudo fcvm podman run --name full \ ``` fcvm/ -├── src/ # Host CLI -│ ├── main.rs # Entry point -│ ├── cli/ # Command-line parsing -│ ├── commands/ # Command implementations (podman, snapshot, ls) -│ ├── firecracker/ # Firecracker API client -│ ├── network/ # Networking (bridged, slirp) -│ ├── storage/ # Disk/snapshot management -│ ├── state/ # VM state persistence -│ ├── health.rs # Health monitoring -│ ├── uffd/ # UFFD memory sharing -│ └── volume/ # Volume/FUSE mount handling -│ -├── fc-agent/ # Guest agent -│ └── src/main.rs # Container orchestration inside VM -│ -├── fuse-pipe/ # FUSE passthrough library -│ ├── src/ # Client/server for host directory sharing -│ ├── tests/ # Integration tests -│ └── benches/ # Performance benchmarks -│ -└── tests/ # Integration tests - ├── common/mod.rs # Shared test utilities - ├── test_sanity.rs # Basic VM lifecycle - ├── test_state_manager.rs - ├── test_health_monitor.rs - ├── test_fuse_posix.rs - ├── test_fuse_in_vm.rs - ├── test_localhost_image.rs - └── test_snapshot_clone.rs +├── src/ # Host CLI (fcvm binary) +├── fc-agent/ # Guest agent (runs inside VM) +├── fuse-pipe/ # FUSE passthrough library +└── tests/ # Integration tests (16 files) ``` +See [DESIGN.md](DESIGN.md#directory-structure) for detailed structure. + --- ## CLI Reference -### Global Options - -| Option | Description | -|--------|-------------| -| `--base-dir ` | Base directory for all fcvm data (default: `/mnt/fcvm-btrfs` or `FCVM_BASE_DIR` env) | -| `--sub-process` | Running as subprocess (disables timestamp/level in logs) | +Run `fcvm --help` or `fcvm --help` for full options. ### Commands -#### `fcvm ls` -List running VMs. - -| Option | Description | -|--------|-------------| -| `--json` | Output in JSON format | -| `--pid ` | Filter by fcvm process PID | - -#### `fcvm snapshots` -List available snapshots. - -#### `fcvm podman run` -Run a container in a Firecracker VM. - -| Option | Default | Description | -|--------|---------|-------------| -| `` | (required) | Container image (e.g., `nginx:alpine` or `localhost/myimage`) | -| `--name ` | (required) | VM name | -| `--cpu ` | 2 | Number of vCPUs | -| `--mem ` | 2048 | Memory in MiB | -| `--map ` | | Volume mapping(s), comma-separated. Append `:ro` for read-only | -| `--env ` | | Environment variables, comma-separated or repeated | -| `--cmd ` | | Command to run inside container | -| `--publish <[IP:]HPORT:GPORT[/PROTO]>` | | Port forwarding, comma-separated | -| `--network ` | bridged | Network mode: `bridged` or `rootless` | -| `--health-check ` | | HTTP health check URL. If not specified, uses container ready signal via vsock | -| `--balloon ` | (none) | Balloon device target MiB. If not specified, no balloon device is configured | -| `--privileged` | false | Run container in privileged mode (allows mknod, device access) | - -#### `fcvm snapshot create` -Create a snapshot from a running VM. - -| Option | Description | -|--------|-------------| -| `` | VM name to snapshot (mutually exclusive with `--pid`) | -| `--pid ` | VM PID to snapshot (mutually exclusive with name) | -| `--tag ` | Custom snapshot name (defaults to VM name) | - -#### `fcvm snapshot serve ` -Start UFFD memory server to serve pages on-demand for cloning. - -#### `fcvm snapshot run` -Run a clone from a snapshot. - -| Option | Default | Description | -|--------|---------|-------------| -| `--pid ` | (required) | Serve process PID to clone from | -| `--name ` | (auto) | Custom name for cloned VM | -| `--publish <[IP:]HPORT:GPORT[/PROTO]>` | | Port forwarding | -| `--network ` | bridged | Network mode: `bridged` or `rootless` | -| `--exec ` | | Execute command in container after clone starts, then cleanup | - -#### `fcvm snapshot ls` -List running snapshot servers. - -#### `fcvm exec` -Execute a command in a running VM or container. Mirrors `podman exec` behavior. - -| Option | Description | -|--------|-------------| -| `` | VM name (mutually exclusive with `--pid`) | -| `--pid ` | VM PID (mutually exclusive with name) | -| `--vm` | Execute in the VM instead of inside the container | -| `-i, --interactive` | Keep STDIN open | -| `-t, --tty` | Allocate pseudo-TTY | -| `-- ...` | Command and arguments to execute | - -**Auto-detection**: When running a shell (bash, sh, zsh, etc.) with a TTY stdin, `-it` is enabled automatically. - -**Examples:** -```bash -# Execute inside container (default, sudo needed to read VM state) -sudo fcvm exec my-vm -- cat /etc/os-release -sudo fcvm exec --pid 12345 -- wget -q -O - ifconfig.me +| Command | Description | +|---------|-------------| +| `fcvm setup` | Download kernel (~15MB) and create rootfs (~10GB). Takes 5-10 min first run | +| `fcvm podman run` | Run container in Firecracker VM | +| `fcvm exec` | Execute command in running VM/container | +| `fcvm ls` | List running VMs (`--json` for JSON output) | +| `fcvm snapshot create` | Create snapshot from running VM | +| `fcvm snapshot serve` | Start UFFD memory server for cloning | +| `fcvm snapshot run` | Spawn clone from memory server | +| `fcvm snapshots` | List available snapshots | -# Execute in VM (guest OS) -sudo fcvm exec my-vm --vm -- hostname -sudo fcvm exec --pid 12345 --vm -- curl -s ifconfig.me +See [DESIGN.md](DESIGN.md#commands) for full option reference. -# Interactive shell (auto-detects -it when stdin is a TTY) -sudo fcvm exec my-vm -- bash -sudo fcvm exec my-vm --vm -- bash +### Key Options -# Explicit TTY flags (like podman exec -it) -sudo fcvm exec my-vm -it -- sh -sudo fcvm exec my-vm --vm -it -- bash +**`fcvm podman run`** - Essential options: +``` +--name VM name (required) +--network bridged (default, needs sudo) or rootless +--publish Port forward host:guest (e.g., 8080:80) +--map Volume mount host:guest (optional :ro for read-only) +--env Environment variable +--setup Auto-setup if kernel/rootfs missing (rootless only) +``` + +**`fcvm exec`** - Execute in VM/container: +```bash +sudo fcvm exec my-vm -- cat /etc/os-release # In container +sudo fcvm exec my-vm --vm -- curl -s ifconfig.me # In guest OS +sudo fcvm exec my-vm -- bash # Interactive shell ``` --- ## Network Modes -| Mode | Flag | Root Required | Performance | -|------|------|---------------|-------------| -| Bridged | `--network bridged` | Yes | Better | -| Rootless | `--network rootless` | No | Good | - -**Bridged**: Uses iptables NAT, requires sudo. Port forwarding via DNAT rules. +| Mode | Flag | Root | Notes | +|------|------|------|-------| +| Bridged | `--network bridged` | Yes | iptables NAT, better performance | +| Rootless | `--network rootless` | No | slirp4netns, works without root | -**Rootless**: Uses slirp4netns in user namespace. Port forwarding via slirp4netns API. +See [DESIGN.md](DESIGN.md#networking) for architecture details. --- ## Container Behavior -### Exit Code Forwarding - -When a container exits, fcvm forwards its exit code: +- **Exit codes**: Container exit code forwarded to host via vsock +- **Logs**: Container stdout/stderr prefixed with `[ctr:out]`/`[ctr:err]` +- **Health**: Default uses vsock ready signal; optional `--health-check` for HTTP -```bash -# Container exits with code 0 → fcvm returns 0 -sudo fcvm podman run --name test --cmd "exit 0" public.ecr.aws/nginx/nginx:alpine -echo $? # 0 - -# Container exits with code 42 → fcvm returns error -sudo fcvm podman run --name test --cmd "exit 42" public.ecr.aws/nginx/nginx:alpine -# ERROR fcvm: Error: container exited with code 42 -echo $? # 1 -``` - -Exit codes are communicated from fc-agent (inside VM) to fcvm (host) via vsock status channel (port 4999). - -### Container Logs - -Container stdout/stderr flows through the serial console: -1. Container writes to stdout/stderr -2. fc-agent prefixes with `[ctr:out]` or `[ctr:err]` and writes to serial console -3. Firecracker sends serial output to fcvm -4. fcvm logs via tracing (visible on stderr) - -Example output: -``` -INFO firecracker: fc-agent[292]: [ctr:out] hello world -INFO firecracker: fc-agent[292]: [ctr:err] error message -``` - -### Health Checks - -**Default behavior**: fcvm waits for fc-agent to signal container readiness via vsock. No HTTP polling needed. - -**Custom HTTP health check**: Use `--health-check` for HTTP-based health monitoring: -```bash -sudo fcvm podman run --name web --health-check http://localhost:80/health nginx:alpine -``` - -With custom health checks, fcvm polls the URL until it returns 2xx status. +See [DESIGN.md](DESIGN.md#guest-agent) for details. --- ## Environment Variables -| Variable | Description | Default | -|----------|-------------|---------| -| `FCVM_BASE_DIR` | Base directory for all fcvm data | `/mnt/fcvm-btrfs` | -| `RUST_LOG` | Logging level and filters | `info` | - -### Examples - -```bash -# Use different base directory -FCVM_BASE_DIR=/data/fcvm sudo fcvm podman run ... - -# Increase logging verbosity -RUST_LOG=debug sudo fcvm podman run ... - -# Debug specific component -RUST_LOG=firecracker=debug,health-monitor=debug sudo fcvm podman run ... - -# Silence all logs -RUST_LOG=off sudo fcvm podman run ... 2>/dev/null -``` +| Variable | Default | Description | +|----------|---------|-------------| +| `FCVM_BASE_DIR` | `/mnt/fcvm-btrfs` | Base directory for all data | +| `RUST_LOG` | `info` | Logging level (e.g., `debug`, `firecracker=debug`) | --- ## Testing -### Makefile Targets - -Run `make help` for the full list. Key targets: - -#### Development -| Target | Description | -|--------|-------------| -| `make build` | Build fcvm and fc-agent | -| `make clean` | Clean build artifacts | - -#### Testing (with optional FILTER and STREAM) - -VM tests run with sudo via `CARGO_TARGET_*_RUNNER` env vars (set in Makefile). -Use `FILTER=` to filter tests by name, `STREAM=1` for live output. - -| Target | Description | -|--------|-------------| -| `make test-vm` | All VM tests (runs with sudo via target runner) | -| `make test-vm FILTER=sanity` | Only sanity tests | -| `make test-vm FILTER=exec` | Only exec tests | -| `make test-vm STREAM=1` | All tests with live output | -| `make container-test-vm` | VM tests in container | -| `make container-test-vm FILTER=exec` | Only exec tests in container | -| `make test-all` | Everything | - -#### Linting -| Target | Description | -|--------|-------------| -| `make lint` | Run clippy + fmt-check | -| `make clippy` | Run cargo clippy | -| `make fmt` | Format code | -| `make fmt-check` | Check formatting | - -#### Benchmarks -| Target | Description | -|--------|-------------| -| `make bench` | All benchmarks (throughput + operations + protocol) | -| `make bench-throughput` | I/O throughput benchmarks | -| `make bench-operations` | FUSE operation latency benchmarks | -| `make bench-protocol` | Wire protocol benchmarks | -| `make bench-quick` | Quick benchmarks (faster iteration) | -| `make bench-logs` | View recent benchmark logs/telemetry | -| `make bench-clean` | Clean benchmark artifacts | - -### Test Files - -#### fcvm Integration Tests (`tests/`) -| File | Description | -|------|-------------| -| `test_sanity.rs` | Basic VM startup and health check (rootless + bridged) | -| `test_state_manager.rs` | State management unit tests | -| `test_health_monitor.rs` | Health monitoring tests | -| `test_fuse_posix.rs` | POSIX FUSE compliance tests | -| `test_fuse_in_vm.rs` | FUSE-in-VM integration | -| `test_localhost_image.rs` | Local image tests | -| `test_snapshot_clone.rs` | Snapshot/clone workflow, clone port forwarding | -| `test_port_forward.rs` | Port forwarding for regular VMs | - -#### fuse-pipe Tests (`fuse-pipe/tests/`) -| File | Description | -|------|-------------| -| `integration.rs` | Basic FUSE operations (no root) | -| `integration_root.rs` | FUSE operations requiring root | -| `test_permission_edge_cases.rs` | Permission edge cases, setuid/setgid | -| `test_mount_stress.rs` | Mount/unmount stress tests | -| `test_allow_other.rs` | AllowOther flag tests | -| `test_unmount_race.rs` | Unmount race condition tests | -| `pjdfstest_matrix.rs` | POSIX compliance (17 categories run in parallel via nextest) | - -### Running Tests - ```bash -# Container testing (recommended) -make container-test # All fuse-pipe tests -make container-test-vm # VM tests - -# Native testing -make test # fuse-pipe tests -make test-vm # VM tests - -# Direct cargo commands (for debugging) -cargo test --release -p fuse-pipe --test integration -- --nocapture -sudo cargo test --release --test test_sanity -- --nocapture +# Quick start +make build # Build fcvm + fc-agent +make test-root # Run all tests (requires sudo + KVM) + +# Test tiers +make test-unit # Unit tests only (no VMs) +make test-integration-fast # Quick VM tests (<30s each) +make test-root # All tests including pjdfstest + +# Container testing (recommended - all deps bundled) +make container-test-root # All tests in container + +# Options +make test-root FILTER=exec # Filter by name +make test-root STREAM=1 # Live output +make test-root LIST=1 # List without running ``` +See [DESIGN.md](DESIGN.md#test-infrastructure) for test architecture and file listing. + ### Debugging Tests Enable tracing: @@ -595,50 +400,12 @@ sudo fusermount3 -u /tmp/fuse-*-mount* ## Data Layout -``` -/mnt/fcvm-btrfs/ -├── kernels/ -│ ├── vmlinux.bin # Symlink to active kernel -│ └── vmlinux-{sha}.bin # Kernel (SHA of URL for cache key) -├── rootfs/ -│ └── layer2-{sha}.raw # Base Ubuntu + Podman (~10GB, SHA of setup script) -├── initrd/ -│ └── fc-agent-{sha}.initrd # fc-agent injection initrd (SHA of binary) -├── vm-disks/{vm_id}/ # Per-VM disk (CoW reflink) -├── snapshots/ # Firecracker snapshots -├── state/ # VM state JSON files -└── cache/ # Downloaded cloud images -``` - ---- - -## Setup - -### dnsmasq Setup - -```bash -# One-time: Install dnsmasq for DNS forwarding to VMs -sudo apt-get update && sudo apt-get install -y dnsmasq -sudo tee /etc/dnsmasq.d/fcvm.conf > /dev/null < anyhow::Result<()> { eprintln!( - "[fc-agent] mounting FUSE volume at {} via vsock port {}", - mount_point, port + "[fc-agent] mounting FUSE volume at {} via vsock port {} ({} readers)", + mount_point, port, NUM_READERS ); - fuse_pipe::mount_vsock(HOST_CID, port, mount_point) + fuse_pipe::mount_vsock_with_readers(HOST_CID, port, mount_point, NUM_READERS) } /// Mount a FUSE filesystem with multiple reader threads. diff --git a/fc-agent/src/main.rs b/fc-agent/src/main.rs index a094cb3e..9b79a1ed 100644 --- a/fc-agent/src/main.rs +++ b/fc-agent/src/main.rs @@ -1550,16 +1550,12 @@ async fn main() -> Result<()> { let mut pull_succeeded = false; for attempt in 1..=MAX_RETRIES { - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] =========================================="); eprintln!( "[fc-agent] PULLING IMAGE: {} (attempt {}/{})", plan.image, attempt, MAX_RETRIES ); - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] =========================================="); // Spawn podman pull and stream output in real-time let mut child = Command::new("podman") @@ -1571,21 +1567,19 @@ async fn main() -> Result<()> { .context("spawning podman pull")?; // Stream stdout in real-time - let stdout_task = if let Some(stdout) = child.stdout.take() { - Some(tokio::spawn(async move { + let stdout_task = child.stdout.take().map(|stdout| { + tokio::spawn(async move { let reader = BufReader::new(stdout); let mut lines = reader.lines(); while let Ok(Some(line)) = lines.next_line().await { eprintln!("[fc-agent] [podman] {}", line); } - })) - } else { - None - }; + }) + }); // Stream stderr in real-time and capture for error reporting - let stderr_task = if let Some(stderr) = child.stderr.take() { - Some(tokio::spawn(async move { + let stderr_task = child.stderr.take().map(|stderr| { + tokio::spawn(async move { let reader = BufReader::new(stderr); let mut lines = reader.lines(); let mut captured = Vec::new(); @@ -1594,10 +1588,8 @@ async fn main() -> Result<()> { captured.push(line); } captured - })) - } else { - None - }; + }) + }); // Wait for podman to finish let status = child.wait().await.context("waiting for podman pull")?; @@ -1620,20 +1612,13 @@ async fn main() -> Result<()> { // Capture error for final bail message last_error = stderr_lines.join("\n"); - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] =========================================="); eprintln!( "[fc-agent] IMAGE PULL FAILED (attempt {}/{})", attempt, MAX_RETRIES ); - eprintln!( - "[fc-agent] exit code: {:?}", - status.code() - ); - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] exit code: {:?}", status.code()); + eprintln!("[fc-agent] =========================================="); if attempt < MAX_RETRIES { eprintln!("[fc-agent] retrying in {} seconds...", RETRY_DELAY_SECS); @@ -1642,16 +1627,12 @@ async fn main() -> Result<()> { } if !pull_succeeded { - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] =========================================="); eprintln!( "[fc-agent] FATAL: IMAGE PULL FAILED AFTER {} ATTEMPTS", MAX_RETRIES ); - eprintln!( - "[fc-agent] ==========================================" - ); + eprintln!("[fc-agent] =========================================="); anyhow::bail!( "Failed to pull image after {} attempts:\n{}", MAX_RETRIES, @@ -1718,7 +1699,10 @@ async fn main() -> Result<()> { // Port 4997 is dedicated for stdout/stderr let output_fd = create_output_vsock(); if output_fd >= 0 { - eprintln!("[fc-agent] output vsock connected (port {})", OUTPUT_VSOCK_PORT); + eprintln!( + "[fc-agent] output vsock connected (port {})", + OUTPUT_VSOCK_PORT + ); } // Stream stdout via vsock (wrapped in Arc for sharing across tasks) @@ -1729,7 +1713,11 @@ async fn main() -> Result<()> { let reader = BufReader::new(stdout); let mut lines = reader.lines(); while let Ok(Some(line)) = lines.next_line().await { - send_output_line(fd.load(std::sync::atomic::Ordering::Relaxed), "stdout", &line); + send_output_line( + fd.load(std::sync::atomic::Ordering::Relaxed), + "stdout", + &line, + ); } })) } else { @@ -1743,7 +1731,11 @@ async fn main() -> Result<()> { let reader = BufReader::new(stderr); let mut lines = reader.lines(); while let Ok(Some(line)) = lines.next_line().await { - send_output_line(fd.load(std::sync::atomic::Ordering::Relaxed), "stderr", &line); + send_output_line( + fd.load(std::sync::atomic::Ordering::Relaxed), + "stderr", + &line, + ); } })) } else { diff --git a/fuse-pipe/Cargo.toml b/fuse-pipe/Cargo.toml index 502f0365..37e3e3ac 100644 --- a/fuse-pipe/Cargo.toml +++ b/fuse-pipe/Cargo.toml @@ -9,9 +9,10 @@ keywords = ["fuse", "filesystem", "vsock", "async", "pipelining"] categories = ["filesystem", "asynchronous"] [features] -default = ["fuse-client"] -fuse-client = ["dep:fuser"] +default = ["integration-slow"] trace-benchmarks = [] # Enable tracing in benchmarks +privileged-tests = [] # Gate tests requiring root +integration-slow = [] # Gate slow tests (pjdfstest) [dependencies] # Core @@ -36,9 +37,9 @@ tracing-subscriber = { version = "0.3", features = ["env-filter"] } # Using local path for development - synced to EC2 via `make sync` fuse-backend-rs = { path = "../../fuse-backend-rs", default-features = false, features = ["fusedev"] } -# Optional: FUSE client (local fork with multi-reader support via FUSE_DEV_IOC_CLONE) +# FUSE client (local fork with multi-reader support via FUSE_DEV_IOC_CLONE) # Using local path for development - synced to EC2 via `make sync` -fuser = { path = "../../fuser", optional = true } +fuser = { path = "../../fuser" } # Concurrent data structures dashmap = "5.5" @@ -61,5 +62,5 @@ name = "operations" harness = false [[test]] -name = "pjdfstest_matrix" -path = "tests/pjdfstest_matrix.rs" +name = "pjdfstest_matrix_root" +path = "tests/pjdfstest_matrix_root.rs" diff --git a/fuse-pipe/src/lib.rs b/fuse-pipe/src/lib.rs index b5153987..5b617a5d 100644 --- a/fuse-pipe/src/lib.rs +++ b/fuse-pipe/src/lib.rs @@ -57,7 +57,6 @@ pub mod server; pub mod telemetry; pub mod transport; -#[cfg(feature = "fuse-client")] pub mod client; // Re-export protocol types at crate root for convenience @@ -78,9 +77,8 @@ pub use server::{AsyncServer, FilesystemHandler, PassthroughFs, ServerConfig}; pub use telemetry::{SpanCollector, SpanSummary}; // Re-export client types -#[cfg(feature = "fuse-client")] pub use client::{mount, mount_spawn, FuseClient, MountConfig, MountHandle, Multiplexer}; -#[cfg(all(feature = "fuse-client", target_os = "linux"))] +#[cfg(target_os = "linux")] pub use client::{mount_vsock, mount_vsock_with_options, mount_vsock_with_readers}; /// Prelude for common imports. diff --git a/fuse-pipe/tests/common/mod.rs b/fuse-pipe/tests/common/mod.rs index 0c9f02ee..5fddde27 100644 --- a/fuse-pipe/tests/common/mod.rs +++ b/fuse-pipe/tests/common/mod.rs @@ -44,19 +44,6 @@ fn init_tracing() { /// Global counter for unique test IDs static TEST_COUNTER: AtomicU64 = AtomicU64::new(0); -/// Panic if running as root. Use this in tests that should NOT require root -/// to catch accidental `sudo cargo test` invocations. -pub fn require_nonroot() { - let euid = unsafe { libc::geteuid() }; - if euid == 0 { - panic!( - "This test should NOT be run as root. \ - Use `cargo test` not `sudo cargo test`. \ - Root tests are in integration_root.rs and test_permission_edge_cases.rs" - ); - } -} - /// Join a thread with timeout. Returns true if joined successfully, false if timed out. fn join_with_timeout(thread: JoinHandle, timeout: Duration) -> bool { let start = std::time::Instant::now(); diff --git a/fuse-pipe/tests/integration.rs b/fuse-pipe/tests/integration.rs index 7729bbe1..649d3f62 100644 --- a/fuse-pipe/tests/integration.rs +++ b/fuse-pipe/tests/integration.rs @@ -12,12 +12,11 @@ mod common; use std::fs; use std::os::unix::io::AsRawFd; -use common::{cleanup, require_nonroot, unique_paths, FuseMount}; +use common::{cleanup, unique_paths, FuseMount}; use nix::unistd::{lseek, Whence}; #[test] fn test_create_and_read_file() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); @@ -33,7 +32,6 @@ fn test_create_and_read_file() { #[test] fn test_create_directory() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); @@ -48,7 +46,6 @@ fn test_create_directory() { #[test] fn test_list_directory() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); let mount = fuse.mount_path(); @@ -77,7 +74,6 @@ fn test_list_directory() { #[test] fn test_nested_file() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); @@ -99,7 +95,6 @@ fn test_nested_file() { #[test] fn test_file_metadata() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); @@ -120,7 +115,6 @@ fn test_file_metadata() { #[test] fn test_rename_across_directories() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); let mount = fuse.mount_path(); @@ -150,7 +144,6 @@ fn test_rename_across_directories() { #[test] fn test_symlink_and_readlink() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); let mount = fuse.mount_path(); @@ -176,7 +169,6 @@ fn test_symlink_and_readlink() { #[test] fn test_hardlink_survives_source_removal() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); let mount = fuse.mount_path(); @@ -199,7 +191,6 @@ fn test_hardlink_survives_source_removal() { #[test] fn test_multi_reader_mount_basic_io() { - require_nonroot(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); let fuse = FuseMount::new(&data_dir, &mount_dir, 4); let mount = fuse.mount_path().to_path_buf(); @@ -229,7 +220,6 @@ fn test_multi_reader_mount_basic_io() { /// Test that lseek supports negative offsets relative to SEEK_END. #[test] fn test_lseek_supports_negative_offsets() { - require_nonroot(); common::increase_ulimit(); let (data_dir, mount_dir) = unique_paths("fuse-integ"); diff --git a/fuse-pipe/tests/integration_root.rs b/fuse-pipe/tests/integration_root.rs index a632a9ba..98f8dbe3 100644 --- a/fuse-pipe/tests/integration_root.rs +++ b/fuse-pipe/tests/integration_root.rs @@ -5,7 +5,9 @@ //! - setfsuid()/setfsgid() credential switching //! - mkdir as non-root user via credential switching //! -//! Run with: `sudo cargo test --release -p fuse-pipe --test integration_root` +//! Run with: `sudo cargo test --release -p fuse-pipe --features privileged-tests --test integration_root` + +#![cfg(feature = "privileged-tests")] mod common; diff --git a/fuse-pipe/tests/pjdfstest_common.rs b/fuse-pipe/tests/pjdfstest_common.rs index f9d7ebdf..e01b2d48 100644 --- a/fuse-pipe/tests/pjdfstest_common.rs +++ b/fuse-pipe/tests/pjdfstest_common.rs @@ -191,10 +191,10 @@ pub fn run_single_category(category: &str, jobs: usize) -> (bool, usize, usize) init_tracing(); raise_fd_limit(); - if !is_pjdfstest_installed() { - eprintln!("pjdfstest not found - skipping {}", category); - return (true, 0, 0); // Skip, don't fail - } + assert!( + is_pjdfstest_installed(), + "pjdfstest binary not found - install it or exclude pjdfstest tests from run" + ); // Unique paths for this test process let pid = std::process::id(); diff --git a/fuse-pipe/tests/pjdfstest_matrix.rs b/fuse-pipe/tests/pjdfstest_matrix_root.rs similarity index 75% rename from fuse-pipe/tests/pjdfstest_matrix.rs rename to fuse-pipe/tests/pjdfstest_matrix_root.rs index 3c569098..6c80c68b 100644 --- a/fuse-pipe/tests/pjdfstest_matrix.rs +++ b/fuse-pipe/tests/pjdfstest_matrix_root.rs @@ -1,7 +1,13 @@ -//! Matrix pjdfstest runner - each category is a separate test for parallel execution. +//! Host-side pjdfstest matrix - tests fuse-pipe FUSE directly (no VM) //! -//! Run with: cargo nextest run -p fuse-pipe --test pjdfstest_matrix -//! Categories run in parallel via nextest's process isolation. +//! Each category is a separate test, allowing nextest to run all 17 in parallel. +//! Tests fuse-pipe's PassthroughFs via local FUSE mount. +//! +//! See also: tests/test_fuse_in_vm_matrix.rs (in-VM matrix, tests full vsock stack) +//! +//! Run with: cargo nextest run -p fuse-pipe --test pjdfstest_matrix_root --features privileged-tests,integration-slow + +#![cfg(all(feature = "privileged-tests", feature = "integration-slow"))] mod pjdfstest_common; @@ -22,8 +28,7 @@ macro_rules! pjdfstest_category { }; } -// Generate a test function for each pjdfstest category -// These will run in parallel via nextest +// All categories require root for chown/mknod/user-switching pjdfstest_category!(test_pjdfstest_chflags, "chflags"); pjdfstest_category!(test_pjdfstest_chmod, "chmod"); pjdfstest_category!(test_pjdfstest_chown, "chown"); diff --git a/fuse-pipe/tests/test_allow_other.rs b/fuse-pipe/tests/test_allow_other.rs index a77fde36..652b4bdb 100644 --- a/fuse-pipe/tests/test_allow_other.rs +++ b/fuse-pipe/tests/test_allow_other.rs @@ -5,7 +5,7 @@ mod common; -use common::{cleanup, require_nonroot, unique_paths, FuseMount}; +use common::{cleanup, unique_paths, FuseMount}; use std::fs; use std::process::Command; @@ -13,16 +13,12 @@ use std::process::Command; /// This test creates a file as the mounting user, then verifies another user can access it. #[test] fn test_allow_other_with_fuse_conf() { - require_nonroot(); - - // Skip if user_allow_other is not configured + // Require user_allow_other in fuse.conf - fail if not configured let fuse_conf = fs::read_to_string("/etc/fuse.conf").unwrap_or_default(); - if !fuse_conf.lines().any(|l| l.trim() == "user_allow_other") { - eprintln!( - "Skipping test_allow_other_with_fuse_conf - user_allow_other not in /etc/fuse.conf" - ); - return; - } + assert!( + fuse_conf.lines().any(|l| l.trim() == "user_allow_other"), + "Test requires user_allow_other in /etc/fuse.conf" + ); let (data_dir, mount_dir) = unique_paths("allow-other"); let fuse = FuseMount::new(&data_dir, &mount_dir, 1); diff --git a/fuse-pipe/tests/test_mount_stress.rs b/fuse-pipe/tests/test_mount_stress.rs index 61dbbb35..78d9330d 100644 --- a/fuse-pipe/tests/test_mount_stress.rs +++ b/fuse-pipe/tests/test_mount_stress.rs @@ -5,7 +5,7 @@ mod common; -use common::{cleanup, require_nonroot, unique_paths, FuseMount}; +use common::{cleanup, unique_paths, FuseMount}; use std::fs; use std::sync::atomic::{AtomicUsize, Ordering}; use std::sync::Arc; @@ -16,7 +16,6 @@ use std::time::{Duration, Instant}; /// This catches resource leaks, cleanup issues, and deadlocks. #[test] fn test_parallel_mount_stress() { - require_nonroot(); const NUM_THREADS: usize = 8; const ITERATIONS_PER_THREAD: usize = 5; @@ -96,7 +95,6 @@ fn test_parallel_mount_stress() { /// This catches cleanup issues that only manifest under rapid cycling. #[test] fn test_rapid_mount_unmount_cycles() { - require_nonroot(); const CYCLES: usize = 20; let start = Instant::now(); @@ -131,7 +129,6 @@ fn test_rapid_mount_unmount_cycles() { /// All mounts are created first, then operations run in parallel. #[test] fn test_concurrent_operations_on_multiple_mounts() { - require_nonroot(); const NUM_MOUNTS: usize = 4; const OPS_PER_MOUNT: usize = 10; diff --git a/fuse-pipe/tests/test_permission_edge_cases.rs b/fuse-pipe/tests/test_permission_edge_cases.rs index ca9a1904..a6f54a93 100644 --- a/fuse-pipe/tests/test_permission_edge_cases.rs +++ b/fuse-pipe/tests/test_permission_edge_cases.rs @@ -3,9 +3,9 @@ //! These tests reproduce specific pjdfstest failures to enable fast iteration. //! They test edge cases in chmod, chown, open, truncate, and link operations. //! -//! Run with: `sudo cargo test --test test_permission_edge_cases -- --nocapture` +//! Run with: `sudo cargo test --features privileged-tests --test test_permission_edge_cases -- --nocapture` -// Allow unused variables - test code often has unused return values +#![cfg(feature = "privileged-tests")] #![allow(unused_variables)] mod common; diff --git a/fuse-pipe/tests/test_unmount_race.rs b/fuse-pipe/tests/test_unmount_race.rs index a22a129e..7279090f 100644 --- a/fuse-pipe/tests/test_unmount_race.rs +++ b/fuse-pipe/tests/test_unmount_race.rs @@ -11,7 +11,7 @@ use std::fs::{self, File}; use std::io::{Read, Write}; use std::thread; -use common::{cleanup, require_nonroot, unique_paths, FuseMount}; +use common::{cleanup, unique_paths, FuseMount}; /// Reproduce the unmount race with heavy I/O. /// @@ -20,7 +20,6 @@ use common::{cleanup, require_nonroot, unique_paths, FuseMount}; /// is called, causing ERROR logs. #[test] fn test_unmount_after_heavy_io() { - require_nonroot(); // Use many readers to increase chance of race const NUM_READERS: usize = 16; const NUM_FILES: usize = 100; @@ -79,7 +78,6 @@ fn test_unmount_after_heavy_io() { /// Run the test multiple times to increase chance of hitting the race. #[test] fn test_unmount_race_repeated() { - require_nonroot(); for i in 0..5 { eprintln!("\n=== Iteration {} ===", i); test_unmount_after_heavy_io_inner(i); diff --git a/src/cli/args.rs b/src/cli/args.rs index 82fba71e..ad0fb456 100644 --- a/src/cli/args.rs +++ b/src/cli/args.rs @@ -31,6 +31,8 @@ pub enum Commands { Snapshots, /// Execute a command in a running VM Exec(ExecArgs), + /// Setup kernel and rootfs (kernel ~15MB download, rootfs ~10GB creation, takes 5-10 minutes) + Setup, } // ============================================================================ @@ -107,6 +109,11 @@ pub struct RunArgs { /// Useful for diagnosing fc-agent startup issues #[arg(long)] pub strace_agent: bool, + + /// Run setup if kernel/rootfs are missing (takes 5-10 minutes on first run) + /// Without this flag, fcvm will fail if setup hasn't been run + #[arg(long)] + pub setup: bool, } // ============================================================================ diff --git a/src/commands/mod.rs b/src/commands/mod.rs index 36261571..f8ac07c9 100644 --- a/src/commands/mod.rs +++ b/src/commands/mod.rs @@ -2,6 +2,7 @@ pub mod common; pub mod exec; pub mod ls; pub mod podman; +pub mod setup; pub mod snapshot; pub mod snapshots; @@ -9,5 +10,6 @@ pub mod snapshots; pub use exec::cmd_exec; pub use ls::cmd_ls; pub use podman::cmd_podman; +pub use setup::cmd_setup; pub use snapshot::cmd_snapshot; pub use snapshots::cmd_snapshots; diff --git a/src/commands/podman.rs b/src/commands/podman.rs index c381240b..8cce558a 100644 --- a/src/commands/podman.rs +++ b/src/commands/podman.rs @@ -1,4 +1,5 @@ use anyhow::{bail, Context, Result}; +use fs2::FileExt; use std::path::PathBuf; use tokio::signal::unix::{signal, SignalKind}; use tracing::{debug, info, warn}; @@ -155,10 +156,7 @@ async fn run_status_listener( /// Host → Guest: "stdin:content" (written to container stdin) /// /// Returns collected output lines as Vec<(stream, line)>. -async fn run_output_listener( - socket_path: &str, - vm_id: &str, -) -> Result> { +async fn run_output_listener(socket_path: &str, vm_id: &str) -> Result> { use tokio::io::{AsyncBufReadExt, AsyncWriteExt, BufReader}; use tokio::net::UnixListener; @@ -256,14 +254,21 @@ async fn cmd_podman_run(args: RunArgs) -> Result<()> { // Validate VM name before any setup work validate_vm_name(&args.name).context("invalid VM name")?; - // Ensure kernel, rootfs, and initrd exist (auto-setup on first run) - let kernel_path = crate::setup::ensure_kernel() + // Disallow --setup when running as root + // Root users should run `fcvm setup` explicitly + if args.setup && nix::unistd::geteuid().is_root() { + bail!("--setup is not allowed when running as root. Run 'fcvm setup' first."); + } + + // Get kernel, rootfs, and initrd paths + // With --setup: create if missing; without: fail if missing + let kernel_path = crate::setup::ensure_kernel(args.setup) .await .context("setting up kernel")?; - let base_rootfs = crate::setup::ensure_rootfs() + let base_rootfs = crate::setup::ensure_rootfs(args.setup) .await .context("setting up rootfs")?; - let initrd_path = crate::setup::ensure_fc_agent_initrd() + let initrd_path = crate::setup::ensure_fc_agent_initrd(args.setup) .await .context("setting up fc-agent initrd")?; @@ -287,43 +292,91 @@ async fn cmd_podman_run(args: RunArgs) -> Result<()> { .collect::>>() .context("parsing volume mappings")?; - // For localhost/ images, use skopeo to copy image to a directory - // The guest will use skopeo to import it into local storage + // For localhost/ images, use content-addressable cache for skopeo export + // This avoids lock contention when multiple VMs export the same image let _image_export_dir = if args.image.starts_with("localhost/") { - let image_dir = paths::vm_runtime_dir(&vm_id).join("image-export"); - tokio::fs::create_dir_all(&image_dir) - .await - .context("creating image export directory")?; - - info!(image = %args.image, "Exporting localhost image with skopeo"); - - let output = tokio::process::Command::new("skopeo") - .arg("copy") - .arg(format!("containers-storage:{}", args.image)) - .arg(format!("dir:{}", image_dir.display())) + // Get image digest for content-addressable storage + let inspect_output = tokio::process::Command::new("podman") + .args(["image", "inspect", &args.image, "--format", "{{.Digest}}"]) .output() .await - .context("running skopeo copy")?; + .context("inspecting image digest")?; - if !output.status.success() { - let stderr = String::from_utf8_lossy(&output.stderr); + if !inspect_output.status.success() { + let stderr = String::from_utf8_lossy(&inspect_output.stderr); bail!( - "Failed to export image '{}' with skopeo: {}", + "Failed to get digest for image '{}': {}", args.image, stderr ); } - info!(dir = %image_dir.display(), "Image exported to OCI directory"); + let digest = String::from_utf8_lossy(&inspect_output.stdout) + .trim() + .to_string(); + + // Use content-addressable cache: /mnt/fcvm-btrfs/image-cache/{digest}/ + let image_cache_dir = paths::base_dir().join("image-cache"); + tokio::fs::create_dir_all(&image_cache_dir) + .await + .context("creating image-cache directory")?; + + let cache_dir = image_cache_dir.join(&digest); + + // Lock per-digest to prevent concurrent exports of the same image + let lock_path = image_cache_dir.join(format!("{}.lock", &digest)); + let lock_file = + std::fs::File::create(&lock_path).context("creating image cache lock file")?; + lock_file + .lock_exclusive() + .context("acquiring image cache lock")?; - // Add the image directory as a read-only volume mount + // Check if already cached (inside lock to prevent race) + let manifest_path = cache_dir.join("manifest.json"); + if !manifest_path.exists() { + info!(image = %args.image, digest = %digest, "Exporting localhost image with skopeo"); + + // Create cache dir + tokio::fs::create_dir_all(&cache_dir) + .await + .context("creating image cache directory")?; + + let output = tokio::process::Command::new("skopeo") + .arg("copy") + .arg(format!("containers-storage:{}", args.image)) + .arg(format!("dir:{}", cache_dir.display())) + .output() + .await + .context("running skopeo copy")?; + + if !output.status.success() { + let stderr = String::from_utf8_lossy(&output.stderr); + // Clean up partial export + let _ = tokio::fs::remove_dir_all(&cache_dir).await; + drop(lock_file); // Release lock before bailing + bail!( + "Failed to export image '{}' with skopeo: {}", + args.image, + stderr + ); + } + + info!(dir = %cache_dir.display(), "Image exported to OCI directory"); + } else { + info!(image = %args.image, digest = %digest, "Using cached image export"); + } + + // Lock released when lock_file is dropped + drop(lock_file); + + // Add the cached image directory as a read-only volume mount volume_mappings.push(VolumeMapping { - host_path: image_dir.clone(), + host_path: cache_dir.clone(), guest_path: "/tmp/fcvm-image".to_string(), read_only: true, }); - Some(image_dir) + Some(cache_dir) } else { None }; @@ -661,56 +714,150 @@ async fn run_vm_setup( // This is fully rootless - no sudo required! // Step 1: Spawn holder process (keeps namespace alive) + // Retry for up to 2 seconds if holder dies (transient failures under load) let holder_cmd = slirp_net.build_holder_command(); info!(cmd = ?holder_cmd, "spawning namespace holder for rootless networking"); - // Spawn holder with piped stderr to capture errors if it fails - let mut child = tokio::process::Command::new(&holder_cmd[0]) - .args(&holder_cmd[1..]) - .stdin(std::process::Stdio::null()) - .stdout(std::process::Stdio::null()) - .stderr(std::process::Stdio::piped()) - .spawn() - .with_context(|| format!("failed to spawn holder: {:?}", holder_cmd))?; - - let holder_pid = child.id().context("getting holder process PID")?; - info!(holder_pid = holder_pid, "namespace holder started"); - - // Give holder a moment to potentially fail, then check status - tokio::time::sleep(std::time::Duration::from_millis(50)).await; - match child.try_wait() { - Ok(Some(status)) => { - // Holder exited - capture stderr to see why - let stderr = if let Some(mut stderr_pipe) = child.stderr.take() { + let retry_deadline = std::time::Instant::now() + std::time::Duration::from_secs(2); + let mut attempt = 0; + #[allow(unused_assignments)] + let mut _last_error: Option = None; + + let (mut child, holder_pid, mut holder_stderr) = loop { + attempt += 1; + + // Spawn holder with piped stderr to capture errors if it fails + let mut child = tokio::process::Command::new(&holder_cmd[0]) + .args(&holder_cmd[1..]) + .stdin(std::process::Stdio::null()) + .stdout(std::process::Stdio::null()) + .stderr(std::process::Stdio::piped()) + .spawn() + .with_context(|| format!("failed to spawn holder: {:?}", holder_cmd))?; + + let holder_pid = child.id().context("getting holder process PID")?; + if attempt > 1 { + info!( + holder_pid = holder_pid, + attempt = attempt, + "namespace holder started (retry)" + ); + } else { + info!(holder_pid = holder_pid, "namespace holder started"); + } + + // Give holder a moment to potentially fail, then check status + tokio::time::sleep(std::time::Duration::from_millis(50)).await; + + // Take stderr pipe - we'll use it for diagnostics if holder dies later + let mut holder_stderr = child.stderr.take(); + + match child.try_wait() { + Ok(Some(status)) => { + // Holder exited - capture stderr to see why + let stderr = if let Some(ref mut pipe) = holder_stderr { + use tokio::io::AsyncReadExt; + let mut buf = String::new(); + let _ = pipe.read_to_string(&mut buf).await; + buf + } else { + String::new() + }; + + _last_error = Some(format!( + "holder exited immediately: status={}, stderr='{}'", + status, + stderr.trim() + )); + + if std::time::Instant::now() < retry_deadline { + warn!( + holder_pid = holder_pid, + attempt = attempt, + status = %status, + stderr = %stderr.trim(), + "holder died, retrying..." + ); + tokio::time::sleep(std::time::Duration::from_millis(100)).await; + continue; + } else { + bail!( + "holder process exited immediately after {} attempts: status={}, stderr={}, cmd={:?}", + attempt, + status, + stderr.trim(), + holder_cmd + ); + } + } + Ok(None) => { + debug!(holder_pid = holder_pid, "holder still running after 50ms"); + } + Err(e) => { + warn!(holder_pid = holder_pid, error = ?e, "failed to check holder status"); + } + } + + // Additional delay for namespace setup + // The --map-root-user option invokes setuid helpers asynchronously + tokio::time::sleep(std::time::Duration::from_millis(50)).await; + + // Check if holder is still alive before proceeding + if !crate::utils::is_process_alive(holder_pid) { + // Try to capture stderr from the dead holder process + let holder_stderr_content = if let Some(ref mut pipe) = holder_stderr { use tokio::io::AsyncReadExt; let mut buf = String::new(); - let _ = stderr_pipe.read_to_string(&mut buf).await; - buf + match tokio::time::timeout( + std::time::Duration::from_millis(100), + pipe.read_to_string(&mut buf), + ) + .await + { + Ok(Ok(_)) => buf, + _ => String::new(), + } } else { String::new() }; - bail!( - "holder process exited immediately: status={}, stderr={}, cmd={:?}", - status, - stderr.trim(), - holder_cmd - ); - } - Ok(None) => { - debug!(holder_pid = holder_pid, "holder still running after 50ms"); - // Holder is running - drop the stderr pipe so it doesn't block - drop(child.stderr.take()); - } - Err(e) => { - warn!(holder_pid = holder_pid, error = ?e, "failed to check holder status"); + + let _ = child.kill().await; + + _last_error = Some(format!( + "holder died after 100ms: stderr='{}'", + holder_stderr_content.trim() + )); + + if std::time::Instant::now() < retry_deadline { + warn!( + holder_pid = holder_pid, + attempt = attempt, + holder_stderr = %holder_stderr_content.trim(), + "holder died after initial check, retrying..." + ); + tokio::time::sleep(std::time::Duration::from_millis(100)).await; + continue; + } else { + let max_user_ns = std::fs::read_to_string("/proc/sys/user/max_user_namespaces") + .unwrap_or_else(|_| "unknown".to_string()); + bail!( + "holder process (PID {}) died after {} attempts. \ + stderr='{}', max_user_namespaces={}. \ + This may indicate resource exhaustion or namespace limit reached.", + holder_pid, + attempt, + holder_stderr_content.trim(), + max_user_ns.trim() + ); + } } - } - // Additional delay for namespace setup (already waited 50ms above) - // The --map-auto option invokes setuid helpers asynchronously - tokio::time::sleep(std::time::Duration::from_millis(50)).await; + // Holder is alive - break out of retry loop + break (child, holder_pid, holder_stderr); + }; // Step 2: Run setup script via nsenter (creates TAPs, iptables, etc.) + // This is also inside retry logic - if holder dies during nsenter, retry everything let setup_script = slirp_net.build_setup_script(); let nsenter_prefix = slirp_net.build_nsenter_prefix(holder_pid); @@ -737,15 +884,6 @@ async fn run_vm_setup( warn!("/dev/net/tun not available - TAP device creation will fail"); } - // Verify holder is still alive before attempting nsenter - if !crate::utils::is_process_alive(holder_pid) { - let _ = child.kill().await; - bail!( - "holder process (PID {}) died before network setup could run", - holder_pid - ); - } - info!(holder_pid = holder_pid, "running network setup via nsenter"); // Log the setup script for debugging @@ -767,32 +905,171 @@ async fn run_vm_setup( if !setup_output.status.success() { let stderr = String::from_utf8_lossy(&setup_output.stderr); let stdout = String::from_utf8_lossy(&setup_output.stdout); - // Kill holder before bailing - let _ = child.kill().await; + // Re-check state for diagnostics let holder_alive = std::path::Path::new(&proc_dir).exists(); let ns_user_exists = std::path::Path::new(&ns_user).exists(); let ns_net_exists = std::path::Path::new(&ns_net).exists(); - // Log comprehensive error info at ERROR level (always visible) - warn!( - holder_pid = holder_pid, - holder_alive = holder_alive, - tun_exists = tun_exists, - ns_user_exists = ns_user_exists, - ns_net_exists = ns_net_exists, - stderr = %stderr.trim(), - stdout = %stdout.trim(), - "network setup failed - diagnostics" - ); + // If holder died during nsenter, this is a retryable error + if !holder_alive && std::time::Instant::now() < retry_deadline { + // Holder died during nsenter - retry the whole thing + let holder_stderr_content = if let Some(ref mut pipe) = holder_stderr { + use tokio::io::AsyncReadExt; + let mut buf = String::new(); + match tokio::time::timeout( + std::time::Duration::from_millis(100), + pipe.read_to_string(&mut buf), + ) + .await + { + Ok(Ok(_)) => buf, + _ => String::new(), + } + } else { + String::new() + }; - bail!( - "network setup failed: {} (tun={}, holder_alive={}, ns_user={}, ns_net={})", - stderr.trim(), - tun_exists, - holder_alive, - ns_user_exists, - ns_net_exists + let _ = child.kill().await; + + warn!( + holder_pid = holder_pid, + attempt = attempt, + holder_stderr = %holder_stderr_content.trim(), + nsenter_stderr = %stderr.trim(), + "holder died during nsenter, retrying..." + ); + + // Jump back to the retry loop by recursing into this block + // We need to restructure - for now just retry once more inline + tokio::time::sleep(std::time::Duration::from_millis(100)).await; + + // Retry: spawn new holder + attempt += 1; + let mut retry_child = tokio::process::Command::new(&holder_cmd[0]) + .args(&holder_cmd[1..]) + .stdin(std::process::Stdio::null()) + .stdout(std::process::Stdio::null()) + .stderr(std::process::Stdio::piped()) + .spawn() + .with_context(|| { + format!("failed to spawn holder on retry: {:?}", holder_cmd) + })?; + + let retry_holder_pid = retry_child.id().context("getting retry holder PID")?; + info!( + holder_pid = retry_holder_pid, + attempt = attempt, + "namespace holder started (retry after nsenter failure)" + ); + + tokio::time::sleep(std::time::Duration::from_millis(100)).await; + + if !crate::utils::is_process_alive(retry_holder_pid) { + let _ = retry_child.kill().await; + bail!( + "holder died on retry after nsenter failure (attempt {})", + attempt + ); + } + + // Retry nsenter with new holder + let retry_nsenter_prefix = slirp_net.build_nsenter_prefix(retry_holder_pid); + let retry_output = tokio::process::Command::new(&retry_nsenter_prefix[0]) + .args(&retry_nsenter_prefix[1..]) + .arg("bash") + .arg("-c") + .arg(&setup_script) + .output() + .await + .context("running network setup via nsenter (retry)")?; + + if !retry_output.status.success() { + let retry_stderr = String::from_utf8_lossy(&retry_output.stderr); + let _ = retry_child.kill().await; + bail!( + "network setup failed on retry: {} (attempt {})", + retry_stderr.trim(), + attempt + ); + } + + // Success on retry - update variables for rest of function + child = retry_child; + // Note: holder_pid is shadowed in the outer scope, but we continue with retry_holder_pid + info!( + holder_pid = retry_holder_pid, + attempts = attempt, + "network setup succeeded after retry" + ); + } else { + // If holder died, try to capture its stderr for more context + let holder_stderr_content = if !holder_alive { + if let Some(ref mut pipe) = holder_stderr { + use tokio::io::AsyncReadExt; + let mut buf = String::new(); + match tokio::time::timeout( + std::time::Duration::from_millis(100), + pipe.read_to_string(&mut buf), + ) + .await + { + Ok(Ok(_)) => buf, + _ => String::new(), + } + } else { + String::new() + } + } else { + String::new() + }; + + // Kill holder before bailing + let _ = child.kill().await; + + // Log comprehensive error info at ERROR level (always visible) + warn!( + holder_pid = holder_pid, + holder_alive = holder_alive, + holder_stderr = %holder_stderr_content.trim(), + tun_exists = tun_exists, + ns_user_exists = ns_user_exists, + ns_net_exists = ns_net_exists, + nsenter_stderr = %stderr.trim(), + nsenter_stdout = %stdout.trim(), + "network setup failed - diagnostics" + ); + + if !holder_alive { + bail!( + "network setup failed: holder died during nsenter after {} attempts. \ + nsenter_stderr='{}', holder_stderr='{}', \ + (tun={}, ns_user={}, ns_net={})", + attempt, + stderr.trim(), + holder_stderr_content.trim(), + tun_exists, + ns_user_exists, + ns_net_exists + ); + } else { + bail!( + "network setup failed: {} (tun={}, holder_alive={}, ns_user={}, ns_net={})", + stderr.trim(), + tun_exists, + holder_alive, + ns_user_exists, + ns_net_exists + ); + } + } + } + + if attempt > 1 { + info!( + holder_pid = holder_pid, + attempts = attempt, + "namespace setup succeeded after retries" ); } diff --git a/src/commands/setup.rs b/src/commands/setup.rs new file mode 100644 index 00000000..7d3ecc66 --- /dev/null +++ b/src/commands/setup.rs @@ -0,0 +1,31 @@ +use anyhow::{Context, Result}; + +/// Run setup to download kernel and create rootfs. +/// +/// This downloads the Kata kernel (~15MB) and creates the Layer 2 rootfs (~10GB). +/// The rootfs creation downloads Ubuntu cloud image and installs podman, taking 5-10 minutes. +pub async fn cmd_setup() -> Result<()> { + println!("Setting up fcvm (this may take 5-10 minutes on first run)..."); + + // Ensure kernel exists (downloads Kata kernel if missing) + let kernel_path = crate::setup::ensure_kernel(true) + .await + .context("setting up kernel")?; + println!(" ✓ Kernel ready: {}", kernel_path.display()); + + // Ensure rootfs exists (creates Layer 2 if missing) + let rootfs_path = crate::setup::ensure_rootfs(true) + .await + .context("setting up rootfs")?; + println!(" ✓ Rootfs ready: {}", rootfs_path.display()); + + // Ensure fc-agent initrd exists + let initrd_path = crate::setup::ensure_fc_agent_initrd(true) + .await + .context("setting up fc-agent initrd")?; + println!(" ✓ Initrd ready: {}", initrd_path.display()); + + println!("\nSetup complete! You can now run VMs with: fcvm podman run ..."); + + Ok(()) +} diff --git a/src/commands/snapshot.rs b/src/commands/snapshot.rs index 5c0b38b2..dfcf4eb9 100644 --- a/src/commands/snapshot.rs +++ b/src/commands/snapshot.rs @@ -428,7 +428,7 @@ async fn cmd_snapshot_serve(args: SnapshotServeArgs) -> Result<()> { let running_clones: Vec = all_vms .into_iter() .filter(|vm| vm.config.serve_pid == Some(my_pid)) - .filter(|vm| vm.pid.map(|p| crate::utils::is_process_alive(p)).unwrap_or(false)) + .filter(|vm| vm.pid.map(crate::utils::is_process_alive).unwrap_or(false)) .collect(); if running_clones.is_empty() { diff --git a/src/main.rs b/src/main.rs index 316280e3..71f055f2 100644 --- a/src/main.rs +++ b/src/main.rs @@ -54,7 +54,8 @@ async fn main() -> Result<()> { .init(); } else { // Parent process: only use colors when outputting to a TTY (not when piped to file) - let use_color = atty::is(atty::Stream::Stderr); + use std::io::IsTerminal; + let use_color = std::io::stderr().is_terminal(); tracing_subscriber::fmt() .with_env_filter( EnvFilter::from_default_env().add_directive(tracing::Level::INFO.into()), @@ -72,6 +73,7 @@ async fn main() -> Result<()> { Commands::Snapshot(args) => commands::cmd_snapshot(args).await, Commands::Snapshots => commands::cmd_snapshots().await, Commands::Exec(args) => commands::cmd_exec(args).await, + Commands::Setup => commands::cmd_setup().await, }; // Handle errors diff --git a/src/network/bridged.rs b/src/network/bridged.rs index fa726f8e..4d3a9b01 100644 --- a/src/network/bridged.rs +++ b/src/network/bridged.rs @@ -134,7 +134,13 @@ impl NetworkManager for BridgedNetwork { "clone using In-Namespace NAT" ); - (host_ip, veth_subnet, guest_ip, Some(orig_gateway), Some(veth_inner_ip)) + ( + host_ip, + veth_subnet, + guest_ip, + Some(orig_gateway), + Some(veth_inner_ip), + ) } else { // Baseline VM case: use 172.30.x.y/30 for everything let third_octet = (subnet_id / 64) as u8; @@ -281,7 +287,18 @@ impl NetworkManager for BridgedNetwork { guest_ip.clone() }; - match portmap::setup_port_mappings(&target_ip, &self.port_mappings).await { + // Scope DNAT rules to the veth's host IP - this allows parallel VMs to use + // the same port since each VM has a unique veth IP + let scoped_mappings: Vec<_> = self + .port_mappings + .iter() + .map(|m| super::PortMapping { + host_ip: Some(host_ip.clone()), + ..m.clone() + }) + .collect(); + + match portmap::setup_port_mappings(&target_ip, &scoped_mappings).await { Ok(rules) => self.port_mapping_rules = rules, Err(e) => { let _ = self.cleanup().await; diff --git a/src/network/namespace.rs b/src/network/namespace.rs index ce6b138c..89f80bfa 100644 --- a/src/network/namespace.rs +++ b/src/network/namespace.rs @@ -111,17 +111,12 @@ pub async fn list_namespaces() -> Result> { Ok(namespaces) } -#[cfg(test)] +#[cfg(all(test, feature = "privileged-tests"))] mod tests { use super::*; #[tokio::test] async fn test_namespace_lifecycle() { - if unsafe { libc::geteuid() } != 0 { - eprintln!("Skipping test_namespace_lifecycle - requires root"); - return; - } - let ns_name = "fcvm-test-ns"; // Clean up if exists from previous test @@ -143,10 +138,8 @@ mod tests { } // Requires CAP_SYS_ADMIN to remount /sys in new namespace (doesn't work in containers) - #[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_exec_in_namespace() { - let ns_name = "fcvm-test-exec"; // Clean up if exists diff --git a/src/network/portmap.rs b/src/network/portmap.rs index 07c260c9..9c7ac80b 100644 --- a/src/network/portmap.rs +++ b/src/network/portmap.rs @@ -352,30 +352,28 @@ mod tests { } } + #[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_port_mapping_lifecycle() { - // Test that we can create and cleanup rules - // Note: This test requires root and modifies iptables, so it's - // more of an integration test. Skip in CI. - let guest_ip = "172.30.0.2"; + // Test that we can create and cleanup rules (requires root for iptables) + // Use a scoped host_ip so rules don't conflict with parallel tests + let veth_ip = "172.30.99.1"; // Fake veth IP for testing + let guest_ip = "172.30.99.2"; let mappings = vec![PortMapping { - host_ip: None, - host_port: 18080, + host_ip: Some(veth_ip.to_string()), // Scope DNAT to this IP + host_port: 8080, guest_port: 80, proto: Protocol::Tcp, }]; // Setup - let rules = setup_port_mappings(guest_ip, &mappings).await; + let rules = setup_port_mappings(guest_ip, &mappings) + .await + .expect("setup port mappings (requires root)"); - if let Ok(rules) = rules { - assert_eq!(rules.len(), 4); // DNAT (PREROUTING) + DNAT (OUTPUT) + MASQUERADE + FORWARD + assert_eq!(rules.len(), 4); // DNAT (PREROUTING) + DNAT (OUTPUT) + MASQUERADE + FORWARD - // Cleanup - cleanup_port_mappings(&rules).await.unwrap(); - } else { - // If we can't setup (not root), that's OK for this test - println!("Skipping port mapping test (requires root)"); - } + // Cleanup + cleanup_port_mappings(&rules).await.unwrap(); } } diff --git a/src/setup/kernel.rs b/src/setup/kernel.rs index 0951e7fb..79017a30 100644 --- a/src/setup/kernel.rs +++ b/src/setup/kernel.rs @@ -24,19 +24,22 @@ pub fn get_kernel_url_hash() -> Result { Ok(compute_sha256_short(kernel_config.url.as_bytes())) } -/// Ensure kernel exists, downloading from Kata release if needed -pub async fn ensure_kernel() -> Result { +/// Ensure kernel exists, downloading from Kata release if needed. +/// If `allow_create` is false, bail if kernel doesn't exist. +pub async fn ensure_kernel(allow_create: bool) -> Result { let (plan, _, _) = load_plan()?; let kernel_config = plan.kernel.current_arch()?; - download_kernel(kernel_config).await + download_kernel(kernel_config, allow_create).await } /// Download kernel from Kata release tarball. /// /// Uses file locking to prevent race conditions when multiple VMs start /// simultaneously and all try to download the same kernel. -async fn download_kernel(config: &KernelArchConfig) -> Result { +/// +/// If `allow_create` is false, bail if kernel doesn't exist. +async fn download_kernel(config: &KernelArchConfig, allow_create: bool) -> Result { let kernel_dir = paths::kernel_dir(); // Cache by URL hash - changing URL triggers re-download @@ -49,6 +52,11 @@ async fn download_kernel(config: &KernelArchConfig) -> Result { return Ok(kernel_path); } + // Bail if creation not allowed + if !allow_create { + bail!("Kernel not found. Run 'fcvm setup' first, or use --setup flag."); + } + // Create directory (needed for lock file) tokio::fs::create_dir_all(&kernel_dir) .await @@ -123,10 +131,7 @@ async fn download_kernel(config: &KernelArchConfig) -> Result { let extract_path = format!("./{}", config.path); let output = Command::new("tar") - .args([ - "--use-compress-program=zstd", - "-xf", - ]) + .args(["--use-compress-program=zstd", "-xf"]) .arg(&tarball_path) .arg("-C") .arg(&cache_dir) diff --git a/src/setup/rootfs.rs b/src/setup/rootfs.rs index 606818e5..ddfbd641 100644 --- a/src/setup/rootfs.rs +++ b/src/setup/rootfs.rs @@ -282,7 +282,10 @@ pub fn generate_setup_script(plan: &Plan) -> String { s.push_str("# Fix /etc/fstab\n"); for pattern in &plan.fstab.remove_patterns { // Use sed to remove lines containing the pattern - s.push_str(&format!("sed -i '/{}/d' /etc/fstab\n", pattern.replace('/', "\\/"))); + s.push_str(&format!( + "sed -i '/{}/d' /etc/fstab\n", + pattern.replace('/', "\\/") + )); } s.push('\n'); } @@ -338,7 +341,6 @@ pub fn generate_setup_script(plan: &Plan) -> String { s } - // ============================================================================ // Plan Loading and SHA256 // ============================================================================ @@ -359,7 +361,7 @@ fn find_plan_file() -> Result { for path in &candidates { if path.exists() { - return Ok(path.canonicalize().context("canonicalizing plan file path")?); + return path.canonicalize().context("canonicalizing plan file path"); } } @@ -371,7 +373,10 @@ fn find_plan_file() -> Result { bail!( "rootfs-plan.toml not found. Checked: {:?}", - candidates.iter().map(|p| p.display().to_string()).collect::>() + candidates + .iter() + .map(|p| p.display().to_string()) + .collect::>() ) } @@ -425,7 +430,9 @@ pub fn compute_sha256(data: &[u8]) -> String { /// /// NOTE: fc-agent is NOT included in Layer 2. It will be injected per-VM at boot time. /// Layer 2 only contains packages (podman, crun, etc.). -pub async fn ensure_rootfs() -> Result { +/// +/// If `allow_create` is false, bail if rootfs doesn't exist. +pub async fn ensure_rootfs(allow_create: bool) -> Result { let (plan, _plan_sha_full, _plan_sha_short) = load_plan()?; // Generate all scripts and compute hash of the complete init script @@ -462,6 +469,11 @@ pub async fn ensure_rootfs() -> Result { return Ok(rootfs_path); } + // Bail if creation not allowed + if !allow_create { + bail!("Rootfs not found. Run 'fcvm setup' first, or use --setup flag."); + } + // Create directory for lock file tokio::fs::create_dir_all(&rootfs_dir) .await @@ -506,7 +518,8 @@ pub async fn ensure_rootfs() -> Result { let temp_rootfs_path = rootfs_path.with_extension("raw.tmp"); let _ = tokio::fs::remove_file(&temp_rootfs_path).await; - let result = create_layer2_rootless(&plan, script_sha_short, &setup_script, &temp_rootfs_path).await; + let result = + create_layer2_rootless(&plan, script_sha_short, &setup_script, &temp_rootfs_path).await; if result.is_ok() { tokio::fs::rename(&temp_rootfs_path, &rootfs_path) @@ -748,7 +761,9 @@ exec switch_root /newroot /sbin/init /// /// Uses file locking to prevent race conditions when multiple VMs start /// simultaneously and all try to create the initrd. -pub async fn ensure_fc_agent_initrd() -> Result { +/// +/// If `allow_create` is false, bail if initrd doesn't exist. +pub async fn ensure_fc_agent_initrd(allow_create: bool) -> Result { // Find fc-agent binary let fc_agent_path = find_fc_agent_binary()?; let fc_agent_bytes = std::fs::read(&fc_agent_path) @@ -775,6 +790,11 @@ pub async fn ensure_fc_agent_initrd() -> Result { return Ok(initrd_path); } + // Bail if creation not allowed + if !allow_create { + bail!("fc-agent initrd not found. Run 'fcvm setup' first, or use --setup flag."); + } + // Create initrd directory (needed for lock file) tokio::fs::create_dir_all(&initrd_dir) .await @@ -858,7 +878,11 @@ pub async fn ensure_fc_agent_initrd() -> Result { // Write service files (normal and strace version) tokio::fs::write(temp_dir.join("fc-agent.service"), FC_AGENT_SERVICE).await?; - tokio::fs::write(temp_dir.join("fc-agent.service.strace"), FC_AGENT_SERVICE_STRACE).await?; + tokio::fs::write( + temp_dir.join("fc-agent.service.strace"), + FC_AGENT_SERVICE_STRACE, + ) + .await?; // Create cpio archive (initrd format) // Use bash with pipefail so cpio errors aren't masked by gzip success (v3) @@ -910,7 +934,12 @@ pub async fn ensure_fc_agent_initrd() -> Result { /// Find busybox binary (prefer static version) fn find_busybox() -> Result { // Check for busybox-static first - for path in &["/bin/busybox-static", "/usr/bin/busybox-static", "/bin/busybox", "/usr/bin/busybox"] { + for path in &[ + "/bin/busybox-static", + "/usr/bin/busybox-static", + "/bin/busybox", + "/usr/bin/busybox", + ] { let p = PathBuf::from(path); if p.exists() { return Ok(p); @@ -960,8 +989,10 @@ async fn create_layer2_rootless( let output = Command::new("qemu-img") .args([ "convert", - "-f", "qcow2", - "-O", "raw", + "-f", + "qcow2", + "-O", + "raw", path_to_str(&cloud_image)?, path_to_str(&full_disk_path)?, ]) @@ -1010,11 +1041,14 @@ async fn create_layer2_rootless( ptype: String, } - let sfdisk_output: SfdiskOutput = serde_json::from_slice(&output.stdout) - .context("parsing sfdisk JSON output")?; + let sfdisk_output: SfdiskOutput = + serde_json::from_slice(&output.stdout).context("parsing sfdisk JSON output")?; // Find the Linux filesystem partition (type ends with 0FC63DAF-8483-4772-8E79-3D69D8477DE4 or similar) - let root_part = sfdisk_output.partitiontable.partitions.iter() + let root_part = sfdisk_output + .partitiontable + .partitions + .iter() .find(|p| p.ptype.contains("0FC63DAF") || p.node.ends_with("1")) .ok_or_else(|| anyhow::anyhow!("Could not find root partition in GPT disk"))?; @@ -1055,7 +1089,10 @@ async fn create_layer2_rootless( .context("expanding partition")?; if !output.status.success() { - bail!("truncate failed: {}", String::from_utf8_lossy(&output.stderr)); + bail!( + "truncate failed: {}", + String::from_utf8_lossy(&output.stderr) + ); } // Resize the ext4 filesystem to fill the partition @@ -1074,7 +1111,10 @@ async fn create_layer2_rootless( .context("running resize2fs")?; if !output.status.success() { - bail!("resize2fs failed: {}", String::from_utf8_lossy(&output.stderr)); + bail!( + "resize2fs failed: {}", + String::from_utf8_lossy(&output.stderr) + ); } // Step 4b: Fix /etc/fstab to remove BOOT and UEFI entries @@ -1141,9 +1181,7 @@ async fn fix_fstab_in_image(image_path: &Path) -> Result<()> { // Filter out BOOT and UEFI entries let new_fstab: String = fstab_content .lines() - .filter(|line| { - !line.contains("LABEL=BOOT") && !line.contains("LABEL=UEFI") - }) + .filter(|line| !line.contains("LABEL=BOOT") && !line.contains("LABEL=UEFI")) .collect::>() .join("\n"); @@ -1158,12 +1196,7 @@ async fn fix_fstab_in_image(image_path: &Path) -> Result<()> { // Write the new fstab back using debugfs -w // debugfs command: rm /etc/fstab; write /tmp/fstab.new /etc/fstab let output = Command::new("debugfs") - .args([ - "-w", - "-R", - &format!("rm /etc/fstab"), - path_to_str(image_path)?, - ]) + .args(["-w", "-R", "rm /etc/fstab", path_to_str(image_path)?]) .output() .await .context("removing old fstab with debugfs")?; @@ -1253,7 +1286,10 @@ async fn create_layer2_setup_initrd( .context("making init executable")?; if !output.status.success() { - bail!("Failed to chmod init: {}", String::from_utf8_lossy(&output.stderr)); + bail!( + "Failed to chmod init: {}", + String::from_utf8_lossy(&output.stderr) + ); } // Copy busybox static binary (prefer busybox-static if available) @@ -1271,7 +1307,10 @@ async fn create_layer2_setup_initrd( .context("making busybox executable")?; if !output.status.success() { - bail!("Failed to chmod busybox: {}", String::from_utf8_lossy(&output.stderr)); + bail!( + "Failed to chmod busybox: {}", + String::from_utf8_lossy(&output.stderr) + ); } // Copy packages into initrd @@ -1339,7 +1378,12 @@ async fn download_packages(plan: &Plan, script_sha_short: &str) -> Result Result Result Result { let url_hash = &compute_sha256(arch_config.url.as_bytes())[..12]; let image_path = cache_dir.join(format!( "ubuntu-{}-{}-{}.img", - plan.base.version, - arch_name, - url_hash + plan.base.version, arch_name, url_hash )); // If cached, use it @@ -1531,20 +1579,27 @@ async fn boot_vm_for_setup(disk_path: &Path, initrd_path: &Path) -> Result<()> { let log_path = temp_dir.join("firecracker.log"); // Find kernel - downloaded from Kata release if needed - let kernel_path = crate::setup::kernel::ensure_kernel().await?; + // We pass true since we're in the rootfs creation path (allow_create=true) + let kernel_path = crate::setup::kernel::ensure_kernel(true).await?; // Create serial console output file let serial_path = temp_dir.join("serial.log"); - let serial_file = std::fs::File::create(&serial_path) - .context("creating serial console file")?; + let serial_file = + std::fs::File::create(&serial_path).context("creating serial console file")?; // Start Firecracker with serial console output - info!("starting Firecracker for Layer 2 setup (serial output: {})", serial_path.display()); + info!( + "starting Firecracker for Layer 2 setup (serial output: {})", + serial_path.display() + ); let mut fc_process = Command::new("firecracker") .args([ - "--api-sock", path_to_str(&api_socket)?, - "--log-path", path_to_str(&log_path)?, - "--level", "Info", + "--api-sock", + path_to_str(&api_socket)?, + "--log-path", + path_to_str(&log_path)?, + "--level", + "Info", ]) .stdout(serial_file.try_clone().context("cloning serial file")?) .stderr(std::process::Stdio::null()) @@ -1611,7 +1666,9 @@ async fn boot_vm_for_setup(disk_path: &Path, initrd_path: &Path) -> Result<()> { // No network needed! Packages are installed from local ISO. // Start the VM - client.put_action(crate::firecracker::api::InstanceAction::InstanceStart).await?; + client + .put_action(crate::firecracker::api::InstanceAction::InstanceStart) + .await?; info!("Layer 2 setup VM started, waiting for completion (this takes several minutes)"); // Wait for VM to shut down (setup script runs shutdown -h now when done) @@ -1624,7 +1681,10 @@ async fn boot_vm_for_setup(disk_path: &Path, initrd_path: &Path) -> Result<()> { match fc_process.try_wait() { Ok(Some(status)) => { let elapsed = start.elapsed(); - info!("Firecracker exited with status: {:?} after {:?}", status, elapsed); + info!( + "Firecracker exited with status: {:?} after {:?}", + status, elapsed + ); return Ok(elapsed); } Ok(None) => { @@ -1658,7 +1718,9 @@ async fn boot_vm_for_setup(disk_path: &Path, initrd_path: &Path) -> Result<()> { match result { Ok(Ok(elapsed)) => { // Check for completion marker in serial output - let serial_content = tokio::fs::read_to_string(&serial_path).await.unwrap_or_default(); + let serial_content = tokio::fs::read_to_string(&serial_path) + .await + .unwrap_or_default(); if !serial_content.contains("FCVM_SETUP_COMPLETE") { warn!("Setup failed! Serial console output:\n{}", serial_content); if let Ok(log_content) = tokio::fs::read_to_string(&log_path).await { @@ -1668,7 +1730,10 @@ async fn boot_vm_for_setup(disk_path: &Path, initrd_path: &Path) -> Result<()> { bail!("Layer 2 setup failed (no FCVM_SETUP_COMPLETE marker found)"); } let _ = tokio::fs::remove_dir_all(&temp_dir).await; - info!(elapsed_secs = elapsed.as_secs(), "Layer 2 setup VM completed successfully"); + info!( + elapsed_secs = elapsed.as_secs(), + "Layer 2 setup VM completed successfully" + ); Ok(()) } Ok(Err(e)) => { diff --git a/tests/common/mod.rs b/tests/common/mod.rs index aa0cb4a6..d6697dde 100644 --- a/tests/common/mod.rs +++ b/tests/common/mod.rs @@ -5,6 +5,9 @@ use std::path::PathBuf; /// Default test image - use AWS ECR to avoid Docker Hub rate limits pub const TEST_IMAGE: &str = "public.ecr.aws/nginx/nginx:alpine"; + +/// Polling interval for status checks (100ms) +pub const POLL_INTERVAL: Duration = Duration::from_millis(100); use std::process::{Command, Stdio}; use std::sync::atomic::{AtomicUsize, Ordering}; use std::time::Duration; @@ -13,7 +16,6 @@ use tokio::time::sleep; /// Global counter for unique test IDs static TEST_COUNTER: AtomicUsize = AtomicUsize::new(0); - /// Check if we're running inside a container. /// /// Containers create marker files that we can use to detect containerized environments. @@ -464,7 +466,7 @@ pub async fn start_memory_server( // Wait for serve process to save its state file // Serve processes don't have health status, so we just check state exists - poll_serve_state_by_pid(serve_pid, 10).await?; + poll_serve_state_by_pid(serve_pid, 30).await?; Ok((child, serve_pid)) } diff --git a/tests/lint.rs b/tests/lint.rs new file mode 100644 index 00000000..223092df --- /dev/null +++ b/tests/lint.rs @@ -0,0 +1,52 @@ +//! Lint tests - run fmt, clippy, audit, deny in parallel via cargo test. + +#![cfg(feature = "integration-fast")] + +use std::process::Command; + +fn run_cargo(args: &[&str]) -> std::process::Output { + Command::new("cargo") + .args(args) + .output() + .unwrap_or_else(|e| panic!("failed to run cargo {}: {}", args.join(" "), e)) +} + +fn assert_success(name: &str, output: std::process::Output) { + assert!( + output.status.success(), + "{} failed:\n{}{}", + name, + String::from_utf8_lossy(&output.stdout), + String::from_utf8_lossy(&output.stderr) + ); +} + +#[test] +fn fmt() { + assert_success("cargo fmt", run_cargo(&["fmt", "--", "--check"])); +} + +#[test] +fn clippy() { + assert_success( + "cargo clippy", + run_cargo(&[ + "clippy", + "--all-targets", + "--all-features", + "--", + "-D", + "warnings", + ]), + ); +} + +#[test] +fn audit() { + assert_success("cargo audit", run_cargo(&["audit"])); +} + +#[test] +fn deny() { + assert_success("cargo deny", run_cargo(&["deny", "check"])); +} diff --git a/tests/test_clone_connection.rs b/tests/test_clone_connection.rs index 9ec8fe6f..c2de638b 100644 --- a/tests/test_clone_connection.rs +++ b/tests/test_clone_connection.rs @@ -6,6 +6,8 @@ //! 3. We snapshot and clone the VM //! 4. Observe: does the clone's connection reset? Can it reconnect? +#![cfg(feature = "integration-slow")] + mod common; use anyhow::{Context, Result}; @@ -104,6 +106,33 @@ impl BroadcastServer { } } +/// Timeout for waiting for connections +const CONNECTION_TIMEOUT_SECS: u64 = 30; + +/// Poll until connection count exceeds threshold, with timeout +async fn wait_for_connections(counter: &Arc, min_count: u64) -> Result { + let start = Instant::now(); + let timeout = Duration::from_secs(CONNECTION_TIMEOUT_SECS); + + loop { + let count = counter.load(Ordering::Relaxed); + if count >= min_count { + return Ok(count); + } + + if start.elapsed() > timeout { + anyhow::bail!( + "timeout ({}s) waiting for connections: got {}, need {}", + CONNECTION_TIMEOUT_SECS, + count, + min_count + ); + } + + tokio::time::sleep(common::POLL_INTERVAL).await; + } +} + /// Test that cloning a VM resets TCP connections properly #[tokio::test] async fn test_clone_connection_reset_rootless() -> Result<()> { @@ -364,6 +393,7 @@ async fn test_clone_reconnect_latency_rootless() -> Result<()> { let server_port = server.port(); let stop_handle = server.stop_handle(); let server_seq = Arc::clone(&server.seq); + let conn_counter = Arc::clone(&server.conn_counter); let _server_thread = server.run_in_background(); println!(" Listening on port {}", server_port); @@ -437,7 +467,7 @@ async fn test_clone_reconnect_latency_rootless() -> Result<()> { }; // Wait for client to connect - tokio::time::sleep(Duration::from_secs(2)).await; + wait_for_connections(&conn_counter, 1).await?; let seq_before_snapshot = server_seq.load(Ordering::Relaxed); println!(" Client connected (server seq: {})", seq_before_snapshot); @@ -568,6 +598,7 @@ async fn test_clone_connection_timing_rootless() -> Result<()> { let server_port = server.port(); let stop_handle = server.stop_handle(); let server_seq = Arc::clone(&server.seq); + let conn_counter = Arc::clone(&server.conn_counter); let _server_thread = server.run_in_background(); println!(" Listening on port {}", server_port); @@ -637,7 +668,7 @@ async fn test_clone_connection_timing_rootless() -> Result<()> { } // Wait for connection - tokio::time::sleep(Duration::from_secs(2)).await; + wait_for_connections(&conn_counter, 1).await?; let seq_at_connect = server_seq.load(Ordering::Relaxed); println!( " Persistent client connected! (server seq: {})", @@ -743,8 +774,8 @@ async fn test_clone_connection_timing_rootless() -> Result<()> { println!(" Clone healthy (PID: {})", clone_pid); // The clone's nc process woke up in a new network namespace - // It has a stale socket fd - what happened? - tokio::time::sleep(Duration::from_secs(1)).await; + // It has a stale socket fd - give it a moment to react + tokio::time::sleep(Duration::from_millis(100)).await; println!("\nStep 8: Checking clone's inherited nc process..."); let output = tokio::process::Command::new(&fcvm_path) @@ -997,7 +1028,7 @@ done .await?; // Wait for initial connection - tokio::time::sleep(Duration::from_secs(2)).await; + wait_for_connections(&conn_counter, 1).await?; let initial_conns = conn_counter.load(Ordering::Relaxed); println!( " Client connected! (server has {} connections)", diff --git a/tests/test_egress.rs b/tests/test_egress.rs index bef92f95..2720a388 100644 --- a/tests/test_egress.rs +++ b/tests/test_egress.rs @@ -9,13 +9,15 @@ //! //! Both bridged and rootless networking modes are tested. +#![cfg(feature = "integration-slow")] + mod common; use anyhow::{Context, Result}; use std::time::Duration; -/// External URL to test egress connectivity - Docker Hub auth endpoint (returns 200) -const EGRESS_TEST_URL: &str = "https://auth.docker.io/token?service=registry.docker.io"; +/// External URL to test egress connectivity - AWS EC2 metadata mock (fast, returns 200) +const EGRESS_TEST_URL: &str = "https://checkip.amazonaws.com"; /// Test egress connectivity for fresh VM with bridged networking #[cfg(feature = "privileged-tests")] @@ -188,7 +190,7 @@ async fn egress_clone_test_impl(network: &str) -> Result<()> { .context("spawning memory server")?; // Wait for serve process to save its state file - common::poll_serve_state_by_pid(serve_pid, 10).await?; + common::poll_serve_state_by_pid(serve_pid, 30).await?; println!(" ✓ Memory server ready (PID: {})", serve_pid); // Step 4: Spawn clone @@ -260,7 +262,7 @@ async fn test_egress(fcvm_path: &std::path::Path, pid: u32) -> Result<()> { "curl", "-s", "--max-time", - "15", + "5", "-o", "/dev/null", "-w", @@ -302,7 +304,7 @@ async fn test_egress(fcvm_path: &std::path::Path, pid: u32) -> Result<()> { "-q", "-O", "/dev/null", - "--timeout=15", + "--timeout=5", EGRESS_TEST_URL, ]) .output() diff --git a/tests/test_egress_stress.rs b/tests/test_egress_stress.rs index 4c5904a3..0fd86733 100644 --- a/tests/test_egress_stress.rs +++ b/tests/test_egress_stress.rs @@ -7,6 +7,8 @@ //! 4. Runs parallel curl commands from each clone to the local HTTP server //! 5. Verifies all requests succeed +#![cfg(feature = "integration-slow")] + mod common; use anyhow::{Context, Result}; @@ -185,8 +187,8 @@ async fn egress_stress_impl( .await .context("spawning memory server")?; - // Wait for server to be ready - tokio::time::sleep(Duration::from_secs(2)).await; + // Wait for serve process to save its state file + common::poll_serve_state_by_pid(serve_pid, 30).await?; println!(" ✓ Memory server ready (PID: {})", serve_pid); // Step 4: Spawn clones in parallel diff --git a/tests/test_exec.rs b/tests/test_exec.rs index 599d45b4..db01bd55 100644 --- a/tests/test_exec.rs +++ b/tests/test_exec.rs @@ -6,6 +6,8 @@ //! Uses common::spawn_fcvm() to prevent pipe buffer deadlock. //! See CLAUDE.md "Pipe Buffer Deadlock in Tests" for details. +#![cfg(feature = "integration-fast")] + mod common; use anyhow::{Context, Result}; diff --git a/tests/test_fuse_in_vm.rs b/tests/test_fuse_in_vm.rs deleted file mode 100644 index fc16fdd5..00000000 --- a/tests/test_fuse_in_vm.rs +++ /dev/null @@ -1,257 +0,0 @@ -//! FUSE-in-VM integration test -//! -//! Tests fuse-pipe by running pjdfstest inside a Firecracker VM: -//! 1. Create temp directory with test data -//! 2. Start VM with --map to mount the directory via fuse-pipe -//! 3. Run pjdfstest container inside VM against the FUSE mount -//! 4. Verify all tests pass -//! -//! This tests the full fuse-pipe stack: -//! - Host: VolumeServer serving directory via vsock -//! - Guest: fc-agent mounting via fuse-pipe FuseClient -//! - Guest: pjdfstest container running against the mount - -mod common; - -use anyhow::{Context, Result}; -use std::path::PathBuf; -use std::process::Stdio; -use std::time::{Duration, Instant}; - -/// Quick smoke test - run just posix_fallocate category (~100 tests) -/// Requires sudo for reliable podman storage access. -#[cfg(feature = "privileged-tests")] -#[tokio::test] -async fn test_fuse_in_vm_smoke() -> Result<()> { - fuse_in_vm_test_impl("posix_fallocate", 8).await -} - -/// Full pjdfstest suite in VM (8789 tests) -/// Run with: cargo test --test test_fuse_in_vm test_fuse_in_vm_full -- --ignored -/// Requires sudo for reliable podman storage access. -#[cfg(feature = "privileged-tests")] -#[tokio::test] -#[ignore] -async fn test_fuse_in_vm_full() -> Result<()> { - fuse_in_vm_test_impl("all", 64).await -} - -async fn fuse_in_vm_test_impl(category: &str, jobs: usize) -> Result<()> { - // Full test suite needs privileged mode for mknod tests - let privileged = category == "all"; - fuse_in_vm_test_impl_inner(category, jobs, privileged).await -} - -async fn fuse_in_vm_test_impl_inner(category: &str, jobs: usize, privileged: bool) -> Result<()> { - let test_id = format!("fuse-vm-{}", std::process::id()); - let test_start = Instant::now(); - - println!("\n╔═══════════════════════════════════════════════════════════════╗"); - println!( - "║ FUSE-in-VM Test: {} ({} jobs) ║", - category, jobs - ); - if privileged { - println!("║ [PRIVILEGED MODE] ║"); - } - println!("╚═══════════════════════════════════════════════════════════════╝\n"); - - // Paths - let data_dir = PathBuf::from(format!("/tmp/fuse-{}-data", test_id)); - let vm_name = format!("fuse-vm-{}", std::process::id()); - - // Cleanup from previous runs - let _ = tokio::fs::remove_dir_all(&data_dir).await; - - // Create data directory for the FUSE mount - tokio::fs::create_dir_all(&data_dir).await?; - - // Set permissions for pjdfstest (needs write access) - #[cfg(unix)] - { - use std::os::unix::fs::PermissionsExt; - tokio::fs::set_permissions(&data_dir, std::fs::Permissions::from_mode(0o777)).await?; - } - - // Find fcvm binary - let fcvm_path = common::find_fcvm_binary()?; - - // ========================================================================= - // Step 1: Build pjdfstest container if needed - // ========================================================================= - println!("Step 1: Ensuring pjdfstest container exists..."); - let step1_start = Instant::now(); - - // Check if pjdfstest container exists (in root's storage) - let check_output = tokio::process::Command::new("podman") - .args(["image", "exists", "localhost/pjdfstest"]) - .output() - .await?; - - if !check_output.status.success() { - println!(" Building pjdfstest container (sudo podman build)..."); - let build_output = tokio::process::Command::new("podman") - .args([ - "build", - "-t", - "pjdfstest", - "-f", - "Containerfile.pjdfstest", - ".", - ]) - .output() - .await - .context("building pjdfstest container")?; - - if !build_output.status.success() { - anyhow::bail!( - "Failed to build pjdfstest container: {}", - String::from_utf8_lossy(&build_output.stderr) - ); - } - } - println!( - " ✓ pjdfstest container ready (took {:.1}s)", - step1_start.elapsed().as_secs_f64() - ); - - // ========================================================================= - // Step 2: Start VM with FUSE mount - // ========================================================================= - println!("\nStep 2: Starting VM with FUSE-mounted directory..."); - let step2_start = Instant::now(); - - // Map the data directory into the VM via fuse-pipe - // The guest will mount it at /mnt/volumes/0 (default for first volume) - let map_arg = format!("{}:/testdir", data_dir.display()); - - // Build the pjdfstest command - // Select tests based on category - let prove_cmd = if category == "all" { - format!("prove -v -j {} -r /opt/pjdfstest/tests/", jobs) - } else { - format!("prove -v -j {} -r /opt/pjdfstest/tests/{}/", jobs, category) - }; - - // Preserve SUDO_USER from the outer sudo (if any) so that fcvm can - // find containers in the correct user's storage - let mut cmd = tokio::process::Command::new(fcvm_path); - let mut args = vec![ - "podman", - "run", - "--name", - &vm_name, - "--network", - "rootless", - "--map", - &map_arg, - "--cmd", - &prove_cmd, - ]; - // Add --privileged for full test suite (needed for mknod tests) - if privileged { - args.push("--privileged"); - } - args.push("localhost/pjdfstest"); - cmd.args(&args) - .stdout(Stdio::piped()) - .stderr(Stdio::piped()); - - // If SUDO_USER is set (we're running under sudo), preserve it - if let Ok(sudo_user) = std::env::var("SUDO_USER") { - cmd.env("SUDO_USER", sudo_user); - } - - let mut vm_child = cmd.spawn().context("spawning VM")?; - - let vm_pid = vm_child - .id() - .ok_or_else(|| anyhow::anyhow!("failed to get VM PID"))?; - - // Spawn log consumers - common::spawn_log_consumer(vm_child.stdout.take(), "vm"); - common::spawn_log_consumer_stderr(vm_child.stderr.take(), "vm"); - - println!( - " ✓ VM started (PID: {}, took {:.1}s)", - vm_pid, - step2_start.elapsed().as_secs_f64() - ); - - // ========================================================================= - // Step 3: Wait for VM to complete - // ========================================================================= - println!("\nStep 3: Waiting for pjdfstest to complete..."); - let step3_start = Instant::now(); - - // Wait for VM process with timeout - let timeout = if category == "all" { - Duration::from_secs(3600) // 1 hour for full test - } else { - Duration::from_secs(600) // 10 minutes for single category - }; - - let result = tokio::time::timeout(timeout, vm_child.wait()).await; - - let exit_status = match result { - Ok(Ok(status)) => status, - Ok(Err(e)) => anyhow::bail!("Error waiting for VM: {}", e), - Err(_) => { - common::kill_process(vm_pid).await; - anyhow::bail!("VM timeout after {} seconds", timeout.as_secs()); - } - }; - - let test_time = step3_start.elapsed(); - println!( - " VM exited with status: {} (took {:.1}s)", - exit_status, - test_time.as_secs_f64() - ); - - // ========================================================================= - // Cleanup - // ========================================================================= - println!("\nCleaning up..."); - let _ = tokio::fs::remove_dir_all(&data_dir).await; - - let total_time = test_start.elapsed(); - - // ========================================================================= - // Results - // ========================================================================= - println!("\n╔═══════════════════════════════════════════════════════════════╗"); - println!("║ RESULTS ║"); - println!("╠═══════════════════════════════════════════════════════════════╣"); - println!( - "║ Category: {:>10} ║", - category - ); - println!( - "║ Jobs: {:>10} ║", - jobs - ); - println!( - "║ Test time: {:>10.1}s ║", - test_time.as_secs_f64() - ); - println!( - "║ Total time: {:>10.1}s ║", - total_time.as_secs_f64() - ); - println!( - "║ Exit status: {:>10} ║", - exit_status.code().unwrap_or(-1) - ); - println!("╚═══════════════════════════════════════════════════════════════╝"); - - if !exit_status.success() { - anyhow::bail!( - "pjdfstest failed with exit code: {}", - exit_status.code().unwrap_or(-1) - ); - } - - println!("\n✅ FUSE-IN-VM TEST PASSED!"); - Ok(()) -} diff --git a/tests/test_fuse_in_vm_matrix.rs b/tests/test_fuse_in_vm_matrix.rs new file mode 100644 index 00000000..8d3d70ee --- /dev/null +++ b/tests/test_fuse_in_vm_matrix.rs @@ -0,0 +1,171 @@ +//! In-VM pjdfstest matrix - runs pjdfstest categories inside VMs +//! +//! Each category is a separate test, allowing nextest to run all 17 in parallel. +//! Tests the full stack: host VolumeServer → vsock → guest FUSE mount. +//! +//! See also: fuse-pipe/tests/pjdfstest_matrix_root.rs (host-side matrix, tests fuse-pipe directly) +//! +//! Run with: cargo nextest run --test test_fuse_in_vm_matrix --features privileged-tests + +#![cfg(all(feature = "privileged-tests", feature = "integration-slow"))] + +mod common; + +use anyhow::{Context, Result}; +use std::process::Stdio; +use std::time::Instant; + +/// Number of parallel jobs within prove (inside VM) +const JOBS: usize = 8; + +/// Run a single pjdfstest category inside a VM +async fn run_category_in_vm(category: &str) -> Result<()> { + let test_id = format!("pjdfs-vm-{}-{}", category, std::process::id()); + let vm_name = format!("pjdfs-{}-{}", category, std::process::id()); + let start = Instant::now(); + + // Find fcvm binary + let fcvm_path = common::find_fcvm_binary()?; + + // Build prove command for this category + let prove_cmd = format!("prove -v -j {} -r /opt/pjdfstest/tests/{}/", JOBS, category); + + // Check if pjdfstest container exists + let check = tokio::process::Command::new("podman") + .args(["image", "exists", "localhost/pjdfstest"]) + .output() + .await?; + + if !check.status.success() { + // Build pjdfstest container + let build = tokio::process::Command::new("podman") + .args([ + "build", + "-t", + "pjdfstest", + "-f", + "Containerfile.pjdfstest", + ".", + ]) + .output() + .await + .context("building pjdfstest container")?; + + if !build.status.success() { + anyhow::bail!( + "Failed to build pjdfstest: {}", + String::from_utf8_lossy(&build.stderr) + ); + } + } + + // Create temp directory for FUSE mount + let data_dir = format!("/tmp/fuse-{}-data", test_id); + tokio::fs::create_dir_all(&data_dir).await?; + + #[cfg(unix)] + { + use std::os::unix::fs::PermissionsExt; + tokio::fs::set_permissions(&data_dir, std::fs::Permissions::from_mode(0o777)).await?; + } + + let map_arg = format!("{}:/testdir", data_dir); + + // Start VM with pjdfstest container + let mut cmd = tokio::process::Command::new(&fcvm_path); + cmd.args([ + "podman", + "run", + "--name", + &vm_name, + "--network", + "bridged", + "--map", + &map_arg, + "--cmd", + &prove_cmd, + "--privileged", // Needed for mknod tests + "localhost/pjdfstest", + ]) + .stdout(Stdio::piped()) + .stderr(Stdio::piped()); + + // Preserve SUDO_USER if set + if let Ok(sudo_user) = std::env::var("SUDO_USER") { + cmd.env("SUDO_USER", sudo_user); + } + + let mut child = cmd.spawn().context("spawning VM")?; + let vm_pid = child.id().ok_or_else(|| anyhow::anyhow!("no VM PID"))?; + + // Consume output + common::spawn_log_consumer(child.stdout.take(), &format!("vm-{}", category)); + common::spawn_log_consumer_stderr(child.stderr.take(), &format!("vm-{}", category)); + + // Wait for completion (10 min timeout per category) + let timeout = std::time::Duration::from_secs(600); + let result = tokio::time::timeout(timeout, child.wait()).await; + + // Cleanup + let _ = tokio::fs::remove_dir_all(&data_dir).await; + + let exit_status = match result { + Ok(Ok(status)) => status, + Ok(Err(e)) => anyhow::bail!("Error waiting for VM: {}", e), + Err(_) => { + common::kill_process(vm_pid).await; + anyhow::bail!("VM timeout after {} seconds", timeout.as_secs()); + } + }; + + let duration = start.elapsed(); + + if !exit_status.success() { + anyhow::bail!( + "pjdfstest category {} failed in VM: exit={} ({:.1}s)", + category, + exit_status.code().unwrap_or(-1), + duration.as_secs_f64() + ); + } + + println!( + "[FUSE-VM] \u{2713} {} ({:.1}s)", + category, + duration.as_secs_f64() + ); + + Ok(()) +} + +macro_rules! pjdfstest_vm_category { + ($name:ident, $category:literal) => { + #[tokio::test] + async fn $name() { + run_category_in_vm($category).await.expect(concat!( + "pjdfstest category ", + $category, + " failed in VM" + )); + } + }; +} + +// All 17 pjdfstest categories - each runs in a separate VM +pjdfstest_vm_category!(test_pjdfstest_vm_chflags, "chflags"); +pjdfstest_vm_category!(test_pjdfstest_vm_chmod, "chmod"); +pjdfstest_vm_category!(test_pjdfstest_vm_chown, "chown"); +pjdfstest_vm_category!(test_pjdfstest_vm_ftruncate, "ftruncate"); +pjdfstest_vm_category!(test_pjdfstest_vm_granular, "granular"); +pjdfstest_vm_category!(test_pjdfstest_vm_link, "link"); +pjdfstest_vm_category!(test_pjdfstest_vm_mkdir, "mkdir"); +pjdfstest_vm_category!(test_pjdfstest_vm_mkfifo, "mkfifo"); +pjdfstest_vm_category!(test_pjdfstest_vm_mknod, "mknod"); +pjdfstest_vm_category!(test_pjdfstest_vm_open, "open"); +pjdfstest_vm_category!(test_pjdfstest_vm_posix_fallocate, "posix_fallocate"); +pjdfstest_vm_category!(test_pjdfstest_vm_rename, "rename"); +pjdfstest_vm_category!(test_pjdfstest_vm_rmdir, "rmdir"); +pjdfstest_vm_category!(test_pjdfstest_vm_symlink, "symlink"); +pjdfstest_vm_category!(test_pjdfstest_vm_truncate, "truncate"); +pjdfstest_vm_category!(test_pjdfstest_vm_unlink, "unlink"); +pjdfstest_vm_category!(test_pjdfstest_vm_utimensat, "utimensat"); diff --git a/tests/test_fuse_posix.rs b/tests/test_fuse_posix.rs deleted file mode 100644 index 2412e5f0..00000000 --- a/tests/test_fuse_posix.rs +++ /dev/null @@ -1,292 +0,0 @@ -//! POSIX FUSE compliance tests using pjdfstest -//! -//! These tests run the pjdfstest suite against fcvm's FUSE volume implementation. -//! Tests use snapshot/clone pattern: one baseline VM + multiple clones for parallel testing. -//! -//! Prerequisites: -//! - pjdfstest must be installed at /tmp/pjdfstest-check/pjdfstest -//! - Test directory at /tmp/pjdfstest-check/tests/ -//! -//! Install with: -//! ```bash -//! git clone https://github.com/pjd/pjdfstest /tmp/pjdfstest-check -//! cd /tmp/pjdfstest-check && autoreconf -ifs && ./configure && make -//! ``` -//! -//! Run with: -//! ```bash -//! # Sequential (one VM, all categories) -//! cargo test --test test_fuse_posix test_posix_all_sequential -- --ignored --nocapture -//! -//! # Parallel (one baseline + multiple clones, one category per test) -//! cargo test --test test_fuse_posix -- --ignored --nocapture --test-threads=4 -//! ``` - -mod common; - -use std::fs; -use std::path::Path; -use std::process::{Command, Stdio}; -use std::time::Instant; - -const PJDFSTEST_BIN: &str = "/tmp/pjdfstest-check/pjdfstest"; -const PJDFSTEST_TESTS: &str = "/tmp/pjdfstest-check/tests"; -const TIMEOUT_SECS: u64 = 60; - -#[derive(Debug)] -struct TestResult { - category: String, - passed: bool, - tests: usize, - failures: usize, - duration_secs: f64, - output: String, -} - -/// Discover all pjdfstest categories -fn discover_categories() -> Vec { - let tests_dir = Path::new(PJDFSTEST_TESTS); - let mut categories = Vec::new(); - - if let Ok(entries) = fs::read_dir(tests_dir) { - for entry in entries.filter_map(|e| e.ok()) { - if entry.file_type().map(|t| t.is_dir()).unwrap_or(false) { - if let Some(name) = entry.file_name().to_str() { - categories.push(name.to_string()); - } - } - } - } - - categories.sort(); - categories -} - -/// Run a single pjdfstest category against a directory -async fn run_category(category: &str, work_dir: &Path) -> TestResult { - let start = Instant::now(); - let tests_dir = Path::new(PJDFSTEST_TESTS); - let category_tests = tests_dir.join(category); - - // Create isolated work directory for this category - let category_work = work_dir.join(category); - let _ = fs::remove_dir_all(&category_work); - if let Err(e) = fs::create_dir_all(&category_work) { - return TestResult { - category: category.to_string(), - passed: false, - tests: 0, - failures: 0, - duration_secs: start.elapsed().as_secs_f64(), - output: format!("Failed to create work directory: {}", e), - }; - } - - // Copy pjdfstest binary to work directory (POSIX tests require this) - let local_pjdfstest = category_work.join("pjdfstest"); - if let Err(e) = fs::copy(PJDFSTEST_BIN, &local_pjdfstest) { - return TestResult { - category: category.to_string(), - passed: false, - tests: 0, - failures: 0, - duration_secs: start.elapsed().as_secs_f64(), - output: format!("Failed to copy pjdfstest: {}", e), - }; - } - - // Run prove for this category - let output = Command::new("timeout") - .args([ - &TIMEOUT_SECS.to_string(), - "prove", - "-v", - "-r", - category_tests.to_str().unwrap(), - ]) - .current_dir(&category_work) - .stdout(Stdio::piped()) - .stderr(Stdio::piped()) - .output(); - - let duration = start.elapsed().as_secs_f64(); - - match output { - Ok(out) => { - let stdout = String::from_utf8_lossy(&out.stdout); - let stderr = String::from_utf8_lossy(&out.stderr); - let combined = format!("{}\n{}", stdout, stderr); - - let (tests, failures) = parse_prove_output(&combined); - let passed = out.status.success() && failures == 0; - - TestResult { - category: category.to_string(), - passed, - tests, - failures, - duration_secs: duration, - output: combined, - } - } - Err(e) => TestResult { - category: category.to_string(), - passed: false, - tests: 0, - failures: 0, - duration_secs: duration, - output: format!("Failed to run prove: {}", e), - }, - } -} - -/// Parse prove output to extract test counts and failures -fn parse_prove_output(output: &str) -> (usize, usize) { - let mut tests = 0usize; - let mut failures = 0usize; - - for line in output.lines() { - // Parse "Files=N, Tests=M" - if line.starts_with("Files=") { - if let Some(tests_part) = line.split("Tests=").nth(1) { - if let Some(num_str) = tests_part.split(',').next() { - tests = num_str.trim().parse().unwrap_or(0); - } - } - } - - // Parse "Failed X/Y subtests" - if line.contains("Failed") && line.contains("subtests") { - let parts: Vec<&str> = line.split_whitespace().collect(); - for (i, part) in parts.iter().enumerate() { - if *part == "Failed" && i + 1 < parts.len() { - if let Some(failed_str) = parts[i + 1].split('/').next() { - failures += failed_str.parse::().unwrap_or(0); - } - } - } - } - } - - (tests, failures) -} - -/// Check that pjdfstest is installed -fn check_prerequisites() { - if !Path::new(PJDFSTEST_BIN).exists() { - panic!( - "pjdfstest not found at {}. Install with:\n\ - git clone https://github.com/pjd/pjdfstest /tmp/pjdfstest-check\n\ - cd /tmp/pjdfstest-check && autoreconf -ifs && ./configure && make", - PJDFSTEST_BIN - ); - } -} - -/// Utility test to list all available categories -#[test] -#[ignore = "utility test - just prints available categories"] -fn list_categories() { - if !Path::new(PJDFSTEST_TESTS).exists() { - println!("pjdfstest tests directory not found at {}", PJDFSTEST_TESTS); - println!("Install with:"); - println!(" git clone https://github.com/pjd/pjdfstest /tmp/pjdfstest-check"); - println!(" cd /tmp/pjdfstest-check && autoreconf -ifs && ./configure && make"); - return; - } - - let categories = discover_categories(); - println!("\nAvailable pjdfstest categories ({}):", categories.len()); - for cat in categories { - println!(" - {}", cat); - } -} - -/// Run all categories sequentially on a single VM -/// -/// This test creates ONE VM with a FUSE volume and runs all pjdfstest categories -/// sequentially. Useful for comprehensive testing without parallelism complexity. -#[cfg(feature = "privileged-tests")] -#[tokio::test] -#[ignore = "comprehensive test - runs all categories sequentially"] -async fn test_posix_all_sequential_bridged() { - check_prerequisites(); - - // Create VM with FUSE volume - let fixture = common::VmFixture::new("posix-all-seq") - .await - .expect("failed to create VM fixture"); - - println!("\n╔═══════════════════════════════════════════════════════════════╗"); - println!("║ pjdfstest POSIX Compliance Test (Sequential) ║"); - println!("╚═══════════════════════════════════════════════════════════════╝\n"); - - let categories = discover_categories(); - println!("Running {} categories sequentially...\n", categories.len()); - - let mut all_passed = true; - let mut total_tests = 0; - let mut total_failures = 0; - let mut failed_categories = Vec::new(); - - for category in &categories { - let result = run_category(category, fixture.host_dir()).await; - - let status = if result.passed { "✓" } else { "✗" }; - println!( - "[{}] {} {} ({} tests, {} failures, {:.1}s)", - categories.iter().position(|c| c == category).unwrap() + 1, - status, - result.category, - result.tests, - result.failures, - result.duration_secs - ); - - total_tests += result.tests; - total_failures += result.failures; - - if !result.passed { - all_passed = false; - failed_categories.push(result.category.clone()); - - // Print output for failed categories - if result.output.len() < 5000 { - eprintln!("\n━━━ {} output ━━━", result.category); - eprintln!("{}", result.output); - } - } - } - - println!("\n╔═══════════════════════════════════════════════════════════════╗"); - println!("║ TEST SUMMARY ║"); - println!("╠═══════════════════════════════════════════════════════════════╣"); - println!( - "║ Total tests: {:>10} ║", - total_tests - ); - println!( - "║ Total failures: {:>10} ║", - total_failures - ); - println!( - "║ Categories: {:>10} ║", - categories.len() - ); - println!( - "║ Failed categories:{:>10} ║", - failed_categories.len() - ); - println!("╚═══════════════════════════════════════════════════════════════╝"); - - if !failed_categories.is_empty() { - panic!( - "\n{} categories failed: {:?}", - failed_categories.len(), - failed_categories - ); - } - - assert!(all_passed, "all test categories should pass"); - assert_eq!(total_failures, 0, "should have no failures"); -} diff --git a/tests/test_health_monitor.rs b/tests/test_health_monitor.rs index 32b12c1e..3669a30a 100644 --- a/tests/test_health_monitor.rs +++ b/tests/test_health_monitor.rs @@ -13,7 +13,7 @@ fn create_unique_test_dir() -> std::path::PathBuf { let id = TEST_COUNTER.fetch_add(1, Ordering::SeqCst); let pid = std::process::id(); let temp_dir = tempfile::tempdir().expect("create temp base dir"); - let path = temp_dir.into_path(); + let path = temp_dir.keep(); // Rename to include unique suffix for debugging let unique_path = std::path::PathBuf::from(format!("/tmp/fcvm-test-health-{}-{}", pid, id)); let _ = std::fs::remove_dir_all(&unique_path); diff --git a/tests/test_localhost_image.rs b/tests/test_localhost_image.rs index 85bde9a8..535069c2 100644 --- a/tests/test_localhost_image.rs +++ b/tests/test_localhost_image.rs @@ -4,6 +4,8 @@ //! The image is exported from the host using skopeo, mounted into the VM via FUSE, //! and then imported by fc-agent using skopeo before running with podman. +#![cfg(all(feature = "integration-fast", feature = "privileged-tests"))] + mod common; use anyhow::{Context, Result}; @@ -12,7 +14,6 @@ use std::time::Duration; use tokio::io::{AsyncBufReadExt, BufReader}; /// Test that a localhost/ container image can be built and run in a VM -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_localhost_hello_world_bridged() -> Result<()> { println!("\nLocalhost Image Test"); @@ -77,7 +78,9 @@ async fn test_localhost_hello_world_bridged() -> Result<()> { found_hello = true; } // Check for container exit with code 0 - if line.contains("Container exit notification received") && line.contains("exit_code=0") { + if line.contains("Container exit notification received") + && line.contains("exit_code=0") + { exited_zero = true; } } @@ -86,7 +89,8 @@ async fn test_localhost_hello_world_bridged() -> Result<()> { }); // Wait for the process to exit (with timeout) - let timeout = Duration::from_secs(60); + // 120s to handle podman storage lock contention during parallel test runs + let timeout = Duration::from_secs(120); let result = tokio::time::timeout(timeout, child.wait()).await; match result { @@ -121,7 +125,9 @@ async fn test_localhost_hello_world_bridged() -> Result<()> { Ok(()) } else { println!("\n❌ LOCALHOST IMAGE TEST FAILED!"); - println!(" - Did not find expected output: '[ctr:stdout] Hello from localhost container!'"); + println!( + " - Did not find expected output: '[ctr:stdout] Hello from localhost container!'" + ); println!(" - Check logs above for error details"); anyhow::bail!("Localhost image test failed") } diff --git a/tests/test_port_forward.rs b/tests/test_port_forward.rs index ff7b7322..b99683bd 100644 --- a/tests/test_port_forward.rs +++ b/tests/test_port_forward.rs @@ -2,6 +2,8 @@ //! //! Verifies that --publish correctly forwards ports from host to guest +#![cfg(feature = "integration-fast")] + mod common; use anyhow::{Context, Result}; @@ -28,6 +30,9 @@ fn test_port_forward_bridged() -> Result<()> { let fcvm_path = common::find_fcvm_binary()?; let vm_name = format!("port-bridged-{}", std::process::id()); + // Port 8080:80 - DNAT is scoped to veth IP so same port works across parallel VMs + let host_port: u16 = 8080; + // Start VM with port forwarding let mut fcvm = Command::new(&fcvm_path) .args([ @@ -38,7 +43,7 @@ fn test_port_forward_bridged() -> Result<()> { "--network", "bridged", "--publish", - "18080:80", + "8080:80", "nginx:alpine", ]) .spawn() @@ -51,9 +56,10 @@ fn test_port_forward_bridged() -> Result<()> { let start = std::time::Instant::now(); let mut healthy = false; let mut guest_ip = String::new(); + let mut veth_host_ip = String::new(); while start.elapsed() < Duration::from_secs(60) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json", "--pid", &fcvm_pid.to_string()]) @@ -75,12 +81,18 @@ fn test_port_forward_bridged() -> Result<()> { // Find our VM and check health (filtered by PID so should be only one) if let Some(display) = vms.first() { if matches!(display.vm.health_status, fcvm::state::HealthStatus::Healthy) { - // Extract guest_ip from config.network + // Extract guest_ip and host_ip (veth's host IP) from config.network if let Some(ref ip) = display.vm.config.network.guest_ip { guest_ip = ip.clone(); } + if let Some(ref ip) = display.vm.config.network.host_ip { + veth_host_ip = ip.clone(); + } healthy = true; - println!("VM is healthy, guest_ip: {}", guest_ip); + println!( + "VM is healthy, guest_ip: {}, veth_host_ip: {}", + guest_ip, veth_host_ip + ); break; } } @@ -114,64 +126,40 @@ fn test_port_forward_bridged() -> Result<()> { ); } - // Test 2: Access via forwarded port (external interface) - // Get the host's primary IP - let host_ip_output = Command::new("hostname") - .arg("-I") - .output() - .context("getting host IP")?; - let host_ip = String::from_utf8_lossy(&host_ip_output.stdout) - .split_whitespace() - .next() - .unwrap_or("127.0.0.1") - .to_string(); - - println!("Testing access via host IP {}:18080...", host_ip); + // Test 2: Access via port forwarding (veth's host IP) + // DNAT rules are scoped to the veth IP, so this is what we test + println!( + "Testing port forwarding via veth IP {}:{}...", + veth_host_ip, host_port + ); let output = Command::new("curl") .args([ "-s", "--max-time", "5", - &format!("http://{}:18080", host_ip), + &format!("http://{}:{}", veth_host_ip, host_port), ]) .output() .context("curl to forwarded port")?; let forward_works = output.status.success() && !output.stdout.is_empty(); println!( - "Forwarded port (host IP): {}", + "Port forwarding (veth IP): {}", if forward_works { "OK" } else { "FAIL" } ); - // Test 3: Access via localhost (this is the tricky one) - println!("Testing access via localhost:18080..."); - let output = Command::new("curl") - .args(["-s", "--max-time", "5", "http://127.0.0.1:18080"]) - .output() - .context("curl to localhost")?; - - let localhost_works = output.status.success() && !output.stdout.is_empty(); - println!( - "Localhost access: {}", - if localhost_works { "OK" } else { "FAIL" } - ); - // Cleanup println!("Cleaning up..."); let _ = Command::new("kill") .args(["-TERM", &fcvm_pid.to_string()]) .output(); - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let _ = fcvm.wait(); - // Assertions - ALL port forwarding methods must work + // Assertions - both direct and port forwarding must work assert!(direct_works, "Direct access to guest should work"); - assert!(forward_works, "Port forwarding via host IP should work"); - assert!( - localhost_works, - "Localhost port forwarding should work (requires route_localnet)" - ); + assert!(forward_works, "Port forwarding via veth IP should work"); println!("test_port_forward_bridged PASSED"); Ok(()) @@ -189,7 +177,7 @@ fn test_port_forward_rootless() -> Result<()> { let vm_name = format!("port-rootless-{}", std::process::id()); // Start VM with rootless networking and port forwarding - // Use unprivileged port 8080 since rootless can't bind to 80 + // Rootless uses unique loopback IPs (127.x.y.z) per VM, so port 8080 is fine let mut fcvm = Command::new(&fcvm_path) .args([ "podman", @@ -214,7 +202,7 @@ fn test_port_forward_rootless() -> Result<()> { let mut loopback_ip = String::new(); while start.elapsed() < Duration::from_secs(90) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json", "--pid", &fcvm_pid.to_string()]) @@ -287,7 +275,7 @@ fn test_port_forward_rootless() -> Result<()> { .args(["-TERM", &fcvm_pid.to_string()]) .output(); - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let _ = fcvm.wait(); // Assertions diff --git a/tests/test_readme_examples.rs b/tests/test_readme_examples.rs index a977bd58..ddfe2038 100644 --- a/tests/test_readme_examples.rs +++ b/tests/test_readme_examples.rs @@ -9,6 +9,8 @@ //! `Stdio::inherit()` to prevent pipe buffer deadlock. See CLAUDE.md //! "Pipe Buffer Deadlock in Tests" for details. +#![cfg(all(feature = "integration-fast", feature = "privileged-tests"))] + mod common; use anyhow::{Context, Result}; @@ -21,7 +23,6 @@ use std::time::Duration; /// ``` /// sudo fcvm podman run --name web1 --map /host/config:/config:ro nginx:alpine /// ``` -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_readonly_volume_bridged() -> Result<()> { println!("\ntest_readonly_volume_bridged"); @@ -118,7 +119,6 @@ async fn test_readonly_volume_bridged() -> Result<()> { /// ``` /// sudo fcvm podman run --name web1 --env DEBUG=1 nginx:alpine /// ``` -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_env_variables_bridged() -> Result<()> { println!("\ntest_env_variables_bridged"); @@ -197,7 +197,6 @@ async fn test_env_variables_bridged() -> Result<()> { /// ``` /// sudo fcvm podman run --name web1 --cpu 4 --mem 4096 nginx:alpine /// ``` -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_custom_resources_bridged() -> Result<()> { println!("\ntest_custom_resources_bridged"); @@ -276,7 +275,6 @@ async fn test_custom_resources_bridged() -> Result<()> { /// fcvm ls --json /// fcvm ls --pid 12345 /// ``` -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_fcvm_ls_bridged() -> Result<()> { println!("\ntest_fcvm_ls_bridged"); @@ -407,7 +405,6 @@ async fn test_fcvm_ls_bridged() -> Result<()> { /// ``` /// sudo fcvm podman run --name web1 --cmd "nginx -g 'daemon off;'" nginx:alpine /// ``` -#[cfg(feature = "privileged-tests")] #[tokio::test] async fn test_custom_command_bridged() -> Result<()> { println!("\ntest_custom_command_bridged"); diff --git a/tests/test_sanity.rs b/tests/test_sanity.rs index e21c44fb..8729a111 100644 --- a/tests/test_sanity.rs +++ b/tests/test_sanity.rs @@ -3,6 +3,8 @@ //! Uses common::spawn_fcvm() to prevent pipe buffer deadlock. //! See CLAUDE.md "Pipe Buffer Deadlock in Tests" for details. +#![cfg(feature = "integration-fast")] + mod common; use anyhow::{Context, Result}; diff --git a/tests/test_signal_cleanup.rs b/tests/test_signal_cleanup.rs index 29a5370d..df44109f 100644 --- a/tests/test_signal_cleanup.rs +++ b/tests/test_signal_cleanup.rs @@ -3,6 +3,8 @@ //! Verifies that when fcvm receives SIGINT/SIGTERM, it properly cleans up //! child processes (firecracker, slirp4netns, etc.) +#![cfg(feature = "integration-fast")] + mod common; use anyhow::{Context, Result}; @@ -61,7 +63,7 @@ fn test_sigint_kills_firecracker_bridged() -> Result<()> { let start = std::time::Instant::now(); let mut healthy = false; while start.elapsed() < Duration::from_secs(60) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json"]) @@ -114,7 +116,7 @@ fn test_sigint_kills_firecracker_bridged() -> Result<()> { break; } Ok(None) => { - std::thread::sleep(Duration::from_millis(100)); + std::thread::sleep(common::POLL_INTERVAL); } Err(e) => { println!("Error waiting for fcvm: {}", e); @@ -130,7 +132,7 @@ fn test_sigint_kills_firecracker_bridged() -> Result<()> { } // Give a moment for cleanup - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); // Check if our specific firecracker is still running let still_running = process_exists(fc_pid); @@ -192,7 +194,7 @@ fn test_sigterm_kills_firecracker_bridged() -> Result<()> { let start = std::time::Instant::now(); let mut healthy = false; while start.elapsed() < Duration::from_secs(60) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json"]) @@ -238,14 +240,14 @@ fn test_sigterm_kills_firecracker_bridged() -> Result<()> { break; } Ok(None) => { - std::thread::sleep(Duration::from_millis(100)); + std::thread::sleep(common::POLL_INTERVAL); } Err(_) => break, } } // Give a moment for cleanup - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); // Check if our specific firecracker is still running let still_running = process_exists(fc_pid); @@ -305,7 +307,7 @@ fn test_sigterm_cleanup_rootless() -> Result<()> { let start = std::time::Instant::now(); let mut healthy = false; while start.elapsed() < Duration::from_secs(60) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json"]) @@ -355,14 +357,14 @@ fn test_sigterm_cleanup_rootless() -> Result<()> { break; } Ok(None) => { - std::thread::sleep(Duration::from_millis(100)); + std::thread::sleep(common::POLL_INTERVAL); } Err(_) => break, } } // Give a moment for cleanup - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); // Verify our SPECIFIC processes are cleaned up if let Some(fc_pid) = our_fc_pid { @@ -509,7 +511,7 @@ fn test_sigterm_cleanup_bridged() -> Result<()> { let start = std::time::Instant::now(); let mut healthy = false; while start.elapsed() < Duration::from_secs(60) { - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); let output = Command::new(&fcvm_path) .args(["ls", "--json"]) @@ -553,12 +555,12 @@ fn test_sigterm_cleanup_bridged() -> Result<()> { println!("fcvm exited with status: {:?}", status); break; } - Ok(None) => std::thread::sleep(Duration::from_millis(100)), + Ok(None) => std::thread::sleep(common::POLL_INTERVAL), Err(_) => break, } } - std::thread::sleep(Duration::from_secs(2)); + std::thread::sleep(common::POLL_INTERVAL); // Verify our SPECIFIC processes are cleaned up if let Some(fc_pid) = our_fc_pid { diff --git a/tests/test_snapshot_clone.rs b/tests/test_snapshot_clone.rs index f0438d65..bbd7a5fe 100644 --- a/tests/test_snapshot_clone.rs +++ b/tests/test_snapshot_clone.rs @@ -7,6 +7,8 @@ //! 4. Spawn clones from snapshot (concurrently) //! 5. Verify clones become healthy (concurrently) +#![cfg(feature = "integration-slow")] + mod common; use anyhow::{Context, Result}; @@ -769,6 +771,9 @@ async fn test_clone_http(fcvm_path: &std::path::Path, clone_pid: u32) -> Result< async fn test_clone_port_forward_bridged() -> Result<()> { let (baseline_name, clone_name, snapshot_name, _) = common::unique_names("pf-bridged"); + // Port 8080:80 - DNAT is scoped to veth IP so same port works across parallel VMs + let host_port: u16 = 8080; + println!("\n╔═══════════════════════════════════════════════════════════════╗"); println!("║ Clone Port Forwarding Test (bridged) ║"); println!("╚═══════════════════════════════════════════════════════════════╝\n"); @@ -833,7 +838,8 @@ async fn test_clone_port_forward_bridged() -> Result<()> { println!(" ✓ Memory server ready (PID: {})", serve_pid); // Step 4: Spawn clone WITH port forwarding - println!("\nStep 4: Spawning clone with --publish 19080:80..."); + let publish_arg = format!("{}:80", host_port); + println!("\nStep 4: Spawning clone with --publish {}...", publish_arg); let serve_pid_str = serve_pid.to_string(); let (_clone_child, clone_pid) = common::spawn_fcvm_with_logs( &[ @@ -846,7 +852,7 @@ async fn test_clone_port_forward_bridged() -> Result<()> { "--network", "bridged", "--publish", - "19080:80", + &publish_arg, ], &clone_name, ) @@ -869,55 +875,35 @@ async fn test_clone_port_forward_bridged() -> Result<()> { .context("getting clone state")?; let stdout = String::from_utf8_lossy(&output.stdout); - let guest_ip: String = serde_json::from_str::>(&stdout) - .ok() - .and_then(|v| v.first().cloned()) - .and_then(|v| { - v.get("config")? - .get("network")? - .get("guest_ip")? - .as_str() - .map(|s| s.to_string()) - }) - .unwrap_or_default(); + let parsed: Vec = serde_json::from_str(&stdout).unwrap_or_default(); + let network = parsed.first().and_then(|v| v.get("config")?.get("network")); + + let guest_ip = network + .and_then(|n| n.get("guest_ip")?.as_str()) + .unwrap_or_default() + .to_string(); + let veth_host_ip = network + .and_then(|n| n.get("host_ip")?.as_str()) + .unwrap_or_default() + .to_string(); - println!(" Clone guest IP: {}", guest_ip); - - // Note: Direct access to guest IP (172.30.x.y) is NOT expected to work for clones. - // Clones use In-Namespace NAT where the guest IP is only reachable inside the namespace. - // Port forwarding goes through veth_inner_ip (10.x.y.z) which then gets DNATed to guest_ip. - // We test this only to document the expected behavior. - println!(" Testing direct access to guest (expected to fail for clones)..."); - let direct_result = tokio::process::Command::new("curl") - .args(["-s", "--max-time", "5", &format!("http://{}:80", guest_ip)]) - .output() - .await; - - let direct_works = direct_result - .map(|o| o.status.success() && !o.stdout.is_empty()) - .unwrap_or(false); println!( - " Direct access: {} (expected for clones)", - if direct_works { "✓ OK" } else { "✗ N/A" } + " Clone guest_ip: {}, veth_host_ip: {}", + guest_ip, veth_host_ip ); - // Test 2: Access via host's primary IP and forwarded port - let host_ip = tokio::process::Command::new("hostname") - .arg("-I") - .output() - .await - .ok() - .and_then(|o| String::from_utf8(o.stdout).ok()) - .and_then(|s| s.split_whitespace().next().map(|ip| ip.to_string())) - .unwrap_or_else(|| "127.0.0.1".to_string()); - - println!(" Testing access via host IP {}:19080...", host_ip); + // Test: Access via port forwarding (veth's host IP) + // DNAT rules are scoped to the veth IP, so this is what we test + println!( + " Testing port forwarding via veth IP {}:{}...", + veth_host_ip, host_port + ); let forward_result = tokio::process::Command::new("curl") .args([ "-s", "--max-time", "10", - &format!("http://{}:19080", host_ip), + &format!("http://{}:{}", veth_host_ip, host_port), ]) .output() .await; @@ -926,29 +912,10 @@ async fn test_clone_port_forward_bridged() -> Result<()> { .map(|o| o.status.success() && !o.stdout.is_empty()) .unwrap_or(false); println!( - " Port forward (host IP): {}", + " Port forward (veth IP): {}", if forward_works { "✓ OK" } else { "✗ FAIL" } ); - // Test 3: Access via localhost - println!(" Testing access via localhost:19080..."); - let localhost_result = tokio::process::Command::new("curl") - .args(["-s", "--max-time", "10", "http://127.0.0.1:19080"]) - .output() - .await; - - let localhost_works = localhost_result - .map(|o| o.status.success() && !o.stdout.is_empty()) - .unwrap_or(false); - println!( - " Localhost access: {}", - if localhost_works { - "✓ OK" - } else { - "✗ FAIL" - } - ); - // Cleanup println!("\nCleaning up..."); common::kill_process(clone_pid).await; @@ -961,37 +928,23 @@ async fn test_clone_port_forward_bridged() -> Result<()> { println!("║ RESULTS ║"); println!("╠═══════════════════════════════════════════════════════════════╣"); println!( - "║ Direct access to guest: {} (N/A for clones) ║", - if direct_works { "✓ WORKS" } else { "✗ N/A " } - ); - println!( - "║ Port forward (host IP): {} ║", + "║ Port forward (veth IP): {} ║", if forward_works { "✓ PASSED" } else { "✗ FAILED" } ); - println!( - "║ Localhost port forward: {} ║", - if localhost_works { - "✓ PASSED" - } else { - "✗ FAILED" - } - ); println!("╚═══════════════════════════════════════════════════════════════╝"); - // For clones, only port forwarding methods must work. - // Direct access is NOT expected to work due to In-Namespace NAT architecture. - if forward_works && localhost_works { + // Port forwarding via veth IP must work + if forward_works { println!("\n✅ CLONE PORT FORWARDING TEST PASSED!"); Ok(()) } else { anyhow::bail!( - "Clone port forwarding test failed: forward={}, localhost={}", - forward_works, - localhost_works + "Clone port forwarding test failed: forward={}", + forward_works ) } }