Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
64f70c7
Fix Layer 2 setup: proper error handling and package resolution
ejc3 Dec 25, 2025
b0654f7
Update docs to reflect Layer 2 setup improvements
ejc3 Dec 25, 2025
f12d7d9
README: Expand setup section with detailed steps
ejc3 Dec 25, 2025
053cec8
README: Document --setup flag for auto-setup on first run
ejc3 Dec 25, 2025
8a90821
README: Clarify why --setup is rootless only
ejc3 Dec 25, 2025
0210025
CI: Auto-cancel in-progress runs on new push
ejc3 Dec 25, 2025
e3e75bc
CI: Add missing dependencies to Container job
ejc3 Dec 25, 2025
5f1853a
Fix VM shutdown in Layer 2 setup
ejc3 Dec 25, 2025
4b640fa
CI: Run setup inside container, add sanity checks
ejc3 Dec 25, 2025
62aeb8c
CI: Run setup inside container, add sanity checks
ejc3 Dec 25, 2025
7e4fa8b
Fix podman-in-podman for rootless container setup
ejc3 Dec 25, 2025
8059098
CI: Add Rust cache for faster builds
ejc3 Dec 25, 2025
b30f67a
CI: Add cargo cache for container builds
ejc3 Dec 25, 2025
676388d
Fix: Add --cgroups=disabled to actual podman command
ejc3 Dec 25, 2025
b27bb30
Refactor: Use single source for download script
ejc3 Dec 25, 2025
9290255
Separate lint tests from integration tests
ejc3 Dec 25, 2025
f0b9cec
CI: Install cargo-audit/deny for CVSS 4.0 support
ejc3 Dec 25, 2025
d98ed1d
docs: Add NO HACKS policy to CLAUDE.md
ejc3 Dec 25, 2025
a83bdbe
CI: Add shared-key to rust-cache for cache reuse
ejc3 Dec 25, 2025
d2b678a
CI: Save rust cache even on failure
ejc3 Dec 25, 2025
8e714fd
Fix disk space exhaustion in CI snapshot tests
ejc3 Dec 25, 2025
0cad682
CI: Enable userfaultfd in Container job
ejc3 Dec 25, 2025
39512ce
CI: Create and pass /dev/userfaultfd to container
ejc3 Dec 25, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 39 additions & 3 deletions .claude/CLAUDE.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,18 @@
# fcvm Development Log

## NO HACKS

**Fix the root cause, not the symptom.** When something fails:
1. Understand WHY it's failing
2. Fix the actual problem
3. Don't hide errors, disable tests, or add workarounds

Examples of hacks to avoid:
- Gating tests behind feature flags to skip failures
- Adding sleeps or retries without understanding the race
- Clearing caches instead of updating tools
- Using `|| true` to ignore errors

## Overview
fcvm is a Firecracker VM manager for running Podman containers in lightweight microVMs. This document tracks implementation findings and decisions.

Expand Down Expand Up @@ -727,9 +740,25 @@ fuse-pipe/benches/
- Initrd: `/mnt/fcvm-btrfs/initrd/fc-agent-{sha}.initrd` (injects fc-agent at boot)

**Layer System:**
The rootfs is named after the SHA of the setup script + kernel URL. This ensures automatic cache invalidation when:
The rootfs is named after the SHA of a combined script that includes:
- Init script (embeds install script + setup script)
- Kernel URL
- Download script (packages + Ubuntu codename)

This ensures automatic cache invalidation when:
- The init logic, install script, or setup script changes
- The kernel URL changes (different kernel version)
- The package list or target Ubuntu version changes

**Package Download:**
Packages are downloaded using `podman run ubuntu:{codename}` with `apt-get install --download-only`.
This ensures packages match the target Ubuntu version (Noble/24.04), not the host OS.
The `codename` is specified in `rootfs-plan.toml`.

**Setup Verification:**
Layer 2 setup writes a marker file `/etc/fcvm-setup-complete` on successful completion.
After the setup VM exits, fcvm mounts the rootfs and verifies this marker exists.
If missing, setup fails with a clear error.

The initrd contains a statically-linked busybox and fc-agent binary, injected at boot before systemd.

Expand Down Expand Up @@ -887,8 +916,15 @@ ERROR fcvm: Error: setting up rootfs: Rootfs not found. Run 'fcvm setup' first,

**What `fcvm setup` does:**
1. Downloads Kata kernel from URL in `rootfs-plan.toml` (~15MB, cached by URL hash)
2. Creates Layer 2 rootfs (~10GB, downloads Ubuntu cloud image, boots VM to install packages)
3. Creates fc-agent initrd (embeds statically-linked fc-agent binary)
2. Downloads packages using `podman run ubuntu:noble` with `apt-get install --download-only`
- Packages specified in `rootfs-plan.toml` (podman, crun, fuse-overlayfs, skopeo, fuse3, haveged, chrony, strace)
- Uses target Ubuntu version (noble/24.04) to get correct package versions
3. Creates Layer 2 rootfs (~10GB):
- Downloads Ubuntu cloud image
- Boots VM with packages embedded in initrd
- Runs install script (dpkg) + setup script (config files, services)
- Verifies setup completed by checking for `/etc/fcvm-setup-complete` marker file
4. Creates fc-agent initrd (embeds statically-linked fc-agent binary)

**Kernel source**: Kata Containers kernel (6.12.47 from Kata 3.24.0 release) with `CONFIG_FUSE_FS=y` built-in.

Expand Down
14 changes: 12 additions & 2 deletions .config/nextest.toml
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,10 @@ retries = 0
[test-groups.stress-tests]
max-threads = 1

# Snapshot tests limited to 3 concurrent (each snapshot is ~5.6GB on disk)
[test-groups.snapshot-tests]
max-threads = 3

# VM tests run at full parallelism (num-cpus)
[test-groups.vm-tests]
max-threads = "num-cpus"
Expand All @@ -51,9 +55,15 @@ filter = "package(fcvm) & test(/stress_100/)"
test-group = "stress-tests"
slow-timeout = { period = "600s", terminate-after = 1 }

# VM tests get 10 minute timeout
# Snapshot tests: limited to 3 concurrent (each creates ~5.6GB snapshot on disk)
[[profile.default.overrides]]
filter = "package(fcvm) & (test(/snapshot/) | test(/clone/))"
test-group = "snapshot-tests"
slow-timeout = { period = "600s", terminate-after = 1 }

# VM tests get 10 minute timeout (non-snapshot tests)
[[profile.default.overrides]]
filter = "package(fcvm) & test(/test_/) & !test(/stress_100/) & !test(/pjdfstest_vm/)"
filter = "package(fcvm) & test(/test_/) & !test(/stress_100/) & !test(/pjdfstest_vm/) & !test(/snapshot/) & !test(/clone/)"
test-group = "vm-tests"
slow-timeout = { period = "600s", terminate-after = 1 }

Expand Down
38 changes: 34 additions & 4 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,11 @@ on:
push:
branches: [main]

# Cancel in-progress runs when a new revision is pushed
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

env:
CARGO_TERM_COLOR: always
FUSE_BACKEND_RS: ${{ github.workspace }}/fuse-backend-rs
Expand Down Expand Up @@ -36,6 +41,11 @@ jobs:
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- uses: Swatinem/rust-cache@v2
with:
cache-provider: buildjet
workspaces: fcvm -> target
cache-on-failure: "true"
- name: Install dependencies
run: |
sudo apt-get update
Expand All @@ -51,8 +61,8 @@ jobs:
release-v1.14.0-x86_64/jailer-v1.14.0-x86_64
sudo mv /usr/local/bin/firecracker-v1.14.0-x86_64 /usr/local/bin/firecracker
sudo mv /usr/local/bin/jailer-v1.14.0-x86_64 /usr/local/bin/jailer
- name: Install cargo-nextest
run: cargo install cargo-nextest --locked
- name: Install cargo tools
run: cargo install cargo-nextest cargo-audit cargo-deny --locked
- name: Setup KVM and networking
run: |
sudo chmod 666 /dev/kvm
Expand Down Expand Up @@ -102,15 +112,35 @@ jobs:
- name: Setup KVM and rootless podman
run: |
sudo chmod 666 /dev/kvm
# Create userfaultfd device for snapshot cloning
if [ ! -e /dev/userfaultfd ]; then
sudo mknod /dev/userfaultfd c 10 126
fi
sudo chmod 666 /dev/userfaultfd
sudo sysctl -w vm.unprivileged_userfaultfd=1
# Configure rootless podman to use cgroupfs (no systemd session on CI)
mkdir -p ~/.config/containers
printf '[engine]\ncgroup_manager = "cgroupfs"\nevents_logger = "file"\n' > ~/.config/containers/containers.conf
# Create cargo cache directory for container
mkdir -p ${{ github.workspace }}/cargo-cache/registry ${{ github.workspace }}/cargo-cache/target
- name: Cache container cargo
uses: actions/cache@v4
with:
path: ${{ github.workspace }}/cargo-cache
key: container-cargo-${{ hashFiles('fcvm/Cargo.lock') }}
restore-keys: container-cargo-
- name: container-test-unit
env:
CARGO_CACHE_DIR: ${{ github.workspace }}/cargo-cache
working-directory: fcvm
run: make container-test-unit
- name: setup-fcvm
- name: container-setup-fcvm
env:
CARGO_CACHE_DIR: ${{ github.workspace }}/cargo-cache
working-directory: fcvm
run: make setup-fcvm
run: make container-setup-fcvm
- name: container-test
env:
CARGO_CACHE_DIR: ${{ github.workspace }}/cargo-cache
working-directory: fcvm
run: make container-test
2 changes: 1 addition & 1 deletion Containerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ RUN cargo install cargo-nextest cargo-audit cargo-deny --locked
RUN apt-get update && apt-get install -y \
fuse3 libfuse3-dev autoconf automake libtool perl libclang-dev clang \
musl-tools iproute2 iptables slirp4netns dnsmasq qemu-utils e2fsprogs \
parted podman skopeo git curl sudo procps zstd busybox-static cpio uidmap \
parted fdisk podman skopeo git curl sudo procps zstd busybox-static cpio uidmap \
&& rm -rf /var/lib/apt/lists/*

# Install Firecracker
Expand Down
27 changes: 22 additions & 5 deletions DESIGN.md
Original file line number Diff line number Diff line change
Expand Up @@ -920,7 +920,14 @@ The guest is configured to support rootless Podman:
fcvm setup
```

This downloads the Kata kernel (~15MB) and creates the Layer 2 rootfs (~10GB with Ubuntu + Podman). Takes 5-10 minutes on first run.
**What it does:**
1. Downloads Kata kernel (~15MB, cached by URL hash)
2. Downloads packages via `podman run ubuntu:noble` with `apt-get install --download-only`
3. Creates Layer 2 rootfs (~10GB): boots VM, installs packages, writes config
4. Verifies setup by checking `/etc/fcvm-setup-complete` marker file
5. Creates fc-agent initrd (embeds statically-linked fc-agent binary)

Takes 5-10 minutes on first run. Subsequent runs are instant (cached by content hash).

**Note**: Must be run before `fcvm podman run` with bridged networking. For rootless mode, you can use `--setup` flag on `fcvm podman run` instead.

Expand Down Expand Up @@ -1310,7 +1317,7 @@ Override with `FCVM_BASE_DIR` environment variable.
/mnt/fcvm-btrfs/
├── kernels/ # Kernel binaries
│ └── vmlinux-{sha}.bin
├── rootfs/ # Base rootfs images
├── rootfs/ # Base rootfs images (contains /etc/fcvm-setup-complete marker)
│ └── layer2-{sha}.raw
├── initrd/ # fc-agent injection initrds
│ └── fc-agent-{sha}.initrd
Expand All @@ -1319,9 +1326,19 @@ Override with `FCVM_BASE_DIR` environment variable.
├── snapshots/ # Firecracker snapshots
├── state/ # VM state JSON files
│ └── {vm-id}.json
└── cache/ # Downloaded images
└── cache/ # Downloaded images and packages
├── ubuntu-24.04-arm64-{sha}.img # Cloud image cache
└── packages-{sha}/ # Downloaded .deb files
```

**Rootfs Hash Calculation:**
The layer2-{sha}.raw name is computed from:
- Init script (embeds install + setup scripts)
- Kernel URL
- Download script (package list + Ubuntu codename)

This ensures automatic cache invalidation when any component changes.

### State Persistence

**VM State** (`/mnt/fcvm-btrfs/state/{vm-id}.json`):
Expand Down Expand Up @@ -1726,6 +1743,6 @@ The 64 CPUs help within each crate (LLVM codegen), but crate-level parallelism i

**End of Design Specification**

*Version: 2.2*
*Date: 2025-12-24*
*Version: 2.3*
*Date: 2025-12-25*
*Author: fcvm project*
33 changes: 27 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -30,11 +30,19 @@ endif
# Base test command
NEXTEST := CARGO_TARGET_DIR=target cargo nextest $(NEXTEST_CMD) --release

# Container run command (runs as testuser via Containerfile USER directive)
# Optional cargo cache directory (for CI caching)
CARGO_CACHE_DIR ?=
ifneq ($(CARGO_CACHE_DIR),)
CARGO_CACHE_MOUNT := -v $(CARGO_CACHE_DIR)/registry:/usr/local/cargo/registry -v $(CARGO_CACHE_DIR)/target:/workspace/fcvm/target
else
CARGO_CACHE_MOUNT :=
endif

# Container run command
CONTAINER_RUN := podman run --rm --privileged \
-v .:/workspace/fcvm -v $(FUSE_BACKEND_RS):/workspace/fuse-backend-rs -v $(FUSER):/workspace/fuser \
--device /dev/fuse --device /dev/kvm \
--ulimit nofile=65536:65536 --pids-limit=65536 -v /mnt/fcvm-btrfs:/mnt/fcvm-btrfs
--device /dev/fuse --device /dev/kvm --device /dev/userfaultfd \
--ulimit nofile=65536:65536 --pids-limit=65536 -v /mnt/fcvm-btrfs:/mnt/fcvm-btrfs $(CARGO_CACHE_MOUNT)

.PHONY: all help build clean test test-unit test-fast test-all test-root \
_test-unit _test-fast _test-all _test-root \
Expand Down Expand Up @@ -84,11 +92,11 @@ container-test-unit: container-build
@echo "==> Running unit tests in container..."
$(CONTAINER_RUN) $(CONTAINER_TAG) make build _test-unit

container-test-fast: setup-fcvm container-build
container-test-fast: container-setup-fcvm
@echo "==> Running fast tests in container..."
$(CONTAINER_RUN) $(CONTAINER_TAG) make _test-fast

container-test-all: setup-fcvm container-build
container-test-all: container-setup-fcvm
@echo "==> Running all tests in container..."
$(CONTAINER_RUN) $(CONTAINER_TAG) make _test-all

Expand Down Expand Up @@ -117,7 +125,7 @@ setup-btrfs:
@if ! mountpoint -q /mnt/fcvm-btrfs 2>/dev/null; then \
echo '==> Creating btrfs loopback...'; \
if [ ! -f /var/fcvm-btrfs.img ]; then \
sudo truncate -s 20G /var/fcvm-btrfs.img && sudo mkfs.btrfs /var/fcvm-btrfs.img; \
sudo truncate -s 60G /var/fcvm-btrfs.img && sudo mkfs.btrfs /var/fcvm-btrfs.img; \
fi && \
sudo mkdir -p /mnt/fcvm-btrfs && \
sudo mount -o loop /var/fcvm-btrfs.img /mnt/fcvm-btrfs && \
Expand All @@ -135,6 +143,19 @@ setup-fcvm: build setup-btrfs
@echo "==> Running fcvm setup..."
./target/release/fcvm setup

# Run setup inside container (for CI - container has Firecracker)
container-setup-fcvm: container-build setup-btrfs
@echo "==> Running fcvm setup in container..."
$(CONTAINER_RUN) $(CONTAINER_TAG) make build _setup-fcvm

_setup-fcvm:
@FREE_GB=$$(df -BG /mnt/fcvm-btrfs 2>/dev/null | awk 'NR==2 {gsub("G",""); print $$4}'); \
if [ -n "$$FREE_GB" ] && [ "$$FREE_GB" -lt 15 ]; then \
echo "ERROR: Need 15GB on /mnt/fcvm-btrfs (have $${FREE_GB}GB)"; \
exit 1; \
fi
./target/release/fcvm setup

bench: build
@echo "==> Running benchmarks..."
sudo cargo bench -p fuse-pipe --bench throughput
Expand Down
20 changes: 19 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -83,11 +83,29 @@ cargo build --release --workspace

### Setup (First Time)
```bash
# Create btrfs filesystem and download kernel + rootfs (takes 5-10 minutes)
# Create btrfs filesystem
make setup-btrfs

# Download kernel and create rootfs (takes 5-10 minutes first time)
fcvm setup
```

**What `fcvm setup` does:**
1. Downloads Kata kernel (~15MB, cached by URL hash)
2. Downloads packages via `podman run ubuntu:noble` (ensures correct Ubuntu 24.04 versions)
3. Creates Layer 2 rootfs (~10GB): boots VM, installs packages, writes config files
4. Verifies setup completed successfully (checks marker file)
5. Creates fc-agent initrd

Subsequent runs are instant - everything is cached by content hash.

**Alternative: Auto-setup on first run (rootless only)**
```bash
# Skip explicit setup - does it automatically on first run
fcvm podman run --name web1 --network rootless --setup nginx:alpine
```
The `--setup` flag triggers setup if kernel/rootfs are missing. Only works with `--network rootless` to avoid file ownership issues when running as root.

### Run a Container
```bash
# Run nginx in a Firecracker VM (using AWS ECR public registry to avoid Docker Hub rate limits)
Expand Down
11 changes: 9 additions & 2 deletions fuse-pipe/src/server/passthrough.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1355,7 +1355,10 @@ mod tests {
};

// Create hardlink
eprintln!("Calling link(source_ino={}, parent=1, name='link.txt')...", source_ino);
eprintln!(
"Calling link(source_ino={}, parent=1, name='link.txt')...",
source_ino
);
let resp = fs.link(source_ino, 1, "link.txt", uid, gid, 0);
let link_ino = match resp {
VolumeResponse::Entry { attr, .. } => {
Expand All @@ -1369,7 +1372,11 @@ mod tests {
let src_path = dir.path().join("source.txt");
let link_path = dir.path().join("link.txt");
eprintln!("=== link() FAILED ===");
eprintln!("errno: {} ({})", errno, std::io::Error::from_raw_os_error(errno));
eprintln!(
"errno: {} ({})",
errno,
std::io::Error::from_raw_os_error(errno)
);
eprintln!("source.txt exists: {}", src_path.exists());
eprintln!("link.txt exists: {}", link_path.exists());
eprintln!(
Expand Down
5 changes: 4 additions & 1 deletion fuse-pipe/tests/common/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,10 @@ pub fn supports_at_empty_path(dir: &Path) -> bool {
eprintln!("AT_EMPTY_PATH: supported");
} else {
let err = std::io::Error::last_os_error();
eprintln!("AT_EMPTY_PATH: not supported ({}) - skipping hardlink test", err);
eprintln!(
"AT_EMPTY_PATH: not supported ({}) - skipping hardlink test",
err
);
}
supported
}
Expand Down
8 changes: 7 additions & 1 deletion fuse-pipe/tests/integration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,13 @@ fn test_hardlink_survives_source_removal() {
eprintln!("=== Hardlink failed ===");
eprintln!("source: {:?} exists={}", source, source.exists());
eprintln!("link: {:?}", link);
eprintln!("mount contents: {:?}", fs::read_dir(mount).ok().map(|d| d.filter_map(|e| e.ok()).map(|e| e.file_name()).collect::<Vec<_>>()));
eprintln!(
"mount contents: {:?}",
fs::read_dir(mount).ok().map(|d| d
.filter_map(|e| e.ok())
.map(|e| e.file_name())
.collect::<Vec<_>>())
);
panic!("create hardlink failed: {}", e);
}

Expand Down
2 changes: 2 additions & 0 deletions rootfs-plan.toml
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,8 @@
# Ubuntu 24.04 LTS (Noble Numbat) cloud images
# Using "current" for latest updates - URL changes trigger plan SHA change
version = "24.04"
# Codename used to download packages from correct Ubuntu release
codename = "noble"

[base.arm64]
url = "https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-arm64.img"
Expand Down
Loading
Loading