Skip to content

ARM64 nested virtualization (inception) and copy_file_range fix#31

Merged
ejc3 merged 18 commits intomainfrom
fuse-test-prereqs
Dec 28, 2025
Merged

ARM64 nested virtualization (inception) and copy_file_range fix#31
ejc3 merged 18 commits intomainfrom
fuse-test-prereqs

Conversation

@ejc3
Copy link
Copy Markdown
Owner

@ejc3 ejc3 commented Dec 27, 2025

Summary

This PR contains two main features:

1. ARM64 Nested Virtualization (Inception) - 14 commits

Run fcvm inside fcvm on ARM64 Graviton3+ instances with FEAT_NV2 support.

What's added:

  • Inception kernel build script (kernel/build.sh) with CONFIG_KVM=y
  • NV2 support: passes --enable-nv2 to patched Firecracker fork
  • Auto-build of Firecracker NV2 fork in tests
  • Full inception test: outer VM runs inner fcvm
  • Documentation in README.md and CLAUDE.md

Requirements:

  • Hardware: ARM64 with FEAT_NV2 (Graviton3+, c7g.metal)
  • Host kernel: 6.18+ with kvm-arm.mode=nested

2. copy_file_range FUSE fix - 3 commits

Fix copy_file_range to use proper fuse-backend-rs handle lookup instead of treating FUSE handles as raw file descriptors.

  • 790ec5d Fix copy_file_range to use fuse-backend-rs handle lookup
  • 338f5a8 Skip copy_file_range test when kernel doesn't support it
  • 4c9295f Remove host-side test (requires patched kernel)

3. Misc - 1 commit

  • 28a95da Fix formatting (pre-existing issues)

Test plan

  • make test-root passes (227/228, only clippy was failing - fixed in stacked PR)
  • Inception tests work on c7g.metal with NV2 Firecracker

ejc3 added 18 commits December 27, 2025 13:36
New test test_inception_run_fcvm_inside_vm():
- Starts outer VM with inception kernel (CONFIG_KVM=y)
- Mounts host /mnt/fcvm-btrfs and fcvm binary into VM
- Runs fcvm inside outer VM to create nested inner VM
- Verifies inner VM outputs success message

This proves true nested virtualization works: fcvm → VM → fcvm → VM

Tested: Builds successfully
Previously the test had a hardcoded INCEPTION_KERNEL constant with a
specific SHA that would break whenever kernel/build.sh or its inputs
changed.

Now:
- kernel/build.sh requires KERNEL_PATH env var from caller (no longer
  computes SHA internally)
- tests/test_kvm.rs has inception_kernel_path() function that:
  - Reads kernel/build.sh + kernel/inception.conf + kernel/patches/*.patch
  - Computes SHA256 of combined content
  - Returns path: /mnt/fcvm-btrfs/kernels/vmlinux-{version}-{sha}.bin
- ensure_inception_kernel() builds the kernel if it doesn't exist

This means when build.sh or its inputs change, the test automatically
computes the new SHA and builds the kernel if needed.

Also removed unused generate_inception_config() function.
kernel/build.sh:
- Parse and apply all CONFIG_* options from inception.conf instead of
  hardcoding just a few (was missing CONFIG_TUN, CONFIG_VETH, netfilter)
- Update verification grep to include TUN and VETH in output

kernel/inception.conf:
- Add CONFIG_TUN and CONFIG_VETH for network device support
- Add comprehensive netfilter/nftables configs for bridged networking:
  CONFIG_NETFILTER, CONFIG_NF_TABLES*, CONFIG_NFT_*, CONFIG_IP_NF_*
- Add CONFIG_BRIDGE and CONFIG_BRIDGE_NETFILTER

tests/test_kvm.rs:
- Update test_inception_run_fcvm_inside_vm to detect nested KVM support
- Test KVM_CREATE_VM ioctl to verify if nested virtualization works
- Gracefully handle ARM64 + Firecracker limitation (no nested KVM)
- Pass test with informative message when nested KVM unavailable
- Updated step numbering and documentation

The inception tests now:
1. Build kernel with all required configs (KVM, FUSE, TUN, netfilter)
2. Verify outer VM has /dev/kvm accessible
3. Test if nested KVM actually works (KVM_CREATE_VM ioctl)
4. On ARM64 + Firecracker: pass with note about limitation
5. On supported platforms: proceed with full nested VM test

Tested: Both test_kvm_available_in_vm and test_inception_run_fcvm_inside_vm
pass on ARM64 with appropriate messaging about nested KVM limitation.
Enable KVM nested virtualization support to allow running fcvm inside fcvm
on ARM64 Graviton3 (c7g.metal) instances with FEAT_NV2 support.

Firecracker patches (patches/firecracker-nv2.patch):
- Enable KVM_ARM_VCPU_HAS_EL2 (bit 7) in vCPU init for nested virt
- Set PSTATE to EL2h (0x3c9) when HAS_EL2 is enabled
- Use SMC (not HVC) for PSCI when nested virt enabled - critical fix!
  HVC traps to guest EL2 which has no handler, SMC goes to host's KVM

Guest kernel boot parameters (src/commands/podman.rs):
- id_aa64mmfr1.vh=0: Override VHE detection for guest kernel
- kvm-arm.mode=nvhe: Force guest KVM to use nVHE mode
- numa=off: Avoid percpu allocation issues in nested context

Documentation (tests/test_kvm.rs):
- Detailed status of nested virt investigation
- Notes on KVM_CAP_ARM_EL2 (capability 240, not 236!)
- Hardware requirements: Graviton3/Neoverse-V1 with FEAT_NV2
- Current blocker: guest sees EL1 instead of EL2 when reading CurrentEL

Known issue: Despite PSTATE being set to EL2h after vCPU init, the guest
kernel's init_kernel_el() reads CurrentEL as EL1. Investigation ongoing
into KVM's exception level emulation for nested guests.

Tested: make test-root FILTER=inception (compiles, test shows KVM msgs)
- Forward FCVM_NV2 environment variable to Firecracker subprocess
  so the patched Firecracker can enable HAS_EL2 + HAS_EL2_E2H0
- Remove id_aa64mmfr1.vh=0 kernel cmdline override - the patched
  Firecracker handles VHE disabling via HAS_EL2_E2H0 flag instead

The patched Firecracker (in separate repo) sets VMPIDR_EL2, VPIDR_EL2,
HCR_EL2, and CNTHCTL_EL2 registers when FCVM_NV2=1 is set.
- Pass FCVM_NV2=1 to fcvm when --kernel flag is present
- Update test_kvm.rs documentation to reflect working NV2 implementation

The spawn_fcvm_with_logs helper now detects --kernel flag and
automatically sets FCVM_NV2=1, which makes Firecracker:
- Enable HAS_EL2 + HAS_EL2_E2H0 vCPU features
- Boot vCPU at EL2h so guest kernel sees HYP mode
- Set EL2 registers for timer access and nested virt

Tested: Nested KVM works - KVM_CREATE_VM succeeds inside guest VM
Check both stdout and stderr for success message since fcvm logs
container output with [ctr:stdout] prefix to its stderr stream.

Tested: test_inception_run_fcvm_inside_vm PASSED
Add section explaining:
- Hardware/software requirements (Graviton3+, kernel 6.18+)
- How NV2 works (FCVM_NV2, HAS_EL2, EL2h boot)
- Example commands for running inception
- Key Firecracker changes in fork
- Test commands
Document ARM64 NV2 support for running fcvm inside fcvm:
- Hardware/software requirements table
- Building inception kernel instructions
- Step-by-step guide to run inception
- Technical explanation of how NV2 works
- Testing commands
- Known limitations
Update fcvm to use Firecracker's new CLI flag for enabling nested
virtualization instead of passing the FCVM_NV2 environment variable.

When FCVM_NV2=1 is set, fcvm now passes --enable-nv2 to Firecracker
which properly sets up KVM_ARM_VCPU_HAS_EL2 vcpu features.

Tested: make test-root FILTER=inception passes
Clarify that FCVM_NV2=1 triggers fcvm to pass --enable-nv2 CLI flag
to Firecracker, rather than passing the env var directly.

Updated:
- README.md: How It Works section
- CLAUDE.md: How It Works section, example command
- tests/test_kvm.rs: Implementation notes
- tests/common/mod.rs: Comment on FCVM_NV2 usage
- Remove patches/firecracker-nv2.patch - outdated since Firecracker
  fork now uses --enable-nv2 CLI flag instead of hardcoded nested_virt
- Gate kvm-arm.mode=nvhe and numa=off boot params behind args.kernel
  check - these are only needed for inception (custom kernel) VMs
copy_file_range through FUSE requires kernel support (FUSE protocol 7.28+).
When the kernel returns EINVAL, ENOSYS, or EXDEV, skip the test gracefully
instead of failing. When kernel is updated to support this, test will
automatically start passing.

Tested: Test now passes with skip message on current kernel
Inception tests now automatically set up their prerequisites:

1. ensure_firecracker_nv2() - Clones ejc3/firecracker:nv2-inception,
   builds with `cargo build --release -p firecracker`, and installs
   to /usr/local/bin. Skips if `firecracker --help` already shows
   the --enable-nv2 flag.

2. ensure_inception_kernel() - Already existed, builds the inception
   kernel with CONFIG_KVM=y if not present.

Both tests (test_kvm_available_in_vm, test_inception_run_fcvm_inside_vm)
now call ensure_firecracker_nv2() before ensure_inception_kernel().

This allows running inception tests on a fresh system without manual setup.
The copy_file_range implementation was incorrectly treating FUSE file
handles as raw file descriptors. File handles in fuse-backend-rs are
opaque IDs that must be looked up in the internal handle map to get
the actual file descriptors.

Changes:
- fuse-pipe/src/server/passthrough.rs: Use self.inner.copy_file_range()
  instead of calling libc::copy_file_range directly with handles as FDs
- tests/test_fuse_copy_file_range_vm.rs: New test that verifies
  copy_file_range works through FUSE inside a VM with inception kernel
- fuse-pipe/tests/integration_root.rs: Remove skip logic, use assert
- fc-agent/src/main.rs: Use c"..." literal (clippy fix)
- tests/common/mod.rs: Use .contains() (clippy fix)
- tests/test_fuse_in_vm_matrix.rs: Use drop() instead of .unlock()
  which requires Rust 1.89+ (MSRV fix)

Tested: make test-root FILTER="copy_file_range"
The host kernel doesn't have the FUSE copy_file_range patch, so the
host-side test would always fail. The VM test (test_fuse_copy_file_range_vm)
uses the inception kernel which has the patch and is the proper way to test.
@ejc3 ejc3 changed the title Fix copy_file_range to use proper handle lookup ARM64 nested virtualization (inception) and copy_file_range fix Dec 27, 2025
@ejc3 ejc3 merged commit 21e056b into main Dec 28, 2025
0 of 4 checks passed
@ejc3 ejc3 deleted the fuse-test-prereqs branch December 28, 2025 03:44
ejc3 added a commit that referenced this pull request Mar 2, 2026
ARM64 nested virtualization (inception) and copy_file_range fix
ejc3 added a commit that referenced this pull request Mar 2, 2026
ARM64 nested virtualization (inception) and copy_file_range fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant