fix: pdeathsig, namespace timeout, arping for remaining flaky tests#541
fix: pdeathsig, namespace timeout, arping for remaining flaky tests#541
Conversation
send_gratuitous_arp() used `ping -c 1 -W 3` to trigger ARP resolution, but ping requires ICMP raw sockets and always failed in rootless mode (pasta doesn't respond to ICMP), burning 3 seconds on every restore. Replace with `arping -c 1 -w 1 -I eth0 <gateway>` which sends an ARP request at layer 2 and waits for a reply. Pasta responds to ARP, so this completes in ~30ms and confirms bidirectional L2 connectivity. fix: use pre_exec setns for rootless baselines (PR_SET_PDEATHSIG) Rootless baselines used nsenter to enter the user+net namespace, but nsenter's internal setns(CLONE_NEWUSER) clears PR_SET_PDEATHSIG (kernel zeros task->pdeath_signal on credential changes). This left Firecracker orphaned when fcvm was SIGKILL'd. Switch to pre_exec setns path (already used by clones) which sets pdeathsig AFTER entering the user namespace, ensuring Firecracker receives SIGKILL when its parent dies. fix: increase user namespace timeout from 5s to 30s Under heavy parallel test load, `unshare --user --net` can take >5s to create the namespace. The 5s timeout caused test_clone_connection tests to fail intermittently on loaded CI runners.
🔍 Claude ReviewSEVERITY: medium Findings[MEDIUM] Misleading commit message: The single commit [MEDIUM] Dead nsenter code path in [MEDIUM] Stale comments in
[MEDIUM] DESIGN.md stale: Line 472 says "Run Firecracker via nsenter: [MEDIUM] PR description overclaims: The description lists 3 fixes, but the arping change (fix #3) is not part of this PR — it was already merged on main. The PR contains 2 actual changes: the pdeathsig fix and the timeout increase. SummaryThe core pdeathsig fix (switching rootless baselines from nsenter to pre_exec setns) is well-motivated and the implementation is correct. However, the PR leaves dead code (the unreachable nsenter branch), stale comments, and stale documentation that will mislead future readers about how rootless Firecracker launch works. The commit message also doesn't describe the actual changes. Review by Claude | Run |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 4843c836ff
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| .context("reading own user namespace inode")? | ||
| .ino(); | ||
| let ns_deadline = std::time::Instant::now() + std::time::Duration::from_secs(5); | ||
| let ns_deadline = std::time::Instant::now() + std::time::Duration::from_secs(30); |
There was a problem hiding this comment.
Keep namespace readiness deadline consistent
setup_namespace_mappings() now waits up to 30s for /proc/<pid>/ns/user, but spawn_namespace_holder() still computes its absolute deadline from HOLDER_RETRY_TIMEOUT (5s) and passes that same deadline into wait_for_namespace_ready(). If namespace creation takes longer than 5s (the case this commit targets), wait_for_namespace_ready() starts with an already-expired deadline and will time out after the first failed probe, so rootless startup can still fail under load instead of getting the intended extended grace period.
Useful? React with 👍 / 👎.
After PR #541 switched rootless baselines to use pre_exec setns, the nsenter code path in VmManager::start() became unreachable (user_namespace_path is always set when holder_pid is set). Remove the dead branch and update all stale comments and DESIGN.md to reflect that ALL rootless VMs now use pre_exec setns. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
🔧 Auto-Fix CreatedI found issues and created a fix PR: #542 Please review and merge the fix PR first, then this PR. |
Namespace holder processes (unshare --user --net -- sleep infinity) had no PR_SET_PDEATHSIG set. When fcvm was SIGKILL'd, holders orphaned to init and accumulated — 380 found on the dev host from previous test runs. These leaked namespaces degrade the network stack and cause pasta forwarding failures in subsequent tests. - Add PR_SET_PDEATHSIG(SIGKILL) to holder spawn in common.rs - Add find_holder_for_fcvm() to test_signal_cleanup.rs - Extend test_sigkill_kills_firecracker_rootless to verify holder dies - Add bench diagnostics: dump clone log on HTTP failure for root-causing - Kill stale holders in clean-test-data Makefile target Tested: test_sigkill_kills_firecracker_rootless passes, verifies both firecracker and holder die via pdeathsig after SIGKILL.
🔍 Claude ReviewSEVERITY: medium Findings[MEDIUM] PR description does not cover second commit: Commit [MEDIUM] Dead nsenter branch + stale comments/docs: As noted in previous review, the dead nsenter code path (vm.rs:165-183), stale struct comments (vm.rs:39-41), and stale DESIGN.md step 5 (line 472) remain unfixed. Fix PR #542 addresses these but has not been merged yet. [LOW] SummaryThe second commit ( Review by Claude | Run |
After force-pushes or toolchain changes, cargo's cached build script outputs can reference nonexistent files. Cargo silently uses stale fingerprints and the test binaries fail with "No such file or directory" at runtime. Detect this by checking for build script output files before building and clean the cache if they're missing.
🔍 Claude ReviewSEVERITY: medium Findings[MEDIUM] PR description still does not cover commits 2 and 3: As noted in previous review, the PR description only covers the first commit's changes. Now there is a third commit ( [MEDIUM] Dead nsenter branch + stale comments/docs remain unfixed: As noted in first review, the dead nsenter code path (vm.rs:165-183), stale struct comments (vm.rs:39-41), and stale DESIGN.md step 5 (line 472) remain. Fix PR #542 addresses these but has not been merged yet. New Commit Analysis:
|
7599bb9 to
348aaaf
Compare
🔍 Claude ReviewSEVERITY: medium Findings[MEDIUM] PR description still does not cover commits 2, 3, and 4: As noted in previous review, the PR description only covers the first commit's changes. There are now 4 total commits but only the first is described. [MEDIUM] Dead nsenter branch + stale comments/docs remain unfixed: As noted in first review, these issues remain. Fix PR #542 is still open. [MEDIUM] Commit New Commit Analysis:
|
- Update rust-toolchain.toml from 1.92.0 to 1.93.0 - CI: use dtolnay/rust-toolchain@master with toolchain="" so it reads rust-toolchain.toml instead of hardcoding @stable - Self-hosted runners: install rustup with --default-toolchain none and let rust-toolchain.toml drive the version + components + targets - Single source of truth: rust-toolchain.toml controls everything
348aaaf to
364759d
Compare
🔍 Claude ReviewSEVERITY: medium FindingsNo new commits since previous review. All previously reported issues remain:
SummaryNo changes since last review. The core implementation is correct. Outstanding issues are the incomplete PR description, stale code/docs (tracked in PR #542), and the misleading commit message on the CI commit. Review by Claude | Run |
🔍 Claude ReviewSEVERITY: medium Findings[MEDIUM] New commit [MEDIUM] Dead nsenter branch + stale comments/docs remain unfixed: As noted in first review, the unreachable nsenter code path (vm.rs:165-183), stale comments, and stale DESIGN.md step 5 remain. Fix PR #542 is still open. [MEDIUM] PR description does not cover commits 2-5: As noted in previous reviews, the description only describes the first commit's changes. Commits for PR_SET_PDEATHSIG on holders, stale cargo cache detection, Rust 1.93.0 pinning, and vsock exec timeout are not described. New Commit Analysis:
|
🔧 Auto-Fix CreatedI found issues and created a fix PR: #543 Please review and merge the fix PR first, then this PR. |
When the bench's clone HTTP request fails, dump: - Exact connect error (refused vs timeout vs succeeded-but-no-data) - Listening sockets on the loopback IP (is pasta listening?) - All pasta processes (any stale?) - Stale process counts (sleep/pasta/firecracker) - Last 30 lines of clone log (what did fcvm actually do?) This will show us exactly why pasta fails on loaded CI runners.
🔍 Claude ReviewSEVERITY: medium FindingsNew commit Previously reported issues still open:
SummaryThe new commit is a clean diagnostic improvement — no issues. The three previously reported medium issues remain tracked in PRs #542 and #543 (both still open). The core implementation across all 6 commits is correct. Review by Claude | Run |
Bridged mode creates named network namespaces (fcvm-vm-*) with bridges, veths, and iptables rules. When fcvm is SIGKILL'd, the cleanup handler doesn't run and the namespace persists in /var/run/netns/. Found 188 stale namespaces across 7 CI runners, each holding kernel network objects that degrade pasta forwarding for subsequent tests. Also fixed the sleep infinity pkill to use pgrep -x -P 1 instead of pkill -f which can match the shell process running the command.
🔍 Claude ReviewSEVERITY: medium FindingsNew commits since last review: Commit Commit [MEDIUM] Remaining stale comments/docs not addressed by
[MEDIUM] PR description does not cover commits 2-8: As noted in previous review, the description only describes the first commit. There are now 8 total commits. SummaryThe two new commits are correct and well-motivated. Commit Review by Claude | Run |
🔧 Auto-Fix CreatedI found issues and created a fix PR: #544 Please review and merge the fix PR first, then this PR. |
b45e17b to
aa0e7cb
Compare
🔍 Claude ReviewSEVERITY: medium FindingsNew commit [MEDIUM] Remaining stale comments/docs not addressed by
[MEDIUM] PR description does not cover commits 2-8: As noted in previous reviews, the description only covers the first commit. There are now 8 total commits (pdeathsig on holders, stale cargo cache, Rust 1.93.0, vsock exec timeout, bench diagnostics, stale namespace cleanup, stale comment fixes). SummaryThe new commit is correct and addresses two previously flagged issues. The remaining stale comments/docs are tracked in open fix PR #544. Core implementation across all 8 commits remains correct — no new issues found. Review by Claude | Run |
aa0e7cb to
d2d8fab
Compare
🔍 Claude ReviewSEVERITY: medium FindingsNo new commits since previous review. All previously reported issues remain:
SummaryNo changes since last review. The core implementation across all 8 commits is correct — the pdeathsig fix, holder cleanup, namespace cleanup, bench diagnostics, and CI improvements are all well-motivated and properly implemented. The outstanding issues are tracked in open fix PR #544. Review by Claude | Run |
d2d8fab to
bce94ff
Compare
🔍 Claude ReviewSEVERITY: medium FindingsNo new commits since previous review. All previously reported issues remain:
SummaryNo changes since last review. The core implementation across all 8 commits is correct. Outstanding stale comments/docs are tracked in open fix PR #544. Review by Claude | Run |
bce94ff to
61eb802
Compare
Summary
Fixes 5 of 6 flaky tests from the latest main CI run (22565260533).
PR_SET_PDEATHSIG fix: Switch rootless baselines from nsenter to pre_exec setns path. nsenter's internal
setns(CLONE_NEWUSER)clearsPR_SET_PDEATHSIG(kernel zerostask->pdeath_signalon credential changes), leaving Firecracker orphaned when fcvm is SIGKILL'd. The pre_exec setns path (already used by clones) sets pdeathsig AFTER entering the user namespace.test_sigkill_kills_firecracker_rootlessNamespace timeout 5s→30s:
unshare --user --netcan take >5s under heavy parallel test load on CI runners. Increased from 5s to 30s.test_clone_connection_reset_rootless,test_clone_connection_timing_rootlessarping replaces ping: Already on main from PR fc-agent: clock sync, ss -K fix, --no-swap cgroup isolation, --no-dirty-tracking #536.
send_gratuitous_arp()usedping -c 1 -W 3which always failed in rootless (3s timeout). Now usesarping -c 1 -w 1 -I eth0 <gateway>which works at L2.test_clone_port_forward_rootlessNot fixed in this PR:
test_snapshot_clone_stress_100_bridged— Firecracker EINVAL from mount namespace with many baseline_dirs (separate issue)test_egress_clone_bridged— container wget timeout to host test server (pre-existing, not clone-related)Test Results