Skip to content

Add TTY and interactive mode support (-it flags)#108

Merged
ejc3 merged 6 commits intomainfrom
tty-interactive-mode
Jan 6, 2026
Merged

Add TTY and interactive mode support (-it flags)#108
ejc3 merged 6 commits intomainfrom
tty-interactive-mode

Conversation

@ejc3
Copy link
Copy Markdown
Owner

@ejc3 ejc3 commented Jan 5, 2026

Summary

Implements full PTY/interactive mode support for fcvm podman run and fcvm exec, matching docker/podman -i and -t flag semantics.

Changes

New Features

  • -i, --interactive: Keep stdin open (forward user input)
  • -t, --tty: Allocate pseudo-TTY (colors, line editing, full-screen apps)
  • -it: Both flags together for interactive shell

Implementation

  • Host side (src/commands/tty.rs): Raw terminal mode, binary exec_proto framing over vsock
  • Guest side (fc-agent/src/tty.rs): PTY allocation via openpty(), fork with PTY as controlling terminal

CLI Changes

  • Add -i/-t flags to fcvm podman run
  • Change exec name from positional to --name flag (fixes argument conflict)

Documentation

  • Add TTY architecture section to DESIGN.md
  • Add Interactive Mode section to README.md
  • Update all CLI examples to use --name flag

Test Plan

  • make test-root FILTER=tty - 4 TTY tests pass
  • make test-root FILTER=exec - 6 exec tests pass
  • Manual verification of README examples:
    • fcvm exec --name my-vm -- cat /etc/os-release
    • fcvm exec --name my-vm --vm -- hostname
    • fcvm exec --pid <PID> -- hostname
    • fcvm podman run ... alpine:latest echo "hello" ✅ (trailing args)

@ejc3 ejc3 changed the base branch from ls-show-loopback-ip to main January 5, 2026 19:26
Implement full PTY support matching docker/podman -i and -t flags:
- `-i`: Keep stdin open (forward user input)
- `-t`: Allocate pseudo-TTY (colors, line editing, vim)
- `-it`: Both (interactive shell)

Host side (src/commands/tty.rs):
- Raw terminal mode via tcsetattr/cfmakeraw
- Binary exec_proto framing over vsock
- Async reader/writer tasks for stdin/stdout

Guest side (fc-agent/src/tty.rs):
- PTY allocation via openpty()
- Fork child with PTY as controlling terminal
- Relay between vsock and PTY master

CLI changes:
- Add -i/-t flags to `fcvm podman run`
- Change exec `name` from positional to `--name` flag
- Fix argument conflict between --pid and trailing command

Documentation:
- Add TTY architecture section to DESIGN.md
- Add Interactive Mode section to README.md
- Update CLI examples to use --name flag

Tested:
- 4 TTY tests pass (tty, interactive, parallel stress)
- 6 exec tests pass
- Manual verification of README examples
@ejc3 ejc3 force-pushed the tty-interactive-mode branch from 8eb5e56 to d24ef7a Compare January 5, 2026 19:29
ejc3 added 2 commits January 5, 2026 19:30
- Remove needless return statement in fc-agent exec TTY handler
- Fix formatting in tty.rs and test_exec.rs
@ejc3 ejc3 mentioned this pull request Jan 5, 2026
3 tasks
@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔍 Claude Review

SEVERITY: critical

Findings

[CRITICAL] Terminal state not restored on signals (SIGTERM/SIGQUIT)

Location:

When the user presses Ctrl+\ (SIGQUIT) or the process receives SIGTERM/SIGKILL, the terminal is LEFT IN RAW MODE. The user's shell becomes unusable (no echo, no line editing) until they manually run reset.

Issue: No signal handlers are installed. Terminal restoration only happens at normal function exit.

Fix: Install signal handlers using signal-hook or similar to restore terminal state before exit.


[CRITICAL] Writer thread not joined - potential zombie threads

Location:

drop(writer_thread);  // ← DOES NOT JOIN!

Issue: drop(Option<JoinHandle>) detaches the thread instead of joining it. If the writer is blocked in poll(), it becomes a zombie thread holding resources.

Fix:

if let Some(handle) = writer_thread {
    handle.join().ok();
}

[CRITICAL] Guest-side FD leaks on error paths

Location: fc-agent/src/tty.rs:112-122, 127-134

Multiple error paths fail to close file descriptors:

  1. vsock clone failure (line 120): Returns without closing master_fd (TTY mode) or stdout_read/stdin_write (pipe mode)
  2. dup() failure (line 134): Returns without closing master_fd

Over time, repeated failures exhaust the file descriptor limit.

Fix: Add explicit FD cleanup in error handlers.


[CRITICAL] waitpid return value ignored - zombie processes

Location: fc-agent/src/tty.rs:398-399

unsafe {
    libc::waitpid(child_pid, &mut status, 0);  // Return value ignored!
}

Issue: If waitpid() fails (EINTR, ECHILD), the code proceeds with uninitialized status variable. The child may become a zombie.

Fix:

let ret = unsafe { libc::waitpid(child_pid, &mut status, 0) };
if ret < 0 {
    eprintln!("[fc-agent] waitpid failed: {}", std::io::Error::last_os_error());
    return 1;
}

[CRITICAL] Buffer allocation DoS vulnerability

Location: exec-proto/src/lib.rs:112

let mut payload = vec![0u8; len];  // Allocates immediately!

Issue: A malicious sender can send large length prefixes (up to 16MB limit) then disconnect without sending data. Multiple connections can exhaust memory.

Fix: Use progressive allocation or read-in-chunks to avoid allocating full buffer upfront.


Medium Issues

[MEDIUM] Socket cleanup race

Location: src/commands/tty.rs:24-43

If the process is killed between binding the Unix socket and cleanup, the socket file is orphaned. Mitigated by cleanup at start of next run, but still resource pollution.

[MEDIUM] tcsetattr failure ignored

Location: src/commands/tty.rs:149

Terminal restoration failure is silently ignored. Should at least log a warning.

[MEDIUM] Escape sequences unfiltered

Location: src/commands/tty.rs:172

Raw terminal data from guest is written directly to stdout. A compromised guest can send malicious ANSI/OSC sequences to manipulate the terminal, change title, or exploit terminal emulator bugs.

Recommendation: Document the risk or add escape sequence filtering for high-security environments.

[MEDIUM] Vsock disconnect returns exit code 0

Location: src/commands/tty.rs:190-201

Protocol errors cause the reader loop to exit, but default to exit code 0 (success). Should return 1 to indicate error.


Summary

The TTY implementation has good overall design with proper binary framing protocol and thread coordination. However, there are critical gaps in cleanup and signal handling:

  1. Terminal corruption: User terminal breaks on abnormal exit (Ctrl+, SIGTERM)
  2. Resource leaks: FD leaks and zombie threads/processes accumulate over time
  3. Security: Buffer allocation DoS and unfiltered escape sequences

These issues will significantly impact user experience and system stability under error conditions.

Recommended Priority:

  1. Add signal handlers for terminal restoration (affects all users)
  2. Fix thread joining and FD leaks (long-running stability)
  3. Address buffer allocation DoS (security)

Review by Claude | Run

claude-claude bot pushed a commit that referenced this pull request Jan 5, 2026
Fixes critical issues identified in PR #108 review:

1. **Thread joining**: Writer thread is now properly joined instead of
   being dropped/detached. Prevents zombie threads when the reader
   exits while writer is blocked in poll().

2. **FD leak fixes (guest)**: Added explicit FD cleanup on error paths:
   - vsock clone failure: now closes master_fd/pipe FDs before return
   - dup() failure: now closes master_fd before return
   Prevents FD exhaustion after repeated failures.

3. **waitpid error handling**: Check return value and handle failures
   (EINTR, ECHILD). Prevents zombie processes and garbage exit codes.

4. **Terminal restoration**: Check tcsetattr return value and log
   warnings on failure.

5. **Protocol error handling**: Return exit code 1 instead of 0 when
   vsock protocol errors occur.

6. **Stdout write errors**: Detect and handle stdout write failures
   in reader loop.

These fixes address resource leaks and improve error handling that
could accumulate over time and degrade system stability.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixes from Claude review (PR #108 comment):

Host-side (src/commands/tty.rs):
- Add signal handlers (SIGTERM, SIGQUIT, SIGHUP) to restore terminal
  on abnormal exit - prevents terminal corruption
- Join writer thread properly instead of drop() - prevents zombie threads
- Log warning on tcsetattr failure instead of silent ignore
- Return exit code 1 on protocol errors/EOF instead of 0

Guest-side (fc-agent/src/tty.rs):
- Fix FD leaks on vsock.try_clone() failure (now closes master_fd or
  stdout_read/stdin_write depending on mode)
- Fix FD leak on dup() failure (now closes master_fd)
- Check waitpid return value and log error on failure

Protocol (exec-proto/src/lib.rs):
- Reduce max message size from 16MB to 1MB (sufficient for TTY)
- Read payload progressively in 8KB chunks instead of allocating full
  buffer upfront - prevents memory exhaustion on disconnect

Tested: cargo fmt --check && cargo clippy --all-targets -D warnings
@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔍 Claude Review

SEVERITY: critical

Findings

[CRITICAL] Signal handler is not async-signal-safe

Location: src/commands/tty.rs:31-37

The signal handler calls Mutex::lock(), which is NOT async-signal-safe per POSIX. If a signal arrives while the main thread holds the ORIG_TERMIOS mutex, this causes a deadlock.

From POSIX signal-safety requirements:

A signal handler can only call async-signal-safe functions. pthread_mutex_lock is NOT on this list.

Impact: Process hangs instead of terminating when user presses Ctrl+\ during terminal restoration.

Fix: Use atomic operations or a lock-free approach:

// Option 1: Use static mut with atomic flag (unsafe but signal-safe)
static mut ORIG_FD: i32 = -1;
static mut ORIG_TERMIOS: libc::termios = unsafe { std::mem::zeroed() };
static TERMIOS_SAVED: AtomicBool = AtomicBool::new(false);

extern "C" fn signal_handler(sig: libc::c_int) {
    if TERMIOS_SAVED.load(Ordering::Relaxed) {
        unsafe {
            libc::tcsetattr(ORIG_FD, libc::TCSANOW, &ORIG_TERMIOS);
        }
    }
    unsafe {
        libc::signal(sig, libc::SIG_DFL);
        libc::raise(sig);
    }
}

[CRITICAL] Using deprecated signal() API instead of sigaction()

Location: src/commands/tty.rs:40, 47-49

The code uses libc::signal(), which has undefined behavior per POSIX:

  • signal() behavior varies across platforms (BSD vs SysV semantics)
  • Re-raising with signal(sig, SIG_DFL); raise(sig) may not work correctly on all systems
  • No control over signal mask or SA_RESTART flags

From POSIX.1-2008:

The use of signal() is unspecified in a multithreaded process. Use sigaction().

Fix: Use sigaction() with proper flags:

let mut sa: libc::sigaction = unsafe { std::mem::zeroed() };
sa.sa_sigaction = signal_handler as usize;
sa.sa_flags = libc::SA_RESETHAND; // Auto-reset to SIG_DFL after first invocation
unsafe {
    libc::sigaction(libc::SIGTERM, &sa, std::ptr::null_mut());
    libc::sigaction(libc::SIGQUIT, &sa, std::ptr::null_mut());
    libc::sigaction(libc::SIGHUP, &sa, std::ptr::null_mut());
}

[MEDIUM] Confusing control flow in reader_loop

Location: src/commands/tty.rs:256-267

Lines 256-257 return Some(1) on EOF, but line 267 also returns Some(1) with comment "Should not reach here normally". This is confusing:

  • EOF without Exit message should probably return error (current behavior is correct)
  • But the unreachable code at line 267 suggests the author wasn't confident about control flow

Recommendation: Add unreachable\!() or explain why line 267 is reachable:

    }
    // Should not reach here - all branches above return
    unreachable\!("reader_loop: exited without return")
}

Summary

The TTY implementation has excellent improvements addressing all previous critical issues (thread joining, FD leaks, waitpid errors, DoS mitigation). However, the signal handling implementation is fundamentally broken:

  1. Mutex in signal handler → deadlock risk
  2. Using deprecated signal() → undefined behavior on some platforms

These issues only manifest under specific timing (signal during mutex hold) or on certain platforms, which is why they may not show up in testing.

Good news: The previous [CRITICAL] issues are all fixed! 🎉

  • ✅ Thread joining
  • ✅ FD cleanup
  • ✅ waitpid error handling
  • ✅ Buffer allocation DoS

Recommended action: Fix the signal-safety issues before merging. The terminal restoration logic is important for user experience.


Review by Claude | Run

@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔧 Auto-Fix Created

I found issues and created a fix PR: #116

The fix addresses critical signal-safety issues in the TTY signal handler that could cause deadlocks or undefined behavior.

Fixed Issues

  • [CRITICAL] Signal handler deadlock from Mutex::lock() in async context
  • [CRITICAL] Undefined behavior from using deprecated signal() API

Please review and merge the fix PR first, then this PR.

View Claude Run

Address review feedback from second Claude review:

1. Replace Mutex with static mut + atomic flag
   - Mutex::lock() is not async-signal-safe per POSIX
   - Could deadlock if signal arrives while mutex is held
   - Now uses static mut ORIG_FD/ORIG_TERMIOS_STORAGE with
     TERMIOS_SAVED atomic flag for synchronization
   - Uses std::ptr::addr_of!() to avoid reference to mutable static

2. Replace signal() with sigaction()
   - signal() has undefined behavior in multithreaded programs
   - sigaction() provides well-defined POSIX semantics
   - Use SA_RESETHAND flag for auto-reset to SIG_DFL

3. Improve reader_loop comment
   - Clarify when post-loop code is reached (done flag set externally)
   - Add debug log for this case

Tested: cargo fmt --check && cargo clippy --all-targets -D warnings
@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔍 Claude Review

SEVERITY: medium

Findings

[MEDIUM] Incorrect sigaction field used for signal handler

Location: src/commands/tty.rs:54

sa.sa_sigaction = signal_handler as usize;  // WRONG FIELD\!

Issue: The code sets sa_sigaction but doesn't use the SA_SIGINFO flag. When SA_SIGINFO is absent, you must use sa_handler instead.

  • sa_handler: Used when SA_SIGINFO is NOT set, expects fn(c_int)
  • sa_sigaction: Used when SA_SIGINFO IS set, expects fn(c_int, *mut siginfo_t, *mut c_void)

The signal handler has the correct signature for sa_handler (line 34):

extern "C" fn signal_handler(_sig: libc::c_int)  // ← sa_handler signature

Impact: This is undefined behavior - may work on some platforms due to memory layout overlap, but is not portable and violates POSIX.

Fix:

// In libc crate, sigaction uses a union for sa_handler/sa_sigaction
// On most platforms, assigning to the wrong field works due to union layout,
// but it's technically UB. Use sa_handler when not using SA_SIGINFO:
sa.sa_handler = signal_handler as usize;  // Correct field

Summary

The TTY implementation has addressed all previous critical issues:

  • ✅ Thread joining fixed
  • ✅ FD cleanup on error paths fixed
  • ✅ waitpid error handling fixed
  • ✅ Buffer allocation DoS mitigated (progressive reads)
  • ✅ Signal handler async-signal-safety fixed (atomic + static mut)
  • ✅ Using sigaction() instead of deprecated signal()

The remaining issue is medium severity - while the current code may work on most platforms due to union memory layout, it's technically undefined behavior and should be corrected for portability.

Good work overall! The implementation is solid and all critical safety issues have been resolved. 🎉


Review by Claude | Run

@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔧 Auto-Fix Created

I found a portability issue and created a fix PR: #117

Issue Fixed

  • [MEDIUM] Using wrong sigaction union field (sa_sigaction instead of sa_handler)

This is a simple one-line fix for undefined behavior. While it may work on most platforms due to union layout, it's not portable per POSIX.

Please review and merge the fix PR first, then this PR.

View Claude Run

Replace raw libc calls with nix crate for cleaner, more idiomatic Rust:

- Use nix::sys::signal::{sigaction, SigAction, SigHandler, Signal}
  instead of raw libc::sigaction
- Use nix::sys::termios::{Termios, tcgetattr, tcsetattr, cfmakeraw}
  instead of raw libc termios calls
- Use nix::unistd::isatty instead of libc::isatty
- Use nix::fcntl for O_NONBLOCK handling

The signal handler still uses unsafe for accessing static mut, but the
rest of the code is now using safe Rust wrappers. The async-signal-safety
properties are preserved - tcsetattr is still async-signal-safe per POSIX.

Tested: cargo fmt --check && cargo clippy --all-targets -D warnings
@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 5, 2026

🔍 Claude Review

SEVERITY: none

Findings

All previous critical and medium issues have been successfully resolved! 🎉

The implementation now has:

  • ✅ Proper signal handling using nix crate with sigaction()
  • ✅ Async-signal-safe signal handler with atomics
  • ✅ Thread joining to prevent zombie threads
  • ✅ FD cleanup on all error paths
  • ✅ waitpid error handling
  • ✅ DoS mitigation with progressive buffer allocation

Minor observation (not requiring a fix):

[LOW] poll() doesn't retry on EINTR (src/commands/tty.rs:324)

The poll() call in writer_loop breaks on any error, including EINTR (interrupted by signal). POSIX recommends retrying on EINTR. However, this is already mitigated by the 1-second timeout loop, so the impact is minimal (at most 1 second delay on signal).

Not worth fixing given the workaround is already in place.

Summary

Excellent work addressing all the feedback! The TTY implementation is production-ready. The signal-safety refactoring using the nix crate is clean and correct. All resource leaks and error handling gaps from earlier reviews have been fixed.

Recommendation: Ready to merge


Review by Claude | Run

@claude-claude
Copy link
Copy Markdown
Contributor

claude-claude bot commented Jan 6, 2026

⚠️ CI Failure Analysis - Workflow Permission Issue

I analyzed CI #20727735934 and identified two critical issues:

Problems Identified

  1. Container Job Failure: Runner version 2.321.0 doesn't support node24 (required by actions/checkout@v6)

    'using: node24' is not supported, use 'docker', 'node12', 'node16' or 'node20' instead.
    
  2. Host Job Failure: cargo command not found during make setup-fcvm

    /bin/bash: line 1: cargo: command not found
    make: *** [Makefile:178: build] Error 127
    

Solution Required

The fixes require changes to .github/workflows/ci.yml:

  • Downgrade all actions/checkout@v6actions/checkout@v4 (to use node20 instead of node24)
  • Add . "$HOME/.cargo/env" after rustup installation in Host/Host-Root jobs to ensure cargo is in PATH

Why I Couldn't Fix This Automatically

GitHub security restrictions prevent GitHub Apps (including Claude CI) from modifying workflow files without the workflows permission. The push was rejected:

refusing to allow a GitHub App to create or update workflow `.github/workflows/ci.yml` without `workflows` permission

Manual Fix Required

I've prepared the fix in commit 37deb81 on branch claude/fix-20736425909, but it cannot be pushed. You'll need to either:

  1. Apply these changes manually to the tty-interactive-mode branch
  2. Grant the GitHub App workflows permission (if possible)
  3. Cherry-pick commit 37deb81 from the local branch

View Claude Run

@ejc3 ejc3 merged commit 39469b7 into main Jan 6, 2026
15 of 17 checks passed
@ejc3 ejc3 deleted the tty-interactive-mode branch January 6, 2026 04:11
ejc3 added a commit that referenced this pull request Jan 6, 2026
Tests that SIGINT sent via terminal interrupt character (^C/0x03) through
a PTY is properly received by fcvm's tokio signal handler and triggers
clean VM shutdown.

This verifies the fix from PR #108 (TTY handling) works correctly when
Ctrl+C is pressed in a real terminal context.

Tested: make test FILTER=ctrlc (passes in ~7s)
ejc3 added a commit that referenced this pull request Jan 6, 2026
Tests that SIGINT sent via terminal interrupt character (^C/0x03) through
a PTY is properly received by fcvm's tokio signal handler and triggers
clean VM shutdown.

This verifies the fix from PR #108 (TTY handling) works correctly when
Ctrl+C is pressed in a real terminal context.

Tested: make test FILTER=ctrlc (passes in ~7s)
ejc3 added a commit that referenced this pull request Jan 6, 2026
Tests that SIGINT sent via terminal interrupt character (^C/0x03) through
a PTY is properly received by fcvm's tokio signal handler and triggers
clean VM shutdown.

This verifies the fix from PR #108 (TTY handling) works correctly when
Ctrl+C is pressed in a real terminal context.

Tested: make test FILTER=ctrlc (passes in ~7s)
ejc3 added a commit that referenced this pull request Mar 2, 2026
Fixes from Claude review (PR #108 comment):

Host-side (src/commands/tty.rs):
- Add signal handlers (SIGTERM, SIGQUIT, SIGHUP) to restore terminal
  on abnormal exit - prevents terminal corruption
- Join writer thread properly instead of drop() - prevents zombie threads
- Log warning on tcsetattr failure instead of silent ignore
- Return exit code 1 on protocol errors/EOF instead of 0

Guest-side (fc-agent/src/tty.rs):
- Fix FD leaks on vsock.try_clone() failure (now closes master_fd or
  stdout_read/stdin_write depending on mode)
- Fix FD leak on dup() failure (now closes master_fd)
- Check waitpid return value and log error on failure

Protocol (exec-proto/src/lib.rs):
- Reduce max message size from 16MB to 1MB (sufficient for TTY)
- Read payload progressively in 8KB chunks instead of allocating full
  buffer upfront - prevents memory exhaustion on disconnect

Tested: cargo fmt --check && cargo clippy --all-targets -D warnings
ejc3 added a commit that referenced this pull request Mar 2, 2026
Add TTY and interactive mode support (-it flags)
ejc3 added a commit that referenced this pull request Mar 2, 2026
Tests that SIGINT sent via terminal interrupt character (^C/0x03) through
a PTY is properly received by fcvm's tokio signal handler and triggers
clean VM shutdown.

This verifies the fix from PR #108 (TTY handling) works correctly when
Ctrl+C is pressed in a real terminal context.

Tested: make test FILTER=ctrlc (passes in ~7s)
ejc3 added a commit that referenced this pull request Mar 2, 2026
Fixes from Claude review (PR #108 comment):

Host-side (src/commands/tty.rs):
- Add signal handlers (SIGTERM, SIGQUIT, SIGHUP) to restore terminal
  on abnormal exit - prevents terminal corruption
- Join writer thread properly instead of drop() - prevents zombie threads
- Log warning on tcsetattr failure instead of silent ignore
- Return exit code 1 on protocol errors/EOF instead of 0

Guest-side (fc-agent/src/tty.rs):
- Fix FD leaks on vsock.try_clone() failure (now closes master_fd or
  stdout_read/stdin_write depending on mode)
- Fix FD leak on dup() failure (now closes master_fd)
- Check waitpid return value and log error on failure

Protocol (exec-proto/src/lib.rs):
- Reduce max message size from 16MB to 1MB (sufficient for TTY)
- Read payload progressively in 8KB chunks instead of allocating full
  buffer upfront - prevents memory exhaustion on disconnect

Tested: cargo fmt --check && cargo clippy --all-targets -D warnings
ejc3 added a commit that referenced this pull request Mar 2, 2026
Add TTY and interactive mode support (-it flags)
ejc3 added a commit that referenced this pull request Mar 2, 2026
Tests that SIGINT sent via terminal interrupt character (^C/0x03) through
a PTY is properly received by fcvm's tokio signal handler and triggers
clean VM shutdown.

This verifies the fix from PR #108 (TTY handling) works correctly when
Ctrl+C is pressed in a real terminal context.

Tested: make test FILTER=ctrlc (passes in ~7s)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant