-
Notifications
You must be signed in to change notification settings - Fork 92
Description
Feature Request: Add --terminate-on-abnormal-exit option to pv-adverb
Summary
Add a new option --terminate-on-abnormal-exit[=SECONDS] that enables automatic cleanup of descendant processes when the primary command exits abnormally (crash), while preserving the current behavior of waiting for all descendants on normal exit.
Problem Statement
Currently, pv-adverb with --subreaper waits indefinitely for all descendant processes to exit. This is intentional and documented as preserving "Steam's traditional behaviour for native Linux games."
However, when a game crashes (exits with signal or non-zero status), the container continues waiting for orphaned Wine processes like winedevice.exe that may never exit on their own. This leads to:
- Resource consumption - Orphaned processes consume CPU/memory indefinitely
- User confusion - Steam may show game as "stopped" while processes persist
- Inability to relaunch - Some games can't start while previous instance's processes exist
- Manual intervention required - Users must manually kill processes or reboot
Real-World Example
Warhammer 40,000: Darktide (AppId: 1361210) on Proton 9.0 exhibits this behavior. When the game crashes, winedevice.exe processes persist indefinitely because:
pv-adverb --subreaperadopts them--terminate-timeoutdefaults to-1.0(disabled)- No signal is ever sent to clean them up
Related Issues
This request is related to but distinct from:
- GitLab Issue #54: "Pressure Vessel doesn't exit when game launch is canceled" - addresses cancellation, not crash cleanup
- Bubblewrap #529:
--die-with-parentlimitations - steam-runtime-supervisor (MR !666): Related process supervision work
Proposed Solution
New Option: --terminate-on-abnormal-exit[=SECONDS]
Add an option that triggers --terminate-timeout behavior only when the primary command exits abnormally.
Definition of "abnormal exit" (following systemd's on-abnormal semantics):
- Process terminated by signal (
WIFSIGNALED(status)is true) - Exit code >= 128 (indicates signal: 128 + signal_number)
- Core dump generated
Behavior:
--terminate-on-abnormal-exit=30
- If COMMAND exits with code 0: Wait indefinitely for descendants (current behavior)
- If COMMAND is terminated by signal: Enable terminate-timeout=30, clean up descendants
- If COMMAND exits with code >= 128: Enable terminate-timeout=30, clean up descendants
Technical Consideration: Exit Code Propagation
Verified: Official Valve Proton 9.0 properly propagates exit codes. The proton script captures the return value from run_proc() in a variable rc and calls sys.exit(rc) at the end, ensuring exit codes flow through correctly.
Exit code chain verification:
- Wine follows the 128+signal convention (e.g., SIGSEGV → 139, SIGABRT → 134)
- Proton captures and propagates Wine's exit code via
subprocess.call()→sys.exit(rc) - pv-adverb receives the exit status via
waitpid()and can check bothWIFSIGNALED()andWEXITSTATUS()
This means the proposed feature would work correctly for:
- Native Linux games (direct exit code propagation)
- Proton/Wine games (exit codes propagated through the chain)
- Direct signal detection (
WIFSIGNALED) as an additional safety check
Implementation Sketch
In adverb.c, after the child process exits:
// After waitpid() returns
if (opt_terminate_on_abnormal_exit >= 0.0)
{
gboolean is_abnormal = FALSE;
if (WIFSIGNALED(wait_status))
{
// Killed by signal (SIGSEGV, SIGABRT, etc.)
is_abnormal = TRUE;
g_info("Child terminated by signal %d, enabling cleanup", WTERMSIG(wait_status));
}
else if (WIFEXITED(wait_status) && WEXITSTATUS(wait_status) >= 128)
{
// Exit code indicates signal (128 + signal_number)
is_abnormal = TRUE;
g_info("Child exited with signal-like status %d, enabling cleanup", WEXITSTATUS(wait_status));
}
if (is_abnormal && process_manager_options.terminate_grace_usec < 0)
{
// Enable termination timeout for cleanup
process_manager_options.terminate_grace_usec =
opt_terminate_on_abnormal_exit * G_TIME_SPAN_SECOND;
}
}Command-Line Interface
--terminate-on-abnormal-exit=SECONDS
If the primary command exits abnormally (killed by signal or
exit code >= 128), send SIGTERM to remaining descendant processes
after --terminate-idle-timeout seconds, then SIGKILL after SECONDS.
If 0.0, use SIGKILL immediately. Implies --subreaper.
This preserves the traditional behavior of waiting for descendants
on normal exit (code 0) while ensuring cleanup after crashes.
[Default: -1.0, meaning disabled]
Alternatives Considered
1. Just use --terminate-timeout
Games could be launched with --terminate-timeout=30. However, this changes the semantics for ALL exits, including normal ones where background processes might legitimately need to finish (e.g., save game sync, cloud upload).
The proposed --terminate-on-abnormal-exit option is additive - it only triggers on crashes while preserving the documented behavior for normal exits.
2. Steam client enables --terminate-timeout by default
This would solve the immediate problem but changes established behavior for all games. The conditional approach respects the design decision documented in adverb.1.md that games should wait for descendants.
3. Fix exit code propagation in Proton first
While this would be ideal, it's a separate concern. The WIFSIGNALED() check works regardless, and this feature benefits native Linux games immediately.
Benefits
- Backward compatible - Default is disabled, no change to existing behavior
- Follows established patterns - Similar to systemd's
Restart=on-abnormal - Minimal code change - Only adds exit status check before existing terminate logic
- Solves real user pain - Addresses orphaned process issues on crash
- Per-game configurable - Can be enabled only for games known to crash
- Works with existing infrastructure - Leverages
process_manager_options.terminate_grace_usec
Use Case
Steam client could enable this for Proton games:
pv-adverb \
--exit-with-parent \
--subreaper \
--terminate-on-abnormal-exit=30 \
-- proton waitforexitandrun game.exeOr users could set via launch options:
PRESSURE_VESSEL_TERMINATE_ON_ABNORMAL=30 %command%
Environment
- OS: Linux 6.14.0-36-generic
- Steam Runtime: SteamLinuxRuntime_sniper
- Proton: 9.0 (Beta)
- Affected Game: Warhammer 40,000: Darktide (1361210)
System Diagnostics
Steam Runtime: sniper 3.0.20250929.168600
pressure-vessel: 0.20250926.0
steam-runtime-tools scripts: 0.20250926.0
References
- systemd Restart=on-abnormal - Similar pattern in systemd
- Docker restart on-failure - Container restart policies
- Exit code 128+signal convention - Linux signal exit codes
- pv-adverb documentation - Current pv-adverb behavior
- Wine exit code handling - Wine's
get_unix_exit_codefunction - Proton exit code propagation - Proton script properly propagates exit codes via
sys.exit(rc) - Process completion status - GNU C Library documentation on WIFSIGNALED/WIFEXITED
Acknowledgments
I appreciate the complexity of process lifecycle management in containerized environments and recognize that:
- The current behavior is intentional and documented
steam-runtime-supervisordevelopment shows active investment in process supervision- This request aims to complement existing work, not replace it