Skip to content

Feature Request: Add --terminate-on-abnormal-exit option to pv-adverb #790

@drewrad8

Description

@drewrad8

Feature Request: Add --terminate-on-abnormal-exit option to pv-adverb

Summary

Add a new option --terminate-on-abnormal-exit[=SECONDS] that enables automatic cleanup of descendant processes when the primary command exits abnormally (crash), while preserving the current behavior of waiting for all descendants on normal exit.

Problem Statement

Currently, pv-adverb with --subreaper waits indefinitely for all descendant processes to exit. This is intentional and documented as preserving "Steam's traditional behaviour for native Linux games."

However, when a game crashes (exits with signal or non-zero status), the container continues waiting for orphaned Wine processes like winedevice.exe that may never exit on their own. This leads to:

  1. Resource consumption - Orphaned processes consume CPU/memory indefinitely
  2. User confusion - Steam may show game as "stopped" while processes persist
  3. Inability to relaunch - Some games can't start while previous instance's processes exist
  4. Manual intervention required - Users must manually kill processes or reboot

Real-World Example

Warhammer 40,000: Darktide (AppId: 1361210) on Proton 9.0 exhibits this behavior. When the game crashes, winedevice.exe processes persist indefinitely because:

  • pv-adverb --subreaper adopts them
  • --terminate-timeout defaults to -1.0 (disabled)
  • No signal is ever sent to clean them up

Related Issues

This request is related to but distinct from:

Proposed Solution

New Option: --terminate-on-abnormal-exit[=SECONDS]

Add an option that triggers --terminate-timeout behavior only when the primary command exits abnormally.

Definition of "abnormal exit" (following systemd's on-abnormal semantics):

  • Process terminated by signal (WIFSIGNALED(status) is true)
  • Exit code >= 128 (indicates signal: 128 + signal_number)
  • Core dump generated

Behavior:

--terminate-on-abnormal-exit=30
  • If COMMAND exits with code 0: Wait indefinitely for descendants (current behavior)
  • If COMMAND is terminated by signal: Enable terminate-timeout=30, clean up descendants
  • If COMMAND exits with code >= 128: Enable terminate-timeout=30, clean up descendants

Technical Consideration: Exit Code Propagation

Verified: Official Valve Proton 9.0 properly propagates exit codes. The proton script captures the return value from run_proc() in a variable rc and calls sys.exit(rc) at the end, ensuring exit codes flow through correctly.

Exit code chain verification:

  1. Wine follows the 128+signal convention (e.g., SIGSEGV → 139, SIGABRT → 134)
  2. Proton captures and propagates Wine's exit code via subprocess.call()sys.exit(rc)
  3. pv-adverb receives the exit status via waitpid() and can check both WIFSIGNALED() and WEXITSTATUS()

This means the proposed feature would work correctly for:

  • Native Linux games (direct exit code propagation)
  • Proton/Wine games (exit codes propagated through the chain)
  • Direct signal detection (WIFSIGNALED) as an additional safety check

Implementation Sketch

In adverb.c, after the child process exits:

// After waitpid() returns
if (opt_terminate_on_abnormal_exit >= 0.0)
{
    gboolean is_abnormal = FALSE;

    if (WIFSIGNALED(wait_status))
    {
        // Killed by signal (SIGSEGV, SIGABRT, etc.)
        is_abnormal = TRUE;
        g_info("Child terminated by signal %d, enabling cleanup", WTERMSIG(wait_status));
    }
    else if (WIFEXITED(wait_status) && WEXITSTATUS(wait_status) >= 128)
    {
        // Exit code indicates signal (128 + signal_number)
        is_abnormal = TRUE;
        g_info("Child exited with signal-like status %d, enabling cleanup", WEXITSTATUS(wait_status));
    }

    if (is_abnormal && process_manager_options.terminate_grace_usec < 0)
    {
        // Enable termination timeout for cleanup
        process_manager_options.terminate_grace_usec =
            opt_terminate_on_abnormal_exit * G_TIME_SPAN_SECOND;
    }
}

Command-Line Interface

--terminate-on-abnormal-exit=SECONDS
    If the primary command exits abnormally (killed by signal or
    exit code >= 128), send SIGTERM to remaining descendant processes
    after --terminate-idle-timeout seconds, then SIGKILL after SECONDS.
    If 0.0, use SIGKILL immediately. Implies --subreaper.
    This preserves the traditional behavior of waiting for descendants
    on normal exit (code 0) while ensuring cleanup after crashes.
    [Default: -1.0, meaning disabled]

Alternatives Considered

1. Just use --terminate-timeout

Games could be launched with --terminate-timeout=30. However, this changes the semantics for ALL exits, including normal ones where background processes might legitimately need to finish (e.g., save game sync, cloud upload).

The proposed --terminate-on-abnormal-exit option is additive - it only triggers on crashes while preserving the documented behavior for normal exits.

2. Steam client enables --terminate-timeout by default

This would solve the immediate problem but changes established behavior for all games. The conditional approach respects the design decision documented in adverb.1.md that games should wait for descendants.

3. Fix exit code propagation in Proton first

While this would be ideal, it's a separate concern. The WIFSIGNALED() check works regardless, and this feature benefits native Linux games immediately.

Benefits

  1. Backward compatible - Default is disabled, no change to existing behavior
  2. Follows established patterns - Similar to systemd's Restart=on-abnormal
  3. Minimal code change - Only adds exit status check before existing terminate logic
  4. Solves real user pain - Addresses orphaned process issues on crash
  5. Per-game configurable - Can be enabled only for games known to crash
  6. Works with existing infrastructure - Leverages process_manager_options.terminate_grace_usec

Use Case

Steam client could enable this for Proton games:

pv-adverb \
    --exit-with-parent \
    --subreaper \
    --terminate-on-abnormal-exit=30 \
    -- proton waitforexitandrun game.exe

Or users could set via launch options:

PRESSURE_VESSEL_TERMINATE_ON_ABNORMAL=30 %command%

Environment

  • OS: Linux 6.14.0-36-generic
  • Steam Runtime: SteamLinuxRuntime_sniper
  • Proton: 9.0 (Beta)
  • Affected Game: Warhammer 40,000: Darktide (1361210)

System Diagnostics

Steam Runtime: sniper 3.0.20250929.168600
pressure-vessel: 0.20250926.0
steam-runtime-tools scripts: 0.20250926.0

References

Acknowledgments

I appreciate the complexity of process lifecycle management in containerized environments and recognize that:

  • The current behavior is intentional and documented
  • steam-runtime-supervisor development shows active investment in process supervision
  • This request aims to complement existing work, not replace it

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions