Skip to content

bmalph run can hang if output exceeds pipe buffer #170

@A9G-Data-Droid

Description

@A9G-Data-Droid

Describe the bug

startRunDashboard() pipes stdio but never attaches data listeners, so any long-running loop eventually deadlocks once ~64 KiB of unread output accumulates in the kernel pipe buffer.

spawnRalphLoop() in dist/run/ralph-process.js spawns ralph_loop.sh with stdio: ["ignore", "pipe", "pipe"] whenever dashboard mode is active (the default). run-dashboard.js then reads state exclusively from files (.ralph/progress.json, .ralph/live.log, etc.) and never attaches data / readable listeners — or calls .resume() — on ralph.child.stdout or ralph.child.stderr. The only references to those streams are a .destroy() call inside detach().

Because the shell writes a steady trickle (spinner lines via the log helper, colored echo -e ... >&2 messages, periodic claude JSON), the pipe buffers fill after roughly 10–30 minutes of runtime. At that point the next write(2) from bash blocks forever inside the kernel, and the whole loop freezes silently.

Observed symptoms:

  • Dashboard shows Progress: 0/0 (0%) with a spinner that never advances.
  • .ralph/progress.json is frozen on "status": "executing"; ralph.log stops getting new ⠸ Claude Code working... lines.
  • claude itself exits cleanly (result JSON is written to .ralph/logs/claude_output_*.log), but the next loop never starts.
  • ps shows the bash ./.ralph/ralph_loop.sh process in state S with no children, and its stdout/stderr fds both point to a socket: inode that the node parent isn't draining.
  • /proc/<pid>/wchan reports sock_alloc_send_pskb — the kernel is waiting for socket send-buffer space (node's child.stdout pipes on Linux are implemented as socketpairs, hence the "socket" fd).

To reproduce

Steps to reproduce the behavior:

  1. In any bmalph-initialized project, run bmalph run (dashboard mode — the default).
  2. Let the Ralph loop run long enough to emit more than ~64 KiB of cumulative stdout/stderr. In practice this takes roughly 10–30 minutes depending on spinner cadence and claude output volume.
  3. Observe that the dashboard freezes: progress stops advancing, live.log/ralph.log stop updating, and no new claude invocation is spawned even after the current one completes.
  4. ps -ef shows the ralph_loop bash process alive with no children; cat /proc/<pid>/wchan prints sock_alloc_send_pskb.

Expected behavior

The Ralph loop should continue advancing across iterations indefinitely regardless of how much output ralph_loop.sh emits. Piped stdio from the child should either be drained by the parent or not created in the first place.

Suggested fix — one of:

  • Minimal: change spawnRalphLoop() in dist/run/ralph-process.js to use stdio: options.inheritStdio ? "inherit" : ["ignore", "ignore", "ignore"]. The dashboard already reads everything it needs from files, and ralph_loop.sh mirrors its log helper to .ralph/logs/ralph.log, so nothing on disk is lost. The existing detach() path is safe because child.stdout.destroy() is guarded by if (child.stdout), and that object is null when stdio is "ignore".
  • Alternative: keep the pipes but attach drain listeners in startRunDashboard() (e.g. ralph.child.stdout?.on("data", () => {}); ralph.child.stderr?.on("data", () => {});) so the buffers keep flowing.

Environment

  • OS: Linux 6.12 (Docker sandbox on macOS host; also reproducible on bare Linux)
  • Node.js version: 20.19.4
  • bmalph version: 2.11.0
  • Shell: GNU bash 5.2.37

Additional context

Workaround until a release ships: patch the installed file in place.

--- a/dist/run/ralph-process.js
+++ b/dist/run/ralph-process.js
@@
-        stdio: options.inheritStdio ? "inherit" : ["ignore", "pipe", "pipe"],
+        stdio: options.inheritStdio ? "inherit" : ["ignore", "ignore", "ignore"],

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions