Problem
Agentic workflows using Copilot CLI (Node.js-based) and Codex fail at the very first step with node: command not found or codex: command not found. The agent job is killed before any agent output is produced.
Three workflows were affected in a 1.5-hour burst on 2026-04-27 09:25–11:41 UTC:
Other Copilot CLI and Codex runs in the same 6-hour window succeeded, confirming this is intermittent and runner-slot specific.
Context
Original report: github/gh-aw#28726
The agent execution container (containers/agent/) uses selective bind mounts for host system binaries (/usr, /bin, /sbin, /lib, /lib64, /opt) mounted read-only under /host/. The entrypoint.sh chroots into /host before running the user command. If the runner slot that supplies these host paths lacks node or codex in its PATH (e.g., /usr/local/bin/node or /usr/bin/node), the binary will be absent inside the chroot and the command fails immediately.
Root Cause
Runner slot(s) were provisioned without the engine binary present on the host filesystem during the 09:25–11:41 UTC window. Because entrypoint.sh in containers/agent/ chroots to /host and runs the user command directly, any binary missing from the host's PATH will also be missing inside the container. There is no pre-flight check in containers/agent/entrypoint.sh to verify that the resolved command binary actually exists before executing.
Relevant files:
containers/agent/entrypoint.sh — chroot and command execution logic
src/docker-manager.ts — generates the Docker Compose config and bind mounts
src/cli.ts — orchestrates container startup and command execution
Proposed Solution
-
Add a pre-flight binary check in containers/agent/entrypoint.sh: Before executing the user command, resolve the binary via command -v (or which) inside the chroot and emit a clear diagnostic message if it is not found. Fail fast with exit code 127 and a human-readable error pointing to missing PATH setup, rather than letting the shell print a cryptic command not found.
-
Add a node --version / engine health-check step to the AWF wrapper CLI (src/cli.ts): After startContainers() and before runAgentCommand(), optionally execute a lightweight check that the target binary is resolvable inside the agent container (e.g., docker exec awf-agent command -v node). If the check fails, abort with a clear error before wasting runner time.
-
Document the runner requirement: Update README / docs/environment.md to explicitly state that the host runner must have the engine binary (e.g., node, codex) in a system PATH directory that is bind-mounted into the agent container (/usr, /bin, /sbin, /opt).
-
Investigate intermittent provisioning: Cross-reference with the runner infrastructure team to determine whether a specific runner pool/label intermittently lacks node or codex on the host filesystem.
Generated by Firewall Issue Dispatcher · ● 631.5K · ◷
Problem
Agentic workflows using Copilot CLI (Node.js-based) and Codex fail at the very first step with
node: command not foundorcodex: command not found. The agent job is killed before any agent output is produced.Three workflows were affected in a 1.5-hour burst on 2026-04-27 09:25–11:41 UTC:
node: command not found(run §24986870660)node: command not found(run §24990655972)codex: command not found(run §24992928191)Other Copilot CLI and Codex runs in the same 6-hour window succeeded, confirming this is intermittent and runner-slot specific.
Context
Original report: github/gh-aw#28726
The agent execution container (
containers/agent/) uses selective bind mounts for host system binaries (/usr,/bin,/sbin,/lib,/lib64,/opt) mounted read-only under/host/. Theentrypoint.shchroots into/hostbefore running the user command. If the runner slot that supplies these host paths lacksnodeorcodexin its PATH (e.g.,/usr/local/bin/nodeor/usr/bin/node), the binary will be absent inside the chroot and the command fails immediately.Root Cause
Runner slot(s) were provisioned without the engine binary present on the host filesystem during the 09:25–11:41 UTC window. Because
entrypoint.shincontainers/agent/chroots to/hostand runs the user command directly, any binary missing from the host's PATH will also be missing inside the container. There is no pre-flight check incontainers/agent/entrypoint.shto verify that the resolved command binary actually exists before executing.Relevant files:
containers/agent/entrypoint.sh— chroot and command execution logicsrc/docker-manager.ts— generates the Docker Compose config and bind mountssrc/cli.ts— orchestrates container startup and command executionProposed Solution
Add a pre-flight binary check in
containers/agent/entrypoint.sh: Before executing the user command, resolve the binary viacommand -v(orwhich) inside the chroot and emit a clear diagnostic message if it is not found. Fail fast with exit code 127 and a human-readable error pointing to missing PATH setup, rather than letting the shell print a crypticcommand not found.Add a
node --version/ engine health-check step to the AWF wrapper CLI (src/cli.ts): AfterstartContainers()and beforerunAgentCommand(), optionally execute a lightweight check that the target binary is resolvable inside the agent container (e.g.,docker exec awf-agent command -v node). If the check fails, abort with a clear error before wasting runner time.Document the runner requirement: Update README /
docs/environment.mdto explicitly state that the host runner must have the engine binary (e.g.,node,codex) in a system PATH directory that is bind-mounted into the agent container (/usr,/bin,/sbin,/opt).Investigate intermittent provisioning: Cross-reference with the runner infrastructure team to determine whether a specific runner pool/label intermittently lacks
nodeorcodexon the host filesystem.