Executive Summary
7 failures in the 6h window (2026-05-01T11:32–12:28Z) across 6 workflows. Three distinct root-cause clusters identified:
- Cluster A — GitHub API Rate Limiting (4/7 runs): Concurrent safe-output writes exhausted the installation-token quota;
create_issue, add_labels, and lock issue all failed after retries.
- Cluster B — Missing CLI binaries in AWF chroot (2/7 runs):
codex not on PATH (Codex engine) and Node.js unavailable (Copilot CLI engine), both resulting in exit 127.
- Cluster C — Python pip dependency failure (1/7 run):
scipy fails to generate package metadata; follow-on pip install pandas matplotlib seaborn timed out after 2× 120 s waits.
One sub-issue for Cluster A (P0) is linked below. Clusters B and C are documented here as P1 and P2 for follow-up.
Failure Cluster Table
| Run ID |
Workflow |
Engine |
Cluster |
Conclusion |
Run URL |
| 25212818396 |
Daily Fact About gh-aw |
Codex |
B |
exit 127 (codex not found) |
§25212818396 |
| 25213299148 |
Daily Skill Optimizer Improvements |
Copilot |
A |
create_issue rate-limit after 3 retries |
§25213299148 |
| 25213666728 |
AI Moderator |
Codex |
A |
add_labels rate-limit |
§25213666728 |
| 25213669352 |
Step Name Alignment |
Claude |
A |
create_issue rate-limit after 3 retries |
§25213669352 |
| 25213746690 |
Daily Issues Report Generator |
Copilot |
B |
exit 127 (Node.js not in chroot) |
§25213746690 |
| 25213787885 |
GitHub MCP Structural Analysis |
Claude |
C |
scipy install error → pandas timeout |
§25213787885 |
| 25214243935 |
AI Moderator |
Codex |
A |
lock issue rate-limit |
§25214243935 |
Evidence
Cluster A — GitHub API Rate Limiting (4 runs)
All four failures share the same error pattern from the safe_outputs workflow step:
API rate limit exceeded for installation. request ID ...
timestamp 2026-05-01 12:09:07 UTC
Timeline: burst of concurrent runs started between 12:05–12:10 UTC; by 12:09 UTC the installation token was exhausted. A second isolated hit at 12:28 UTC (AI Moderator on issue_comment event) shows the token hadn't fully recovered.
Affected operations:
create_issue (Daily Skill Optimizer, Step Name Alignment): 3 retries each, ~90 s total wait, still failed
add_labels (AI Moderator): first attempt failed with no retry success
lock issue (AI Moderator lock workflow): pre-activation step failed
All failures are in the safe-outputs processing layer, not the agent itself. Agent work was completed successfully in all four cases.
Comparator: Copilot CLI Deep Research Agent (§25213682014) started at 12:06Z and succeeded — it used the noop safe output which requires no write API calls, so it was unaffected.
Cluster B — Missing CLI binaries in AWF chroot (2 runs)
Daily Fact About gh-aw (Codex engine, §25212818396):
/bin/bash: line 1: codex: command not found
Process exiting with code: 127
The entrypoint tries to run codex exec but the binary is absent from PATH inside the chroot. The run never reached the agent.
Daily Issues Report Generator (Copilot CLI engine, §25213746690):
[entrypoint][ERROR] Copilot CLI requires Node.js, but 'node' is not available inside AWF chroot.
[entrypoint][ERROR] Ensure Node.js is installed on the runner and reachable from PATH inside the chroot.
Process exiting with code: 127
The Copilot CLI harness detects Node.js is missing and exits cleanly with code 127. The runner likely uses a different image or lost a cached tool.
Both failures are pre-agent (harness launch failures) with zero tokens consumed.
Cluster C — Python pip dependency failure (1 run)
GitHub MCP Structural Analysis (Claude engine, §25213787885):
The workflow installs Python packages at runtime. scipy fails first:
× Encountered error while generating package metadata.
╰─> scipy
note: This is an issue with the package mentioned above, not pip.
The agent recovers and tries pip install pandas matplotlib seaborn. This runs in background and the agent polls it twice with a 120 s timeout each time, eventually timing out. Total runtime: 17.4m before the workflow failed.
Root cause: scipy build depends on native compilation (Fortran/C) tools not available in the sandbox. pandas/matplotlib/seaborn take >240 s to install from wheels (slow download or missing binary wheels).
Remediation: pre-install or pin dependencies in the workflow setup, or use a requirements file with only binary-wheel packages.
Existing Issue Correlation
Unable to read existing open issues (GitHub API not authenticated in this context). Sub-issues are created de-novo. Reviewers should check for duplicates against existing agentic-workflows issues for Clusters B and C.
Proposed Fix Roadmap
| Priority |
Cluster |
Fix |
| P0 |
A — Rate limiting |
Stagger concurrent workflow trigger times to avoid burst; add rate-limit backoff/retry logic in safe-outputs handler; see sub-issue #29541 |
| P1 |
B — Missing CLI binaries |
Add pre-flight binary checks to entrypoint with actionable error messages and runner image pinning; verify codex binary deploy pipeline and Node.js bind-mount |
| P2 |
C — pip deps |
Pre-install required Python packages in workflow setup step or use a pre-built requirements image for GitHub MCP Structural Analysis |
Sub-issues Created
References:
Executive Summary
7 failures in the 6h window (2026-05-01T11:32–12:28Z) across 6 workflows. Three distinct root-cause clusters identified:
create_issue,add_labels, andlock issueall failed after retries.codexnot on PATH (Codex engine) and Node.js unavailable (Copilot CLI engine), both resulting in exit 127.scipyfails to generate package metadata; follow-onpip install pandas matplotlib seaborntimed out after 2× 120 s waits.One sub-issue for Cluster A (P0) is linked below. Clusters B and C are documented here as P1 and P2 for follow-up.
Failure Cluster Table
codexnot found)create_issuerate-limit after 3 retriesadd_labelsrate-limitcreate_issuerate-limit after 3 retrieslock issuerate-limitEvidence
Cluster A — GitHub API Rate Limiting (4 runs)
All four failures share the same error pattern from the safe_outputs workflow step:
Timeline: burst of concurrent runs started between 12:05–12:10 UTC; by 12:09 UTC the installation token was exhausted. A second isolated hit at 12:28 UTC (AI Moderator on
issue_commentevent) shows the token hadn't fully recovered.Affected operations:
create_issue(Daily Skill Optimizer, Step Name Alignment): 3 retries each, ~90 s total wait, still failedadd_labels(AI Moderator): first attempt failed with no retry successlock issue(AI Moderator lock workflow): pre-activation step failedAll failures are in the safe-outputs processing layer, not the agent itself. Agent work was completed successfully in all four cases.
Comparator: Copilot CLI Deep Research Agent (§25213682014) started at 12:06Z and succeeded — it used the
noopsafe output which requires no write API calls, so it was unaffected.Cluster B — Missing CLI binaries in AWF chroot (2 runs)
Daily Fact About gh-aw (Codex engine, §25212818396):
The entrypoint tries to run
codex execbut the binary is absent from PATH inside the chroot. The run never reached the agent.Daily Issues Report Generator (Copilot CLI engine, §25213746690):
The Copilot CLI harness detects Node.js is missing and exits cleanly with code 127. The runner likely uses a different image or lost a cached tool.
Both failures are pre-agent (harness launch failures) with zero tokens consumed.
Cluster C — Python pip dependency failure (1 run)
GitHub MCP Structural Analysis (Claude engine, §25213787885):
The workflow installs Python packages at runtime.
scipyfails first:The agent recovers and tries
pip install pandas matplotlib seaborn. This runs in background and the agent polls it twice with a 120 s timeout each time, eventually timing out. Total runtime: 17.4m before the workflow failed.Root cause: scipy build depends on native compilation (Fortran/C) tools not available in the sandbox. pandas/matplotlib/seaborn take >240 s to install from wheels (slow download or missing binary wheels).
Remediation: pre-install or pin dependencies in the workflow setup, or use a requirements file with only binary-wheel packages.
Existing Issue Correlation
Unable to read existing open issues (GitHub API not authenticated in this context). Sub-issues are created de-novo. Reviewers should check for duplicates against existing
agentic-workflowsissues for Clusters B and C.Proposed Fix Roadmap
codexbinary deploy pipeline and Node.js bind-mountSub-issues Created
References: