Overview
The Smoke CI batch on branch copilot/add-edit-wiki-support (PR #29626) at ~01:00Z on 2026-05-02 produced 4 failures across all non-Copilot engines. Two failures are P0 (agent crashes before any output); two are P1 (agent succeeded but safe_outputs job failed). No prior tracking issue was found for these failure signatures.
Failure Clusters
| Priority |
Run ID |
Workflow |
Failing Job |
Root Cause |
| P0 |
§25239718625 |
Smoke Crush |
agent |
EROFS: read-only filesystem blocks Crush binary install into node_modules |
| P0 |
§25239718609 |
Smoke Gemini |
agent |
GEMINI_API_KEY invalid — HTTP 400 API_KEY_INVALID on first API call |
| P1 |
§25239718599 |
Smoke Claude |
safe_outputs |
resolve_pull_request_review_thread → Resource not accessible by integration |
| P1 |
§25239718605 |
Smoke Codex |
safe_outputs |
edit_wiki HTTPS push: fatal: could not read Username (no git credentials) |
Evidence
P0 — Crush: EROFS crash details
From agent-stdio.log for run 25239718625:
ERREOFS: read-only file system, mkdir
'/opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/`@charmland/crush`/bin'
Crush v0.59.0 downloaded successfully but the npm post-install script tries to copy the extracted binary into the node module's bin/ subdirectory inside the hosted tool cache, which is mounted read-only inside the chroot/sandbox. A secondary error also occurred: Failed to transfer /host/home/runner/work/_temp/gh-aw/safeoutputs ownership to chroot user. Zero turns, zero tool calls — total agent runtime ~1m.
P0 — Gemini: API key invalid
From gemini-client-error-generateJson-api-2026-05-02T01-00-16-489Z.json and gemini-client-error-Turn.run-sendMessageStream-2026-05-02T01-00-16-721Z.json for run 25239718609:
{"error": {"code": 400, "status": "INVALID_ARGUMENT",
"message": "API key not valid. Please pass a valid API key.",
"details": [{"reason": "API_KEY_INVALID", "domain": "googleapis.com"}]}}
Both generateJson (routing/classification) and sendMessageStream (main conversation stream) hit the error simultaneously at 01:00:16Z. The Crush run's stdio log also warned GEMINI_API_KEY is not set at line 15-16, confirming the key is absent/expired in the CI secret store.
P1 — Smoke Claude: safe_outputs permission failure
From workflow-logs/safe_outputs/8_Process Safe Outputs.txt for run 25239718599:
- Agent ran 39 turns, 27 tool types, 1.44M tokens, ~$1.14
- 12 of 13 safe output messages succeeded
- Message 10/13 failed:
resolve_pull_request_review_thread
The GitHub App/token used by the safe_outputs job lacks the pull_request_review write permission needed to resolve review threads. This permission is not required for most safe output operations, but the Smoke Claude task exercises it.
Additional finding: the agent was resource-heavy for a Triage-domain task — 39 turns vs. typical baseline, classified resource_heavy_for_domain (high) and poor_agentic_control (medium).
P1 — Smoke Codex: edit_wiki credential failure + missing MCP
From workflow-logs/safe_outputs/9_Process Safe Outputs.txt for run 25239718605:
fatal: could not read Username for 'https://github.com': No such device or address
The safe_outputs job clones the wiki over HTTPS and applies a patch, but no git credential helper is configured in that job context (unlike the main agent job). 3 of 4 outputs succeeded (created issue #29655, closed #29638, added comment on PR #29626); the wiki edit failure caused the overall job to exit as failure.
Additional finding: Codex does not have the web-fetch MCP tool configured — agent called safeoutputs.missing_tool to signal this. Codex fell back to playwright.browser_navigate targeting github.com. Firewall also blocked 5 requests to ab.chatgpt.com:443 (4) and chatgpt.com:443 (1).
Existing Issue Correlation
GitHub issues API was not reachable during this investigation (HTTP 403 on localhost proxy). No de-duplication against existing tracking was possible. Smoke test result issues #29655 and #29658 were created by the agents themselves as part of normal operation.
Proposed Fix Roadmap
| Priority |
Item |
Owner Area |
| P0 |
Crush install path: write binary to writable temp dir instead of node_modules |
Crush integration / workflow setup |
| P0 |
Rotate or re-configure GEMINI_API_KEY in CI secrets |
Secrets / infra |
| P1 |
Grant resolve PR review thread permission to safe_outputs job token |
Workflow permissions |
| P1 |
Configure git credentials for wiki HTTPS push in safe_outputs job |
safe_outputs infra |
| P2 |
Add web-fetch MCP to Codex engine config |
Engine config |
Sub-Issues Created
- #aw_p0fix — P0 crashes: Crush EROFS install + Gemini API key invalid
References:
Generated by [aw] Failure Investigator (6h) · ● 639.1K · ◷
Overview
The Smoke CI batch on branch
copilot/add-edit-wiki-support(PR #29626) at ~01:00Z on 2026-05-02 produced 4 failures across all non-Copilot engines. Two failures are P0 (agent crashes before any output); two are P1 (agent succeeded butsafe_outputsjob failed). No prior tracking issue was found for these failure signatures.Failure Clusters
agentagentGEMINI_API_KEYinvalid — HTTP 400API_KEY_INVALIDon first API callsafe_outputsresolve_pull_request_review_thread→Resource not accessible by integrationsafe_outputsedit_wikiHTTPS push:fatal: could not read Username(no git credentials)Evidence
P0 — Crush: EROFS crash details
From
agent-stdio.logfor run 25239718625:Crush v0.59.0 downloaded successfully but the npm post-install script tries to copy the extracted binary into the node module's
bin/subdirectory inside the hosted tool cache, which is mounted read-only inside the chroot/sandbox. A secondary error also occurred:Failed to transfer /host/home/runner/work/_temp/gh-aw/safeoutputs ownership to chroot user. Zero turns, zero tool calls — total agent runtime ~1m.P0 — Gemini: API key invalid
From
gemini-client-error-generateJson-api-2026-05-02T01-00-16-489Z.jsonandgemini-client-error-Turn.run-sendMessageStream-2026-05-02T01-00-16-721Z.jsonfor run 25239718609:{"error": {"code": 400, "status": "INVALID_ARGUMENT", "message": "API key not valid. Please pass a valid API key.", "details": [{"reason": "API_KEY_INVALID", "domain": "googleapis.com"}]}}Both
generateJson(routing/classification) andsendMessageStream(main conversation stream) hit the error simultaneously at 01:00:16Z. The Crush run's stdio log also warnedGEMINI_API_KEY is not setat line 15-16, confirming the key is absent/expired in the CI secret store.P1 — Smoke Claude: safe_outputs permission failure
From
workflow-logs/safe_outputs/8_Process Safe Outputs.txtfor run 25239718599:resolve_pull_request_review_threadPRRT_kwDOPc1QR85_EWVjon PR feat: add edit-wiki safe-output for pushing changes to repository wikis #29626Request failed due to following response errors: - Resource not accessible by integrationThe GitHub App/token used by the
safe_outputsjob lacks thepull_request_reviewwrite permission needed to resolve review threads. This permission is not required for most safe output operations, but the Smoke Claude task exercises it.Additional finding: the agent was resource-heavy for a Triage-domain task — 39 turns vs. typical baseline, classified
resource_heavy_for_domain(high) andpoor_agentic_control(medium).P1 — Smoke Codex: edit_wiki credential failure + missing MCP
From
workflow-logs/safe_outputs/9_Process Safe Outputs.txtfor run 25239718605:The
safe_outputsjob clones the wiki over HTTPS and applies a patch, but no git credential helper is configured in that job context (unlike the main agent job). 3 of 4 outputs succeeded (created issue #29655, closed #29638, added comment on PR #29626); the wiki edit failure caused the overall job to exit asfailure.Additional finding: Codex does not have the
web-fetchMCP tool configured — agent calledsafeoutputs.missing_toolto signal this. Codex fell back toplaywright.browser_navigatetargetinggithub.com. Firewall also blocked 5 requests toab.chatgpt.com:443(4) andchatgpt.com:443(1).Existing Issue Correlation
GitHub issues API was not reachable during this investigation (HTTP 403 on localhost proxy). No de-duplication against existing tracking was possible. Smoke test result issues #29655 and #29658 were created by the agents themselves as part of normal operation.
Proposed Fix Roadmap
GEMINI_API_KEYin CI secretsresolve PR review threadpermission to safe_outputs job tokenSub-Issues Created
References: