You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Two scheduled workflows failed in the 6-hour window ending 2026-04-22T01:11Z, both due to 401 Unauthorized from api.openai.com/v1/responses. Root cause identified: lock files are running without the openai-proxy provider config introduced in PR #27711. Workflows attempt to reach api.openai.com directly, but the OPENAI_API_KEY only works via the internal proxy (172.30.0.30:10000). The fix is already tracked in #27724 (recompile lock files). This report connects the dots.
Reconnecting... 2/5
Reconnecting... 3/5
Reconnecting... 4/5
Reconnecting... 5/5
Reconnecting... 1/5
unexpected status 401 Unauthorized: Missing bearer or basic authentication in header,
url: (api.openai.com/redacted), cf-ray: ..., request id: req_...
Root Cause Analysis
PR #27711 (commit 28d8df1) added logic to inject an openai-proxy provider into the generated codex config, routing codex through (172.30.0.30/redacted) instead of api.openai.com` directly.
But the lock files (.lock.yml) have not been recompiled since this change, as tracked in #27724. This means currently-running codex workflows use OLD lock files that send requests directly to api.openai.com with an API key that is only valid when routed through the internal AWF proxy.
Chain: PR #27711 merged → lock files stale → codex workflows use old config → 401 at api.openai.com
Evidence — AI Moderator (06:00 UTC, post-previous-report)
Same error signature as prior window — sk-place***roxy key sent directly to api.openai.com:
startup websocket prewarm setup failed: unexpected status 401 Unauthorized:
Incorrect API key provided: sk-place****************roxy.
url: wss://api.openai.com/v1/responses
ERROR: Reconnecting... 2/5 ... 3/5 ... (agent exits, safe_outputs: no output)
Note: PR #27762 (container image digest pins, merged in same window) does not fix the openai-proxy lock-file issue. The Codex 401 root cause from #27731 remains outstanding.
Smoke Gemini — separate failure (awf-api-proxy)
Different root cause from Codex 401. The Gemini API proxy sidecar container failed its health check on the PR branch:
Container awf-api-proxy Error
dependency failed to start: container awf-api-proxy is unhealthy
Tracked by #27688 (smoke test on now-merged PR branch — lower priority).
Assessment
No new P0 failure clusters. #27731 (recompile lock files) remains the single highest-priority fix. Codex workflows running on main will continue failing at every scheduled trigger until the proxy config is added to the lock files.
Fix: add protected-files: fallback-to-issue to workflow frontmatter
Assessment
No new P0 cluster was added to the Codex 401 story — #27731 remains the single highest-priority fix. All Codex workflows on main will continue failing at every scheduled trigger until lock files are recompiled with the openai-proxy provider config.
The node not found Copilot pattern is newly identified P0 (see sub-issue).
P0 #27731 (Codex 401 / lock file recompile) remains unresolved — AI Moderator failed 6 more times. New P0 identified: awf-api-proxy sidecar now unhealthy on main-branch workflows.
Overview
Two scheduled workflows failed in the 6-hour window ending 2026-04-22T01:11Z, both due to
401 Unauthorizedfromapi.openai.com/v1/responses. Root cause identified: lock files are running without theopenai-proxyprovider config introduced in PR #27711. Workflows attempt to reachapi.openai.comdirectly, but the OPENAI_API_KEY only works via the internal proxy (172.30.0.30:10000). The fix is already tracked in #27724 (recompile lock files). This report connects the dots.Failure Clusters
api.openai.comapi.openai.comEvidence
Audit-diff: failed AI Moderator (24752310887) vs successful Design Decision Gate (24752301186)
api.openai.com:443and 1 blocked tochatgpt.com:443; successful run only contactedapi.anthropic.com:443Reconnecting... 1/5through5/5) before giving upError signature (from issues #27678, #27716)
Root Cause Analysis
PR #27711 (commit
28d8df1) added logic to inject anopenai-proxyprovider into the generated codex config, routing codex through(172.30.0.30/redacted) instead ofapi.openai.com` directly.But the lock files (
.lock.yml) have not been recompiled since this change, as tracked in #27724. This means currently-running codex workflows use OLD lock files that send requests directly toapi.openai.comwith an API key that is only valid when routed through the internal AWF proxy.Chain: PR #27711 merged → lock files stale → codex workflows use old config → 401 at
api.openai.comExisting Issue Correlation
Proposed Fix Roadmap
Sub-Issues Created
References:
6h Follow-Up Window: 2026-04-22T01:11Z → 07:14Z
P0 fix (#27731) still unresolved — Codex 401 failures continue on
main.New failure in this window
Evidence — AI Moderator (06:00 UTC, post-previous-report)
Same error signature as prior window —
sk-place***roxykey sent directly toapi.openai.com:Note: PR #27762 (container image digest pins, merged in same window) does not fix the openai-proxy lock-file issue. The Codex 401 root cause from #27731 remains outstanding.
Smoke Gemini — separate failure (awf-api-proxy)
Different root cause from Codex 401. The Gemini API proxy sidecar container failed its health check on the PR branch:
Tracked by #27688 (smoke test on now-merged PR branch — lower priority).
Assessment
No new P0 failure clusters. #27731 (recompile lock files) remains the single highest-priority fix. Codex workflows running on
mainwill continue failing at every scheduled trigger until the proxy config is added to the lock files.References:
6h Window: 2026-04-22T07:14Z → 13:14Z
P0 fix (#27731 — recompile lock files) remains unresolved. Codex 401 failures continue. A new untracked P0 cluster identified: Copilot
node: command not found.Failure Clusters (8 runs, 100% error rate)
node not foundnode not foundKey Evidence from Audit
Codex 401 (still root-caused to stale lock files — #27724/#27731):
sk-place***roxykey sent directly toapi.openai.com/v1/responses→ 401, retries 1–5, engine crashchatgpt.com:443(blocked by firewall) — secondary concernapi.github.com:443andgithub.com:443from allow-listCopilot
node not found(NEW — no prior tracking):Daily Documentation Updater (post-inference failure, $2.68 spent):
create_pull_requestblocked by protected files.github/aw/create-agentic-workflow.md,.github/aw/github-agentic-workflows.mdprotected-files: fallback-to-issueto workflow frontmatterAssessment
No new P0 cluster was added to the Codex 401 story — #27731 remains the single highest-priority fix. All Codex workflows on
mainwill continue failing at every scheduled trigger until lock files are recompiled with theopenai-proxyprovider config.The
node not foundCopilot pattern is newly identified P0 (see sub-issue).References:
6h Window: 2026-04-22T13:14Z → 19:14Z
P0 #27731 (Codex 401 / lock file recompile) remains unresolved — AI Moderator failed 6 more times. New P0 identified:
awf-api-proxysidecar now unhealthy on main-branch workflows.Failure Clusters (18 runs)
Key Evidence
Codex 401 (unchanged root cause — stale lock files, #27731):
wss://api.openai.com/v1/responsescalled directly;sk-place***roxykey rejected with 401copilot/disable-shell-history-expansionhit same pattern (Codex 0.121.0)awf-api-proxyunhealthy — NEW P0 (previously PR-branch only via #27688):Assessment
Two active P0 root causes in this window:
awf-api-proxyunhealthy — escalated from PR-branch issue to main-branch; sub-issue created below for actionable follow-upReferences:
Note
🔒 Integrity filter blocked 5 items
The following items were blocked because they don't meet the GitHub integrity level.
list_issues: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".issue_read: has lower integrity than agent requires. The agent cannot read data with integrity below "approved".To allow these resources, lower
min-integrityin your GitHub frontmatter: