You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
P1: The gh-aw-firewall binary download (v0.25.28) returns intermittent HTTP 502 from GitHub releases CDN, causing cascading failures in both agent and detection jobs across multiple workflows.
Problem Statement
In the 19:14–01:14 UTC window (2026-04-25/26), the AWF firewall binary install step failed in two workflows ~2 hours apart:
Installing awf with checksum verification (version: v0.25.28, ...)
Downloading checksums from 'https://github.com/github/gh-aw-firewall/releases/download/v0.25.28/checksums.txt'...
curl: (22) The requested URL returned error: 502
[... 5 more retries at 10s intervals ...]
##[error]Process completed with exit code 22.
Root Cause
Intermittent HTTP 502 from GitHub releases CDN serving gh-aw-firewall v0.25.28. The failure window is short: runs at 21:43Z, 23:04Z, 23:16Z, and 00:14Z all succeeded, confirming transient CDN outage rather than a broken release.
Impact
Smoke CI — agent job failed (exit 22) → run marked failure → auto-issue [aw] Smoke CI failed #28521 opened, PR CI coverage blocked for that push
Design Decision Gate — detection job failed (AWF unavailable → no threat-detection output → ERR_PARSE) → run marked failuredespite agent completing successfully with noop → $0.25 wasted, spurious failure
Proposed Remediation
Cache the AWF binary between runs keyed on version — eliminates CDN dependency on every run (same pattern already used for agent memory cache)
Fallback to prior cached version if current download returns non-2xx — a stale firewall binary is far better than no firewall at all
Increase retry budget beyond the current 60s (6×10s) for release CDN blips
Success Criteria
AWF binary install step tolerates short CDN outages without failing the workflow
Design Decision Gate noop-success runs are not classified as failure due to detection infrastructure failures
Closing: Smoke CI has passed in 5+ consecutive runs across the 01:10–07:10 UTC and 07:07–13:07 UTC 2026-04-26 windows. The transient GitHub releases CDN outage (HTTP 502) has resolved. No recurrence observed.
P1: The
gh-aw-firewallbinary download (v0.25.28) returns intermittent HTTP 502 from GitHub releases CDN, causing cascading failures in bothagentanddetectionjobs across multiple workflows.Problem Statement
In the 19:14–01:14 UTC window (2026-04-25/26), the AWF firewall binary install step failed in two workflows ~2 hours apart:
agentdetectionBoth failures share the identical error:
Root Cause
Intermittent HTTP 502 from GitHub releases CDN serving
gh-aw-firewallv0.25.28. The failure window is short: runs at 21:43Z, 23:04Z, 23:16Z, and 00:14Z all succeeded, confirming transient CDN outage rather than a broken release.Impact
failure→ auto-issue [aw] Smoke CI failed #28521 opened, PR CI coverage blocked for that pushERR_PARSE) → run markedfailuredespite agent completing successfully withnoop→ $0.25 wasted, spurious failureProposed Remediation
Success Criteria
failuredue to detection infrastructure failuresRelated to #28268
Related to #28268
Closing: Smoke CI has passed in 5+ consecutive runs across the 01:10–07:10 UTC and 07:07–13:07 UTC 2026-04-26 windows. The transient GitHub releases CDN outage (HTTP 502) has resolved. No recurrence observed.
Closed by [aw] Failure Investigator (6h) — 07:07–13:07 UTC 2026-04-26 window.