Problem Statement
Run §24957487711 (Documentation Noob Tester) failed at the agent job with exit code 7 in 43 seconds. Activation (18s) and conclusion (19s) jobs both succeeded; only the agent job failed.
Root Cause
Exit code 7 is CURLE_COULDNT_CONNECT — curl could not connect to the target host. The static analysis report (#28489) identified that docs-noob-tester.lock.yml:445 contains a "Wait for server readiness" step that runs curl localhost:4321. This is a blocking readiness check for a local documentation server (likely Astro). If that server fails to start or is not listening on port 4321 when the check fires, curl exits with code 7 and the entire agent job fails.
Evidence
| Metric |
This Run (24957487711) |
Baseline (24891429625) |
| Agent job conclusion |
failure (exit code 7) |
success |
| Agent job duration |
43s |
— |
| blocked_requests |
0 |
79 |
| Engine |
Copilot CLI v1.0.36, model: auto |
Copilot CLI |
| Classification |
blocked_requests_decrease (changed) |
— |
The blocked_requests drop from 79 → 0 is a strong corroborating signal: if the agent job fails at the server readiness step (before the Copilot CLI process even executes), zero firewall-blocked requests would be recorded because no agent network activity occurs.
Probable Failure Scenario
- Agent job starts
- Workflow attempts to start Astro/docs server at port 4321
- Server startup is slow or fails (npm dependency issue, port conflict, build error)
curl localhost:4321 readiness check fires too early
- curl exits with code 7 (ECONNREFUSED)
- Agent job exits with code 7, skipping safe_outputs and detection jobs
Proposed Remediation
- Check server startup logs in
agent-stdio.log at /tmp/gh-aw/aw-mcp/logs/run-24957487711/ for the specific failure (npm error, port already in use, build failure)
- Add retry with backoff to the
curl localhost:4321 readiness check (e.g., until curl -sf localhost:4321; do sleep 2; done with a max-wait timeout)
- Capture server startup exit code before the readiness check to distinguish server crash from timeout
- Review whether localhost server is necessary — if this is a static site, consider building and checking the output directory instead
Success Criteria
- Documentation Noob Tester agent job completes with exit code 0
blocked_requests returns to baseline (≥1 requests processed)
- Server readiness check does not cause false-negative failures due to slow startup
Related to #28602 (auto-triage failure issue for run 24957487711)
Generated by [aw] Failure Investigator (6h) · ● 415.2K · ◷
Problem Statement
Run §24957487711 (Documentation Noob Tester) failed at the
agentjob with exit code 7 in 43 seconds. Activation (18s) and conclusion (19s) jobs both succeeded; only the agent job failed.Root Cause
Exit code 7 is
CURLE_COULDNT_CONNECT— curl could not connect to the target host. The static analysis report (#28489) identified thatdocs-noob-tester.lock.yml:445contains a "Wait for server readiness" step that runscurl localhost:4321. This is a blocking readiness check for a local documentation server (likely Astro). If that server fails to start or is not listening on port 4321 when the check fires, curl exits with code 7 and the entire agent job fails.Evidence
blocked_requests_decrease(changed)The
blocked_requestsdrop from 79 → 0 is a strong corroborating signal: if the agent job fails at the server readiness step (before the Copilot CLI process even executes), zero firewall-blocked requests would be recorded because no agent network activity occurs.Probable Failure Scenario
curl localhost:4321readiness check fires too earlyProposed Remediation
agent-stdio.logat/tmp/gh-aw/aw-mcp/logs/run-24957487711/for the specific failure (npm error, port already in use, build failure)curl localhost:4321readiness check (e.g.,until curl -sf localhost:4321; do sleep 2; donewith a max-wait timeout)Success Criteria
blocked_requestsreturns to baseline (≥1 requests processed)Related to #28602 (auto-triage failure issue for run 24957487711)