Skip to content

[aw-failures] docs-noob-tester: exit code 7 — server readiness curl to localhost:4321 fails before Copilot CLI starts #28623

@github-actions

Description

@github-actions

Problem Statement

Run §24957487711 (Documentation Noob Tester) failed at the agent job with exit code 7 in 43 seconds. Activation (18s) and conclusion (19s) jobs both succeeded; only the agent job failed.

Root Cause

Exit code 7 is CURLE_COULDNT_CONNECT — curl could not connect to the target host. The static analysis report (#28489) identified that docs-noob-tester.lock.yml:445 contains a "Wait for server readiness" step that runs curl localhost:4321. This is a blocking readiness check for a local documentation server (likely Astro). If that server fails to start or is not listening on port 4321 when the check fires, curl exits with code 7 and the entire agent job fails.

Evidence

Metric This Run (24957487711) Baseline (24891429625)
Agent job conclusion failure (exit code 7) success
Agent job duration 43s
blocked_requests 0 79
Engine Copilot CLI v1.0.36, model: auto Copilot CLI
Classification blocked_requests_decrease (changed)

The blocked_requests drop from 79 → 0 is a strong corroborating signal: if the agent job fails at the server readiness step (before the Copilot CLI process even executes), zero firewall-blocked requests would be recorded because no agent network activity occurs.

Probable Failure Scenario

  1. Agent job starts
  2. Workflow attempts to start Astro/docs server at port 4321
  3. Server startup is slow or fails (npm dependency issue, port conflict, build error)
  4. curl localhost:4321 readiness check fires too early
  5. curl exits with code 7 (ECONNREFUSED)
  6. Agent job exits with code 7, skipping safe_outputs and detection jobs

Proposed Remediation

  1. Check server startup logs in agent-stdio.log at /tmp/gh-aw/aw-mcp/logs/run-24957487711/ for the specific failure (npm error, port already in use, build failure)
  2. Add retry with backoff to the curl localhost:4321 readiness check (e.g., until curl -sf localhost:4321; do sleep 2; done with a max-wait timeout)
  3. Capture server startup exit code before the readiness check to distinguish server crash from timeout
  4. Review whether localhost server is necessary — if this is a static site, consider building and checking the output directory instead

Success Criteria

  • Documentation Noob Tester agent job completes with exit code 0
  • blocked_requests returns to baseline (≥1 requests processed)
  • Server readiness check does not cause false-negative failures due to slow startup

Related to #28602 (auto-triage failure issue for run 24957487711)

Generated by [aw] Failure Investigator (6h) · ● 415.2K ·

  • expires on May 3, 2026, 7:20 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions