Skip to content

fix: Multi-Device Docs Tester hits max-turns without producing safe outputs#21327

Merged
pelikhan merged 2 commits intomainfrom
copilot/aw-debug-multi-device-docs-tester
Mar 17, 2026
Merged

fix: Multi-Device Docs Tester hits max-turns without producing safe outputs#21327
pelikhan merged 2 commits intomainfrom
copilot/aw-debug-multi-device-docs-tester

Conversation

Copy link
Contributor

Copilot AI commented Mar 17, 2026

The workflow consistently exhausted its 50-turn budget debugging server/Playwright issues rather than testing devices, terminating with error_max_turns and no safe outputs.

Three compounding root causes (from run #23175727517):

  1. npm run preview returned HTTP 404 for /gh-aw/ — Astro Starlight's hybrid routing isn't served correctly by the static preview server
  2. browser_navigate / browser_run_code timed out at 60s — Vite dev mode emits dozens of ES module requests per page; waitUntil: 'load' (and especially waitForLoadState('networkidle')) never fires within the timeout
  3. 50 turns is too few for 10-device testing once setup overhead burns the first ~10

shared/docs-server-lifecycle.md

  • Switch from npm run previewnpm run dev (dev server correctly handles hybrid routing at /gh-aw/)
  • Readiness check now verifies /gh-aw/ returns HTTP 200 (was just checking the port was open), with 45×3s=135s budget for Vite startup

daily-multi-device-docs-tester.md

  • max-turns 50 → 80
  • tools.timeout: 120 — raises MCP_TOOL_TIMEOUT from 60 s to 120 s
  • Remove npm run build step; swap allowed bash tools from npm run build*/npm run preview* to npm run dev*/npx astro*
  • Playwright navigation example now uses waitUntil: 'domcontentloaded'; explicit warnings added against waitForLoadState('networkidle') and bare browser_navigate for first load:
// Before — guaranteed timeout on Vite dev server
await page.goto(url);
await page.waitForLoadState('networkidle');

// After — only waits for HTML parse, not full module graph
await page.goto(url, { waitUntil: 'domcontentloaded', timeout: 30000 });

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/graphql
    • Triggering command: /usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw (http block)
  • https://api.github.com/repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b
    • Triggering command: /usr/bin/gh gh api /repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b --jq .object.sha (http block)
  • https://api.github.com/repos/github/gh-aw
    • Triggering command: /usr/bin/gh gh api /repos/github/gh-aw --jq .visibility imeout\|playwright e_engine.go ndor/bin/bash (http block)
  • https://api.github.com/repos/githubnext/agentics/git/ref/tags/
    • Triggering command: /usr/bin/gh gh api /repos/githubnext/agentics/git/ref/tags/# --jq .object.sha (http block)

If you need me to access, download, or install something from one of these locations, you can either:


📍 Connect Copilot coding agent with Jira, Azure Boards or Linear to delegate work to Copilot in one click without leaving your project management tool.

Copilot AI linked an issue Mar 17, 2026 that may be closed by this pull request
Root causes from run #23175727517:
1. `npm run preview` returned 404 for `/gh-aw/` (Astro Starlight hybrid routing)
2. Playwright timed out at 60s due to Vite dev server loading many ES modules
3. Max-turns (50) too low for 10-device testing after debug overhead

Fixes:
- shared/docs-server-lifecycle.md: switch to `npm run dev`, verify /gh-aw/ returns 200
- daily-multi-device-docs-tester.md: max-turns 50→80, tools.timeout 120s,
  remove build step, add domcontentloaded guidance

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] [aw] Debug multi-device docs tester workflow failure fix: Multi-Device Docs Tester hits max-turns without producing safe outputs Mar 17, 2026
Copilot AI requested a review from pelikhan March 17, 2026 03:27
@pelikhan pelikhan marked this pull request as ready for review March 17, 2026 03:28
Copilot AI review requested due to automatic review settings March 17, 2026 03:28
@pelikhan pelikhan merged commit 3945fa5 into main Mar 17, 2026
@pelikhan pelikhan deleted the copilot/aw-debug-multi-device-docs-tester branch March 17, 2026 03:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the multi-device docs testing workflow to avoid exhausting the agent’s turn budget and timing out during Astro/Vite dev-server navigation, while ensuring the docs server is started in a way that correctly serves /gh-aw/.

Changes:

  • Switch docs server instructions from npm run preview to npm run dev and strengthen readiness checking to require HTTP 200 on /gh-aw/.
  • Increase agent runtime budget (max turns 50 → 80) and extend tool timeouts (60s → 120s) for Playwright/MCP operations.
  • Update docs tester workflow instructions to avoid Playwright networkidle/default navigation waits and use domcontentloaded instead.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
.github/workflows/unbloat-docs.lock.yml Updates workflow metadata hash to reflect upstream instruction changes.
.github/workflows/shared/docs-server-lifecycle.md Moves lifecycle guidance to Astro dev server + readiness check on /gh-aw/.
.github/workflows/docs-noob-tester.lock.yml Updates workflow metadata hash to reflect shared doc changes.
.github/workflows/daily-multi-device-docs-tester.md Raises max-turns, tool timeout, switches to dev server, and updates Playwright navigation guidance.
.github/workflows/daily-multi-device-docs-tester.lock.yml Compiles the workflow changes into the lock, including updated allowlisted bash commands and timeouts.
Comments suppressed due to low confidence (1)

.github/workflows/daily-multi-device-docs-tester.md:36

  • The workflow’s bash allowlist doesn’t include commands that this workflow (and the imported shared lifecycle instructions) require, e.g. ip, awk, and hostname for computing SERVER_IP (and likely rm for cleanup). With the current tools.bash list, the agent won’t be able to run the provided SERVER_IP=$(ip ... | awk ...) snippet, which blocks Playwright navigation. Add the required command patterns to tools.bash (or adjust the instructions to only use already-allowed commands).
  bash:
    - "npm install*"
    - "npm run dev*"
    - "npx astro*"
    - "npx playwright*"
    - "curl*"
    - "kill*"
    - "lsof*"
    - "ls*"      # List files for directory navigation
    - "pwd*"     # Print working directory
    - "cd*"      # Change directory

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

# - Documentation must be built first (npm run build in docs/ directory)
# - Bash permissions: npm *, curl *, kill *, echo *, sleep *
# - npm install must have been run in docs/ directory
# - Bash permissions: npm *, npx *, curl *, kill *, echo *, sleep *
Comment on lines 13 to 20
## Starting the Documentation Preview Server

**Context**: The documentation has been pre-built using `npm run build`. Use the preview server to serve the static build.

Navigate to the docs directory and start the preview server in the background, binding to all network interfaces on a fixed port:
Navigate to the docs directory and start the development server in the background, binding to all network interfaces on a fixed port:

```bash
cd docs
npm run preview -- --host 0.0.0.0 --port 4321 > /tmp/preview.log 2>&1 &
npm run dev -- --host 0.0.0.0 --port 4321 > /tmp/preview.log 2>&1 &
echo $! > /tmp/server.pid
Comment on lines +41 to 43
[ "$STATUS" = "200" ] && echo "Server ready at http://localhost:4321/gh-aw/!" && break
echo "Waiting for server... ($i/45) (status: $STATUS)" && sleep 3
done

Follow the shared **Documentation Server Lifecycle Management** instructions:
1. Start the preview server (section "Starting the Documentation Preview Server")
1. Start the dev server (section "Starting the Documentation Preview Server")
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[aw] Multi-Device Docs Tester failed

3 participants