[workflow-analysis] Weekly Workflow Analysis — Apr 20–27, 2026 #28687
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-28T10:07:05.002Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Weekly analysis of 43 workflow runs from Apr 20–27, 2026. Overall health is moderate: 67% success rate with two recurring infrastructure patterns driving the majority of failures, and widespread resource-heaviness that inflates cost.
Summary
Critical Issues
❌ Pattern 1 —
node: command not foundin chroot mode (exit code 127)Affected: Daily News (§24986870660)
The copilot-driver launches inside a chroot container but
nodeis not on the PATH after the chroot pivot. The agent never starts; the run fails immediately with:This is a hard infrastructure failure — zero agent turns, zero tokens, no useful output. The same root cause has been observed in other copilot-engine chroot runs. The fix requires ensuring Node.js is resolvable inside the chroot (via
GH_AW_NODE_BIN, toolcache symlink, or explicit PATH injection before the chroot pivot).❌ Pattern 2 — Visual Regression Checker docs-build step failure on
add-shadow-opsPRAffected: 4 runs across 2 PR pushes on branch
add-shadow-ops(#28676)The
Build documentationpre-agent step consistently exits non-zero on this branch. The agent is never reached (no tokens, no turns). Two failure+cancel pairs reflect the pattern of the PR receiving a new push while a run is in flight, making the first run redundant.Resolution: Fix the docs build on the
add-shadow-opsbranch or scope the Visual Regression Checker trigger to skip the docs-build step when the docs paths haven't changed.❌ Safe Outputs Step Failures (post-agent)
[aw] Failure Investigator (§24982458545) — The agent succeeded (85 turns, issued parent report + sub-issue #28673/#28674, cost $2.95) but the
safe_outputsjob failed. The agent itself ran correctly.Schema Feature Coverage Checker (§24981796377) — The agent completed but
safeoutputs.create_pull_requestwas rejected. Cause:protected-filesconfig blocks PRs targetingschema-demo-*.mdpaths. The Failure Investigator already identified this as P0 and recommended addingallowed-files: [".github/workflows/schema-demo-*.md"]to the workflow'ssafe-outputs.create-pull-requestconfig. See auto-issue #28671 for details.Performance Issues
Top 10 Token-Consuming Runs
Schema Consistency Checker stands out: 8.1M tokens for 138 turns in 13.1 minutes, with only ~14.9% cache efficiency. This is the most expensive single run this week and warrants a dedicated efficiency review.
Slow Runs (>10 minutes)
Three of the four slowest runs either failed or consumed very high tokens. Documentation Unbloat at 34.8 minutes is an outlier worth investigating — it ran Playwright and safeoutputs but still failed.
Resource Optimization Opportunities
18 workflows flagged HIGH severity
resource_heavy_for_domainby the agentic profiler — the most frequent issue class this week. Notable examples:Workflows with
agentic_fraction=0.50have approximately half their turns doing pure data-gathering. The recommended fix in all cases is to move data-fetching to deterministic pre-agent steps (writing to/tmp/gh-aw/agent/) rather than having the agent issue repeated API calls itself. This reduces inference cost without changing task quality.Recommendations
Fix Node.js PATH in chroot mode —
Daily Newsand potentially other copilot-engine chroot workflows will continue failing until Node is resolvable post-pivot. SetGH_AW_NODE_BINexplicitly or add a toolcache Node to the chroot PATH. Severity: High (workflow completely non-functional).Investigate
add-shadow-opsdocs build — TheBuild documentationstep fails consistently on PR docs: Add CoprrectionOps #28676. Either fix the docs build on that branch or add a path filter to skip the step when docs sources haven't changed. Severity: Medium (PR-scoped, recovers when PR merges/closes).Apply
allowed-filesfix to Schema Feature Coverage Checker — The agent succeeds but the PR is blocked byprotected-files. Addallowed-files: [".github/workflows/schema-demo-*.md"]per Failure Investigator's recommendation in [aw] Schema Feature Coverage Checker failed #28671. Severity: Medium (blocker for workflow's core output).Reduce Schema Consistency Checker token consumption — At 8.1M tokens and 138 turns per run with 14.9% cache efficiency, this workflow's weekly cost compounds quickly. Consider splitting into smaller batch jobs or pre-fetching schema data deterministically. Severity: Medium (cost).
Audit Documentation Unbloat — 34.8 minutes, 4.8M tokens, and still failing is an anomalous profile. The Playwright usage suggests it's loading external content. Add a timeout guard and review whether the Playwright step is necessary or can be replaced with a direct fetch. Severity: Low (single failed run, but pattern warrants review).
References:
node: command not foundBeta Was this translation helpful? Give feedback.
All reactions