Skip to content

[workflow-analysis] Weekly Workflow Analysis: Performance Issues and Optimization Opportunities (Oct 19-26, 2025) #2509

@github-actions

Description

@github-actions

Executive Summary

Analysis of GitHub Actions workflow runs from October 19-26, 2025 reveals several critical issues affecting workflow reliability and performance. Key findings include high failure rates for specific workflows, excessive token consumption, API permission issues, and configuration problems.

📊 Key Metrics

Overall Statistics

  • Total Workflows Analyzed: 54 workflows in repository
  • Sample Period: Last 7 days
  • Workflows with Data: ~15 workflows examined in detail
  • Total Token Usage: 1,915,066 tokens analyzed
  • Total Cost: ~$1.90 USD
  • Total Turns: 293 turns
  • Total Errors: 132 errors
  • Total Warnings: 52 warnings

🔴 Critical Issues

1. Duplicate Code Detector - 100% Failure Rate

Status: 4/4 runs failed in the past week
Engine: Codex

Primary Error: Project activation failure

Project '/workspace' not found: Not a valid project name or directory. Existing project names: []

Impact:

  • Workflow completely non-functional
  • Average duration: 11.3 minutes wasted per run
  • 135 turns consumed before failure

Root Cause: The Codex engine's serena_activate_project tool cannot find the workspace directory, suggesting a misconfiguration in how the project path is initialized.

Recommendation:

  • Fix project initialization in workflow configuration
  • Verify workspace path mapping for Codex engine
  • Consider adding pre-flight checks before expensive operations

2. Smoke Copilot - 50% Failure Rate

Status: 1/2 runs failed in the past week
Engine: Copilot

Issue: Run 18810304059 failed with no detailed error messages captured in audit data. The agent job failed but error details were not propagated.

Recommendation:

  • Improve error reporting for Copilot engine failures
  • Add diagnostic logging to capture failure context

3. Daily News - High Error Count

Status: 1/4 runs failed, but successful runs have excessive errors
Engine: Copilot

Issues Identified:

  1. MCP Config Warning (20 occurrences in single run):

    not found: /tmp/gh-aw/mcp-config/mcp-servers.json
    
    • Non-critical but clutters logs
  2. Detection Job Failures (20 occurrences):

    Cannot read properties of undefined (reading 'text')
    
    • TypeErrors in detection phase
  3. API Permission Issues:

    Permission denied and could not request permission from user
    

Metrics:

  • Run 18775038124: 62 errors, 19 warnings (worst performer)
  • Average duration: 4.0 minutes (excluding failure)

Recommendation:

  • Create MCP config file or suppress warning if optional
  • Fix undefined property access in detection logic
  • Review and expand API permission scopes

4. Artifacts Summary - Failure

Status: Failed run 18813782251
Engine: Copilot

Errors:

  1. Invalid rule format parsing:

    Invalid rule format: github\(list_workflow_run_artifacts\)
    
  2. Log path issues:

    not found: /tmp/gh-aw/.copilot/logs/
    
  3. Escaping issues with shell commands (branch: fix-shell-parentheses-escaping-*)

Recommendation:

  • Fix regex/rule parsing for GitHub tool names
  • Ensure log directories are created before use
  • Complete shell escaping fixes on the branch

5. Custom Agent Loading Failures

Affected Workflows: Smoke Copilot, Tidy
Engine: Copilot

Recurring Error:

Request to GitHub API at (redacted) failed with status 404

Warning:

Failed to load custom agents for githubnext/gh-aw: Not Found

Impact: Non-blocking but indicates missing custom agent configuration

Recommendation:

  • Either create custom agent configuration or remove the loading attempt
  • Document whether custom agents are expected for this repository

⚡ Performance Issues

1. Token Response Size Limits

Workflows Affected: Smoke Claude

Issue: Multiple tool calls exceed the 25,000 token response limit:

MCP tool "list_pull_requests" response (78309 tokens) exceeds maximum allowed tokens (25000)
MCP tool "list_pull_requests" response (41833 tokens) exceeds maximum allowed tokens (25000)

Impact:

  • Workflows must retry with pagination
  • Increases turn count and duration
  • Wastes tokens on failed calls

Recommendation:

  • Default to smaller page sizes for list operations
  • Use perPage parameter (e.g., 10-20 items per call)
  • Implement automatic pagination in workflow logic
  • Consider using minimal_output: true for GitHub search tools

2. GitHub API Permission Issues

Workflows Affected: Smoke Claude

Error:

failed to get user: GET (redacted) 403 Resource not accessible by integration

Impact: github_get_me calls fail, requiring workflows to adapt

Recommendation:

  • Review GITHUB_TOKEN permissions in workflow files
  • Add contents: read and metadata: read permissions where needed
  • Consider using personal access tokens for user-specific operations

3. Long-Running Workflows

Workflow: Documentation Unbloat (run 18809696174)

Metrics:

  • Duration: 5.8 minutes
  • Token usage: 1,018,212 tokens
  • Cost: $0.68
  • Turns: 90 turns
  • Errors: 18 errors (but succeeded)

Issues:

  • Write tool used without prior Read
  • Git operations on files outside repository
  • Excessive turn count suggests workflow complexity

Recommendation:

  • Add pre-flight Read calls before Write operations
  • Review git operations for path correctness
  • Consider breaking down into smaller sub-workflows

📈 Reliability Metrics by Engine

Claude Engine

  • Workflows Analyzed: Smoke Claude, Documentation Unbloat, Audit Workflows
  • Success Rate: ~100% (2/2 smoke tests passed)
  • Average Token Usage: 448,427 tokens/run
  • Average Cost: $0.61/run
  • Average Turns: 33.5 turns/run
  • Common Issues:
    • Token response size limits (pagination needed)
    • GitHub API permission errors (non-blocking)

Performance: ✅ Good - Most reliable engine, high success rate


Copilot Engine

  • Workflows Analyzed: Smoke Copilot, Daily News, Artifacts Summary, Tidy
  • Success Rate: ~60% (mixed results)
  • Average Errors: 19.5 errors/run (high variance)
  • Common Issues:
    • Custom agent loading failures (404)
    • Detection job TypeErrors
    • Permission denied warnings
    • MCP config warnings

Performance: ⚠️ Moderate - Needs error handling improvements


Codex Engine

  • Workflows Analyzed: Duplicate Code Detector
  • Success Rate: 0% (0/4 runs passed)
  • Average Duration: 11.3 minutes wasted/run
  • Average Turns: 33.75 turns before failure
  • Common Issues:
    • Project activation failures (critical)
    • Stream disconnection warnings

Performance: 🔴 Poor - Completely non-functional for this workflow


💡 Optimization Opportunities

1. Reduce Token Waste

Target: Smoke Claude and other high-token workflows

Actions:

  • Implement minimal_output: true for GitHub API calls
  • Use smaller pagination sizes (10-20 per page)
  • Cache frequently accessed data between jobs
  • Add jq filters to trim unnecessary data

Estimated Savings: 30-50% token reduction


2. Improve Error Handling

Target: All workflows

Actions:

  • Add try-catch wrappers around tool calls
  • Implement exponential backoff for retries
  • Fail fast on unrecoverable errors
  • Add pre-flight validation checks

Expected Impact: Reduce failed turn attempts by 40%


3. Fix Configuration Issues

Priority: High
Target: All Copilot workflows

Actions:

  • Create /tmp/gh-aw/mcp-config/mcp-servers.json or suppress warning
  • Ensure log directories exist before use
  • Document required vs optional configuration files
  • Add workflow initialization step to create required directories

Expected Impact: Reduce warning noise by 60%


4. Standardize API Permissions

Target: All workflows using GitHub API

Actions:

  • Audit all workflows for required permissions
  • Add standard permission block to workflow templates:
    permissions:
      contents: read
      issues: write
      pull-requests: write
      metadata: read
  • Document minimum permissions per workflow type

Expected Impact: Eliminate 403 permission errors


5. Fix Critical Blockers

Priority: Critical
Target: Duplicate Code Detector

Immediate Actions:

  1. Debug Codex workspace initialization
  2. Add diagnostic logging for project activation
  3. Consider switching to alternative engine if Codex issues persist
  4. Disable workflow until fixed to avoid wasted resources

Expected Impact: Save ~45 minutes/week of failed runs


📋 Action Items by Priority

🔥 Critical (Week 1)

  1. Fix Duplicate Code Detector project activation
  2. Resolve Artifacts Summary rule parsing error
  3. Create or suppress MCP config warnings

⚠️ High (Week 2)

  1. Implement pagination defaults for GitHub API calls
  2. Add permission blocks to all workflows
  3. Fix Daily News detection TypeError
  4. Improve error reporting for Copilot failures

📊 Medium (Week 3-4)

  1. Optimize token usage with minimal_output flags
  2. Add pre-flight validation to workflows
  3. Document custom agent configuration expectations
  4. Review and optimize Documentation Unbloat workflow

🔧 Low (Backlog)

  1. Implement workflow-level caching
  2. Create monitoring dashboard for workflow health
  3. Add automated alerts for repeated failures

🎯 Success Metrics

To track improvement, monitor these KPIs weekly:

  1. Overall Success Rate: Target >90% (current: ~70%)
  2. Average Token/Run: Target <300K tokens (current: 448K for Claude)
  3. Average Cost/Run: Target <$0.40 (current: $0.61 for Claude)
  4. Error Count/Run: Target <5 errors (current: 19.5 for Copilot)
  5. Failed Runs/Week: Target <2 failures (current: 7+ failures)

📝 Methodology

Data Collection:

  • Used mcp__agentic_workflows__logs tool with 7-day lookback
  • Analyzed 15+ workflows across 3 engines (Claude, Copilot, Codex)
  • Examined audit data for failed runs

Workflows Analyzed:

  • smoke-claude, smoke-copilot, smoke-codex
  • daily-news, duplicate-code-detector
  • audit-workflows, documentation-unbloat
  • artifacts-summary, tidy
  • And more

Tools Used:

  • status - Workflow inventory
  • logs - Historical run data
  • audit - Detailed failure analysis

🔗 References

  • Run 18810304059 (Smoke Copilot failure): (redacted)
  • Run 18813782251 (Artifacts Summary failure): (redacted)
  • Run 18808429389 (Duplicate Code Detector failure): (redacted)
  • Run 18775038124 (Daily News high errors): (redacted)
  • Run 18809696174 (Documentation Unbloat success): (redacted)

Report Generated: 2025-10-26
Analysis Period: 2025-10-19 to 2025-10-26
Next Review: 2025-11-02

AI generated by Weekly Workflow Analysis

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions