[workflow-analysis] Weekly Workflow Analysis: Performance Issues and Optimization Opportunities (Oct 19-26, 2025)

## Executive Summary

Analysis of GitHub Actions workflow runs from October 19-26, 2025 reveals several critical issues affecting workflow reliability and performance. Key findings include high failure rates for specific workflows, excessive token consumption, API permission issues, and configuration problems.

## 📊 Key Metrics

### Overall Statistics
- **Total Workflows Analyzed**: 54 workflows in repository
- **Sample Period**: Last 7 days
- **Workflows with Data**: ~15 workflows examined in detail
- **Total Token Usage**: 1,915,066 tokens analyzed
- **Total Cost**: ~$1.90 USD
- **Total Turns**: 293 turns
- **Total Errors**: 132 errors
- **Total Warnings**: 52 warnings

---

## 🔴 Critical Issues

### 1. Duplicate Code Detector - 100% Failure Rate
**Status**: 4/4 runs failed in the past week  
**Engine**: Codex

**Primary Error**: Project activation failure
```
Project '/workspace' not found: Not a valid project name or directory. Existing project names: []
```

**Impact**: 
- Workflow completely non-functional
- Average duration: 11.3 minutes wasted per run
- 135 turns consumed before failure

**Root Cause**: The Codex engine's `serena_activate_project` tool cannot find the workspace directory, suggesting a misconfiguration in how the project path is initialized.

**Recommendation**: 
- Fix project initialization in workflow configuration
- Verify workspace path mapping for Codex engine
- Consider adding pre-flight checks before expensive operations

---

### 2. Smoke Copilot - 50% Failure Rate
**Status**: 1/2 runs failed in the past week  
**Engine**: Copilot

**Issue**: Run 18810304059 failed with no detailed error messages captured in audit data. The agent job failed but error details were not propagated.

**Recommendation**:
- Improve error reporting for Copilot engine failures
- Add diagnostic logging to capture failure context

---

### 3. Daily News - High Error Count
**Status**: 1/4 runs failed, but successful runs have excessive errors  
**Engine**: Copilot

**Issues Identified**:
1. **MCP Config Warning** (20 occurrences in single run):
   ```
   not found: /tmp/gh-aw/mcp-config/mcp-servers.json
   ```
   - Non-critical but clutters logs
   
2. **Detection Job Failures** (20 occurrences):
   ```
   Cannot read properties of undefined (reading 'text')
   ```
   - TypeErrors in detection phase
   
3. **API Permission Issues**:
   ```
   Permission denied and could not request permission from user
   ```

**Metrics**:
- Run 18775038124: 62 errors, 19 warnings (worst performer)
- Average duration: 4.0 minutes (excluding failure)

**Recommendation**:
- Create MCP config file or suppress warning if optional
- Fix undefined property access in detection logic
- Review and expand API permission scopes

---

### 4. Artifacts Summary - Failure
**Status**: Failed run 18813782251  
**Engine**: Copilot

**Errors**:
1. Invalid rule format parsing:
   ```
   Invalid rule format: github$list_workflow_run_artifacts$
   ```
   
2. Log path issues:
   ```
   not found: /tmp/gh-aw/.copilot/logs/
   ```

3. Escaping issues with shell commands (branch: fix-shell-parentheses-escaping-*)

**Recommendation**:
- Fix regex/rule parsing for GitHub tool names
- Ensure log directories are created before use
- Complete shell escaping fixes on the branch

---

### 5. Custom Agent Loading Failures
**Affected Workflows**: Smoke Copilot, Tidy  
**Engine**: Copilot

**Recurring Error**:
```
Request to GitHub API at (redacted) failed with status 404
```

**Warning**:
```
Failed to load custom agents for githubnext/gh-aw: Not Found
```

**Impact**: Non-blocking but indicates missing custom agent configuration

**Recommendation**:
- Either create custom agent configuration or remove the loading attempt
- Document whether custom agents are expected for this repository

---

## ⚡ Performance Issues

### 1. Token Response Size Limits
**Workflows Affected**: Smoke Claude

**Issue**: Multiple tool calls exceed the 25,000 token response limit:
```
MCP tool "list_pull_requests" response (78309 tokens) exceeds maximum allowed tokens (25000)
MCP tool "list_pull_requests" response (41833 tokens) exceeds maximum allowed tokens (25000)
```

**Impact**: 
- Workflows must retry with pagination
- Increases turn count and duration
- Wastes tokens on failed calls

**Recommendation**:
- Default to smaller page sizes for list operations
- Use `perPage` parameter (e.g., 10-20 items per call)
- Implement automatic pagination in workflow logic
- Consider using `minimal_output: true` for GitHub search tools

---

### 2. GitHub API Permission Issues
**Workflows Affected**: Smoke Claude

**Error**:
```
failed to get user: GET (redacted) 403 Resource not accessible by integration
```

**Impact**: `github_get_me` calls fail, requiring workflows to adapt

**Recommendation**:
- Review GITHUB_TOKEN permissions in workflow files
- Add `contents: read` and `metadata: read` permissions where needed
- Consider using personal access tokens for user-specific operations

---

### 3. Long-Running Workflows
**Workflow**: Documentation Unbloat (run 18809696174)

**Metrics**:
- Duration: 5.8 minutes
- Token usage: 1,018,212 tokens
- Cost: $0.68
- Turns: 90 turns
- Errors: 18 errors (but succeeded)

**Issues**:
- Write tool used without prior Read
- Git operations on files outside repository
- Excessive turn count suggests workflow complexity

**Recommendation**:
- Add pre-flight Read calls before Write operations
- Review git operations for path correctness
- Consider breaking down into smaller sub-workflows

---

## 📈 Reliability Metrics by Engine

### Claude Engine
- **Workflows Analyzed**: Smoke Claude, Documentation Unbloat, Audit Workflows
- **Success Rate**: ~100% (2/2 smoke tests passed)
- **Average Token Usage**: 448,427 tokens/run
- **Average Cost**: $0.61/run
- **Average Turns**: 33.5 turns/run
- **Common Issues**: 
  - Token response size limits (pagination needed)
  - GitHub API permission errors (non-blocking)

**Performance**: ✅ **Good** - Most reliable engine, high success rate

---

### Copilot Engine
- **Workflows Analyzed**: Smoke Copilot, Daily News, Artifacts Summary, Tidy
- **Success Rate**: ~60% (mixed results)
- **Average Errors**: 19.5 errors/run (high variance)
- **Common Issues**:
  - Custom agent loading failures (404)
  - Detection job TypeErrors
  - Permission denied warnings
  - MCP config warnings

**Performance**: ⚠️ **Moderate** - Needs error handling improvements

---

### Codex Engine
- **Workflows Analyzed**: Duplicate Code Detector
- **Success Rate**: 0% (0/4 runs passed)
- **Average Duration**: 11.3 minutes wasted/run
- **Average Turns**: 33.75 turns before failure
- **Common Issues**:
  - Project activation failures (critical)
  - Stream disconnection warnings

**Performance**: 🔴 **Poor** - Completely non-functional for this workflow

---

## 💡 Optimization Opportunities

### 1. Reduce Token Waste
**Target**: Smoke Claude and other high-token workflows

**Actions**:
- Implement `minimal_output: true` for GitHub API calls
- Use smaller pagination sizes (10-20 per page)
- Cache frequently accessed data between jobs
- Add jq filters to trim unnecessary data

**Estimated Savings**: 30-50% token reduction

---

### 2. Improve Error Handling
**Target**: All workflows

**Actions**:
- Add try-catch wrappers around tool calls
- Implement exponential backoff for retries
- Fail fast on unrecoverable errors
- Add pre-flight validation checks

**Expected Impact**: Reduce failed turn attempts by 40%

---

### 3. Fix Configuration Issues
**Priority**: High  
**Target**: All Copilot workflows

**Actions**:
- Create `/tmp/gh-aw/mcp-config/mcp-servers.json` or suppress warning
- Ensure log directories exist before use
- Document required vs optional configuration files
- Add workflow initialization step to create required directories

**Expected Impact**: Reduce warning noise by 60%

---

### 4. Standardize API Permissions
**Target**: All workflows using GitHub API

**Actions**:
- Audit all workflows for required permissions
- Add standard permission block to workflow templates:
  ```yaml
  permissions:
    contents: read
    issues: write
    pull-requests: write
    metadata: read
  ```
- Document minimum permissions per workflow type

**Expected Impact**: Eliminate 403 permission errors

---

### 5. Fix Critical Blockers
**Priority**: Critical  
**Target**: Duplicate Code Detector

**Immediate Actions**:
1. Debug Codex workspace initialization
2. Add diagnostic logging for project activation
3. Consider switching to alternative engine if Codex issues persist
4. Disable workflow until fixed to avoid wasted resources

**Expected Impact**: Save ~45 minutes/week of failed runs

---

## 📋 Action Items by Priority

### 🔥 Critical (Week 1)
1. [ ] Fix Duplicate Code Detector project activation
2. [ ] Resolve Artifacts Summary rule parsing error
3. [ ] Create or suppress MCP config warnings

### ⚠️ High (Week 2)
4. [ ] Implement pagination defaults for GitHub API calls
5. [ ] Add permission blocks to all workflows
6. [ ] Fix Daily News detection TypeError
7. [ ] Improve error reporting for Copilot failures

### 📊 Medium (Week 3-4)
8. [ ] Optimize token usage with minimal_output flags
9. [ ] Add pre-flight validation to workflows
10. [ ] Document custom agent configuration expectations
11. [ ] Review and optimize Documentation Unbloat workflow

### 🔧 Low (Backlog)
12. [ ] Implement workflow-level caching
13. [ ] Create monitoring dashboard for workflow health
14. [ ] Add automated alerts for repeated failures

---

## 🎯 Success Metrics

To track improvement, monitor these KPIs weekly:

1. **Overall Success Rate**: Target >90% (current: ~70%)
2. **Average Token/Run**: Target <300K tokens (current: 448K for Claude)
3. **Average Cost/Run**: Target <$0.40 (current: $0.61 for Claude)
4. **Error Count/Run**: Target <5 errors (current: 19.5 for Copilot)
5. **Failed Runs/Week**: Target <2 failures (current: 7+ failures)

---

## 📝 Methodology

**Data Collection**:
- Used `mcp__agentic_workflows__logs` tool with 7-day lookback
- Analyzed 15+ workflows across 3 engines (Claude, Copilot, Codex)
- Examined audit data for failed runs

**Workflows Analyzed**:
- smoke-claude, smoke-copilot, smoke-codex
- daily-news, duplicate-code-detector
- audit-workflows, documentation-unbloat
- artifacts-summary, tidy
- And more

**Tools Used**:
- `status` - Workflow inventory
- `logs` - Historical run data
- `audit` - Detailed failure analysis

---

## 🔗 References

- Run 18810304059 (Smoke Copilot failure): (redacted)
- Run 18813782251 (Artifacts Summary failure): (redacted)
- Run 18808429389 (Duplicate Code Detector failure): (redacted)
- Run 18775038124 (Daily News high errors): (redacted)
- Run 18809696174 (Documentation Unbloat success): (redacted)

---

**Report Generated**: 2025-10-26  
**Analysis Period**: 2025-10-19 to 2025-10-26  
**Next Review**: 2025-11-02




> AI generated by [Weekly Workflow Analysis](https://github.com/githubnext/gh-aw/actions/runs/18813809064)

[workflow-analysis] Weekly Workflow Analysis: Performance Issues and Optimization Opportunities (Oct 19-26, 2025) #2509

Description

Executive Summary

📊 Key Metrics

Overall Statistics

🔴 Critical Issues

1. Duplicate Code Detector - 100% Failure Rate

2. Smoke Copilot - 50% Failure Rate

3. Daily News - High Error Count

4. Artifacts Summary - Failure

5. Custom Agent Loading Failures

⚡ Performance Issues

1. Token Response Size Limits

2. GitHub API Permission Issues

3. Long-Running Workflows

📈 Reliability Metrics by Engine

Claude Engine

Copilot Engine

Codex Engine

💡 Optimization Opportunities

1. Reduce Token Waste

2. Improve Error Handling

3. Fix Configuration Issues

4. Standardize API Permissions

5. Fix Critical Blockers

📋 Action Items by Priority

🔥 Critical (Week 1)

⚠️ High (Week 2)

📊 Medium (Week 3-4)

🔧 Low (Backlog)

🎯 Success Metrics

📝 Methodology

🔗 References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions