Workflow Health Dashboard - 2026-01-04
Overview
- Total Workflows: 128
- Healthy: Unable to determine (no metrics data) 🔴
- Warning: 10 (7.8%) - Outdated lock files ⚠️
- Critical: 2 issues identified 🚨
- Inactive: Unknown (no metrics data)
Critical Issues 🚨
Issue 1: Metrics Collection System Down
- Status: No execution metrics available
- Error:
/tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.json does not exist
- Impact: Cannot monitor workflow health, success rates, or failure patterns
- Root Cause:
metrics-collector.md workflow has outdated lock file
- Action: Issue created for enabling metrics collection
- Priority: P0
Issue 2: 10 Workflows with Outdated Lock Files (7.8%)
- Status: Source
.md files modified after .lock.yml compilation
- Impact: Runtime behavior may not match source code
- Affected Workflows:
smoke-copilot-playwright.md
go-fan.md
stale-repo-identifier.md
duplicate-code-detector.md
copilot-pr-nlp-analysis.md
smoke-srt.md
github-mcp-structural-analysis.md
metrics-collector.md ⚠️ Critical
incident-response.md
layout-spec-maintainer.md
- Action: Issue created for recompilation
- Priority: P0
Structural Health Analysis ✅
Since execution metrics are unavailable, this assessment focuses on structural health:
Compilation Coverage
- ✅ 100% coverage: All 128 workflows have
.lock.yml files
- ⚠️ 7.8% outdated: 10 workflows need recompilation
Engine Distribution
| Engine |
Count |
Percentage |
Status |
| Copilot |
69 |
53.9% |
✅ Healthy diversity |
| Claude |
25 |
19.5% |
✅ Good alternative |
| Codex |
7 |
5.5% |
✅ Specialized use |
| Other |
27 |
21.1% |
⚠️ Needs classification |
Analysis: Healthy distribution prevents single point of failure. Copilot as primary engine is appropriate for GitHub integration.
Workflow Categories
| Category |
Count |
Notes |
| Campaign Workflows |
2 |
Campaign orchestration |
| Smoke Tests |
10 |
Testing infrastructure |
| Daily Scheduled |
17 |
Regular maintenance |
| Weekly Scheduled |
1 |
Long-term analysis |
| Hourly Scheduled |
1 |
High-frequency monitoring |
| Event-Triggered |
~97 |
Majority of workflows |
Analysis: Good balance of scheduled vs. event-triggered workflows. Scheduling spread reduces resource contention.
Tool Usage
| Tool |
Workflows |
Coverage |
Status |
| GitHub MCP |
94 |
73% |
✅ Excellent adoption |
| Playwright |
11 |
9% |
✅ Appropriate for UI testing |
| Fetch |
8 |
6% |
✅ Web content retrieval |
Analysis: Heavy GitHub MCP usage is expected and healthy for repository operations.
Systemic Patterns
Positive Indicators ✅
- Complete compilation coverage: All workflows have lock files
- Strong naming conventions: Clear categorization (daily-, smoke-, etc.)
- Engine diversity: Multiple engines prevent vendor lock-in
- Standardized tooling: Widespread GitHub MCP adoption
- No orphaned lock files: Clean 1:1 mapping between source and compiled files
Areas of Concern ⚠️
- Meta-monitoring gap: Metrics collector itself is outdated
- No execution visibility: Cannot assess runtime health
- Missing metrics infrastructure: Need 7 days of data for trends
- Safe outputs visibility: Frontmatter declarations appear missing
Data Limitations 🔴
Current Analysis Limited By:
- ❌ No workflow execution metrics
- ❌ No failure rate data
- ❌ No runtime performance data
- ❌ No error pattern analysis
- ❌ Cannot calculate MTBF
- ❌ Cannot identify failing workflows
Reason: Metrics Collector workflow is outdated and metrics storage not populated.
Impact: This assessment can only evaluate structural health (compilation, configuration, categorization). Runtime health monitoring requires metrics data.
Recommendations
Immediate Actions (P0)
- ✅ Recompile outdated workflows - Issue created
- ✅ Enable metrics collection - Issue created
- ⏳ Verify metrics collection - Pending workflow fix
- ⏳ Wait for baseline data - Need 7 days of metrics
High Priority (P1)
- Establish monitoring alerts - Set up notifications for workflow failures
- Document workflow dependencies - Map inter-workflow relationships
- Verify safe outputs usage - Deep dive into workflow bodies
Medium Priority (P2)
- Analyze execution patterns - Once metrics available
- Optimize scheduling - Prevent resource contention
- Review smoke test coverage - Ensure critical paths tested
Low Priority (P3)
- Standardize frontmatter - Consistent metadata across workflows
- Add workflow descriptions - Improve discoverability
- Document engine selection - Guidelines for choosing engines
Actions Taken This Run
- ✅ Scanned 128 executable workflows
- ✅ Verified 100% compilation coverage
- ✅ Identified 10 outdated lock files
- ✅ Created 2 P0 issues for critical problems
- ✅ Saved analysis to shared repo memory
- ✅ Created coordination alerts for other meta-orchestrators
Trends
- Overall health score: Unable to calculate (no metrics data)
- Compilation health: 92.2% (118/128 up-to-date)
- New failures this week: Unknown (no metrics)
- Fixed issues this week: Unknown (no metrics)
- Average success rate: Unknown (no metrics)
Next Steps
- ⏳ Monitor recompilation issue resolution
- ⏳ Monitor metrics collection enablement
- ⏳ Wait 7 days for metrics baseline
- 🔄 Re-run comprehensive health analysis with execution data
- 🔄 Establish ongoing monitoring and alerting
Success Metrics Target
Once metrics are available, track:
- Overall workflow health score > 80/100
- Workflow success rate > 90%
- Mean time between failures (MTBF) > 7 days
- Outdated lock files < 5%
- Failed workflows detected within 24 hours
Last updated: 2026-01-04T02:59:53Z
Next check: After metrics collection enabled (7 days minimum for baseline)
Dashboard maintained by: Workflow Health Manager
Shared memory: /tmp/gh-aw/repo-memory/default/workflow-health-latest.md
AI generated by Workflow Health Manager - Meta-Orchestrator
Workflow Health Dashboard - 2026-01-04
Overview
Critical Issues 🚨
Issue 1: Metrics Collection System Down
/tmp/gh-aw/repo-memory-default/memory/default/metrics/latest.jsondoes not existmetrics-collector.mdworkflow has outdated lock fileIssue 2: 10 Workflows with Outdated Lock Files (7.8%)
.mdfiles modified after.lock.ymlcompilationsmoke-copilot-playwright.mdgo-fan.mdstale-repo-identifier.mdduplicate-code-detector.mdcopilot-pr-nlp-analysis.mdsmoke-srt.mdgithub-mcp-structural-analysis.mdmetrics-collector.mdincident-response.mdlayout-spec-maintainer.mdStructural Health Analysis ✅
Since execution metrics are unavailable, this assessment focuses on structural health:
Compilation Coverage
.lock.ymlfilesEngine Distribution
Analysis: Healthy distribution prevents single point of failure. Copilot as primary engine is appropriate for GitHub integration.
Workflow Categories
Analysis: Good balance of scheduled vs. event-triggered workflows. Scheduling spread reduces resource contention.
Tool Usage
Analysis: Heavy GitHub MCP usage is expected and healthy for repository operations.
Systemic Patterns
Positive Indicators ✅
Areas of Concern⚠️
Data Limitations 🔴
Current Analysis Limited By:
Reason: Metrics Collector workflow is outdated and metrics storage not populated.
Impact: This assessment can only evaluate structural health (compilation, configuration, categorization). Runtime health monitoring requires metrics data.
Recommendations
Immediate Actions (P0)
High Priority (P1)
Medium Priority (P2)
Low Priority (P3)
Actions Taken This Run
Trends
Next Steps
Success Metrics Target
Once metrics are available, track: