-
Notifications
You must be signed in to change notification settings - Fork 295
Description
Overview
| Metric | Value |
|---|---|
| Total executable workflows | 166 (stable) |
| Compiled with lock files | 166/166 (100% ✅) |
| Outdated lock files | 0 ✅ (11 with 1ms diffs are checkout artifacts, not real) |
| Healthy | ~160 (96%) |
| Critical/Failing (P1) | 5 workflows |
| Overall health score | 76/100 (↑2 from 74 — false-positive lock file flags corrected) |
⚠️ DEGRADED — Lockdown failures ongoing (Day 8+ week) + AI Moderator Day 11 OpenAI restriction. Previous dashboard #19935 expired. Tracking issues for Issue Monster, PR Triage Agent, and Daily Issues Report all expired overnight and were auto-closed.
Critical Issues 🚨
P1: Lockdown Token Missing (4 workflows)
All 4 workflows require GH_AW_GITHUB_TOKEN secret which is not provisioned. All fix paths closed (#17414, #17807 both CLOSED "not_planned"). No current fix path — manual admin intervention required.
- Issue Monster — run Add diagnostic logging to patch generation process #2568 failed (2026-03-08T07:17Z) — ~48 failures/day — old issue [aw] Issue Monster failed #18919 CLOSED Mar 7 (auto-expired)
- PR Triage Agent — run Docs for
output:are not correct #180 failed (2026-03-08T06:18Z) — old issue [aw] PR Triage Agent failed #18952 CLOSED Mar 8 (auto-expired) - Daily Issues Report — run [copilot] Compiler: Upload GITHUB_AW_OUTPUT as workflow artifact "aw_output.txt" #126 failed (2026-03-08T01:59Z) — old issue [aw] Daily Issues Report Generator failed #18967 CLOSED Mar 8 (auto-expired) — 126+ consecutive failures
- Org Health Report — run Small docs fixes #27 failed (last run 2026-03-02T08:26Z) — no active tracking issue
P1: AI Moderator — Day 11 OpenAI Restriction
- Status: Still failing consistently — run [WIP] Fix campaign generation run to enable detection and safe outputs #9727 failed (2026-03-08T06:46Z)
- Error: OpenAI cybersecurity restriction on
gpt-5.3-codexmodel - Tracking: [aw] AI Moderator failed (pre-agent) #19551 OPEN (expires Mar 11, 2026 — 44 comments)
- Impact: Content moderation disabled for all issue events
P1: Smoke Codex — OpenAI Restriction (Day 11)
- Status: Still failing — run Add shared workflow for checking existing PRs before creating duplicates #2175 failed (2026-03-08T01:38Z)
- Tracking: [aw] Smoke Codex failed (pre-agent) #19514 OPEN (expires Mar 11, 2026)
- Note: Failures on PR branches (schedule on main passes occasionally)
Issue Tracking Status
| Workflow | Status | Tracking Issue | Notes |
|---|---|---|---|
| Issue Monster | ❌ Failing | None (expired) | #18919 closed Mar 7 |
| PR Triage Agent | ❌ Failing | None (expired) | #18952 closed Mar 8 |
| Daily Issues Report | ❌ Failing | None (expired) | #18967 closed Mar 8 |
| Org Health Report | ❌ Failing | None (never had one) | Lockdown |
| AI Moderator | ❌ Failing | #19551 ✅ (Mar 11) | Day 11 OpenAI |
| Smoke Codex | ❌ Failing | #19514 ✅ (Mar 11) | Day 11 OpenAI |
| Smoke Copilot | ✅ Passing | N/A | run #2277 success 2026-03-08T01:12Z |
Compilation Health
Lock File Details
All 166 workflows have corresponding .lock.yml files. The check detected 11 files with 0.001s (1 millisecond) timestamp differences — these are filesystem checkout artifacts, not actual outdated lock files. The specific affected workflows change each run depending on checkout ordering.
Yesterday's report incorrectly flagged 12 "outdated" lock files using the same artifact. Those files were all different from today's set, confirming this is a false positive.
Healthy Workflows ✅
Healthy highlights
- Smoke Copilot: ✅ run [task] Consolidate generic validation functions into validation.go #2277 success (2026-03-08T01:12Z on main schedule)
- Smoke Claude: passing (recent schedule runs)
- Metrics Collector: operational
- All other ~160 workflows operating normally
Systemic Issues
Lockdown Token (GH_AW_GITHUB_TOKEN) — Week 5+
- Affected: 4 workflows (Issue Monster, PR Triage Agent, Daily Issues Report, Org Health Report)
- Pattern: All require
lockdown: truewhich needs special token - Fix path: CLOSED — both fix options declined as "not_planned"
- Status: Chronic, accepted failure
OpenAI Cybersecurity Restriction — Day 11
- Affected: AI Moderator, Smoke Codex
- Engine:
gpt-5.3-codexblocked by OpenAI content policy - Fix path: Switch to different engine (claude/copilot) — tracked in [aw] AI Moderator failed (pre-agent) #19551
Health Trends
| Date | Score | Workflows | Key Change |
|---|---|---|---|
| 2026-03-01 | 73/100 | 162 | Metrics Collector regression |
| 2026-03-03 | 76/100 | 165 | Metrics Collector recovered |
| 2026-03-07 | 74/100 | 166 | False positive: 12 "outdated" locks |
| 2026-03-08 | 76/100 | 166 | Corrected false positive; all locks current |
Recommendations
High Priority
- Lockdown workflows: 3 tracking issues expired overnight — workflow auto-generates its own failure issues on next run; no manual action needed from health manager perspective
- OpenAI restriction: AI Moderator Day 11 — consider escalating model switch in [aw] AI Moderator failed (pre-agent) #19551
- Org Health Report: Still no dedicated tracking issue — create one or accept
Medium Priority
- Review lock file timestamp comparison logic — 1ms filesystem artifact causes false positives every run
Actions Taken This Run
- ✅ Verified 166/166 workflows compiled (0 missing lock files)
- ✅ Identified 11 false-positive "outdated" lock files (1ms checkout artifact)
- ✅ Confirmed Issue Monster still failing (run Add diagnostic logging to patch generation process #2568, 2026-03-08T07:17Z)
- ✅ Confirmed Smoke Copilot: PASSING (run [task] Consolidate generic validation functions into validation.go #2277 schedule on main)
- ✅ Confirmed AI Moderator: still failing (run [WIP] Fix campaign generation run to enable detection and safe outputs #9727, issue [aw] AI Moderator failed (pre-agent) #19551 active)
- ✅ Confirmed Smoke Codex: still failing (issue [aw] Smoke Codex failed (pre-agent) #19514 active)
- ✅ Created this dashboard (replacing Workflow Health Dashboard - 2026-03-07 #19935 which expires at 07:25Z today)
- ✅ Updated shared memory
References:
- §22816368749 — This run
- §22794578625 — Previous run (dashboard Workflow Health Dashboard - 2026-03-07 #19935)
- Previous dashboard: Workflow Health Dashboard - 2026-03-07 #19935 (expired 2026-03-08T07:25Z)
Generated by Workflow Health Manager - Meta-Orchestrator · ◷
- expires on Mar 9, 2026, 7:29 AM UTC