-
Notifications
You must be signed in to change notification settings - Fork 309
Closed
Labels
cookieIssue Monster Loves Cookies!Issue Monster Loves Cookies!
Description
Overview
Workflow health assessment for 2026-03-18. 174 workflows monitored, 7 stale lock files (down from 16 last run). Score: 62/100 (↓6 from 68).
P0 failures persist (GH_AW_GITHUB_TOKEN still missing). New P1 escalation: Daily Workflow Updater failing for 9 consecutive days.
Critical Issues 🚨
P0: Issue Monster / PR Triage Agent / Issue Triage Agent
- Status: 100% failure rate (all recent runs failing)
- Error:
GH_AW_GITHUB_TOKENsecret missing —pre_activationstep fails to generate GitHub App token for skip-if checks - Duration: Ongoing since March 15
- Impact: Issue management, PR triage, and issue triage workflows completely non-functional
- Action Required: Configure
GH_AW_GITHUB_TOKENrepository secret (GitHub App token)
Escalated Issues ⬆️
P1: Daily Workflow Updater — 9 consecutive failures (NEW ESCALATION)
- Status: Failing every day since March 9 (9 failures). Last success: March 8.
- Pattern:
scheduleevent at 09:xx UTC, each run lasts ~10 min then fails - Impact: GitHub Actions version updates no longer being applied automatically
- Recent runs: §23187578561 (Mar 17), all failures since run#110
- Action: See issue P1: Daily Workflow Updater failing for 9 consecutive days (since March 9) #21538 for investigation
Recoveries ✅
Bot Detection — RECOVERED (was P1)
- 2 consecutive successes today (runs at 00:24 and 06:24 UTC)
- After cluster of failures Mar 15-17, now healthy
- Status: Downgraded from P1 to Healthy
Warnings ⚠️
P2: Smoke Gemini — Intermittent failures (50% rate)
- Alternating success/failure pattern: success Mar 14-15, failure Mar 16, success Mar 17T00:51, failure Mar 18T00:54
- May indicate intermittent Gemini API availability issues
- Monitoring recommended
P2: Stale Lock Files (7 files)
daily-architecture-diagram.md,daily-compiler-quality.md,daily-mcp-concurrency-analysis.md,daily-secrets-analysis.md,github-mcp-structural-analysis.md,repo-audit-analyzer.md,smoke-call-workflow.md- Action: Run
make recompileto rebuild
Healthy Workflows ✅
Core infrastructure healthy:
- Smoke Copilot ✅ | Smoke Claude ✅ | Smoke Codex ✅
- Auto-Triage Issues ✅ | Contribution Check ✅ | Metrics Collector ✅
- AI Moderator ✅
Systemic Patterns
Systemic GitHub Actions disruption (Mar 17 15:00–22:00 UTC):
- Most workflows show failures in this window, then recovery after 22:54 UTC
- Auto-triage, Smoke Copilot, Contribution Check, WHM itself all affected
- Not a workflow bug — infrastructure disruption
Metrics Summary
| Category | Count | % |
|---|---|---|
| Healthy (≥80) | ~165 | ~95% |
| Warning (60-79) | ~3 | ~2% |
| Critical (<60) | ~3 | ~2% |
| Stale locks | 7 | 4% |
Actions Taken This Run
- Created P1 issue for Daily Workflow Updater (9 days failing)
- Bot Detection downgraded from P1 → Healthy
- Updated shared memory with current state
- Stale lock count: 7 (↓ from 16)
Run: §23233873324
Timestamp: 2026-03-18T07:32Z
Next check: 2026-03-19 ~07:30Z
Reactions are currently unavailable
Metadata
Metadata
Labels
cookieIssue Monster Loves Cookies!Issue Monster Loves Cookies!
Type
Fields
Give feedbackNo fields configured for issues without a type.