[audit-workflows] Daily Audit Report — 2026-04-26 #28637

2026-04-26T21:19:39Z

github-actions[bot]
Bot Apr 26, 2026

Audit of the last 24 hours of agentic workflow runs across the github/gh-aw repository. 53 total runs completed or are in progress, with a 90% success rate (45/50 concluded runs). No missing tools or MCP failures were detected. Total estimated cost for Claude-engine runs: $12.59.

Summary

Metric	Value
Total Runs	53
Successful	45 (90%)
Cancelled	5 (10%)
In Progress	3
Total Tokens	23.6M (10.9M effective)
Claude Est. Cost	$12.59
Total Errors	10
Missing Tools	0
MCP Failures	0
Total Turns	443
GitHub API Calls	195

Engine Mix: Claude=16 (30%) · Copilot=34 (64%) · Codex=1 (2%) · Unknown=2 (4%)

Workflow Health

The overall health is strong at 90% success rate. The 5 cancellations are concentrated in Smoke CI (4 runs) and Deployment Incident Monitor (1 run). Since this is the first audit with repo memory initialized, no historical trend comparison is available — this run establishes the baseline.

Token Usage & Cost

Sergo - Serena Go Expert consumed the most tokens in a single run (6.58M, 99 turns, $3.06), while Design Decision Gate accumulated the highest overall cost across 10 runs ($3.79 total). Copilot-engine runs show $0 cost as billing data is not reported. The 7-day moving average baseline will be established in future audits.

Cancellations & Failures

View Cancelled Runs (5)

Run ID	Workflow	Errors	Notes
§24966690457	Smoke CI	4	Cancelled with errors
§24963334613	Smoke CI	3	Cancelled with errors
§24961125853	Smoke CI	3	Cancelled with errors
§24961118049	Smoke CI	0	Cancelled, no engine assigned
§24964506933	Deployment Incident Monitor	0	Cancelled, no engine assigned

Pattern: Smoke CI is cancelled 4 times in 24h — 3 instances logged 3–4 errors each. This recurring pattern may indicate a race condition, resource contention, or flaky test environment. Deployment Incident Monitor cancelled without an engine assignment, suggesting it was stopped before agent initialization.

Observability Insights

Three notable findings from cross-run analysis:

Severity	Category	Finding
🔴 High	Reliability	16 high-anomaly events (score >0.6) across 53 runs — unusual patterns relative to learned templates (stage: `tool_result`)
🔴 High	Network	Daily DIFC Analyzer: 59/112 requests blocked (53% block rate) — highest network friction of any workflow
🟡 Medium	Drift	Contribution Check execution variance: 0–14 turns (avg 7.0) — unstable prompt or variable task shape

Firewall Analysis

View Firewall Details (649 total requests)

Total: 649 requests | Allowed: 582 (89.7%) | Blocked: 67 (10.3%)
Top blocked domains: ab.chatgpt.com:443, chatgpt.com:443
Worst workflow: Daily DIFC Integrity-Filtered Events Analyzer (59 blocked / 112 total, 53%)

The blocked ChatGPT domains are a consistent firewall block — workflows should not be attempting to reach these endpoints unless intentionally testing network restrictions. The DIFC Analyzer's high block rate is likely expected by design given its purpose of analyzing integrity-filtered events.

High-Cost & High-Token Runs

View Top Runs by Cost/Tokens

Workflow	Runs	Tokens	Est. Cost	Turns
Design Decision Gate 🏗️	10	3.04M	$3.79	~6/run
Sergo - Serena Go Expert	1	6.58M	$3.06	99
Step Name Alignment	1	2.13M	$1.70	42
[aw] Failure Investigator (6h)	1	946K	$1.59	17
Static Analysis Report	1	1.85M	$1.44	29
Copilot Prompt Clustering	1	1.23M	$1.01	23

Sergo at 99 turns warrants attention — long interactive sessions with Serena (LSP-based Go analysis) can accumulate significant tokens. This appears to be a one-off extended session rather than a recurring cost driver.

DIFC Integrity-Filtered Events

1 filtered event detected:

github/pull_request_read on pr:github/gh-aw#28622 — resource has lower integrity than the agent requires (integrity below "approved"). This is expected DIFC enforcement behavior.

MCP Server Activity

Server	Requests
github	80
safeoutputs	69
agenticworkflows	62
serena	32

No MCP failures recorded. All servers responded successfully.

Recommendations

Smoke CI cancellations — Investigate the recurring error pattern in cancelled Smoke CI runs. 4 cancellations in 24h with 3–4 errors each suggests a systemic issue (flaky runner, dependency, or timeout). Consider adding retry logic or alerting.
Sergo session length — The 99-turn / $3.06 run is an outlier. If this is interactive, consider adding turn limits or cost guardrails to long-running sessions.
Contribution Check drift — 0–14 turn variance suggests the workflow prompt may be responding differently to varying issue/PR content. Review whether the prompt needs tightening.
ChatGPT domain blocks — Verify that no production workflows are intentionally calling ab.chatgpt.com or chatgpt.com. If none should, this may indicate a misconfigured tool or test.

References:

§24966690457 — Smoke CI (cancelled, 4 errors)
§24966166774 — Sergo Go Expert (99 turns, $3.06)
§24965818195 — Daily DIFC Analyzer (53% firewall block rate)

Generated by Agentic Workflow Audit Agent · ● 362K · ◷

expires on Apr 27, 2026, 9:19 PM UTC

2026-04-27T21:46:45Z

github-actions[bot]
Bot Apr 27, 2026
Author

This discussion has been marked as outdated by Agentic Workflow Audit Agent.

A newer discussion is available at Discussion #28804.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[audit-workflows] Daily Audit Report — 2026-04-26 #28637

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

[audit-workflows] Daily Audit Report — 2026-04-26 #28637

Uh oh!

github-actions[bot] Bot Apr 26, 2026

Summary

Workflow Health

Token Usage & Cost

Cancellations & Failures

Observability Insights

Firewall Analysis

High-Cost & High-Token Runs

DIFC Integrity-Filtered Events

MCP Server Activity

Recommendations

Replies: 1 comment

Uh oh!

github-actions[bot] Bot Apr 27, 2026 Author

github-actions[bot]
Bot Apr 26, 2026

github-actions[bot]
Bot Apr 27, 2026
Author