Skip to content

[cli-tools-test] safe-output-items.jsonl always empty and SafeItemsCount always 0 in metrics despite safe outputs being executed #20877

@github-actions

Description

@github-actions

Exploratory testing of the audit and logs MCP tools on 2026-03-13 revealed a consistent metric accuracy bug: safe-output-items.jsonl is always empty and SafeItemsCount is always 0 even when safe output actions (e.g., add_labels) are clearly executed and recorded in safeoutputs.jsonl.

Evidence

Across 10 workflow runs checked today, every run with safe output activity had:

  • safeoutputs.jsonl - populated (contains actual safe output actions)
  • safe-output-items.jsonl - empty (0B)
  • SafeItemsCount in run_summary.json - always 0

Specific reproduction case

Run: §23074558467 (Auto-Triage Issues, success)

safeoutputs.jsonl (populated):

{"integrity":"high","item_number":20875,"labels":["bug","cli"],"secrecy":"public","type":"add_labels"}
```

`safe-output-items.jsonl` (**empty - bug**):
```
(no content)

run_summary.json metrics (SafeItemsCount wrong):

{
  "SafeItemsCount": 0,
  "Turns": 0,
  "ToolCalls": null
}

The agent_output.json file correctly contains the validated item — confirming the safe output was executed:

{"items":[{"integrity":"high","item_number":20875,"labels":["bug","cli"],"secrecy":"public","type":"add_labels"}],"errors":[]}

All Affected Runs

Run safeoutputs.jsonl safe-output-items.jsonl SafeItemsCount
run-23074558467 103B 0B 0
run-23072165186 219B 0B 0
run-23059678935 318B 0B 0
run-23064798754 258B 0B 0
run-23059960694 330B 0B 0
run-23073052777 106B 0B 0
run-23071864635 106B 0B 0
run-23070912219 106B 0B 0
run-23051588927 131B 0B 0
run-23070335592 106B 0B 0

Related Metrics Also Broken

  • Turns: always 0 — reported as "Completed in 0 turns" in audit tool even when 102k tokens used. May be Copilot engine-specific, but misleading.
  • ToolCalls: always null — individual MCP tool call details are never stored in run_summary.json, only aggregate summaries in mcp_tool_usage.

Impact

  • gh aw audit reports incorrect safe output activity (shows no safe outputs when items were created)
  • gh aw logs summary total_safe_items: 0 is always wrong
  • Audit trail completeness is compromised — users and monitoring tools cannot rely on safe output counts
  • Severity: High — core observability/compliance feature

Steps to Reproduce

  1. Trigger any workflow that uses safeoutputs (e.g., Auto-Triage Issues)
  2. After completion, run gh aw audit (run_id)
  3. Observe SafeItemsCount: 0 in metrics
  4. Check safe-output-items.jsonl in logs — it will be empty despite safeoutputs.jsonl containing entries

Expected Behavior

  • safe-output-items.jsonl should be populated with safe output items that were executed
  • SafeItemsCount should reflect the actual number of safe output actions taken
  • audit should report these items in its analysis

Environment

  • Repository: github/gh-aw
  • Workflow Run ID: §23075154599
  • Testing Date: 2026-03-13
  • CLI Version: 2a255d2 (from run_summary.json)

Generated by Daily CLI Tools Exploratory Tester ·

  • expires on Mar 21, 2026, 12:04 AM UTC

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions