[cli-tools-test] `safe-output-items.jsonl` always empty and `SafeItemsCount` always 0 in metrics despite safe outputs being executed

Exploratory testing of the `audit` and `logs` MCP tools on 2026-03-13 revealed a consistent metric accuracy bug: `safe-output-items.jsonl` is always empty and `SafeItemsCount` is always `0` even when safe output actions (e.g., `add_labels`) are clearly executed and recorded in `safeoutputs.jsonl`.

### Evidence

Across **10 workflow runs** checked today, every run with safe output activity had:
- `safeoutputs.jsonl` - **populated** (contains actual safe output actions)
- `safe-output-items.jsonl` - **empty (0B)**  
- `SafeItemsCount` in `run_summary.json` - **always 0**

#### Specific reproduction case

**Run**: [§23074558467](https://github.com/github/gh-aw/actions/runs/23074558467) (Auto-Triage Issues, success)

`safeoutputs.jsonl` (**populated**):
````json
{"integrity":"high","item_number":20875,"labels":["bug","cli"],"secrecy":"public","type":"add_labels"}
```

`safe-output-items.jsonl` (**empty - bug**):
```
(no content)
````

`run_summary.json` metrics (**SafeItemsCount wrong**):
```json
{
  "SafeItemsCount": 0,
  "Turns": 0,
  "ToolCalls": null
}
```

The `agent_output.json` file correctly contains the validated item — confirming the safe output was executed:
```json
{"items":[{"integrity":"high","item_number":20875,"labels":["bug","cli"],"secrecy":"public","type":"add_labels"}],"errors":[]}
```

### All Affected Runs

| Run | safeoutputs.jsonl | safe-output-items.jsonl | SafeItemsCount |
|-----|-------------------|------------------------|----------------|
| run-23074558467 | 103B | 0B | 0 |
| run-23072165186 | 219B | 0B | 0 |
| run-23059678935 | 318B | 0B | 0 |
| run-23064798754 | 258B | 0B | 0 |
| run-23059960694 | 330B | 0B | 0 |
| run-23073052777 | 106B | 0B | 0 |
| run-23071864635 | 106B | 0B | 0 |
| run-23070912219 | 106B | 0B | 0 |
| run-23051588927 | 131B | 0B | 0 |
| run-23070335592 | 106B | 0B | 0 |

### Related Metrics Also Broken

- **`Turns`: always 0** — reported as "Completed in 0 turns" in `audit` tool even when 102k tokens used. May be Copilot engine-specific, but misleading.
- **`ToolCalls`: always null** — individual MCP tool call details are never stored in `run_summary.json`, only aggregate summaries in `mcp_tool_usage`.

### Impact

- **`gh aw audit`** reports incorrect safe output activity (shows no safe outputs when items were created)
- **`gh aw logs`** summary `total_safe_items: 0` is always wrong
- Audit trail completeness is compromised — users and monitoring tools cannot rely on safe output counts
- Severity: **High** — core observability/compliance feature

### Steps to Reproduce

1. Trigger any workflow that uses safeoutputs (e.g., Auto-Triage Issues)
2. After completion, run `gh aw audit (run_id)`
3. Observe `SafeItemsCount: 0` in metrics
4. Check `safe-output-items.jsonl` in logs — it will be empty despite `safeoutputs.jsonl` containing entries

### Expected Behavior

- `safe-output-items.jsonl` should be populated with safe output items that were executed
- `SafeItemsCount` should reflect the actual number of safe output actions taken
- `audit` should report these items in its analysis

### Environment

- Repository: github/gh-aw
- Workflow Run ID: [§23075154599](https://github.com/github/gh-aw/actions/runs/23075154599)
- Testing Date: 2026-03-13
- CLI Version: `2a255d2` (from `run_summary.json`)




> Generated by [Daily CLI Tools Exploratory Tester](https://github.com/github/gh-aw/actions/runs/23075154599) · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-cli-tools-tester%22&type=issues)
> - [x] expires  on Mar 21, 2026, 12:04 AM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cli-tools-test] `safe-output-items.jsonl` always empty and `SafeItemsCount` always 0 in metrics despite safe outputs being executed #20877

Evidence

Specific reproduction case

All Affected Runs

Related Metrics Also Broken

Impact

Steps to Reproduce

Expected Behavior

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Run	safeoutputs.jsonl	safe-output-items.jsonl
run-23074558467	103B	0B
run-23072165186	219B	0B
run-23059678935	318B	0B
run-23064798754	258B	0B
run-23059960694	330B	0B
run-23073052777	106B	0B
run-23071864635	106B	0B
run-23070912219	106B	0B
run-23051588927	131B	0B
run-23070335592	106B	0B

[cli-tools-test] safe-output-items.jsonl always empty and SafeItemsCount always 0 in metrics despite safe outputs being executed #20877

Description

Evidence

Specific reproduction case

All Affected Runs

Related Metrics Also Broken

Impact

Steps to Reproduce

Expected Behavior

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[cli-tools-test] `safe-output-items.jsonl` always empty and `SafeItemsCount` always 0 in metrics despite safe outputs being executed #20877