[cli-tools-test] Behavior fingerprint data inconsistency between `logs` and `audit` tools for same run

Exploratory testing discovered that the `logs` and `audit` tools return different `behavior_fingerprint` values for the exact same workflow run, indicating a data consistency issue.

### Problem Description

When comparing the `behavior_fingerprint` for the same run ID (`23701814578`) between the `logs` and `audit` tools, the values differ significantly:

| Field | `logs` tool output | `audit` tool output |
|---|---|---|
| `execution_style` | `"directed"` | `"exploratory"` |
| `resource_profile` | `"lean"` | `"heavy"` |
| `agentic_fraction` | `0` | `0.5` |
| `tool_breadth` | `"narrow"` | `"narrow"` ✅ |
| `actuation_style` | `"read_only"` | `"read_only"` ✅ |
| `dispatch_mode` | `"standalone"` | `"standalone"` ✅ |

Three out of six fields differ for the same run. This suggests the fingerprint computation is not deterministic or uses different data sources depending on which tool is called.

### Tool

- **Tool**: `audit` and `logs` (both affected)
- **Run**: [§23701814578](https://github.com/github/gh-aw/actions/runs/23701814578) — "GPL Dependency Cleaner (gpclean)"

### Steps to Reproduce

1. Call `logs` tool with `workflow_name: "GPL Dependency Cleaner (gpclean)"` and `count: 1`
2. Observe `behavior_fingerprint` for run `23701814578`:
   ```json
   {"execution_style":"directed","tool_breadth":"narrow","actuation_style":"read_only","resource_profile":"lean","dispatch_mode":"standalone","agentic_fraction":0}
   ```
3. Call `audit` tool with `run_id_or_url: "23701814578"`
4. Observe `behavior_fingerprint` for the same run:
   ```json
   {"execution_style":"exploratory","tool_breadth":"narrow","actuation_style":"read_only","resource_profile":"heavy","dispatch_mode":"standalone","agentic_fraction":0.5}
   ```

### Expected Behavior

Both tools should return **identical** `behavior_fingerprint` values for the same run, since the fingerprint should be a deterministic function of the run's actual execution data.

### Actual Behavior

Three fields (`execution_style`, `resource_profile`, `agentic_fraction`) differ between the two tools for the same run.

### Impact

- **Severity**: High — users who use `logs` for quick scanning and `audit` for deep-dives will see conflicting signals about a run's execution behavior
- **Frequency**: Observed consistently for the same run ID
- **Affected**: Any analysis or alerting built on top of `behavior_fingerprint` data
- **Workaround**: None — the values differ per tool call

### Hypothesis

The `logs` tool may be computing the fingerprint from a cached/lightweight summary, while the `audit` tool recomputes from raw log data. The discrepancy could be due to:
- Different data sources (cached vs. live computation)
- Different algorithms or weighting for field calculations
- A caching issue where the `logs` summary was generated before full log processing completed

### Additional Observations from Testing Session

Other observations from the same testing session (Run ID: 23702146707):

1. **`audit` with invalid run ID** returns a raw MCP error `-32603: failed to fetch run metadata` instead of a user-friendly error message
2. **`logs` with non-existent workflow name** returns MCP error `-32602` instead of a structured empty response with an informative message
3. **`compile` tool requires `.md` suffix** (e.g., `ace-editor.md`) while `logs` and `status` tools use display names or IDs without extension — inconsistent naming convention across tools
4. **Failed run audit** (`run 23701844529`) does not surface the specific error message (`Authentication failed`) in the `errors` field, even though it's visible in the downloaded `detection.log`

### Environment

- **Repository**: github/gh-aw
- **Testing Run ID**: 23702146707
- **Date**: 2026-03-29
- **Affected Run**: [§23701814578](https://github.com/github/gh-aw/actions/runs/23701814578)




> Generated by [Daily CLI Tools Exploratory Tester](https://github.com/github/gh-aw/actions/runs/23702146707/agentic_workflow) · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Fdaily-cli-tools-tester%22&type=issues)
> - [x] expires  on Apr 5, 2026, 5:26 AM UTC

Field	`logs` tool output	`audit` tool output
`execution_style`	`"directed"`	`"exploratory"`
`resource_profile`	`"lean"`	`"heavy"`
`agentic_fraction`	`0`	`0.5`
`tool_breadth`	`"narrow"`	`"narrow"` ✅
`actuation_style`	`"read_only"`	`"read_only"` ✅
`dispatch_mode`	`"standalone"`	`"standalone"` ✅

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[cli-tools-test] Behavior fingerprint data inconsistency between `logs` and `audit` tools for same run #23418

Problem Description

Tool

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Hypothesis

Additional Observations from Testing Session

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[cli-tools-test] Behavior fingerprint data inconsistency between logs and audit tools for same run #23418

Description

Problem Description

Tool

Steps to Reproduce

Expected Behavior

Actual Behavior

Impact

Hypothesis

Additional Observations from Testing Session

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

[cli-tools-test] Behavior fingerprint data inconsistency between `logs` and `audit` tools for same run #23418