audit-skills v0: substantive-depth limitations in primary-adopter cycle (Sentinel CHARTER-07) + prompt resolver HTML-comment bug

## Context

This issue consolidates feedback from the **first primary-adopter run** of the `audit-skills` flow shipped in `fw-4.8.0` / `cli-3.9.0` (3 May). Sentinel is executing CommsHub Etapa 2 as 9 Charters per the meta-plan; CHARTER-07 (foundation: setup + ports + 7 migrations + SQLC + scaffolding + Wire stub) just completed its checkpoint-driven audit cycle — the first such cycle in any adopter project.

This is exactly the data point Etapa 2 telemetry was set up to surface. We chose to file this **before** the consolidated snapshot at the end of Etapa 2 because the issues identified are blockers for substantive audit value and merit upstream iteration sooner, not later. Sentinel has paused CHARTER-07 close (Charter status: `in-progress`) so the audit can be re-run after upstream fixes, instead of polluting the telemetry with a structurally-limited cycle.

There are two distinct findings (R10 mechanical bug, R11 substantive-depth limitations) plus a forward-looking proposal that builds on the methodology Sentinel had pre-DevTrail. All evidence lives in the Sentinel repo paths cited at the bottom of this issue and is reproducible.

---

## R10 — Prompt resolver duplicates content by ignoring HTML comment boundaries

### Symptom

After `devtrail charter audit CHARTER-07` (Step 1/3 PREPARE), `audit/charters/CHARTER-07/prompts/auditor-primary.prompt.md` was 1300 lines with the **Charter content + AILOG + diff appearing twice** — once inside the template's leading HTML comment block (lines 1-605), and again in the prompt body proper (lines 607+). The operator detected the duplication visually before pasting into the auditor LLM.

### Root cause

`.devtrail/audit-prompts/auditor-primary.md` (the template) has an HTML comment header (lines 1-32) that documents available placeholders. Lines 21-31 list each placeholder with a description using literal `{{placeholder}} — description` syntax:

```
Placeholders supported by `devtrail charter audit`:
  {{charter_id}}        — e.g., CHARTER-05
  {{charter_content}}   — full body of the Charter doc
  {{git_diff}}          — output of `git diff <git_range>`
  ...
```

The CLI resolver does **string-replace globally** of each `{{placeholder}}` → resolved content, **without distinguishing whether the placeholder is inside or outside an `` block**. Result: the documentation-only lines (e.g., `{{charter_content}} — full body of the Charter doc`) expand to `(entire Charter body) — full body of the Charter doc`, inflating the comment block from 32 to 605 lines and duplicating every payload (Charter + AILOG + diff) that already appears in its proper place outside the comment.

### Scope

Only affects `auditor-primary.md` template. The `auditor-secondary.md` template has a short comment (16 lines) without `{{placeholders}}` — no duplication there.

### Operational impact

- ~30k tokens of duplicated payload sent to the auditor LLM (cost overhead in gpt-4o / gemini chat).
- Risk that the auditor interprets the duplication as a delta between two versions of the Charter, degrading finding quality.
- Documentation block becomes noise (no longer useful as placeholder reference).

### Workaround applied locally

```bash
cd audit/charters/CHARTER-07/prompts/
tail -n +607 auditor-primary.prompt.md > tmp && mv tmp auditor-primary.prompt.md
```

Reduces primary from 1300 → 694 lines (comparable with secondary at 704). Workaround is non-destructive (touches only the resolved prompt, not the template) and reversible (re-running `devtrail charter audit` regenerates the buggy version).

Sentinel commit applying the workaround + AILOG R10 documentation: see references at bottom.

### Proposed fix

Two complementary options for the resolver:

1. **Respect HTML comment boundaries**: skip placeholder replacement inside `` ranges. Conservative; preserves comment block as documentation.
2. **Detect documentation-only lines**: if a placeholder appears as `  {{placeholder}} — description` (indented, followed by ` — `), skip the replacement on that specific line. Less conservative but doesn't require markdown comment parsing.

Either fix unblocks the template's documentation header from being weaponized as a content duplicator.

---

## R11 — Audit cycle v0 produces audits of structurally limited substantive reach

The CHARTER-07 audit cycle completed with `findings_total: 1` (one self-categorized `false_positive` from auditor-primary; auditor-secondary reported zero). On surface that looks clean. **It is not — it is mechanically true for the audit window the auditors received, but vacuously so relative to the Charter's actual scope.** Two structural causes.

### Root cause A — `git_range` defaults to `HEAD~1..HEAD`

CHARTER-07 has **8 commits** in the feature branch `charter-07-commshub-foundation`:

| Commit | Scope |
|---|---|
| `2ee3352` | Setup: module skeleton + Preference Center embed assets |
| `cee74f3` | Migrations (T004-T010): 7 migrations 012-018 with RLS + partitioning |
| `a6efa9c` | SQLC (T011): 12 query files + generated code + 2 migration fixes |
| `824ad76` | Scaffolding: 4 ports + 28 errors + 21 models + 16 events + PII guard test + 8 OTel metrics |
| `6d68392` | Wire (T018) + compile gate (T019) + middleware verify (T019a) |
| `7486b64` | AILOG-035 + atomic Charter §Closing notes |
| `c68702d` | Charter `originating_ailogs` + canonical R<N> drift suppression format |
| `aa5e948` | R10 workaround (the only commit the auditors saw) |

The CLI defaults `git_range` to `HEAD~1..HEAD`, so the auditors processed **only the final atomic-update commit** — a metadata-only delta that updates Charter frontmatter and reorganizes AILOG R7-R9 into canonical drift-suppression format. They did not see migrations, SQLC, scaffolding, Wire, or the PII guard test — i.e., they did not audit the ~4150 lines that constitute CHARTER-07's actual implementation.

Both auditors performed correctly given what they were handed. But the convergence on "no substantive findings" carries no weight as evidence that the foundation is correct — they were not given the foundation to audit.

### Root cause B — Paste-based mode without filesystem tool access

Even if `git_range` had been `main..HEAD` (the full branch), the auditors operated in paste-based mode. They could not:

- Open files outside the diff to verify cross-references (e.g., does function X invoked here actually exist in file Y).
- Consult `data-model.md` to validate migrations against documented schema intent.
- Confirm the `eventTypeToPayload` map covers the full set declared in `EmittedEventTypes`.
- Inspect spec-001 RLS patterns to validate the new `templates_tenant_or_system` policy doesn't leak system rows into spec-001 tests.

Without tool use, auditors extend to training data and assumptions. Sentinel's pre-DevTrail audit methodology had an iteratively-refined prompt that explicitly enforced **"do not opine on files you have not opened"** — a discipline that catches a known failure mode where Gemini, in particular, has a tendency to assume class/interface correctness from naming alone without reading the file. That discipline isn't representable in a paste-based prompt.

### Calibrator compensation (this cycle only)

The calibrator (claude-opus-4-7[1m] running in the operator's IDE) has filesystem access and read both the diff and the 5 implementation commits not in the audit window. The calibrator output (`audit/charters/CHARTER-07/calibrator-reconciler.md` §Reconciliation summary) documents the structural limitations explicitly — as critical observations, NOT as fresh `C<N>` findings (the calibrator role explicitly forbids introducing fresh findings: "that's what the next audit cycle is for").

This compensation works for one Charter audited by an operator using a calibrator-class tool with filesystem access. It does not generalize: most adopters won't have a calibrator with full repo read.

---

## Forward-looking proposal (not a regression)

The audit-skills v0 architecture has real strengths worth preserving:

- **Cross-family heterogeneity** as the discovery mechanism (different model family per auditor → different blind spots → convergence is high signal).
- **Calibrator pattern** (definitional, not discovery).
- **Opt-in checkpoint** per AGENT-RULES §12 heuristic.
- **Telemetry capture as a side effect** of the workflow (zero extra operator burden).

The proposal below preserves all four and addresses the substantive depth gap:

### Audits, when run, should run via CLI with read-only filesystem access

The `auditor-primary` and `auditor-secondary` roles should be invoked via auditor-side CLI tooling (gemini-cli, claude-cli, copilot-cli, codex-cli — whatever the operator has) configured with **read-only access to the repo**. The DevTrail prompts then enforce the principle that gives the audit its substance:

> *"You may only opine on files you have read via tool call. Any finding you produce must cite the specific files (paths and lines) you opened. If you have not opened a file, you may not infer behavior, structure, or correctness about it."*

This shifts the audit from "extend training data + diff" to "verify diff + read context + cite evidence." It is the discipline that distinguishes substantive code review from pattern matching.

The Sentinel pre-DevTrail audit methodology used exactly this approach across the spec-001 implementation cycle. The prompt was iteratively refined as specific resbalones (slip-ups) were observed per model — Gemini's class-by-name assumption being the canonical example we polished out. That mature prompt is shareable; we can contribute it as a starting point if useful.

### `git_range` default should match the Charter's commit set, not the last commit

For Charters implemented as multiple commits on a feature branch (the common case for L Charters), the default `HEAD~1..HEAD` is the wrong unit. Sensible alternatives:

- `git merge-base origin/main HEAD..HEAD` — captures all commits unique to the branch since divergence from main.
- `origin/main..HEAD` — same effect, simpler to express.
- A `--range` flag the operator overrides per invocation, but with the default biased toward "the full branch since main" rather than "the last commit."

This applies regardless of whether audits move to CLI mode or stay paste-based.

### Optional: tighten the resolver (R10 fix)

Independent of the larger architectural shift — the R10 fix above is small, mechanical, and worth shipping as a patch independent of any larger v1 redesign.

---

## Evidence in Sentinel repo

All evidence is reproducible from the Sentinel branch `charter-07-commshub-foundation` (commits `2ee3352..f0f0cee`). Branch is currently local; will push if the upstream maintainers want to inspect concretely — say the word.

Substantive references inside the Sentinel branch:

- `.devtrail/07-ai-audit/agent-logs/AILOG-2026-05-04-035-charter-07-commshub-foundation.md` §Risk:
  - **R10** (~lines 286-352): full root-cause analysis, scope, workaround applied, fix proposal.
  - **R11** (~lines 354-440): structural-depth limitations, both root causes elaborated, calibrator compensation applied, forward-looking proposal.
- `audit/charters/CHARTER-07/auditor-primary.md`: copilot-v1.0.40 (gpt-5.3-codex) actual response (1 finding, self-categorized `false_positive`).
- `audit/charters/CHARTER-07/auditor-secondary.md`: gemini-2.5-pro-cli actual response (0 findings, "execution and accompanying documentation are pristine").
- `audit/charters/CHARTER-07/calibrator-reconciler.md` §Reconciliation summary: detailed analysis of why the convergent zero-substantive-findings does not constitute evidence of foundation correctness.
- `audit/charters/CHARTER-07/external-audit-pending.yaml`: the `external_audit:` block ready to merge into `.devtrail/charters/CHARTER-07.telemetry.yaml` once the Charter closes (currently held back; see operational decision below).

Six commits granular for the implementation + 3 for AILOG/atomic update/audit cycle. The branch is paused at the audit-cycle close commit (`f0f0cee`) — no `devtrail charter close` invoked, no PR opened.

## Operational decision in Sentinel

Per the operator's call: **pause CHARTER-07 close until upstream iterates on R10 + R11 fixes**. Once a new `cli` version ships with the resolver fix (R10) + a sensible `git_range` default (R11 part A) — and ideally with the CLI-based audit pattern surfaced (R11 part B) — Sentinel will:

1. `devtrail update-cli` + `devtrail update-framework`.
2. Re-run the audit cycle on CHARTER-07 with the new flow.
3. `devtrail charter close CHARTER-07` with the substantive audit findings merged into telemetry.
4. Continue with CHARTER-08 onward under the improved methodology.

The current audit cycle outputs (auditor responses + calibrator) stay in `audit/charters/CHARTER-07/` as historical evidence of what the v0 produced; the re-run will overwrite or coexist as a separate cycle (decision pending upstream iteration).

This issue is the formal feedback channel from the first primary adopter. Sentinel will continue Etapa 2 telemetry per the agreed `devtrail-telemetria-etapa-2.md` v0.1 conventions and report a consolidated snapshot at the end (CHARTER-13 close), but R10 + R11 + the forward-looking proposal merit upstream attention now rather than at the end of a multi-week cycle.

Happy to provide more concrete material (the Sentinel pre-DevTrail audit prompt, full auditor responses, additional Charter execution data) if useful.

---

🤖 Filed by Sentinel operator at the close of CHARTER-07 audit cycle, with calibrator analysis from claude-opus-4-7[1m]. All assertions verifiable against the cited commits and file paths.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audit-skills v0: substantive-depth limitations in primary-adopter cycle (Sentinel CHARTER-07) + prompt resolver HTML-comment bug #102

Context

R10 — Prompt resolver duplicates content by ignoring HTML comment boundaries

Symptom

Root cause

Scope

Operational impact

Workaround applied locally

Proposed fix

R11 — Audit cycle v0 produces audits of structurally limited substantive reach

Root cause A — `git_range` defaults to `HEAD~1..HEAD`

Root cause B — Paste-based mode without filesystem tool access

Calibrator compensation (this cycle only)

Forward-looking proposal (not a regression)

Audits, when run, should run via CLI with read-only filesystem access

`git_range` default should match the Charter's commit set, not the last commit

Optional: tighten the resolver (R10 fix)

Evidence in Sentinel repo

Operational decision in Sentinel

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Commit	Scope
`2ee3352`	Setup: module skeleton + Preference Center embed assets
`cee74f3`	Migrations (T004-T010): 7 migrations 012-018 with RLS + partitioning
`a6efa9c`	SQLC (T011): 12 query files + generated code + 2 migration fixes
`824ad76`	Scaffolding: 4 ports + 28 errors + 21 models + 16 events + PII guard test + 8 OTel metrics
`6d68392`	Wire (T018) + compile gate (T019) + middleware verify (T019a)
`7486b64`	AILOG-035 + atomic Charter §Closing notes
`c68702d`	Charter `originating_ailogs` + canonical R drift suppression format
`aa5e948`	R10 workaround (the only commit the auditors saw)

audit-skills v0: substantive-depth limitations in primary-adopter cycle (Sentinel CHARTER-07) + prompt resolver HTML-comment bug #102

Description

Context

R10 — Prompt resolver duplicates content by ignoring HTML comment boundaries

Symptom

Root cause

Scope

Operational impact

Workaround applied locally

Proposed fix

R11 — Audit cycle v0 produces audits of structurally limited substantive reach

Root cause A — git_range defaults to HEAD~1..HEAD

Root cause B — Paste-based mode without filesystem tool access

Calibrator compensation (this cycle only)

Forward-looking proposal (not a regression)

Audits, when run, should run via CLI with read-only filesystem access

git_range default should match the Charter's commit set, not the last commit

Optional: tighten the resolver (R10 fix)

Evidence in Sentinel repo

Operational decision in Sentinel

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Root cause A — `git_range` defaults to `HEAD~1..HEAD`

`git_range` default should match the Charter's commit set, not the last commit