Skip to content

Charter flow telemetry from Sentinel CHARTER-02 — 8 frictions (F1-F8) + 3 observations (O1-O3) on fw-4.6.0/cli-3.7.0 #81

@montfort

Description

@montfort

Summary

Empirical validation of the full Charter flow shipped in fw-4.6.0 / cli-3.7.0,
executed end-to-end on a real backend Charter in the Sentinel repo
(CHARTER-02 — GIN index sobre heartbeats.checks).
The flow worked, but produced 8 reproducible frictions (F1-F8) and
3 design observations (O1-O3) worth addressing before more adopters
hit the same papercuts.

This report is the output of a deliberate validation exercise, not an
opportunistic post-mortem — instrumentation was added inline so the
metrics are honest. Quantitative timing of every CLI invocation is in
CHARTER-02.telemetry.yaml.

Test methodology

  • Surface tested: devtrail charter new, devtrail charter drift
    (with and without --no-ailog-suppress), devtrail charter close
    (--from-template --non-interactive path; interactive path not
    tested
    in this cycle, requires a real human session), and
    devtrail approve (flag-driven path on 3 documents).
  • Test workload: a real backend Charter (S effort estimate, ~30-45
    min target) that ships an SQL migration + integration test. Not a toy
    charter; exercises real code paths.
  • Bonus workload: 3 retroactive devtrail approve calls against
    documents with review_required: true that had been pending since
    before fw-4.6.0 canonized the approval workflow. One of them
    (already-approved AIDEC) revealed F4.
  • Reproducibility: every command timing was wrapped with
    START=$(date +%s%N); <cmd>; END=$(date +%s%N) so wall-clock numbers
    are honest. Frictions are described with the input that triggered
    them so they can be reproduced from a clean repo.
  • Out of scope: interactive devtrail charter close UX (needs
    human-in-the-loop session); WIF/CI integration of the pre-pr.sh hook
    (Sentinel only has 1 active Charter).

Frictions (recommended fixes)

ID Surface Friction Severity Recommended fix
F3 check-charter-drift.sh regex The regex extracts paths from any column of the ## Files to modify markdown table, not just the "File" column. A path mentioned as a textual reference in the "Change" column (e.g. `docs/plans/README.md` as "follows the pattern of ...") is parsed as a declared deliverable → false-positive omission warning. Medium (real false positive — degrades signal of a script whose value is precision) Tighten the regex: only extract from column 1, or require an explicit syntactic marker for declared paths.
F4 devtrail approve re-application Approving a doc that already has reviewed_by / reviewed_at / review_outcome in frontmatter and a ## Approval section in the body does not warn. The frontmatter is overwritten silently, and a second ## Approval section is appended to the body (resulting in two blocks with the same date but potentially different --notes). Medium (silent doc corruption) Detect existing approval; default behavior: skip body section append + print "already approved by <reviewer> on <date> — pass --force to re-approve". --force is the right shape for the revisions_requested → approved cycle.
F6 devtrail charter close body sync The command updates frontmatter status: closed correctly, but does not update the "Status (mirrored from frontmatter — source of truth is above)" line in the Charter body (typically line 12 of the template). After close, frontmatter says closed, body still says in-progress. Silent doc drift. Medium (the template's own promise — "mirrored from frontmatter" — is broken by the CLI) Either (a) the CLI syncs both representations on close, or (b) the template stops claiming the line is mirrored from frontmatter (i.e., remove the line from the scaffold). (a) is cleaner.
F1 devtrail charter new --title The --title produces an aggressive slug. CHARTER-02's title "GIN index sobre heartbeats.checks (Plan 04 F5)" became filename 02-gin-index-sobre-heartbeats-checks-plan-04-f5.md (47 chars). No truncation, no --slug flag override. Low (cosmetic / UX) Add --slug flag (operator override); default slug truncation at e.g. 40 chars.
F2 devtrail charter new --from-ailog The --from-ailog AILOG-X flag is honored in the originating_ailogs frontmatter list, but the body header line ("Origin: Follow-up of AILOG-X. [Add 1-line context about why this Charter exists now.]") keeps the textual placeholder. Low (UX / scaffold quality) Auto-extract the first 1-2 sentences of the referenced AILOG's ## Summary (or ## Context) and inject them where the placeholder lives.
F5 devtrail approve risk_level Approving a risk_level: high or critical document in flag-driven mode produces no warning. Per AGENT-RULES.md §4, those triggers require thorough human review. Low-Medium (defense-in-depth) Print "WARNING: high risk doc — ensure thorough review" before frontmatter write. Optionally require --confirm-high-risk flag for those cases.
F7 devtrail charter close output The output is identical for first-run (creating telemetry from template) and subsequent-runs (revalidating filled telemetry). Both print next: edit the telemetry YAML and re-run.... Low (UX — operator can't tell whether they're done) Differentiate output: first run → "telemetry template created — fill in and re-run"; subsequent runs → "telemetry validated, N filled / M total fields — Charter close finalized" or similar.
F8 devtrail charter close closed_at The Charter Closure section of the template (step 4) mentions closed_at: YYYY-MM-DD as "optional", but the CLI never writes it (neither in --from-template nor implied for the interactive path I didn't test). The operator has to add it manually. Low (data completeness — every closed Charter should be timestamped) Auto-add closed_at: <today> to frontmatter on charter close execution. The schema already permits arbitrary additional fields.

Observations (intentional design / no bugs, worth evaluating)

ID Surface Observation Discussion
O1 check-charter-drift.sh (lines 101-105) The script considers all paths under docs/charters/* and .devtrail/07-ai-audit/* as "always in scope" and silently filters them from the scope-expansion warning. In CHARTER-02 this hid 3 legitimate scope-expansion entries (the 3 retroactive approves on AILOG/AIDEC files). Trade-off: simplicity vs precision. For technical Charters this rule is reasonable (governance docs are always legitimate). But a Charter that opportunistically grows scope to cover massive governance work would be invisible to the drift signal. Consider an opt-in --strict-scope flag that disables the rule.
O2 CLI performance All discrete CLI commands run sub-20ms wall-clock. Total wall time of the full Charter flow (excluding the interactive close and the human edit time): ~50ms across 6 commands. The original 5-10 min target for charter close was conservative; the bottleneck is unequivocally the operator + their editor, not the CLI. Excellent. Nothing to fix. Worth communicating to adopters: the CLI itself is essentially free, perceived friction is human authoring.
O3 --no-ailog-suppress flag Output is byte-identical to the default (suppress-on) mode. Either the AILOG-suppression logic only kicks in when originating_ailogs reference AILOGs containing explicit R<N> sections that would have been suppressed, or the flag is wired in the CLI wrapper but not in the underlying script, or there's a dispatch bug. Documented behavior in --help is "Disable AILOG-aware suppression (show every declared-omitted path even if documented in an AILOG)". Even when no suppression actually fires, it would help to see e.g. INFO: 0 paths suppressed by AILOG-aware filter so the operator knows the flag had a chance to do something.

Quantitative data

Command                                                  Wall time
-------------------------------------------------------- ---------
devtrail charter new -t S --from-ailog X --title "..."   2 ms
devtrail charter drift CHARTER-02 --range main..HEAD     17 ms
devtrail charter drift ... --no-ailog-suppress           15 ms
devtrail approve <id> --outcome approved --reviewer ... ─┐
                                              (×3)       ├ 3 ms each
                                                        ─┘
bash .devtrail/scripts/check-charter-drift.sh ...        ~16 ms (direct script)
devtrail charter close CHARTER-02 --from-template ...    2 ms (×2 — first + revalidation)

Total CLI wall time across ~10 invocations:              ~50 ms

Charter authoring (manual editing):                      ~30-45 min
Telemetry instrumentation overhead:                      ~15 min
F4 cleanup (manual):                                     ~5 min
F6 manual sync (Status mirror):                          ~3 min

Effort estimated (Charter):  S (~30-45 min)
Effort actual:               M (~75 min) — 1.8x drift
Effort actual w/o telemetry: ~40 min — within S bucket

Recommendations

  1. Priority for cli-3.7.x patches (real bugs in commands the docs imply are clean): F3, F4, F6.
  2. Polish for cli-3.8.0 (UX improvements that don't block adopters): F1, F2, F7, F8.
  3. Defense-in-depth for the next minor release: F5.
  4. Worth a design discussion before changing: O1 (the "always in scope" rule is opinionated — adopters may want a --strict-scope flag), O3 (verify whether --no-ailog-suppress actually toggles behavior in any reproducible scenario).

What worked exceptionally well

  • The format v4 atomic-update pattern caught a placeholder drift (AILOG-NNN-...mdAILOG-030-...md) in-session via devtrail charter drift. Without atomic update, this would have surfaced in a post-merge housekeeping PR (which is exactly the failure mode v4 was designed to prevent).
  • The reviewed_by / reviewed_at / review_outcome fields canonized in fw-4.6.0 (RFC RFC: Canonical approval workflow for review_required documents (AIDEC, ETH, MCARD, ADR) #67) worked verbatim with the local extension Sentinel had adopted before canonization. Zero migration, zero rename, zero conflict. This is the right shape of upstream adoption — local validation generates the canonical names.
  • The scaffold from devtrail charter new included all 6 canonical conventions immediately. A new adopter doesn't need to read the template to write a correct Charter; they edit the scaffold and the conventions are already embedded.
  • The drift script's originating_ailogs-aware suppression (the wrapped-by-CLI part, not the bash script) is the right architectural choice: it lets the Charter declare its risks once, in the AILOG, and then the drift signal is automatically denoised against that declaration.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingenhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions