Skip to content

feat(ce-compound): add frontmatter parser-safety validator#697

Merged
tmchow merged 6 commits intomainfrom
tmchow/frontmatter-validator
Apr 26, 2026
Merged

feat(ce-compound): add frontmatter parser-safety validator#697
tmchow merged 6 commits intomainfrom
tmchow/frontmatter-validator

Conversation

@tmchow
Copy link
Copy Markdown
Collaborator

@tmchow tmchow commented Apr 26, 2026

Summary

Frontmatter that strict YAML parsers would silently misread is now caught at write-time. After /ce-compound or /ce-compound-refresh writes a doc, a stdlib-only Python script scans the frontmatter for the bug class that prose rules miss; bad output blocks success with a concrete diagnostic ("related_pr value contains ' #' — quote it") so the agent can fix and retry until the doc is parser-safe.

The original surfacing was Codex on PR #695: related_pr: PR #685 ... parsed as just 'PR' because YAML treats unquoted # as a comment delimiter. A 2026-04-16 plan doc had the same family of bug with unquoted : in a title: field. Both are silent — parsers accept them, downstream code reads garbage, no error fires. The validator targets exactly that bug class.

What the validator catches

Check Bug class Evidence
Frontmatter delimiter lines (--- matched as full lines, not substrings) Malformed structure (----, ---extra, missing closer) silently parses as body-only Codex catch on PR #697 itself
# substring in unquoted scalar Silent comment truncation Codex catch on PR #695
: substring in unquoted scalar Silent mapping confusion 2026-04-16 plan doc title: field

The validator does not flag values starting with leading reserved indicators (` / * / & / ! / \| / > / % / @) because those produce loud parser errors downstream rather than silent corruption — out of stated scope. The two YAML 1.2 indicators that need-trailing-whitespace (-, ?) are also not flagged because bare -foo and ?foo are valid plain scalars.

Design decisions

Stdlib only, no PyYAML dependency. First draft required PyYAML for parser-fidelity validation. Dropped after considering distribution: PyYAML is not Python stdlib, macOS system Python doesn't ship it, mise/asdf-installed Pythons don't ship it. Many user environments would need pip install pyyaml before /ce-compound works end-to-end. The bug class the validator targets is purely regex-detectable, so dropping the YAML parser dependency costs nothing.

Scope: silent corruption only, not lint or schema validation. The validator catches issues YAML parsers won't surface as errors. It does NOT enforce schema.yaml's required-field or enum rules, and does NOT flag YAML reserved indicators that produce loud failures (those are caught by the next tool in the pipeline). A repo-wide sweep found 17+ existing docs with stale schema mismatches; tightening here would block valid new docs because the schema is out of date relative to actual /ce-compound usage. Schema alignment is a separate concern.

Two skill copies, one script. Per AGENTS.md, skills are self-contained — no cross-skill file references. Both ce-compound/scripts/ and ce-compound-refresh/scripts/ get identical copies of validate-frontmatter.py. A test asserts byte-identity to keep them from drifting. SKILL.md in each skill invokes python3 scripts/validate-frontmatter.py <doc-path> directly, matching the existing repo idiom for Python skill scripts (ce-session-extract, ce-session-inventory, ce-release-notes).

Catch radius

Verified against three surfaces:

Test plan

bun test (939/939) and bun run release:validate clean.

References


Compound Engineering
Claude Code

…validator

Add a stdlib-only Python script that ce-compound and ce-compound-refresh
run after writing a doc, to catch the YAML parser-safety bug class that
prose rules miss. PR #695 had a Codex review catch a real instance of this
on safe-auto-rubric-calibration-2026-04-25.md: `related_pr: PR #685 ...`
silently truncated to `'PR'` because YAML treats unquoted ' #' as a comment
delimiter. The prose rules in references/yaml-schema.md cover this for
array items but not for top-level scalars; the validator closes that gap.

## What changed

- `plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter` —
  bash wrapper, executable
- `plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py` —
  Python implementation, no third-party deps
- Same pair duplicated under `ce-compound-refresh/scripts/` (per AGENTS.md
  no-cross-skill-shared-files rule). Tests assert content identity.
- SKILL.md updates wire `bash scripts/validate-frontmatter <doc-path>`
  into the post-write step in both skills, with retry-on-fail loop.
- `tests/frontmatter-validator.test.ts` — 24 cases covering valid frontmatter,
  each failure mode, usage errors, and identity across the two skill copies.

## Why stdlib-only (no PyYAML)

Original draft required PyYAML for parser-fidelity validation. Dropped after
considering distribution: PyYAML is not Python stdlib; macOS system Python
doesn't include it, mise/asdf-installed Pythons don't, and many user
environments would need `pip install pyyaml` before /ce-compound works
end-to-end. The truncation/quoting bug class is purely regex-detectable —
` #` substring, `: ` substring, leading `-`/`*`/`&`/`!`/etc. — so dropping
the YAML parser dependency costs nothing for the bug class we care about.

## Scope: parser-safety only, not full schema validation

The validator catches *parser-safety* issues (silent data loss from YAML's
quoting rules). It does NOT validate against schema.yaml's required-field or
enum-value rules. A repo-wide sweep of existing docs/solutions/ found that
17+ docs have stale schema mismatches (component values like `subagent-template`
or `installer` that aren't in the enum, missing required fields, etc.).
Fixing those is a separate concern that depends on schema.yaml being brought
current relative to ce-compound's actual usage. Filed as a follow-up; this
PR ships only what catches the Codex bug class.

## Catch radius

Would have caught the safe-auto-rubric-calibration-2026-04-25.md and the
plan doc's title issue from PR #695 #696 — both top-level scalars with
unquoted ' #' or ': '. Verified by running on synthetic broken docs in the
test suite and on every existing doc in docs/solutions/ (30/30 pass on
docs that have any frontmatter; 1 legit fail on a doc that has none, which
is correct behavior).

Tests: 934/934 pass (+24 from the new test file). release:validate clean.
@tmchow tmchow changed the title feat(ce-compound,ce-compound-refresh): add frontmatter parser-safety validator feat(ce-compound): add frontmatter parser-safety validator Apr 26, 2026
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 816b251907

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated
Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated
…python3 directly

The wrapper script (validate-frontmatter that exec'd python3
validate-frontmatter.py) didn't earn its keep. It added 4 files to
maintain (2 skills × 2 wrappers), a byte-identity test, an extra
subprocess per call, and an indirection layer where bash quoting/path
issues would surface as Python failures — all to save typing 'python3 '
and '.py' in the SKILL.md invocation.

The existing repo idiom for Python skill scripts is direct invocation:
ce-session-extract calls 'python3 scripts/extract-skeleton.py' in its
own SKILL.md; ce-session-inventory and ce-release-notes do the same.
There's no wrapper convention to consistency-match against.

Drop the wrappers. Update SKILL.md in both skills to invoke
'python3 scripts/validate-frontmatter.py' directly. Drop the
bash-wrapper-identity test (no wrappers left to compare).

Tests: 933/933 pass (-1, the dropped wrapper-identity test).
release:validate clean.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1136c4e7df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated
tmchow added 3 commits April 25, 2026 21:21
…dicator checks

Codex on PR #697 caught two false-positive vectors in the leading-reserved-
indicator check:

1. `-` is a list/sequence marker in YAML only when followed by whitespace.
   Bare `-foo` (no space after `-`) is a valid plain scalar that strict
   parsers accept without complaint. The unconditional check would reject
   `title: -foo` and force /ce-compound to retry on frontmatter that's
   actually fine.

2. `?` is a complex-mapping-key marker only when followed by whitespace.
   Bare `?foo` is a valid plain scalar with the same story.

Drop `-` from the leading-indicator check entirely. Drop `?` from the
reserved tuple. The cases where they ARE problematic (`- foo` / `? foo`
with whitespace) are loud parser failures, not silent misreads, so they're
out of scope for this script (which targets silent-corruption bugs).

The remaining reserved indicators kept in the check (`` ` ``, `*`, `&`,
`!`, `|`, `>`, `%`, `@`) are special at first character without needing
trailing whitespace, so the unconditional check is correct for them.

Tests:
- Removed the `-starts-with-dash` rejection test (was asserting wrong
  behavior all along — Codex's catch).
- Added two positive tests that lock in the fix: valid plain scalars
  starting with `-` and `?` must now pass validation.

Verified the YAML semantics with PyYAML: `title: -foo` parses as
`{'title': '-foo'}`; `title: - foo` raises `ScannerError`. The validator's
job is to catch the silent-misread bugs, not the loud-parser-error ones.

Tests: 935/935 pass (+1 net from new positive tests). release:validate clean.
… full lines

Codex on PR #697 caught that `text.find("\n---", 4)` accepts any line
starting with `---` as a valid closing delimiter — including `----` or
`---extra`. The validator returns OK on those, but downstream frontmatter
readers in this repo require a delimiter line whose stripped content is
exactly `---`, so malformed docs slip through and parse as body-only
later.

Replaced the substring search with line-based matching: split the file
into lines, require both delimiters to be lines whose `rstrip()` is
exactly `---`. This rejects `----` and `---extra` while still permitting
trailing whitespace on a valid `---` line. The same fix applies to the
opening delimiter check (which was less vulnerable but inconsistently
strict — `text.startswith("---\\n")` rejected `---  \\n`).

Same change duplicated to ce-compound-refresh's copy per AGENTS.md
no-cross-skill-shared-files; the byte-identity test catches drift.

Tests:
- New: rejects '----' as closing delimiter
- New: rejects '---extra' as closing delimiter
- New: accepts '---   ' (with trailing whitespace) on either delimiter
- Existing tests still pass

Tests: 941/941 pass (+6 from new delimiter cases). release:validate clean.
…eck (out of stated scope)

Per the validator's own docstring its scope is silent corruption — values
that strict YAML parsers misread without raising — but the leading
reserved-indicator check (`*`, `&`, `!`, `|`, `>`, `%`, `@`, `` ` ``) was
catching the loud-error class instead. Those characters either parse with
a different intent than the author meant (anchor declaration, alias
reference, tag) or trigger explicit parser failures, which downstream
parsers already surface. The validator's job is the bug class downstream
parsers DON'T surface (silent truncation, silent restructuring), not lint.

Background: this check was added speculatively, by extrapolating from the
YAML 1.2 reserved-indicator list rather than from observed bugs. Subsequent
review rounds had to walk back the `-` and `?` cases (Codex flagged them
as YAML-spec-incorrect — both only act as markers when followed by
whitespace). The exercise shows the check was reaching beyond the
evidence-based scope.

What's gone:
- The unconditional first-character-against-tuple check
- Its dedicated test case
- Mention of reserved indicators in the docstring

What stays (the actual evidence-based bug class):
- Frontmatter delimiter line matching (catches `----` / `---extra` /
  missing closer / missing opener)
- ` #` substring → silent comment truncation (the original Codex bug)
- `: ` substring → silent mapping confusion (the plan-doc title bug)

Tests: 939/939 pass. release:validate clean.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f848177b8f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread plugins/compound-engineering/skills/ce-compound/SKILL.md Outdated
…tor's actual scope

Codex on PR #697 caught that the SKILL.md instruction for invoking
validate-frontmatter.py still claimed it catches "leading reserved
indicators" — but the previous commit (f848177) deliberately removed
that check as out-of-stated-scope (loud parser errors, not silent
corruption). The doc-vs-code drift creates false confidence.

Fix: reword the validator-invocation step in both skills to describe what
the script actually checks (delimiter lines + ' #' truncation + ': '
mapping confusion) and explicitly call out what it does NOT check
(schema rules, reserved indicators) so the boundary is clear.

Tests: 939/939 pass. release:validate clean.
@tmchow tmchow merged commit 7eea2d1 into main Apr 26, 2026
2 checks passed
@github-actions github-actions Bot mentioned this pull request Apr 26, 2026
michaelvolz pushed a commit to michaelvolz/compound-engineering-plugin-windows-version that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant