feat(ce-compound): add frontmatter parser-safety validator by tmchow · Pull Request #697 · EveryInc/compound-engineering-plugin

tmchow · 2026-04-26T04:10:38Z

Summary

Frontmatter that strict YAML parsers would silently misread is now caught at write-time. After /ce-compound or /ce-compound-refresh writes a doc, a stdlib-only Python script scans the frontmatter for the bug class that prose rules miss; bad output blocks success with a concrete diagnostic ("related_pr value contains ' #' — quote it") so the agent can fix and retry until the doc is parser-safe.

The original surfacing was Codex on PR #695: related_pr: PR #685 ... parsed as just 'PR' because YAML treats unquoted # as a comment delimiter. A 2026-04-16 plan doc had the same family of bug with unquoted : in a title: field. Both are silent — parsers accept them, downstream code reads garbage, no error fires. The validator targets exactly that bug class.

What the validator catches

Check	Bug class	Evidence
Frontmatter delimiter lines (`---` matched as full lines, not substrings)	Malformed structure (`----`, `---extra`, missing closer) silently parses as body-only	Codex catch on PR #697 itself
`#` substring in unquoted scalar	Silent comment truncation	Codex catch on PR #695
`:` substring in unquoted scalar	Silent mapping confusion	2026-04-16 plan doc `title:` field

The validator does not flag values starting with leading reserved indicators (` / * / & / ! / \| / > / % / @) because those produce loud parser errors downstream rather than silent corruption — out of stated scope. The two YAML 1.2 indicators that need-trailing-whitespace (-, ?) are also not flagged because bare -foo and ?foo are valid plain scalars.

Design decisions

Stdlib only, no PyYAML dependency. First draft required PyYAML for parser-fidelity validation. Dropped after considering distribution: PyYAML is not Python stdlib, macOS system Python doesn't ship it, mise/asdf-installed Pythons don't ship it. Many user environments would need pip install pyyaml before /ce-compound works end-to-end. The bug class the validator targets is purely regex-detectable, so dropping the YAML parser dependency costs nothing.

Scope: silent corruption only, not lint or schema validation. The validator catches issues YAML parsers won't surface as errors. It does NOT enforce schema.yaml's required-field or enum rules, and does NOT flag YAML reserved indicators that produce loud failures (those are caught by the next tool in the pipeline). A repo-wide sweep found 17+ existing docs with stale schema mismatches; tightening here would block valid new docs because the schema is out of date relative to actual /ce-compound usage. Schema alignment is a separate concern.

Two skill copies, one script. Per AGENTS.md, skills are self-contained — no cross-skill file references. Both ce-compound/scripts/ and ce-compound-refresh/scripts/ get identical copies of validate-frontmatter.py. A test asserts byte-identity to keep them from drifting. SKILL.md in each skill invokes python3 scripts/validate-frontmatter.py <doc-path> directly, matching the existing repo idiom for Python skill scripts (ce-session-extract, ce-session-inventory, ce-release-notes).

Catch radius

Verified against three surfaces:

25 test cases in tests/frontmatter-validator.test.ts covering valid frontmatter, each silent-corruption failure mode, malformed-delimiter cases (----, ---extra, trailing whitespace on ---), valid plain scalars starting with - and ?, and identity across the two skill copies
All 32 existing docs/solutions/ files: 30 pass on docs that have frontmatter, 1 legit FAIL on a doc that has no frontmatter (correct behavior — /ce-compound would never have written it that way)
The two YAML issues from PR fix(ce-code-review): tighten autofix_class rubric for safe_auto/gated_auto boundary #695 / ce-session-historian inefficient on sparse-history dispatches (17min wall, 33 tool calls) #696 (related_pr truncation + plan doc title: confusion): both would have been caught before merge

Test plan

bun test (939/939) and bun run release:validate clean.

References

PR fix(ce-code-review): tighten autofix_class rubric for safe_auto/gated_auto boundary #695 — Codex caught the original bug
ce-session-historian inefficient on sparse-history dispatches (17min wall, 33 tool calls) #696 — related ce-session-historian inefficiency notes from the same session

…validator Add a stdlib-only Python script that ce-compound and ce-compound-refresh run after writing a doc, to catch the YAML parser-safety bug class that prose rules miss. PR #695 had a Codex review catch a real instance of this on safe-auto-rubric-calibration-2026-04-25.md: `related_pr: PR #685 ...` silently truncated to `'PR'` because YAML treats unquoted ' #' as a comment delimiter. The prose rules in references/yaml-schema.md cover this for array items but not for top-level scalars; the validator closes that gap. ## What changed - `plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter` — bash wrapper, executable - `plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py` — Python implementation, no third-party deps - Same pair duplicated under `ce-compound-refresh/scripts/` (per AGENTS.md no-cross-skill-shared-files rule). Tests assert content identity. - SKILL.md updates wire `bash scripts/validate-frontmatter <doc-path>` into the post-write step in both skills, with retry-on-fail loop. - `tests/frontmatter-validator.test.ts` — 24 cases covering valid frontmatter, each failure mode, usage errors, and identity across the two skill copies. ## Why stdlib-only (no PyYAML) Original draft required PyYAML for parser-fidelity validation. Dropped after considering distribution: PyYAML is not Python stdlib; macOS system Python doesn't include it, mise/asdf-installed Pythons don't, and many user environments would need `pip install pyyaml` before /ce-compound works end-to-end. The truncation/quoting bug class is purely regex-detectable — ` #` substring, `: ` substring, leading `-`/`*`/`&`/`!`/etc. — so dropping the YAML parser dependency costs nothing for the bug class we care about. ## Scope: parser-safety only, not full schema validation The validator catches *parser-safety* issues (silent data loss from YAML's quoting rules). It does NOT validate against schema.yaml's required-field or enum-value rules. A repo-wide sweep of existing docs/solutions/ found that 17+ docs have stale schema mismatches (component values like `subagent-template` or `installer` that aren't in the enum, missing required fields, etc.). Fixing those is a separate concern that depends on schema.yaml being brought current relative to ce-compound's actual usage. Filed as a follow-up; this PR ships only what catches the Codex bug class. ## Catch radius Would have caught the safe-auto-rubric-calibration-2026-04-25.md and the plan doc's title issue from PR #695 #696 — both top-level scalars with unquoted ' #' or ': '. Verified by running on synthetic broken docs in the test suite and on every existing doc in docs/solutions/ (30/30 pass on docs that have any frontmatter; 1 legit fail on a doc that has none, which is correct behavior). Tests: 934/934 pass (+24 from the new test file). release:validate clean.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 816b251907

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…python3 directly The wrapper script (validate-frontmatter that exec'd python3 validate-frontmatter.py) didn't earn its keep. It added 4 files to maintain (2 skills × 2 wrappers), a byte-identity test, an extra subprocess per call, and an indirection layer where bash quoting/path issues would surface as Python failures — all to save typing 'python3 ' and '.py' in the SKILL.md invocation. The existing repo idiom for Python skill scripts is direct invocation: ce-session-extract calls 'python3 scripts/extract-skeleton.py' in its own SKILL.md; ce-session-inventory and ce-release-notes do the same. There's no wrapper convention to consistency-match against. Drop the wrappers. Update SKILL.md in both skills to invoke 'python3 scripts/validate-frontmatter.py' directly. Drop the bash-wrapper-identity test (no wrappers left to compare). Tests: 933/933 pass (-1, the dropped wrapper-identity test). release:validate clean.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 1136c4e7df

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…dicator checks Codex on PR #697 caught two false-positive vectors in the leading-reserved- indicator check: 1. `-` is a list/sequence marker in YAML only when followed by whitespace. Bare `-foo` (no space after `-`) is a valid plain scalar that strict parsers accept without complaint. The unconditional check would reject `title: -foo` and force /ce-compound to retry on frontmatter that's actually fine. 2. `?` is a complex-mapping-key marker only when followed by whitespace. Bare `?foo` is a valid plain scalar with the same story. Drop `-` from the leading-indicator check entirely. Drop `?` from the reserved tuple. The cases where they ARE problematic (`- foo` / `? foo` with whitespace) are loud parser failures, not silent misreads, so they're out of scope for this script (which targets silent-corruption bugs). The remaining reserved indicators kept in the check (`` ` ``, `*`, `&`, `!`, `|`, `>`, `%`, `@`) are special at first character without needing trailing whitespace, so the unconditional check is correct for them. Tests: - Removed the `-starts-with-dash` rejection test (was asserting wrong behavior all along — Codex's catch). - Added two positive tests that lock in the fix: valid plain scalars starting with `-` and `?` must now pass validation. Verified the YAML semantics with PyYAML: `title: -foo` parses as `{'title': '-foo'}`; `title: - foo` raises `ScannerError`. The validator's job is to catch the silent-misread bugs, not the loud-parser-error ones. Tests: 935/935 pass (+1 net from new positive tests). release:validate clean.

… full lines Codex on PR #697 caught that `text.find("\n---", 4)` accepts any line starting with `---` as a valid closing delimiter — including `----` or `---extra`. The validator returns OK on those, but downstream frontmatter readers in this repo require a delimiter line whose stripped content is exactly `---`, so malformed docs slip through and parse as body-only later. Replaced the substring search with line-based matching: split the file into lines, require both delimiters to be lines whose `rstrip()` is exactly `---`. This rejects `----` and `---extra` while still permitting trailing whitespace on a valid `---` line. The same fix applies to the opening delimiter check (which was less vulnerable but inconsistently strict — `text.startswith("---\\n")` rejected `--- \\n`). Same change duplicated to ce-compound-refresh's copy per AGENTS.md no-cross-skill-shared-files; the byte-identity test catches drift. Tests: - New: rejects '----' as closing delimiter - New: rejects '---extra' as closing delimiter - New: accepts '--- ' (with trailing whitespace) on either delimiter - Existing tests still pass Tests: 941/941 pass (+6 from new delimiter cases). release:validate clean.

…eck (out of stated scope) Per the validator's own docstring its scope is silent corruption — values that strict YAML parsers misread without raising — but the leading reserved-indicator check (`*`, `&`, `!`, `|`, `>`, `%`, `@`, `` ` ``) was catching the loud-error class instead. Those characters either parse with a different intent than the author meant (anchor declaration, alias reference, tag) or trigger explicit parser failures, which downstream parsers already surface. The validator's job is the bug class downstream parsers DON'T surface (silent truncation, silent restructuring), not lint. Background: this check was added speculatively, by extrapolating from the YAML 1.2 reserved-indicator list rather than from observed bugs. Subsequent review rounds had to walk back the `-` and `?` cases (Codex flagged them as YAML-spec-incorrect — both only act as markers when followed by whitespace). The exercise shows the check was reaching beyond the evidence-based scope. What's gone: - The unconditional first-character-against-tuple check - Its dedicated test case - Mention of reserved indicators in the docstring What stays (the actual evidence-based bug class): - Frontmatter delimiter line matching (catches `----` / `---extra` / missing closer / missing opener) - ` #` substring → silent comment truncation (the original Codex bug) - `: ` substring → silent mapping confusion (the plan-doc title bug) Tests: 939/939 pass. release:validate clean.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f848177b8f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

…tor's actual scope Codex on PR #697 caught that the SKILL.md instruction for invoking validate-frontmatter.py still claimed it catches "leading reserved indicators" — but the previous commit (f848177) deliberately removed that check as out-of-stated-scope (loud parser errors, not silent corruption). The doc-vs-code drift creates false confidence. Fix: reword the validator-invocation step in both skills to describe what the script actually checks (delimiter lines + ' #' truncation + ': ' mapping confusion) and explicitly call out what it does NOT check (schema rules, reserved indicators) so the boundary is clear. Tests: 939/939 pass. release:validate clean.

)

tmchow changed the title ~~feat(ce-compound,ce-compound-refresh): add frontmatter parser-safety validator~~ feat(ce-compound): add frontmatter parser-safety validator Apr 26, 2026

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated

Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-compound/scripts/validate-frontmatter.py Outdated

tmchow added 3 commits April 25, 2026 21:21

chatgpt-codex-connector Bot reviewed Apr 26, 2026

View reviewed changes

Comment thread plugins/compound-engineering/skills/ce-compound/SKILL.md Outdated

tmchow merged commit 7eea2d1 into main Apr 26, 2026
2 checks passed

github-actions Bot mentioned this pull request Apr 26, 2026

chore: release main #684

Merged

michaelvolz pushed a commit to michaelvolz/compound-engineering-plugin-windows-version that referenced this pull request Apr 28, 2026

feat(ce-compound): add frontmatter parser-safety validator (EveryInc#697

4b93faa

)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(ce-compound): add frontmatter parser-safety validator#697

feat(ce-compound): add frontmatter parser-safety validator#697
tmchow merged 6 commits intomainfrom
tmchow/frontmatter-validator

tmchow commented Apr 26, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

tmchow commented Apr 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What the validator catches

Design decisions

Catch radius

Test plan

References

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

tmchow commented Apr 26, 2026 •

edited

Loading