test+ci: enforce hook+workflow invariants instead of brittle counts by wtthornton · Pull Request #9 · wtthornton/ralph-claude-code

wtthornton · 2026-05-03T18:15:26Z

Summary

Greens the Test Suite by addressing the 5 failures PR #8 exposed (the count mismatch was masking them).

Workflow gap (one real fix):

.github/workflows/codeql-analysis.yml was the only hand-authored workflow without defaults.run.shell: bash. Pinned it per TAP-667. Closes failures fix(sandbox): TAP-668 — HEALTHCHECK readability + HOME env for ralph user #3 + fix(monitor): TAP-1201 — mid-loop visibility + accurate liveness detection #6 (same root cause).

Test invariants (three brittle assertions rewritten):

Test	Before	After
`HOOKS-2: hook scripts reference .ralph/hooks/`	rejected `.claude/hooks/` entries	accepts EITHER `.ralph/hooks/` OR `.claude/hooks/`; still catches garbage paths
`all hook commands start with 'bash '`	rejected bare `.sh` paths	accepts bare `.claude/hooks/*.sh` (tapps-mcp convention) AND `bash <path>` (Ralph convention); still catches tool names like `Write`/`Edit` landing in command fields
`PreToolUse has exactly two entries`	locked count to 2	verifies Ralph's Bash hook → validate-command.sh AND Edit\|Write hook → protect-ralph-files.sh are present and wired correctly; allows additional plugin-injected hooks

The original tests assumed Ralph was the only source of .claude/settings.json hooks — that broke as soon as tapps-mcp / linear-MCP started registering their own under .claude/hooks/. The rewrite encodes what actually matters: Ralph's safety hooks must remain present and correctly wired; everything beyond that is allowed.

Test plan

Local: npm run test:unit → 1455/1455 passing, no warnings
Local: targeted run on test_settings_json.bats + test_hooks.bats + test_ci_workflows.bats → 93/93 passing
CI Test Suite green

🤖 Generated with Claude Code

Five tests were exposed once the bats count mismatch was fixed (PR #8). Three of the five were brittle assertions that locked out legitimate plugin ecosystems; two were a real workflow gap. Long-term right fix is to encode the actual invariants and close the gap, not silence the tests. Workflow gap fixed: - .github/workflows/codeql-analysis.yml now pins `defaults.run.shell: bash` per the TAP-667 standard. Previously the only hand-authored workflow without this. Two tests (#3, #6) were really one root cause. Test invariants tightened (instead of "exactly N" counts): - HOOKS-2: `bash <path>` commands must reference EITHER .ralph/hooks/ OR .claude/hooks/. The original test rejected .claude/hooks/ entries and broke as soon as tapps-mcp registered hooks there. - "all hook commands start with bash": now also accepts the bare .claude/hooks/<name>.sh form that tapps-mcp emits when registering Linear MCP governance hooks. Catches garbage paths / tool names (Write/Edit) without policing plugin command-emission style. - "PreToolUse has exactly two entries": rewritten to verify Ralph's two safety hooks (Bash → validate-command.sh, Edit|Write → protect-ralph-files.sh) are present AND wired to the right scripts. Plugin-injected entries are allowed; what's protected is removal or rewiring of Ralph's own defenses. Local: 1455/1455 unit tests passing, no warnings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…nvariant Two integration tests asserted that setup.sh-generated .ralphrc shipped with `ALLOWED_TOOLS="..."` containing Bash(npm *)/Bash(pytest), but ADR-0006 deleted the legacy `-p` mode and the ALLOWED_TOOLS allowlist along with it — tool surface now lives in .claude/agents/ralph.md (`tools:` allowlist + `disallowedTools:` blocklist). Replaced both with a single negative assertion: if a future change re-introduces `ALLOWED_TOOLS=` to .ralphrc, this test fires so we don't silently split the tool-surface contract across two files again. The positive invariant (tool surface defined in agent file) is already covered by HOOKS-6 in tests/unit/test_hooks.bats. Local: full integration suite (203 tests) passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ocked The eval test "FILE PROTECTION: blocks edit to .ralphrc" asserted the hook returns exit 2 when no .ralphrc exists in the test fixture dir. But the hook's actual contract (HOOKS-5 in tests/unit/test_hooks.bats) is: ALLOW creating a new .ralphrc when absent, BLOCK editing once it exists. The test was asserting the wrong half of the contract — it went red the moment the eval suite started running end-to-end (post PR #8 / #9 fixes that unmasked the eval step). Split into two tests that match the real invariant: 1. Edit on EXISTING .ralphrc → blocked (touch then test) 2. Create on ABSENT .ralphrc → allowed (HOOKS-5 already covers this for the hook script directly; this is the eval-level mirror) Local: 69/69 deterministic evals passing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

wtthornton and others added 3 commits May 3, 2026 11:15

wtthornton merged commit 5a88b02 into main May 4, 2026
5 checks passed

wtthornton deleted the fix-ci-failures-postcount branch May 4, 2026 19:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test+ci: enforce hook+workflow invariants instead of brittle counts#9

test+ci: enforce hook+workflow invariants instead of brittle counts#9
wtthornton merged 3 commits into
mainfrom
fix-ci-failures-postcount

wtthornton commented May 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

wtthornton commented May 3, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant