Skip to content

test+ci: enforce hook+workflow invariants instead of brittle counts#9

Merged
wtthornton merged 3 commits into
mainfrom
fix-ci-failures-postcount
May 4, 2026
Merged

test+ci: enforce hook+workflow invariants instead of brittle counts#9
wtthornton merged 3 commits into
mainfrom
fix-ci-failures-postcount

Conversation

@wtthornton
Copy link
Copy Markdown
Owner

Summary

Greens the Test Suite by addressing the 5 failures PR #8 exposed (the count mismatch was masking them).

Workflow gap (one real fix):

Test invariants (three brittle assertions rewritten):

Test Before After
HOOKS-2: hook scripts reference .ralph/hooks/ rejected .claude/hooks/ entries accepts EITHER .ralph/hooks/ OR .claude/hooks/; still catches garbage paths
all hook commands start with 'bash ' rejected bare .sh paths accepts bare .claude/hooks/*.sh (tapps-mcp convention) AND bash <path> (Ralph convention); still catches tool names like Write/Edit landing in command fields
PreToolUse has exactly two entries locked count to 2 verifies Ralph's Bash hook → validate-command.sh AND Edit|Write hook → protect-ralph-files.sh are present and wired correctly; allows additional plugin-injected hooks

The original tests assumed Ralph was the only source of .claude/settings.json hooks — that broke as soon as tapps-mcp / linear-MCP started registering their own under .claude/hooks/. The rewrite encodes what actually matters: Ralph's safety hooks must remain present and correctly wired; everything beyond that is allowed.

Test plan

  • Local: npm run test:unit → 1455/1455 passing, no warnings
  • Local: targeted run on test_settings_json.bats + test_hooks.bats + test_ci_workflows.bats → 93/93 passing
  • CI Test Suite green

🤖 Generated with Claude Code

wtthornton and others added 3 commits May 3, 2026 11:15
Five tests were exposed once the bats count mismatch was fixed (PR #8).
Three of the five were brittle assertions that locked out legitimate
plugin ecosystems; two were a real workflow gap. Long-term right fix
is to encode the actual invariants and close the gap, not silence
the tests.

Workflow gap fixed:
- .github/workflows/codeql-analysis.yml now pins `defaults.run.shell:
  bash` per the TAP-667 standard. Previously the only hand-authored
  workflow without this. Two tests (#3, #6) were really one root cause.

Test invariants tightened (instead of "exactly N" counts):
- HOOKS-2: `bash <path>` commands must reference EITHER .ralph/hooks/
  OR .claude/hooks/. The original test rejected .claude/hooks/ entries
  and broke as soon as tapps-mcp registered hooks there.
- "all hook commands start with bash": now also accepts the bare
  .claude/hooks/<name>.sh form that tapps-mcp emits when registering
  Linear MCP governance hooks. Catches garbage paths / tool names
  (Write/Edit) without policing plugin command-emission style.
- "PreToolUse has exactly two entries": rewritten to verify Ralph's
  two safety hooks (Bash → validate-command.sh, Edit|Write →
  protect-ralph-files.sh) are present AND wired to the right scripts.
  Plugin-injected entries are allowed; what's protected is removal
  or rewiring of Ralph's own defenses.

Local: 1455/1455 unit tests passing, no warnings.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nvariant

Two integration tests asserted that setup.sh-generated .ralphrc shipped
with `ALLOWED_TOOLS="..."` containing Bash(npm *)/Bash(pytest), but
ADR-0006 deleted the legacy `-p` mode and the ALLOWED_TOOLS allowlist
along with it — tool surface now lives in .claude/agents/ralph.md
(`tools:` allowlist + `disallowedTools:` blocklist).

Replaced both with a single negative assertion: if a future change
re-introduces `ALLOWED_TOOLS=` to .ralphrc, this test fires so we
don't silently split the tool-surface contract across two files
again. The positive invariant (tool surface defined in agent file)
is already covered by HOOKS-6 in tests/unit/test_hooks.bats.

Local: full integration suite (203 tests) passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ocked

The eval test "FILE PROTECTION: blocks edit to .ralphrc" asserted the
hook returns exit 2 when no .ralphrc exists in the test fixture dir.
But the hook's actual contract (HOOKS-5 in tests/unit/test_hooks.bats)
is: ALLOW creating a new .ralphrc when absent, BLOCK editing once it
exists. The test was asserting the wrong half of the contract — it
went red the moment the eval suite started running end-to-end (post
PR #8 / #9 fixes that unmasked the eval step).

Split into two tests that match the real invariant:
  1. Edit on EXISTING .ralphrc → blocked (touch then test)
  2. Create on ABSENT .ralphrc → allowed (HOOKS-5 already covers this
     for the hook script directly; this is the eval-level mirror)

Local: 69/69 deterministic evals passing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@wtthornton wtthornton merged commit 5a88b02 into main May 4, 2026
5 checks passed
@wtthornton wtthornton deleted the fix-ci-failures-postcount branch May 4, 2026 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant