Add setup eval suite, auto-doc pipeline, and verification phase by jrenaldi79 · Pull Request #10 · jrenaldi79/harness-engineering

jrenaldi79 · 2026-03-24T21:51:19Z

Summary

Setup skill eval suite: New setup-eval-config.json, setup-grader.js, and 2 fixtures (setup-bare, setup-existing-node) that test the /setup skill's file creation, JSON validity, hook executability, CLAUDE.md sections, and agent config structure. Extended run-evals.sh with --config flag and per-test-case prompt overrides.
Auto-documentation pipeline: New scripts/repo-generate-docs.js regenerates AUTO:tree and AUTO:modules markers in CLAUDE.md on every commit via the pre-commit hook. Fixed extractJSDocDescription to prefer multi-line JSDoc over single-line (was picking up @param lines instead of file-level descriptions). Added results to SKIP_DIRS.
Setup skill verification phase: New Phase 6 in SKILL.md runs 6 smoke checks after installation (hook executability, enforcement scripts, CLAUDE.md sections, settings validity, auto-doc pipeline, linter) and fixes failures inline before the summary.
Bug fixes: Pre-push test cache exit 0 on cache hit, .gitignore safety patterns (.env, coverage/, dist/, build/).

Test plan

npx jest --config '{}' tests/scripts/ — all 85 unit tests pass (including 7 new repo-generate-docs tests)
bash tests/evals/run-evals.sh --dry-run — readiness evals still work with default config
bash tests/evals/run-evals.sh --config setup-eval-config.json --dry-run — setup evals resolve correctly
node tests/evals/setup-grader.js validates file existence, JSON validity, hooks, CLAUDE.md sections, settings structure, rule frontmatter
Pre-commit hook auto-regenerates CLAUDE.md markers on every commit
Pre-push hook exits early on test cache hit

https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Summary by CodeRabbit

Release Notes

Documentation
- Enhanced project documentation with setup guides and code quality standards.
- Added comprehensive skill reference materials and quality gate definitions.
Tests
- Expanded test suite to include setup skill evaluation scenarios.
- Added new test fixtures for existing Node.js projects.
Chores
- Added configuration for project enforcement rules and settings.
- Integrated git hooks for automated checks and documentation generation.
- Updated gitignore patterns for common build and environment artifacts.

- Replace all em dashes with commas, periods, or colons - Remove bold from marketing-style callout sentences - Replace banned/sloppy words (field guide → reference, surgical → specific/targeted, batteries-included → comprehensive, non-negotiable → required, going off the rails) - Vary "mechanical enforcement" repetition (use "automated checks", "enforcement", etc.) - Tighten formulaic structures ("This repo is two things" → "This repo contains", colon-then-parallel-clauses → two sentences) - Reduce redundant Karpathy quote paraphrases in mapping table - Remove unnecessary bold emphasis on imperative sentences https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Made small edits

…sized test - Create CLAUDE.md with Commands, Architecture, Quality Gates, Code Review Checklist, Critical Gotchas, and Docs Map sections - Add .claude/settings.json with allow/deny lists for this repo's commands - Add .claude/rules/code-quality.md and tdd.md path-scoped rules - Split 491-line generate-docs.test.js into two files under 300 lines: generate-docs.test.js (243 lines) for marker/link/plan tests generate-docs-helpers.test.js (262 lines) for tree/module/jsdoc tests Addresses gaps found by /readiness analysis: missing CLAUDE.md (Pillar 4), no agent config or rules (Pillar 5), and file size violation (Pillar 6). https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

This repo has no package.json at root (it's a plugin, not an npm package). The lockfile is an artifact from npx invocations and should not be tracked. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

- Add scripts/hooks/pre-commit: secret scanning + 300-line file size check - Add scripts/hooks/pre-push: doc drift detection + smart SHA-based caching - Add scripts/install-hooks.sh to wire hooks into .git/hooks/ - Wrap CLAUDE.md Architecture and Key Modules sections with AUTO markers - Fix module paths in Key Modules table (use full paths from repo root) - Add SessionStart hook to .claude/settings.json for drift detection - Add gotcha about two sets of hooks (repo's own vs. templates for users) Addresses remaining Level 4+ gaps: active git hooks (Pillar 3), AUTO markers (Pillar 4), enforcement hierarchy (Pillar 5), and session-start validation (Pillar 8). https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

validate-docs.js is designed for target projects (expects src/ directory and "Directory Structure" section). Replace with file size re-check across this repo's actual JS directories. Fix SessionStart hook to check repo state and hook installation instead of running target-project scripts. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Distinguishes scripts/ (internal repo hooks) from skills/setup/scripts/ (templates shipped to user projects by /setup). https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Extends the existing eval infrastructure to cover the /setup skill: - setup-grader.js: validates file creation, JSON validity, hook executability, CLAUDE.md sections, settings structure, rule frontmatter - setup-eval-config.json: 2 test cases (greenfield + existing project) - Fixtures: setup-bare (empty dir) and setup-existing-node (Express app) - run-evals.sh: --config flag, per-test-case prompts, setup artifact capture https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Tree and modules sections now reflect the setup eval infrastructure: setup-grader.js, setup-eval-config.json, run-evals.sh, and fixtures. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

@param

The AUTO:tree and AUTO:modules markers were hand-maintained because generate-docs.js targets user projects, not this repo. Added: - scripts/repo-generate-docs.js: repo-specific wrapper that scans skills/, scripts/, tests/ and builds tree + module index - Pre-commit hook now runs repo-generate-docs.js to auto-regenerate - Fixed extractJSDocDescription to try multi-line before single-line (was picking up @param lines instead of file-level descriptions) - Added 'results' to SKIP_DIRS so eval output dirs are excluded https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

The test cache check printed "skipping" but never exited, so tests ran anyway. Added exit 0 after the cache hit message. Added .env, .env.local, coverage/, dist/, build/ to .gitignore to match the patterns /setup installs for user projects. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

The skill now runs 6 smoke checks after installation before presenting the summary: hook executability, enforcement scripts, CLAUDE.md sections, agent config validity, auto-documentation pipeline, and linter. Failures are fixed inline before proceeding. Summary renumbered to Phase 7 and now includes verification results. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

coderabbitai · 2026-03-24T21:51:31Z

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3093e200-4355-4332-9950-0508ee36ef33

📥 Commits

Reviewing files that changed from the base of the PR and between d241f82 and b962bfa.

📒 Files selected for processing (25)

.claude/rules/code-quality.md
.claude/rules/tdd.md
.claude/settings.json
.gitignore
CLAUDE.md
README.md
scripts/README.md
scripts/hooks/pre-commit
scripts/hooks/pre-push
scripts/install-hooks.sh
scripts/repo-generate-docs.js
skills/setup/SKILL.md
skills/setup/scripts/lib/generate-docs-helpers.js
tests/evals/README.md
tests/evals/fixtures/setup-bare/.gitkeep
tests/evals/fixtures/setup-existing-node/package.json
tests/evals/fixtures/setup-existing-node/src/index.ts
tests/evals/fixtures/setup-existing-node/src/utils.ts
tests/evals/fixtures/setup-existing-node/tsconfig.json
tests/evals/run-evals.sh
tests/evals/setup-eval-config.json
tests/evals/setup-grader.js
tests/scripts/generate-docs-helpers.test.js
tests/scripts/generate-docs.test.js
tests/scripts/repo-generate-docs.test.js

📝 Walkthrough

Walkthrough

The PR establishes comprehensive enforcement and evaluation infrastructure for the project. It introduces code quality and TDD rules, adds git hooks for secret scanning and file-size validation, creates a documentation generation system, extends the evaluation framework to support the /setup skill with test fixtures and graders, and provides Claude-specific operational guidance for development workflow.

Changes

Cohort / File(s)	Summary
Code Quality & TDD Rules `.claude/rules/code-quality.md`, `.claude/rules/tdd.md`	Define mandatory code constraints: 300-line file limit, 50-line function guideline, and TDD workflow with pre-implementation testing requirements. Specify refactoring red flags and enforcement script isolation rules.
Claude Configuration `.claude/settings.json`	Introduce session startup hook to display git status and hook installation state; establish command allowlist for permitted dev/test operations and denylist for destructive/high-risk commands.
Documentation & Guidance `CLAUDE.md`, `scripts/README.md`, `tests/evals/README.md`	Add comprehensive Claude operational playbook with quality gates, pre-merge checklist, and critical gotchas; document internal maintenance scripts and expand eval documentation to cover both `/readiness` and `/setup` skills with separate configuration and fixture descriptions.
README & Config Updates `README.md`, `.gitignore`	Revise project description from "field guide" to "reference guide" with more measured phrasing throughout; expand gitignore to exclude `package-lock.json`, environment files, and build/coverage artifacts.
Git Hooks & Installation `scripts/hooks/pre-commit`, `scripts/hooks/pre-push`, `scripts/install-hooks.sh`	Add pre-commit hook for secret scanning, file-size validation, and doc regeneration; add pre-push hook for file-size re-check and SHA-based smart test caching; provide hook installation script.
Documentation Generation `scripts/repo-generate-docs.js`	Create Node.js script to auto-regenerate `AUTO:tree` and `AUTO:modules` markers in `CLAUDE.md` by scanning repository structure and module metadata; support `--check` mode for CI validation.
Documentation Helper Updates `skills/setup/scripts/lib/generate-docs-helpers.js`	Enhance JSDoc parsing to prioritize multi-line blocks over single-line comments; add `results` directory to skip list during tree/module scanning; refine comment extraction logic.
Test Evaluation Framework `tests/evals/run-evals.sh`	Extend evaluation runner with `--config` CLI support for multi-skill evaluation; add `/setup` artifact capture (CLAUDE.md, settings, rules files); implement skill-specific grader resolution and prompt selection; narrow marketplace install test to readiness skill only.
Setup Skill Evaluation `tests/evals/setup-eval-config.json`, `tests/evals/setup-grader.js`	Define setup skill test cases (`setup-bare`, `setup-existing-node`) with comprehensive output validation: file presence/executability, JSON validity, required sections, enforcement configuration, and conversation term checks. Implement setup-specific grader script.
Setup Skill Verification `skills/setup/SKILL.md`	Add explicit Phase 6 "Verify Setup" requiring six smoke-test checks (hooks, enforcement scripts, CLAUDE.md sections, settings/rules validity) with re-run requirement until all pass; update Phase 7 summary and next steps.
Setup Test Fixture `tests/evals/fixtures/setup-existing-node/*`	Add Express/TypeScript project fixture: `package.json` (with build/dev/start scripts), `src/index.ts` (health/greeting routes), `src/utils.ts` (helper functions), and `tsconfig.json` for TypeScript configuration.
Test Suites `tests/scripts/generate-docs-helpers.test.js`, `tests/scripts/repo-generate-docs.test.js`	Add comprehensive test suites for documentation generation helpers (JSDoc parsing, export extraction, tree/module building) and repo-level doc generation (marker updates, `--check` validation). Refactor existing `tests/scripts/generate-docs.test.js` to remove duplicated test coverage now in helpers suite.

Sequence Diagram(s)

sequenceDiagram
    participant Developer as Developer<br/>(git commit)
    participant PreCommit as Pre-commit Hook<br/>(scripts/hooks/pre-commit)
    participant SecretChk as Secret Checker<br/>(check-secrets.js)
    participant SizeChk as Size Checker<br/>(line count validation)
    participant DocGen as Doc Generator<br/>(repo-generate-docs.js)
    participant Git as Git<br/>(staging area)

    Developer->>PreCommit: Trigger on staged files
    PreCommit->>SecretChk: Run secret scanning
    SecretChk-->>PreCommit: Pass/Fail
    alt secrets detected
        PreCommit-->>Developer: Abort commit
    end
    PreCommit->>SizeChk: Check staged JS files<br/>for > 300 lines
    SizeChk-->>PreCommit: Violations list
    alt violations found
        PreCommit-->>Developer: Abort commit
    end
    PreCommit->>DocGen: Regenerate markers<br/>(AUTO:tree, AUTO:modules)
    DocGen->>Git: Update CLAUDE.md
    DocGen-->>PreCommit: Success
    PreCommit-->>Developer: Allow commit

sequenceDiagram
    participant User as Claude Agent<br/>(/setup skill)
    participant Skill as Setup Skill<br/>(SKILL.md flow)
    participant Generator as Generate Docs<br/>(repo-generate-docs.js)
    participant Hooks as Install Hooks<br/>(install-hooks.sh)
    participant Grader as Setup Grader<br/>(setup-grader.js)
    participant Validator as Validation Engine<br/>(checks: files, JSON, sections)

    User->>Skill: Request project setup
    Skill->>Generator: Auto-generate documentation
    Generator-->>Skill: Update CLAUDE.md markers
    Skill->>Hooks: Install git hooks
    Hooks-->>Skill: Hooks installed & executable
    Skill->>Skill: Phase 6: Verify Setup<br/>(6 smoke tests)
    alt verification fails
        Skill-->>User: Fix issues & re-run
    end
    Skill->>Grader: Provide artifact directory
    Grader->>Validator: Run file/JSON/section checks
    Validator-->>Grader: Accumulate check results
    Grader-->>User: Return pass/fail score<br/>& details

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Add path-scoped rules, mechanical enforcement, and React no-useEffect rule #2: Adds .claude/rules/ files and git-hook enforcement mechanisms with identical file size and TDD rule definitions.
feat: add /readiness skill with report card and mobile-friendly image #3: Extends enforcement and evaluation tooling including overlapping rule files, hook infrastructure, and evaluation framework modifications.
feat: add defensive settings.json with allow/deny permission lists #1: Introduces .claude/settings.json with command allowlist and denylist permissions configuration.

Poem

🐰 Hooks and rules now guard the way,
Docs regenerate every day,
Tests verify what you create,
Setup runs smoothly—oh, how great!
Quality gates keep chaos at bay,
Enforcement magic—hip-hip-hooray! 🎉

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch claude/cleanup-readme-slop-gRGnp

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

claude and others added 12 commits March 24, 2026 19:25

Update README.md

52e229d

Made small edits

Add package-lock.json to .gitignore

b806ed3

This repo has no package.json at root (it's a plugin, not an npm package). The lockfile is an artifact from npx invocations and should not be tracked. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Add README to scripts/ clarifying it's for repo maintenance only

59d7f2b

Distinguishes scripts/ (internal repo hooks) from skills/setup/scripts/ (templates shipped to user projects by /setup). https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

Update CLAUDE.md auto-generated sections with new eval files

6a0cfdb

Tree and modules sections now reflect the setup eval infrastructure: setup-grader.js, setup-eval-config.json, run-evals.sh, and fixtures. https://claude.ai/code/session_01Hbxy31TkbujzukGFSxLcPw

jrenaldi79 merged commit f697012 into main Mar 24, 2026
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add setup eval suite, auto-doc pipeline, and verification phase#10

Add setup eval suite, auto-doc pipeline, and verification phase#10
jrenaldi79 merged 12 commits intomainfrom
claude/cleanup-readme-slop-gRGnp

jrenaldi79 commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

Uh oh!

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jrenaldi79 commented Mar 24, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

Uh oh!

coderabbitai Bot commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jrenaldi79 commented Mar 24, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Mar 24, 2026 •

edited

Loading