Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 7 additions & 1 deletion .claude/CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,11 @@ for the full developer guide.
| `pre-bash-block-no-verify.sh` | PreToolUse | Bash | Block `--no-verify` in git commands |
| `pre-bash-git-push-reminder.sh` | PreToolUse | Bash | Warn to run `just review` before push |
| `pre-bash-commit-quality.sh` | PreToolUse | Bash | Scan staged `.py` files for secrets/debug markers |
| `pre-bash-coverage-gate.sh` | PreToolUse | Bash | Warn before `git commit` if coverage below threshold |
| `pre-config-protection.sh` | PreToolUse | Write\|Edit\|MultiEdit | Block weakening ruff/pyright config edits |
| `pre-protect-uv-lock.sh` | PreToolUse | Write\|Edit | Block direct edits to `uv.lock` |
| `pre-write-src-test-reminder.sh` | PreToolUse | Write\|Edit | Warn if test file missing for new source module |
| `pre-write-src-require-test.sh` | PreToolUse | Write\|Edit | Block if the matching test file does not exist (strict TDD) |
| `pre-write-src-test-reminder.sh` | PreToolUse | Write\|Edit | Warn if test file missing for new source module (optional alternative to strict TDD) |
| `pre-write-doc-file-warning.sh` | PreToolUse | Write | Block `.md` files written outside `docs/` |
| `pre-write-jinja-syntax.sh` | PreToolUse | Write | Validate Jinja2 syntax before writing `.jinja` files |
| `pre-suggest-compact.sh` | PreToolUse | Edit\|Write | Suggest `/compact` every ~50 operations |
Expand All @@ -68,6 +70,7 @@ for the full developer guide.
| `post-edit-markdown.sh` | PostToolUse | Edit | Warn if existing `.md` edited outside `docs/` |
| `post-edit-copier-migration.sh` | PostToolUse | Edit\|Write | Migration checklist after `copier.yml` edits |
| `post-edit-template-mirror.sh` | PostToolUse | Edit\|Write | Remind to mirror `template/.claude/` ↔ root |
| `post-edit-refactor-test-guard.sh` | PostToolUse | Edit\|Write | Remind to run tests after several `src/` or `scripts/` edits |
| `post-bash-pr-created.sh` | PostToolUse | Bash | Log PR URL after `gh pr create` |
| `stop-session-end.sh` | Stop | * | Persist session state JSON |
| `stop-evaluate-session.sh` | Stop | * | Extract reusable patterns from transcript |
Expand Down Expand Up @@ -106,6 +109,9 @@ One Markdown file per slash command. Claude Code reads these files when you invo
| `/ci` | Run `just ci` and report results |
| `/test` | Run `just test` and summarise failures |
| `/dependency-check` | Validate `uv.lock` is committed, in sync, and not stale |
| `/tdd-red` | Validate RED phase: confirm a test fails for the right reason |
| `/tdd-green` | Validate GREEN phase: confirm the test passes with no regressions |
| `/ci-fix` | Autonomous CI fixer: diagnose failures, apply fixes, re-run until green |

## rules/

Expand Down
88 changes: 88 additions & 0 deletions .claude/commands/ci-fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
Diagnose and fix all CI failures in this project. You are an autonomous CI-fixing agent.

## Step 1 — Run CI and capture output

```bash
just ci 2>&1
```

If CI passes (exit 0), report success and stop.

## Step 2 — Classify failures

Parse the output and classify each failure into one of these categories:

| Category | Tool | Typical pattern | Fix strategy |
|---|---|---|---|
| **Format** | ruff format | "would reformat" | Run `just fmt` — fully automatic |
| **Lint** | ruff check | `E`, `F`, `I`, `UP`, `B`, `SIM`, `C4`, `RUF`, `PERF`, `T20` | Run `just fix` for auto-fixable; manual fix for others |
| **Docstring** | ruff `D` rules | `D1xx`, `D2xx`, `D3xx`, `D4xx` | Add or fix Google-style docstrings |
| **Type** | basedpyright | "error: Type X cannot be assigned to Y" | Fix type annotations, add casts, narrow types |
| **Test** | pytest | "FAILED tests/..." | Fix test or implementation logic |
| **Coverage** | pytest-cov | "FAIL Required test coverage" | Add missing tests |
| **Import** | ruff `I` rules | `I001` | Run `just fix` — auto-fixable |

## Step 3 — Fix in priority order

Fix failures in this order (cheapest and most impactful first):

1. **Format + Import** — `just fix && just fmt` (fully automatic)
2. **Lint** — `just fix` for auto-fixable, then manual fixes
3. **Docstrings** — add missing Google-style docstrings. Read the `python-docstrings`
skill (`skills/python-docstrings/SKILL.md`) for guidance if available.
4. **Type errors** — fix annotations. Read the `python-code-quality` skill
(`skills/python-code-quality/SKILL.md`) for basedpyright guidance if available.
5. **Test failures** — read the failing test, understand what it asserts, fix the
implementation or the test. Read the `pytest` skill (`skills/pytest/SKILL.md`)
for patterns if available.
6. **Coverage** — identify uncovered lines with `just coverage`, write tests.

## Step 4 — Iterate

After fixing each category:

1. Re-run `just ci 2>&1`
2. If new failures appear, classify and fix them
3. Repeat until CI passes (exit 0)

**Hard limit:** Do not attempt more than 5 full CI iterations. If still failing after 5
rounds, report the remaining failures and ask the user for guidance.

## Step 5 — Report

When CI passes (or after hitting the iteration limit), produce a summary:

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
CI Fix Report
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Status: ✅ ALL GREEN (or ❌ FAILURES REMAIN)
Iterations: N
Duration: ~Xm

Fixes applied:
- [Format] Reformatted 3 files
- [Lint] Fixed 5 ruff violations (2 auto, 3 manual)
- [Docstring] Added docstrings to 2 functions
- [Type] Fixed 1 type annotation in core.py
- [Test] Fixed assertion in test_core.py

Files changed:
- src/mypackage/core.py
- tests/test_core.py

Remaining issues: (if any)
- <description>
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

## Rules

- Fix the root cause, not the symptom. A `# type: ignore` is not a fix.
- Do not weaken linter or type checker configuration.
- Do not delete or skip tests to make CI pass.
- Do not use `--no-verify` on any git command.
- Every fix must preserve existing test behaviour — run tests after each change.
- If a fix requires architectural changes beyond the scope of the failing code,
stop and ask the user instead of making sweeping changes.
48 changes: 48 additions & 0 deletions .claude/commands/tdd-green.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
Validate the GREEN phase of a TDD cycle: confirm that the previously-failing test now PASSES and no regressions were introduced.

Run the full test suite with `just test`.

Then analyse the output:

1. **Identify the target test** — the test that was RED in the previous phase.

2. **Check three conditions:**
- The target test now PASSES
- ALL other tests still pass (no regressions)
- No new warnings or errors were introduced

3. **Report the result** in this format:

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
TDD GREEN Validation
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Target test: <test name>
Status: PASS ✅ (or FAIL ❌)

Full suite: X passed, Y failed, Z skipped
Regressions: None (or: <list of newly failing tests>)

Verdict: GREEN CONFIRMED — ready for REFACTOR
(or: NOT GREEN — <what to fix>)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

4. If GREEN is confirmed, also run a quick coverage check:
```bash
uv run pytest --cov=src --cov-report=term-missing -q
```
Report coverage on the module that was changed. Flag any uncovered lines.

5. If GREEN is confirmed, remind the user: "Now review and refactor. Tests must stay green throughout. Run `just test` after every change."

6. If GREEN is not confirmed:
- If the target test still fails: the implementation is incomplete. Show the error.
- If other tests regressed: the implementation broke existing behaviour. List the regressions.
- Advise fixing without over-engineering — GREEN means minimal, not perfect.

7. Reset the refactor edit counter if it exists:
```bash
echo "0" > .claude/.refactor-edit-count 2>/dev/null || true
```
41 changes: 41 additions & 0 deletions .claude/commands/tdd-red.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
Validate the RED phase of a TDD cycle: confirm that a specific test FAILS for the right reason.

Run the test suite with `just test` (or the specific test file if the user provides one).

Then analyse the output:

1. **Identify the new/target test** — the user should tell you which test they expect to fail, or you should identify the most recently written test.

2. **Classify the failure** using this table:

| Failure type | Meaning | Verdict |
|---|---|---|
| `AssertionError` | Function exists, wrong behaviour | ✅ Ideal RED — proceed to GREEN |
| `AttributeError` / `ImportError` | Module or function doesn't exist yet | ✅ Good RED — proceed to GREEN |
| `NameError` | Name not defined | ✅ Good RED — proceed to GREEN |
| `SyntaxError` / `IndentationError` | Test itself is broken | ❌ Fix the test first |
| `TypeError` (wrong signature) | Signature mismatch in test | ❌ Fix the test first |
| Test passes unexpectedly | Implementation already exists or test is wrong | ❌ Review the test |
| Unrelated test fails | Regression in existing code | ❌ Investigate before continuing |

3. **Report the result** in this format:

```
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
TDD RED Validation
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Test: <test name>
File: <test file path>
Status: FAIL ✅ (or PASS ❌ or ERROR ❌)
Type: <failure type from table>
Message: <first line of error message>

Verdict: RED CONFIRMED — ready for GREEN
(or: FIX NEEDED — <what to fix>)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
```

4. If RED is confirmed, remind the user: "Now write the minimal implementation to make this test pass. No more, no less."

5. If RED is not confirmed, explain exactly what needs fixing before the TDD cycle can continue.
5 changes: 4 additions & 1 deletion .claude/hooks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -270,9 +270,11 @@ exit 0
| `pre-bash-block-no-verify.sh` | PreToolUse | Bash | Block `git --no-verify` |
| `pre-bash-git-push-reminder.sh` | PreToolUse | Bash | Warn to run `just review` before push |
| `pre-bash-commit-quality.sh` | PreToolUse | Bash | Secret/debug scan before `git commit` |
| `pre-bash-coverage-gate.sh` | PreToolUse | Bash | Warn before `git commit` if coverage below threshold |
| `pre-config-protection.sh` | PreToolUse | Write\|Edit\|MultiEdit | Block weakening ruff/pyright config edits |
| `pre-protect-uv-lock.sh` | PreToolUse | Write\|Edit | Block direct edits to `uv.lock` |
| `pre-write-src-test-reminder.sh` | PreToolUse | Write\|Edit | Warn if `tests/<pkg>/test_<module>.py` missing for top-level `src/<pkg>/<module>.py` |
| `pre-write-src-require-test.sh` | PreToolUse | Write\|Edit | Block if `tests/<pkg>/test_<module>.py` missing for top-level `src/<pkg>/<module>.py` (strict TDD) |
| `pre-write-src-test-reminder.sh` | (optional) | Write\|Edit | Non-blocking alternative to `pre-write-src-require-test.sh` — **do not register both** |
| `pre-write-doc-file-warning.sh` | PreToolUse | Write | Block `.md` files outside `docs/` |
| `pre-write-jinja-syntax.sh` | PreToolUse | Write | Validate Jinja2 syntax before writing |
| `pre-suggest-compact.sh` | PreToolUse | Edit\|Write | Suggest `/compact` every 50 operations |
Expand All @@ -283,6 +285,7 @@ exit 0
| `post-edit-copier-migration.sh` | PostToolUse | Edit\|Write | Migration checklist after `copier.yml` edits |
| `post-edit-template-mirror.sh` | PostToolUse | Edit\|Write | Remind to mirror `template/.claude/` ↔ root |
| `post-bash-pr-created.sh` | PostToolUse | Bash | Log PR URL and suggest review commands |
| `post-edit-refactor-test-guard.sh` | PostToolUse | Edit\|Write | Remind to run tests after several `src/` or `scripts/` edits |
| `stop-session-end.sh` | Stop | * | Persist session state JSON |
| `stop-evaluate-session.sh` | Stop | * | Extract reusable patterns from transcript |
| `stop-cost-tracker.sh` | Stop | * | Track and accumulate session costs |
Expand Down
63 changes: 63 additions & 0 deletions .claude/hooks/post-edit-refactor-test-guard.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/env bash
# Claude PostToolUse hook — Edit|Write
# Refactor safety net: after editing a src/**/*.py or scripts/**/*.py file, remind the developer to
# re-run tests if they haven't done so recently.
#
# This hook tracks the last test run via a timestamp file (.last-test-run). If
# more than 3 source edits have occurred since the last test run, it surfaces a
# reminder. This prevents silent regressions during the REFACTOR stage of TDD.
#
# Exits: 0 always (PostToolUse hooks cannot block)

set -euo pipefail

INPUT=$(cat)

FILE_PATH=$(printf '%s' "$INPUT" | python3 -c '
import json, sys

data = json.load(sys.stdin)
print(data.get("tool_input", {}).get("file_path", ""))
') || {
echo "$INPUT"
exit 0
}

# Only care about Python source files (not tests).
if [[ "$FILE_PATH" != *.py ]] || [[ -z "$FILE_PATH" ]] || [[ ! -f "$FILE_PATH" ]]; then
exit 0
fi

# Normalise path.
FP="${FILE_PATH//\\//}"
FP="${FP#./}"

# Only fire for src/ or scripts/ files.
if [[ ! "$FP" =~ ^(.*/)?(src|scripts)/ ]]; then
exit 0
fi

# Track edit count since last test run.
COUNTER_FILE=".claude/.refactor-edit-count"
mkdir -p .claude

if [[ -f "$COUNTER_FILE" ]]; then
COUNT=$(cat "$COUNTER_FILE" 2>/dev/null || echo "0")
COUNT=$((COUNT + 1))
else
COUNT=1
fi

echo "$COUNT" > "$COUNTER_FILE"

# Remind after every 3 edits.
if [[ $((COUNT % 3)) -eq 0 ]]; then
echo "┌─ Refactor test guard"
echo "│"
echo "│ ${COUNT} source edits since last test run."
echo "│ Run \`just test\` to catch regressions early."
echo "│ In TDD: tests must stay green throughout REFACTOR."
echo "└─ ⚠ Consider running tests"
fi

exit 0
75 changes: 75 additions & 0 deletions .claude/hooks/pre-bash-coverage-gate.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#!/usr/bin/env bash
# Claude PreToolUse hook — Bash
# Coverage gate: before git commit, run a quick coverage check and warn if
# coverage has dropped below the project threshold (default 85%).
#
# This is a non-blocking warning. The pre-commit hook and CI will enforce the
# hard gate — this hook gives earlier feedback so the developer can add missing
# tests before committing.
#
# Exits: 0 always (non-blocking; warnings on stderr only)

set -uo pipefail

INPUT=$(cat)

COMMAND=$(python3 - <<'PYEOF'
import json, sys

data = json.loads(sys.stdin.read())
print(data.get("tool_input", {}).get("command", ""))
PYEOF
<<<"$INPUT") || { echo "$INPUT"; exit 0; }

# Only fire for git commit commands.
if ! echo "$COMMAND" | grep -qE '^\s*git\s+commit\b'; then
echo "$INPUT"
exit 0
fi

# Check if any staged Python files exist.
STAGED_PY=$(git diff --cached --name-only --diff-filter=ACMR 2>/dev/null \
| grep '\.py$' || true)

if [[ -z "$STAGED_PY" ]]; then
echo "$INPUT"
exit 0
fi

# Run the test suite with coverage (collect-only would not execute tests, so no
# coverage data would be produced). Non-zero exit from pytest is ignored; we
# parse the terminal report for TOTAL coverage.
echo "┌─ Coverage gate" >&2

# Note: do not hardcode the package path here; let the project's coverage config
# decide what is measured (e.g. src/ for generated projects, tests/ for this repo).
COV_OUTPUT=$(uv run --active pytest -q --cov --cov-report=term-missing \
--cov-fail-under=85 2>&1 || true)

# Extract the total coverage percentage.
TOTAL_COV=$(echo "$COV_OUTPUT" | grep -oE 'TOTAL\s+[0-9]+\s+[0-9]+\s+([0-9]+)%' \
| grep -oE '[0-9]+%$' || echo "")

if [[ -z "$TOTAL_COV" ]]; then
# Could not determine coverage — skip silently.
echo "│ Could not determine coverage — skipping check" >&2
echo "└─" >&2
echo "$INPUT"
exit 0
fi

COV_NUM="${TOTAL_COV%\%}"

if [[ "$COV_NUM" -lt 85 ]]; then
echo "│" >&2
echo "│ ⚠ Coverage is ${TOTAL_COV} (threshold: 85%)" >&2
echo "│ Consider adding tests before committing." >&2
echo "│ Run \`just coverage\` to see which lines are uncovered." >&2
echo "└─ ⚠ Coverage below threshold" >&2
else
echo "│ Coverage: ${TOTAL_COV} ✓" >&2
echo "└─ ✓ Coverage gate passed" >&2
fi

echo "$INPUT"
exit 0
Loading
Loading