chore(test): install audit-harness v0.1.0 + author CLAUDE.md (P6 batch)#4
Conversation
Two-in-one P6 fan-out (executive-intent had no CLAUDE.md): 1. CLAUDE.md authored from scratch covering stack, dev commands, source layout, secrets guidance, testing baseline, filing standard 2. @intentsolutions/audit-harness v0.1.0 vendored + wrapper at scripts/audit-harness 3. 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md filed Resolves the "no CLAUDE.md" prereq listed at OPS-e1s for executive-intent. Refs jeremylongshore/intentsolutions-vps-runbook#2 jeremylongshore.com made me do it -claude intentsolutions.io
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
|
Warning Rate limit exceeded
To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (13)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Review rate limit: 0/1 reviews remaining, refill in 11 seconds.Comment |
Review Summary by QodoInstall audit-harness v0.1.0 + author CLAUDE.md for testing baseline
WalkthroughsDescription• Install audit-harness v0.1.0 vendored at .audit-harness/ with shell wrapper • Author CLAUDE.md covering stack, dev commands, testing baseline, secrets guidance • Add 6 deterministic enforcement scripts: hash-pinning, escape-scan, architecture checks, bias counting, Gherkin linting, CRAP scoring • File SOP document (009-OD-SOPS-audit-harness-baseline-2026-05-01.md) for P6 fan-out batch Diagramflowchart LR
A["audit-harness v0.1.0"] -->|vendored| B[".audit-harness/scripts/"]
B -->|contains| C["arch-check.sh"]
B -->|contains| D["escape-scan.sh"]
B -->|contains| E["bias-count.sh"]
B -->|contains| F["gherkin-lint.sh"]
B -->|contains| G["harness-hash.sh"]
B -->|contains| H["crap-score.py"]
I["scripts/audit-harness"] -->|wrapper| B
J["CLAUDE.md"] -->|documents| K["Stack & Dev Commands"]
J -->|documents| L["Testing Baseline"]
J -->|references| I
M["000-docs/009-OD-SOPS"] -->|files| N["Installation & Deferred Tasks"]
File Changes1. .audit-harness/scripts/arch-check.sh
|
Code Review by Qodo
1. ~/.claude paths referenced
|
There was a problem hiding this comment.
Code Review
This pull request introduces the @intentsolutions/audit-harness toolkit, which provides a suite of scripts for deterministic test enforcement, architecture validation, and AI escape detection. Key features include a multi-language CRAP scorer, a test bias pattern counter, and a SHA-256 manifest system for pinning critical policy files. Feedback focuses on improving the portability of the shell scripts by replacing the bc dependency with awk, adding fallback support for macOS hashing utilities, and ensuring proper cleanup of temporary files created during diff scanning.
| DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc) | ||
| else | ||
| DENSITY="0" | ||
| fi | ||
|
|
||
| # Per-100 bias rate | ||
| if [ "$TEST_COUNT" -gt 0 ]; then | ||
| RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc) |
There was a problem hiding this comment.
The script depends on bc for floating-point arithmetic, which is often not installed in minimal CI environments (e.g., Alpine-based images). Using awk is more portable as it is a standard POSIX utility usually present in environments where bash is available.
| DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc) | |
| else | |
| DENSITY="0" | |
| fi | |
| # Per-100 bias rate | |
| if [ "$TEST_COUNT" -gt 0 ]; then | |
| RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc) | |
| DENSITY=$(awk "BEGIN {printf \"%.2f\", $ASSERT_COUNT / $TEST_COUNT}") | |
| else | |
| DENSITY="0" | |
| fi | |
| # Per-100 bias rate | |
| if [ "$TEST_COUNT" -gt 0 ]; then | |
| RATE=$(awk "BEGIN {printf \"%.1f\", $TOTAL_BIAS * 100 / $TEST_COUNT}") |
| if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then | ||
| echo " Grade: LOW — no action needed" | ||
| elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then | ||
| echo " Grade: MODERATE — review flagged tests" | ||
| elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then | ||
| echo " Grade: HIGH — systematic remediation needed" | ||
| else | ||
| echo " Grade: CRITICAL — full rewrite of flagged tests" | ||
| fi |
There was a problem hiding this comment.
Continuing the removal of the bc dependency for better portability, the grading logic can also be implemented using awk.
| if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then | |
| echo " Grade: LOW — no action needed" | |
| elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then | |
| echo " Grade: MODERATE — review flagged tests" | |
| elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then | |
| echo " Grade: HIGH — systematic remediation needed" | |
| else | |
| echo " Grade: CRITICAL — full rewrite of flagged tests" | |
| fi | |
| if awk "BEGIN {exit !($RATE <= 5)}"; then | |
| echo " Grade: LOW — no action needed" | |
| elif awk "BEGIN {exit !($RATE <= 15)}"; then | |
| echo " Grade: MODERATE — review flagged tests" | |
| elif awk "BEGIN {exit !($RATE <= 30)}"; then | |
| echo " Grade: HIGH — systematic remediation needed" | |
| else | |
| echo " Grade: CRITICAL — full rewrite of flagged tests" | |
| fi |
| --staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;; | ||
| --range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;; |
There was a problem hiding this comment.
Temporary files created with mktemp are not cleaned up. Adding a trap ensures these files are removed when the script exits, preventing clutter in the temporary directory.
| --staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;; | |
| --range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;; | |
| --staged) DIFF_SRC=$(mktemp); trap 'rm -f "$DIFF_SRC"' EXIT; git diff --cached > "$DIFF_SRC" ;; | |
| --range) DIFF_SRC=$(mktemp); trap 'rm -f "$DIFF_SRC"' EXIT; git diff "$2" > "$DIFF_SRC"; shift ;; |
| return 0 | ||
| fi | ||
| while IFS= read -r f; do | ||
| printf '%s %s\n' "$(sha256sum "$f" | awk '{print $1}')" "$f" |
There was a problem hiding this comment.
sha256sum is not available by default on macOS (which uses shasum -a 256). Using a fallback mechanism makes the hashing script portable across Linux and macOS environments.
| printf '%s %s\n' "$(sha256sum "$f" | awk '{print $1}')" "$f" | |
| printf '%s %s\n' "$( (sha256sum "$f" 2>/dev/null || shasum -a 256 "$f") | awk '{print $1}')" "$f" |
| This repo currently uses environment variables (Firebase Hosting + GCP service account credentials, Supabase keys, Nightfall API key, Inngest signing key). Per `~/.claude/CLAUDE.md` § "SOPS + age secrets standard", this repo should adopt the canonical 4-file pattern: | ||
|
|
||
| ```bash | ||
| cd ~/000-projects/executive-intent | ||
| sops-init | ||
| ``` | ||
|
|
||
| Tracked separately under VPS-as-the-home Priority 6 (`OPS-z9b`). | ||
|
|
||
| ## Testing baseline (2026-05-01 — Intent Solutions Testing SOP) | ||
|
|
||
| This repo participates in the **Intent Solutions Testing SOP** per `~/.claude/CLAUDE.md` § "Intent Solutions Testing SOP" and the VPS-as-the-home program (`OPS-5nm`, Priority 6). | ||
|
|
||
| **Installed**: `@intentsolutions/audit-harness v0.1.0` vendored at `.audit-harness/` with wrapper at `scripts/audit-harness`. Hash-pinning + escape-scan ride along the in-repo install — never reference `~/.claude/` paths from hooks or CI. | ||
|
|
There was a problem hiding this comment.
1. ~/.claude paths referenced 📘 Rule violation ☼ Reliability
New docs reference developer-local ~/.claude/ paths, which breaks the in-repo/reproducible audit-harness baseline requirement. This violates the rule that repo guidance should not depend on ~/.claude-local assets.
Agent Prompt
## Issue description
Repo docs currently reference developer-local `~/.claude/` paths, which violates the requirement that the audit harness baseline be in-repo and not depend on `~/.claude`.
## Issue Context
Compliance requires hooks/CI and repo guidance to avoid `~/.claude` path dependencies and instead point to the vendored `.audit-harness/` and `scripts/audit-harness` wrapper (or to repo-contained documentation).
## Fix Focus Areas
- CLAUDE.md[47-61]
- 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md[38-40]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| def score_go(root: Path, kind: str) -> list[MethodScore]: | ||
| if which_or_none("gocyclo") is None: | ||
| print("[crap-score] gocyclo not installed", file=sys.stderr) | ||
| return [] | ||
|
|
||
| rc, out, _ = run(["gocyclo", "-ignore", "_test.go" if kind == "src" else ".*\\.go$", "."], root) | ||
| complexity: list[tuple[str, str, int]] = [] | ||
| for line in out.splitlines(): | ||
| parts = line.strip().split() | ||
| if len(parts) < 4: | ||
| continue | ||
| try: | ||
| c = int(parts[0]) | ||
| except ValueError: | ||
| continue | ||
| pkg = parts[1] | ||
| func = parts[2] | ||
| fpath = parts[3].split(":", 1)[0] | ||
| include = fpath.endswith("_test.go") if kind == "test" else not fpath.endswith("_test.go") | ||
| if include: | ||
| complexity.append((fpath, f"{pkg}.{func}", c)) |
There was a problem hiding this comment.
2. Go test crap ignored 🐞 Bug ≡ Correctness
score_go() passes -ignore '.*\.go$' when scoring tests, which makes gocyclo ignore all Go files and return no complexity output. Test-method CRAP scores (and thus test CRAP threshold enforcement) never run.
Agent Prompt
## Issue description
`score_go()` uses `gocyclo -ignore '.*\\.go$'` when `kind == "test"`, which effectively ignores every Go source file (including `_test.go`). This produces no complexity output, so the test CRAP report is always empty.
## Issue Context
The function already has logic to include/exclude `_test.go` based on `kind`, so `-ignore` does not need to try to pre-filter test vs src and should not accidentally exclude everything.
## Fix Focus Areas
- .audit-harness/scripts/crap-score.py[150-171]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| src_scores = [s for s in all_scores if s.kind == "src"] | ||
| test_scores = [s for s in all_scores if s.kind == "test"] | ||
| prod_max = max((s.crap for s in src_scores), default=0.0) | ||
| test_max = max((s.crap for s in test_scores), default=0.0) | ||
| prod_avg = (sum(s.crap for s in src_scores) / len(src_scores)) if src_scores else 0.0 | ||
|
|
||
| prod_blockers = [asdict(s) for s in src_scores if s.crap > args.threshold_prod] | ||
| test_blockers = [asdict(s) for s in test_scores if s.crap > args.threshold_test] | ||
| avg_fail = prod_avg > args.threshold_avg | ||
|
|
||
| pass_ = not (prod_blockers or test_blockers or avg_fail) | ||
|
|
||
| summary = { | ||
| "language": lang, | ||
| "thresholds": { | ||
| "production_max": args.threshold_prod, | ||
| "test_max": args.threshold_test, | ||
| "project_avg_max": args.threshold_avg, | ||
| }, | ||
| "production": { | ||
| "methods_scored": len(src_scores), | ||
| "max_crap": round(prod_max, 2), | ||
| "avg_crap": round(prod_avg, 2), | ||
| "blockers": prod_blockers, | ||
| }, | ||
| "test": { | ||
| "methods_scored": len(test_scores), | ||
| "max_crap": round(test_max, 2), | ||
| "blockers": test_blockers, | ||
| }, | ||
| "pass": pass_, | ||
| } | ||
|
|
||
| if args.format in ("json", "both"): | ||
| (out_dir / "summary.json").write_text(json.dumps(summary, indent=2)) | ||
|
|
||
| print(json.dumps({"pass": pass_, "summary_path": str(out_dir / "summary.json")})) | ||
| return 0 if pass_ else 1 |
There was a problem hiding this comment.
3. Crap tool missing false-pass 🐞 Bug ☼ Reliability
When required analyzers (radon/gocyclo/complexity-report) are missing, the scorer returns an empty score set, but main() still reports pass=true and exits 0 because it only checks for blockers/avg-fail. This can make gates succeed even though no CRAP analysis ran.
Agent Prompt
## Issue description
`crap-score.py` returns exit code 0 (pass) when no scoring occurs due to missing analyzers, because an empty score list produces no blockers and no avg-fail.
## Issue Context
This is especially likely for JS/TS in this repo because `complexity-report` is not listed in `devDependencies`. A deterministic quality gate should not silently pass when it cannot execute.
## Fix Focus Areas
- .audit-harness/scripts/crap-score.py[99-102]
- .audit-harness/scripts/crap-score.py[151-153]
- .audit-harness/scripts/crap-score.py[203-207]
- .audit-harness/scripts/crap-score.py[344-381]
- package.json[1-1]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| WARN_COUNT=0 | ||
| ERROR_COUNT=0 | ||
|
|
||
| warn() { echo "WARN $1:$2 $3"; WARN_COUNT=$((WARN_COUNT + 1)); } | ||
| err() { echo "ERROR $1:$2 $3"; ERROR_COUNT=$((ERROR_COUNT + 1)); } | ||
|
|
||
| # 1. Prefer official gherkin-lint if available | ||
| if command -v gherkin-lint >/dev/null 2>&1; then | ||
| echo "gherkin-lint: using installed linter" | ||
| if ! gherkin-lint "$PATH_ARG"; then | ||
| ERROR_COUNT=1 | ||
| fi | ||
| else | ||
| echo "gherkin-lint: falling back to awk rubric (install gherkin-lint for full rules)" | ||
|
|
||
| while IFS= read -r -d '' feature; do | ||
| # Imperative verbs / CSS selectors in steps (declarative warning) | ||
| awk -v file="$feature" ' | ||
| /^[[:space:]]*(Given|When|Then|And|But)/ { | ||
| line = $0 | ||
| if (line ~ /click|type|fill[ _]in|press|select.*from[ _]dropdown/) { | ||
| printf "WARN %s:%d imperative verb in step (prefer declarative)\n", file, NR | ||
| } | ||
| if (line ~ /#[a-zA-Z][-a-zA-Z0-9_]*|\.[a-zA-Z][-a-zA-Z0-9_]*[[:space:]]|xpath/) { | ||
| printf "WARN %s:%d CSS selector / xpath in step (prefer business language)\n", file, NR | ||
| } | ||
| } | ||
| ' "$feature" | ||
|
|
||
| # Scenario length (> 10 steps) | ||
| awk -v file="$feature" ' | ||
| /^[[:space:]]*Scenario/ { sc = NR; steps = 0; sn = $0; next } | ||
| /^[[:space:]]*(Given|When|Then|And|But)/ { if (sc) steps++ } | ||
| /^[[:space:]]*Scenario|^[[:space:]]*Feature|^$/ { | ||
| if (sc && steps > 10) { | ||
| printf "WARN %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps | ||
| } | ||
| if (NR != sc) { sc = 0; steps = 0 } | ||
| } | ||
| END { | ||
| if (sc && steps > 10) { | ||
| printf "WARN %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps | ||
| } | ||
| } | ||
| ' "$feature" | ||
|
|
||
| # Repeated Givens without Background (3+ identical Given lines) | ||
| dupe=$(awk '/^[[:space:]]*Given/ { print }' "$feature" | sort | uniq -c | awk '$1 >= 3 { print }') | ||
| if [[ -n "$dupe" ]] && ! grep -q "^[[:space:]]*Background:" "$feature"; then | ||
| warn "$feature" 0 "repeated Given lines without Background block" | ||
| fi | ||
|
|
||
| # "And" at scenario start (grammar error) | ||
| awk -v file="$feature" ' | ||
| prev_blank = 1 | ||
| /^[[:space:]]*$/ { prev_blank = 1; next } | ||
| /^[[:space:]]*Scenario/ { in_scenario = 1; step_count = 0; next } | ||
| /^[[:space:]]*(Given|When|Then|And|But)/ { | ||
| if (in_scenario && step_count == 0 && /^[[:space:]]*And/) { | ||
| printf "ERROR %s:%d scenario starts with And (use Given/When/Then)\n", file, NR | ||
| } | ||
| step_count++ | ||
| } | ||
| ' "$feature" | ||
|
|
||
| done < <(find "$PATH_ARG" -name "*.feature" -print0) | ||
| fi | ||
|
|
||
| echo "" | ||
| echo "gherkin-lint summary: $WARN_COUNT warning(s), $ERROR_COUNT error(s)" | ||
|
|
||
| if [[ "$ERROR_COUNT" -gt 0 ]]; then | ||
| exit 1 | ||
| fi | ||
| if [[ "$STRICT" -eq 1 && "$WARN_COUNT" -gt 0 ]]; then | ||
| exit 1 | ||
| fi | ||
| exit 0 |
There was a problem hiding this comment.
4. Strict gherkin-lint never fails 🐞 Bug ≡ Correctness
In the awk fallback, most WARN/ERROR lines are printed directly by awk and do not increment WARN_COUNT/ERROR_COUNT, so --strict and error handling often still exit 0. This defeats enforcement when gherkin-lint isn’t installed.
Agent Prompt
## Issue description
When falling back to awk, WARN/ERROR output is not reflected in `WARN_COUNT`/`ERROR_COUNT`, so `--strict` and error exits do not reliably trigger.
## Issue Context
`warn()`/`err()` update counters, but the awk blocks currently `printf` directly. The exit behavior must be based on the same signals being printed.
## Fix Focus Areas
- .audit-harness/scripts/gherkin-lint.sh[34-38]
- .audit-harness/scripts/gherkin-lint.sh[49-100]
- .audit-harness/scripts/gherkin-lint.sh[102-110]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
P6 fan-out for executive-intent. CLAUDE.md authored from scratch + harness vendored + 000-docs entry. Resolves the 'no CLAUDE.md' P6 prereq for this repo.
Refs jeremylongshore/intentsolutions-vps-runbook#2
jeremylongshore.com made me do it
-claude
intentsolutions.io