chore(test): install audit-harness v0.1.0 + author CLAUDE.md (P6 batch) by jeremylongshore · Pull Request #4 · intent-solutions-io/executive-intent

jeremylongshore · 2026-05-02T00:06:12Z

P6 fan-out for executive-intent. CLAUDE.md authored from scratch + harness vendored + 000-docs entry. Resolves the 'no CLAUDE.md' P6 prereq for this repo.

Refs jeremylongshore/intentsolutions-vps-runbook#2

jeremylongshore.com made me do it
-claude
intentsolutions.io

Two-in-one P6 fan-out (executive-intent had no CLAUDE.md): 1. CLAUDE.md authored from scratch covering stack, dev commands, source layout, secrets guidance, testing baseline, filing standard 2. @intentsolutions/audit-harness v0.1.0 vendored + wrapper at scripts/audit-harness 3. 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md filed Resolves the "no CLAUDE.md" prereq listed at OPS-e1s for executive-intent. Refs jeremylongshore/intentsolutions-vps-runbook#2 jeremylongshore.com made me do it -claude intentsolutions.io

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-05-02T00:06:21Z

Warning

Rate limit exceeded

@jeremylongshore has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 11 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c6b68b7c-8d9b-49fd-952e-318bd8e3c966

📥 Commits

Reviewing files that changed from the base of the PR and between 23b7f9f and cc700b8.

📒 Files selected for processing (13)

.audit-harness/CHANGELOG.md
.audit-harness/LICENSE
.audit-harness/README.md
.audit-harness/VERSION
.audit-harness/scripts/arch-check.sh
.audit-harness/scripts/bias-count.sh
.audit-harness/scripts/crap-score.py
.audit-harness/scripts/escape-scan.sh
.audit-harness/scripts/gherkin-lint.sh
.audit-harness/scripts/harness-hash.sh
000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md
CLAUDE.md
scripts/audit-harness

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/install-audit-harness-baseline

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Review rate limit: 0/1 reviews remaining, refill in 11 seconds.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

qodo-code-review · 2026-05-02T00:06:27Z

Review Summary by Qodo

Install audit-harness v0.1.0 + author CLAUDE.md for testing baseline

✨ Enhancement 🧪 Tests

Walkthroughs

Description

• Install audit-harness v0.1.0 vendored at .audit-harness/ with shell wrapper
• Author CLAUDE.md covering stack, dev commands, testing baseline, secrets guidance
• Add 6 deterministic enforcement scripts: hash-pinning, escape-scan, architecture checks, bias
  counting, Gherkin linting, CRAP scoring
• File SOP document (009-OD-SOPS-audit-harness-baseline-2026-05-01.md) for P6 fan-out batch

Diagram

flowchart LR
  A["audit-harness v0.1.0"] -->|vendored| B[".audit-harness/scripts/"]
  B -->|contains| C["arch-check.sh"]
  B -->|contains| D["escape-scan.sh"]
  B -->|contains| E["bias-count.sh"]
  B -->|contains| F["gherkin-lint.sh"]
  B -->|contains| G["harness-hash.sh"]
  B -->|contains| H["crap-score.py"]
  I["scripts/audit-harness"] -->|wrapper| B
  J["CLAUDE.md"] -->|documents| K["Stack & Dev Commands"]
  J -->|documents| L["Testing Baseline"]
  J -->|references| I
  M["000-docs/009-OD-SOPS"] -->|files| N["Installation & Deferred Tasks"]

File Changes

1. .audit-harness/scripts/arch-check.sh ✨ Enhancement +143/-0

Language-aware architecture constraint dispatcher

.audit-harness/scripts/arch-check.sh

2. .audit-harness/scripts/bias-count.sh ✨ Enhancement +88/-0

Test bias pattern counter and quality grader

.audit-harness/scripts/bias-count.sh

3. .audit-harness/scripts/escape-scan.sh ✨ Enhancement +171/-0

AI escape attempt detector in diffs

.audit-harness/scripts/escape-scan.sh

View more (10)

4. .audit-harness/scripts/gherkin-lint.sh ✨ Enhancement +111/-0

Advisory Gherkin quality checker with fallback

.audit-harness/scripts/gherkin-lint.sh

5. .audit-harness/scripts/harness-hash.sh ✨ Enhancement +116/-0

SHA-256 manifest for pinned policy files

.audit-harness/scripts/harness-hash.sh

6. .audit-harness/scripts/crap-score.py ✨ Enhancement +385/-0

Multi-language CRAP complexity-coverage scorer

.audit-harness/scripts/crap-score.py

7. .audit-harness/CHANGELOG.md 📝 Documentation +28/-0

Version 0.1.0 release notes and design decisions

.audit-harness/CHANGELOG.md

8. .audit-harness/LICENSE 📝 Documentation +21/-0

MIT license for audit-harness package

.audit-harness/LICENSE

9. .audit-harness/README.md 📝 Documentation +135/-0

Comprehensive usage guide and taxonomy documentation

.audit-harness/README.md

10. .audit-harness/VERSION ⚙️ Configuration changes +1/-0

Version identifier v0.1.0

.audit-harness/VERSION

11. 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md 📝 Documentation +41/-0

SOP document for harness installation and baseline

000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md

12. CLAUDE.md 📝 Documentation +85/-0

Repository guidance for Claude Code with testing baseline

CLAUDE.md

13. scripts/audit-harness ✨ Enhancement +57/-0

Shell wrapper dispatcher for vendored harness commands

scripts/audit-harness

qodo-code-review · 2026-05-02T00:06:29Z

Code Review by Qodo

🐞 Bugs (5) 📘 Rule violations (1)

1. ~/.claude paths referenced 📘 Rule violation ☼ Reliability

Description

New docs reference developer-local ~/.claude/ paths, which breaks the in-repo/reproducible
audit-harness baseline requirement. This violates the rule that repo guidance should not depend on
~/.claude-local assets.

Code

CLAUDE.md[R47-61]

+This repo currently uses environment variables (Firebase Hosting + GCP service account credentials, Supabase keys, Nightfall API key, Inngest signing key). Per `~/.claude/CLAUDE.md` § "SOPS + age secrets standard", this repo should adopt the canonical 4-file pattern:
+
+```bash
+cd ~/000-projects/executive-intent
+sops-init
+```
+
+Tracked separately under VPS-as-the-home Priority 6 (`OPS-z9b`).
+
+## Testing baseline (2026-05-01 — Intent Solutions Testing SOP)
+
+This repo participates in the **Intent Solutions Testing SOP** per `~/.claude/CLAUDE.md` § "Intent Solutions Testing SOP" and the VPS-as-the-home program (`OPS-5nm`, Priority 6).
+
+**Installed**: `@intentsolutions/audit-harness v0.1.0` vendored at `.audit-harness/` with wrapper at `scripts/audit-harness`. Hash-pinning + escape-scan ride along the in-repo install — never reference `~/.claude/` paths from hooks or CI.
+

Evidence
PR Compliance ID 8 forbids referencing ~/.claude/ paths and requires using the in-repo vendored
harness. The newly added repository documentation includes ~/.claude/CLAUDE.md references.
CLAUDE.md
CLAUDE.md[47-61]
000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md[38-40]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
Repo docs currently reference developer-local `~/.claude/` paths, which violates the requirement that the audit harness baseline be in-repo and not depend on `~/.claude`.

## Issue Context
Compliance requires hooks/CI and repo guidance to avoid `~/.claude` path dependencies and instead point to the vendored `.audit-harness/` and `scripts/audit-harness` wrapper (or to repo-contained documentation).

## Fix Focus Areas
- CLAUDE.md[47-61]
- 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md[38-40]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

2. Go test CRAP ignored 🐞 Bug ≡ Correctness

Description

score_go() passes -ignore '.*\.go$' when scoring tests, which makes gocyclo ignore all Go
files and return no complexity output. Test-method CRAP scores (and thus test CRAP threshold
enforcement) never run.

Code

.audit-harness/scripts/crap-score.py[R150-170]

+def score_go(root: Path, kind: str) -> list[MethodScore]:
+    if which_or_none("gocyclo") is None:
+        print("[crap-score] gocyclo not installed", file=sys.stderr)
+        return []
+
+    rc, out, _ = run(["gocyclo", "-ignore", "_test.go" if kind == "src" else ".*\\.go$", "."], root)
+    complexity: list[tuple[str, str, int]] = []
+    for line in out.splitlines():
+        parts = line.strip().split()
+        if len(parts) < 4:
+            continue
+        try:
+            c = int(parts[0])
+        except ValueError:
+            continue
+        pkg = parts[1]
+        func = parts[2]
+        fpath = parts[3].split(":", 1)[0]
+        include = fpath.endswith("_test.go") if kind == "test" else not fpath.endswith("_test.go")
+        if include:
+            complexity.append((fpath, f"{pkg}.{func}", c))

Evidence

For kind == "test", the ignore regex matches every .go file, so out from gocyclo is empty
and the complexity list remains empty; since test blockers are derived from these scores, test
CRAP checks cannot fail.

.audit-harness/scripts/crap-score.py[150-170]
.audit-harness/scripts/crap-score.py[344-355]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`score_go()` uses `gocyclo -ignore '.*\\.go$'` when `kind == "test"`, which effectively ignores every Go source file (including `_test.go`). This produces no complexity output, so the test CRAP report is always empty.

## Issue Context
The function already has logic to include/exclude `_test.go` based on `kind`, so `-ignore` does not need to try to pre-filter test vs src and should not accidentally exclude everything.

## Fix Focus Areas
- .audit-harness/scripts/crap-score.py[150-171]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

3. CRAP tool missing false-pass 🐞 Bug ☼ Reliability

Description

When required analyzers (radon/gocyclo/complexity-report) are missing, the scorer returns an empty
score set, but main() still reports pass=true and exits 0 because it only checks for
blockers/avg-fail. This can make gates succeed even though no CRAP analysis ran.

Code

.audit-harness/scripts/crap-score.py[R344-381]

+    src_scores = [s for s in all_scores if s.kind == "src"]
+    test_scores = [s for s in all_scores if s.kind == "test"]
+    prod_max = max((s.crap for s in src_scores), default=0.0)
+    test_max = max((s.crap for s in test_scores), default=0.0)
+    prod_avg = (sum(s.crap for s in src_scores) / len(src_scores)) if src_scores else 0.0
+
+    prod_blockers = [asdict(s) for s in src_scores if s.crap > args.threshold_prod]
+    test_blockers = [asdict(s) for s in test_scores if s.crap > args.threshold_test]
+    avg_fail = prod_avg > args.threshold_avg
+
+    pass_ = not (prod_blockers or test_blockers or avg_fail)
+
+    summary = {
+        "language": lang,
+        "thresholds": {
+            "production_max": args.threshold_prod,
+            "test_max": args.threshold_test,
+            "project_avg_max": args.threshold_avg,
+        },
+        "production": {
+            "methods_scored": len(src_scores),
+            "max_crap": round(prod_max, 2),
+            "avg_crap": round(prod_avg, 2),
+            "blockers": prod_blockers,
+        },
+        "test": {
+            "methods_scored": len(test_scores),
+            "max_crap": round(test_max, 2),
+            "blockers": test_blockers,
+        },
+        "pass": pass_,
+    }
+
+    if args.format in ("json", "both"):
+        (out_dir / "summary.json").write_text(json.dumps(summary, indent=2))
+
+    print(json.dumps({"pass": pass_, "summary_path": str(out_dir / "summary.json")}))
+    return 0 if pass_ else 1

Evidence
Multiple score_* functions return [] when tooling is absent, but main() treats an empty result
set as passing because prod_blockers, test_blockers, and avg_fail all evaluate false when no
scores exist; additionally, this repo’s package.json does not declare complexity-report, making
the JS path likely to hit the missing-tool branch.
.audit-harness/scripts/crap-score.py[99-102]
.audit-harness/scripts/crap-score.py[151-153]
.audit-harness/scripts/crap-score.py[203-207]
.audit-harness/scripts/crap-score.py[344-381]
package.json[1-1]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`crap-score.py` returns exit code 0 (pass) when no scoring occurs due to missing analyzers, because an empty score list produces no blockers and no avg-fail.

## Issue Context
This is especially likely for JS/TS in this repo because `complexity-report` is not listed in `devDependencies`. A deterministic quality gate should not silently pass when it cannot execute.

## Fix Focus Areas
- .audit-harness/scripts/crap-score.py[99-102]
- .audit-harness/scripts/crap-score.py[151-153]
- .audit-harness/scripts/crap-score.py[203-207]
- .audit-harness/scripts/crap-score.py[344-381]
- package.json[1-1]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

View more (1)

4. Strict gherkin-lint never fails 🐞 Bug ≡ Correctness

Description

In the awk fallback, most WARN/ERROR lines are printed directly by awk and do not increment
WARN_COUNT/ERROR_COUNT, so --strict and error handling often still exit 0. This defeats
enforcement when gherkin-lint isn’t installed.

Code

.audit-harness/scripts/gherkin-lint.sh[R34-111]

+WARN_COUNT=0
+ERROR_COUNT=0
+
+warn() { echo "WARN  $1:$2 $3"; WARN_COUNT=$((WARN_COUNT + 1)); }
+err()  { echo "ERROR $1:$2 $3"; ERROR_COUNT=$((ERROR_COUNT + 1)); }
+
+# 1. Prefer official gherkin-lint if available
+if command -v gherkin-lint >/dev/null 2>&1; then
+  echo "gherkin-lint: using installed linter"
+  if ! gherkin-lint "$PATH_ARG"; then
+    ERROR_COUNT=1
+  fi
+else
+  echo "gherkin-lint: falling back to awk rubric (install gherkin-lint for full rules)"
+
+  while IFS= read -r -d '' feature; do
+    # Imperative verbs / CSS selectors in steps (declarative warning)
+    awk -v file="$feature" '
+      /^[[:space:]]*(Given|When|Then|And|But)/ {
+        line = $0
+        if (line ~ /click|type|fill[ _]in|press|select.*from[ _]dropdown/) {
+          printf "WARN  %s:%d imperative verb in step (prefer declarative)\n", file, NR
+        }
+        if (line ~ /#[a-zA-Z][-a-zA-Z0-9_]*|\.[a-zA-Z][-a-zA-Z0-9_]*[[:space:]]|xpath/) {
+          printf "WARN  %s:%d CSS selector / xpath in step (prefer business language)\n", file, NR
+        }
+      }
+    ' "$feature"
+
+    # Scenario length (> 10 steps)
+    awk -v file="$feature" '
+      /^[[:space:]]*Scenario/ { sc = NR; steps = 0; sn = $0; next }
+      /^[[:space:]]*(Given|When|Then|And|But)/ { if (sc) steps++ }
+      /^[[:space:]]*Scenario|^[[:space:]]*Feature|^$/ {
+        if (sc && steps > 10) {
+          printf "WARN  %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
+        }
+        if (NR != sc) { sc = 0; steps = 0 }
+      }
+      END {
+        if (sc && steps > 10) {
+          printf "WARN  %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
+        }
+      }
+    ' "$feature"
+
+    # Repeated Givens without Background (3+ identical Given lines)
+    dupe=$(awk '/^[[:space:]]*Given/ { print }' "$feature" | sort | uniq -c | awk '$1 >= 3 { print }')
+    if [[ -n "$dupe" ]] && ! grep -q "^[[:space:]]*Background:" "$feature"; then
+      warn "$feature" 0 "repeated Given lines without Background block"
+    fi
+
+    # "And" at scenario start (grammar error)
+    awk -v file="$feature" '
+      prev_blank = 1
+      /^[[:space:]]*$/ { prev_blank = 1; next }
+      /^[[:space:]]*Scenario/ { in_scenario = 1; step_count = 0; next }
+      /^[[:space:]]*(Given|When|Then|And|But)/ {
+        if (in_scenario && step_count == 0 && /^[[:space:]]*And/) {
+          printf "ERROR %s:%d scenario starts with And (use Given/When/Then)\n", file, NR
+        }
+        step_count++
+      }
+    ' "$feature"
+
+  done < <(find "$PATH_ARG" -name "*.feature" -print0)
+fi
+
+echo ""
+echo "gherkin-lint summary: $WARN_COUNT warning(s), $ERROR_COUNT error(s)"
+
+if [[ "$ERROR_COUNT" -gt 0 ]]; then
+  exit 1
+fi
+if [[ "$STRICT" -eq 1 && "$WARN_COUNT" -gt 0 ]]; then
+  exit 1
+fi
+exit 0

Evidence
The script’s exit behavior depends on WARN_COUNT/ERROR_COUNT, but the awk rubric emits
warnings/errors without calling the shell warn()/err() functions, leaving counters at 0 even
when issues are printed.
.audit-harness/scripts/gherkin-lint.sh[34-38]
.audit-harness/scripts/gherkin-lint.sh[51-61]
.audit-harness/scripts/gherkin-lint.sh[63-78]
.audit-harness/scripts/gherkin-lint.sh[87-97]
.audit-harness/scripts/gherkin-lint.sh[102-110]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
When falling back to awk, WARN/ERROR output is not reflected in `WARN_COUNT`/`ERROR_COUNT`, so `--strict` and error exits do not reliably trigger.

## Issue Context
`warn()`/`err()` update counters, but the awk blocks currently `printf` directly. The exit behavior must be based on the same signals being printed.

## Fix Focus Areas
- .audit-harness/scripts/gherkin-lint.sh[34-38]
- .audit-harness/scripts/gherkin-lint.sh[49-100]
- .audit-harness/scripts/gherkin-lint.sh[102-110]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

5. escape-scan unbound args 🐞 Bug ☼ Reliability

Description

escape-scan.sh runs with set -u but reads $2 for --range and $1 after shift for
--no-hash without validating arguments exist. This can crash the scan with an unbound-variable
error instead of returning a controlled usage error code.

Code

.audit-harness/scripts/escape-scan.sh[R23-43]

+set -euo pipefail
+
+DIFF_SRC=""
+VERIFY_HASH=1
+ROOT="${ROOT:-$(pwd)}"
+HASH_SCRIPT="$(dirname "$0")/harness-hash.sh"
+
+if [[ "$#" -eq 0 ]]; then
+  echo "escape-scan: pass a diff source (- for stdin, --staged, --range, or a patch file)" >&2
+  exit 2
+fi
+
+case "$1" in
+  -) DIFF_SRC="/dev/stdin" ;;
+  --staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;;
+  --range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;;
+  --no-hash) VERIFY_HASH=0; shift; DIFF_SRC="$1" ;;
+  --help|-h)
+    sed -n '2,22p' "$0"; exit 0 ;;
+  *) DIFF_SRC="$1" ;;
+esac

Evidence

With set -u, referencing missing positional parameters aborts execution; the flag parsing directly
dereferences $2 and $1 in branches that can be invoked with insufficient args.

.audit-harness/scripts/escape-scan.sh[23-43]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`escape-scan.sh` can abort due to unbound positional parameters when `--range` is passed without a range or `--no-hash` is passed without a diff source.

## Issue Context
Because the script uses `set -euo pipefail`, missing args should be handled via explicit validation and a clear error message + exit 2, rather than a shell crash.

## Fix Focus Areas
- .audit-harness/scripts/escape-scan.sh[30-48]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

6. bias-count fails without bc 🐞 Bug ☼ Reliability

Description

bias-count.sh uses bc for arithmetic/comparisons under set -e but never checks that bc
exists. On minimal environments this fails with bc: not found instead of producing a report.

Code

.audit-harness/scripts/bias-count.sh[R54-86]

+# Assertion density
+if [ "$TEST_COUNT" -gt 0 ]; then
+  DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc)
+else
+  DENSITY="0"
+fi
+
+# Per-100 bias rate
+if [ "$TEST_COUNT" -gt 0 ]; then
+  RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc)
+else
+  RATE="0"
+fi
+
+echo "SUMMARY"
+echo "─────────────────────────────────────"
+printf "  %-30s %d\n" "Test functions" "$TEST_COUNT"
+printf "  %-30s %d\n" "Total assertions" "$ASSERT_COUNT"
+printf "  %-30s %s\n" "Assertion density" "$DENSITY per test"
+printf "  %-30s %d\n" "Bias patterns found" "$TOTAL_BIAS"
+printf "  %-30s %s\n" "Per-100-tests rate" "$RATE"
+echo
+
+# Grade
+if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then
+  echo "  Grade: LOW — no action needed"
+elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then
+  echo "  Grade: MODERATE — review flagged tests"
+elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then
+  echo "  Grade: HIGH — systematic remediation needed"
+else
+  echo "  Grade: CRITICAL — full rewrite of flagged tests"
+fi

Evidence
The script pipes computations and comparisons to bc multiple times, and set -euo pipefail means
missing bc terminates the script immediately.
.audit-harness/scripts/bias-count.sh[8-9]
.audit-harness/scripts/bias-count.sh[55-56]
.audit-harness/scripts/bias-count.sh[62-64]
.audit-harness/scripts/bias-count.sh[78-85]

Agent prompt

The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`bias-count.sh` assumes `bc` is installed; if it is missing, the script aborts due to `set -e`.

## Issue Context
To keep the harness portable/deterministic, either check `command -v bc` early and exit with a clear message/code, or use `awk` for division/comparison.

## Fix Focus Areas
- .audit-harness/scripts/bias-count.sh[8-9]
- .audit-harness/scripts/bias-count.sh[54-66]
- .audit-harness/scripts/bias-count.sh[77-86]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

gemini-code-assist

Code Review

This pull request introduces the @intentsolutions/audit-harness toolkit, which provides a suite of scripts for deterministic test enforcement, architecture validation, and AI escape detection. Key features include a multi-language CRAP scorer, a test bias pattern counter, and a SHA-256 manifest system for pinning critical policy files. Feedback focuses on improving the portability of the shell scripts by replacing the bc dependency with awk, adding fallback support for macOS hashing utilities, and ensuring proper cleanup of temporary files created during diff scanning.

gemini-code-assist · 2026-05-02T00:07:30Z

+  DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc)
+else
+  DENSITY="0"
+fi
+
+# Per-100 bias rate
+if [ "$TEST_COUNT" -gt 0 ]; then
+  RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc)


The script depends on bc for floating-point arithmetic, which is often not installed in minimal CI environments (e.g., Alpine-based images). Using awk is more portable as it is a standard POSIX utility usually present in environments where bash is available.

Suggested change

DENSITY=$(echo "scale=2; $ASSERT_COUNT / $TEST_COUNT" | bc)

else

DENSITY="0"

fi

# Per-100 bias rate

if [ "$TEST_COUNT" -gt 0 ]; then

RATE=$(echo "scale=1; $TOTAL_BIAS * 100 / $TEST_COUNT" | bc)

DENSITY=$(awk "BEGIN {printf \"%.2f\", $ASSERT_COUNT / $TEST_COUNT}")

else

DENSITY="0"

fi

# Per-100 bias rate

if [ "$TEST_COUNT" -gt 0 ]; then

RATE=$(awk "BEGIN {printf \"%.1f\", $TOTAL_BIAS * 100 / $TEST_COUNT}")

gemini-code-assist · 2026-05-02T00:07:30Z

+if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then
+  echo "  Grade: LOW — no action needed"
+elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then
+  echo "  Grade: MODERATE — review flagged tests"
+elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then
+  echo "  Grade: HIGH — systematic remediation needed"
+else
+  echo "  Grade: CRITICAL — full rewrite of flagged tests"
+fi


Continuing the removal of the bc dependency for better portability, the grading logic can also be implemented using awk.

Suggested change

if [ "$(echo "$RATE <= 5" | bc)" -eq 1 ]; then

echo " Grade: LOW — no action needed"

elif [ "$(echo "$RATE <= 15" | bc)" -eq 1 ]; then

echo " Grade: MODERATE — review flagged tests"

elif [ "$(echo "$RATE <= 30" | bc)" -eq 1 ]; then

echo " Grade: HIGH — systematic remediation needed"

else

echo " Grade: CRITICAL — full rewrite of flagged tests"

fi

if awk "BEGIN {exit !($RATE <= 5)}"; then

echo " Grade: LOW — no action needed"

elif awk "BEGIN {exit !($RATE <= 15)}"; then

echo " Grade: MODERATE — review flagged tests"

elif awk "BEGIN {exit !($RATE <= 30)}"; then

echo " Grade: HIGH — systematic remediation needed"

else

echo " Grade: CRITICAL — full rewrite of flagged tests"

fi

gemini-code-assist · 2026-05-02T00:07:30Z

+  --staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;;
+  --range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;;


Temporary files created with mktemp are not cleaned up. Adding a trap ensures these files are removed when the script exits, preventing clutter in the temporary directory.

Suggested change

--staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;;

--range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;;

--staged) DIFF_SRC=$(mktemp); trap 'rm -f "$DIFF_SRC"' EXIT; git diff --cached > "$DIFF_SRC" ;;

--range) DIFF_SRC=$(mktemp); trap 'rm -f "$DIFF_SRC"' EXIT; git diff "$2" > "$DIFF_SRC"; shift ;;

gemini-code-assist · 2026-05-02T00:07:30Z

+    return 0
+  fi
+  while IFS= read -r f; do
+    printf '%s  %s\n' "$(sha256sum "$f" | awk '{print $1}')" "$f"


sha256sum is not available by default on macOS (which uses shasum -a 256). Using a fallback mechanism makes the hashing script portable across Linux and macOS environments.

Suggested change

printf '%s %s\n' "$(sha256sum "$f" | awk '{print $1}')" "$f"

printf '%s %s\n' "$( (sha256sum "$f" 2>/dev/null || shasum -a 256 "$f") | awk '{print $1}')" "$f"

qodo-code-review · 2026-05-02T00:10:00Z

+This repo currently uses environment variables (Firebase Hosting + GCP service account credentials, Supabase keys, Nightfall API key, Inngest signing key). Per `~/.claude/CLAUDE.md` § "SOPS + age secrets standard", this repo should adopt the canonical 4-file pattern:
+
+```bash
+cd ~/000-projects/executive-intent
+sops-init
+```
+
+Tracked separately under VPS-as-the-home Priority 6 (`OPS-z9b`).
+
+## Testing baseline (2026-05-01 — Intent Solutions Testing SOP)
+
+This repo participates in the **Intent Solutions Testing SOP** per `~/.claude/CLAUDE.md` § "Intent Solutions Testing SOP" and the VPS-as-the-home program (`OPS-5nm`, Priority 6).
+
+**Installed**: `@intentsolutions/audit-harness v0.1.0` vendored at `.audit-harness/` with wrapper at `scripts/audit-harness`. Hash-pinning + escape-scan ride along the in-repo install — never reference `~/.claude/` paths from hooks or CI.
+


1. ~/.claude paths referenced 📘 Rule violation ☼ Reliability

New docs reference developer-local ~/.claude/ paths, which breaks the in-repo/reproducible audit-harness baseline requirement. This violates the rule that repo guidance should not depend on ~/.claude-local assets.

Agent Prompt

## Issue description Repo docs currently reference developer-local `~/.claude/` paths, which violates the requirement that the audit harness baseline be in-repo and not depend on `~/.claude`. ## Issue Context Compliance requires hooks/CI and repo guidance to avoid `~/.claude` path dependencies and instead point to the vendored `.audit-harness/` and `scripts/audit-harness` wrapper (or to repo-contained documentation). ## Fix Focus Areas - CLAUDE.md[47-61] - 000-docs/009-OD-SOPS-audit-harness-baseline-2026-05-01.md[38-40]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-02T00:10:00Z

+def score_go(root: Path, kind: str) -> list[MethodScore]:
+    if which_or_none("gocyclo") is None:
+        print("[crap-score] gocyclo not installed", file=sys.stderr)
+        return []
+
+    rc, out, _ = run(["gocyclo", "-ignore", "_test.go" if kind == "src" else ".*\\.go$", "."], root)
+    complexity: list[tuple[str, str, int]] = []
+    for line in out.splitlines():
+        parts = line.strip().split()
+        if len(parts) < 4:
+            continue
+        try:
+            c = int(parts[0])
+        except ValueError:
+            continue
+        pkg = parts[1]
+        func = parts[2]
+        fpath = parts[3].split(":", 1)[0]
+        include = fpath.endswith("_test.go") if kind == "test" else not fpath.endswith("_test.go")
+        if include:
+            complexity.append((fpath, f"{pkg}.{func}", c))


2. Go test crap ignored 🐞 Bug ≡ Correctness

score_go() passes -ignore '.*\.go$' when scoring tests, which makes gocyclo ignore all Go files and return no complexity output. Test-method CRAP scores (and thus test CRAP threshold enforcement) never run.

Agent Prompt

## Issue description `score_go()` uses `gocyclo -ignore '.*\\.go$'` when `kind == "test"`, which effectively ignores every Go source file (including `_test.go`). This produces no complexity output, so the test CRAP report is always empty. ## Issue Context The function already has logic to include/exclude `_test.go` based on `kind`, so `-ignore` does not need to try to pre-filter test vs src and should not accidentally exclude everything. ## Fix Focus Areas - .audit-harness/scripts/crap-score.py[150-171]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-02T00:10:00Z

+    src_scores = [s for s in all_scores if s.kind == "src"]
+    test_scores = [s for s in all_scores if s.kind == "test"]
+    prod_max = max((s.crap for s in src_scores), default=0.0)
+    test_max = max((s.crap for s in test_scores), default=0.0)
+    prod_avg = (sum(s.crap for s in src_scores) / len(src_scores)) if src_scores else 0.0
+
+    prod_blockers = [asdict(s) for s in src_scores if s.crap > args.threshold_prod]
+    test_blockers = [asdict(s) for s in test_scores if s.crap > args.threshold_test]
+    avg_fail = prod_avg > args.threshold_avg
+
+    pass_ = not (prod_blockers or test_blockers or avg_fail)
+
+    summary = {
+        "language": lang,
+        "thresholds": {
+            "production_max": args.threshold_prod,
+            "test_max": args.threshold_test,
+            "project_avg_max": args.threshold_avg,
+        },
+        "production": {
+            "methods_scored": len(src_scores),
+            "max_crap": round(prod_max, 2),
+            "avg_crap": round(prod_avg, 2),
+            "blockers": prod_blockers,
+        },
+        "test": {
+            "methods_scored": len(test_scores),
+            "max_crap": round(test_max, 2),
+            "blockers": test_blockers,
+        },
+        "pass": pass_,
+    }
+
+    if args.format in ("json", "both"):
+        (out_dir / "summary.json").write_text(json.dumps(summary, indent=2))
+
+    print(json.dumps({"pass": pass_, "summary_path": str(out_dir / "summary.json")}))
+    return 0 if pass_ else 1


3. Crap tool missing false-pass 🐞 Bug ☼ Reliability

When required analyzers (radon/gocyclo/complexity-report) are missing, the scorer returns an empty score set, but main() still reports pass=true and exits 0 because it only checks for blockers/avg-fail. This can make gates succeed even though no CRAP analysis ran.

Agent Prompt

## Issue description `crap-score.py` returns exit code 0 (pass) when no scoring occurs due to missing analyzers, because an empty score list produces no blockers and no avg-fail. ## Issue Context This is especially likely for JS/TS in this repo because `complexity-report` is not listed in `devDependencies`. A deterministic quality gate should not silently pass when it cannot execute. ## Fix Focus Areas - .audit-harness/scripts/crap-score.py[99-102] - .audit-harness/scripts/crap-score.py[151-153] - .audit-harness/scripts/crap-score.py[203-207] - .audit-harness/scripts/crap-score.py[344-381] - package.json[1-1]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

qodo-code-review · 2026-05-02T00:10:00Z

+WARN_COUNT=0
+ERROR_COUNT=0
+
+warn() { echo "WARN  $1:$2 $3"; WARN_COUNT=$((WARN_COUNT + 1)); }
+err()  { echo "ERROR $1:$2 $3"; ERROR_COUNT=$((ERROR_COUNT + 1)); }
+
+# 1. Prefer official gherkin-lint if available
+if command -v gherkin-lint >/dev/null 2>&1; then
+  echo "gherkin-lint: using installed linter"
+  if ! gherkin-lint "$PATH_ARG"; then
+    ERROR_COUNT=1
+  fi
+else
+  echo "gherkin-lint: falling back to awk rubric (install gherkin-lint for full rules)"
+
+  while IFS= read -r -d '' feature; do
+    # Imperative verbs / CSS selectors in steps (declarative warning)
+    awk -v file="$feature" '
+      /^[[:space:]]*(Given|When|Then|And|But)/ {
+        line = $0
+        if (line ~ /click|type|fill[ _]in|press|select.*from[ _]dropdown/) {
+          printf "WARN  %s:%d imperative verb in step (prefer declarative)\n", file, NR
+        }
+        if (line ~ /#[a-zA-Z][-a-zA-Z0-9_]*|\.[a-zA-Z][-a-zA-Z0-9_]*[[:space:]]|xpath/) {
+          printf "WARN  %s:%d CSS selector / xpath in step (prefer business language)\n", file, NR
+        }
+      }
+    ' "$feature"
+
+    # Scenario length (> 10 steps)
+    awk -v file="$feature" '
+      /^[[:space:]]*Scenario/ { sc = NR; steps = 0; sn = $0; next }
+      /^[[:space:]]*(Given|When|Then|And|But)/ { if (sc) steps++ }
+      /^[[:space:]]*Scenario|^[[:space:]]*Feature|^$/ {
+        if (sc && steps > 10) {
+          printf "WARN  %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
+        }
+        if (NR != sc) { sc = 0; steps = 0 }
+      }
+      END {
+        if (sc && steps > 10) {
+          printf "WARN  %s:%d scenario has %d steps (>10 is too long)\n", file, sc, steps
+        }
+      }
+    ' "$feature"
+
+    # Repeated Givens without Background (3+ identical Given lines)
+    dupe=$(awk '/^[[:space:]]*Given/ { print }' "$feature" | sort | uniq -c | awk '$1 >= 3 { print }')
+    if [[ -n "$dupe" ]] && ! grep -q "^[[:space:]]*Background:" "$feature"; then
+      warn "$feature" 0 "repeated Given lines without Background block"
+    fi
+
+    # "And" at scenario start (grammar error)
+    awk -v file="$feature" '
+      prev_blank = 1
+      /^[[:space:]]*$/ { prev_blank = 1; next }
+      /^[[:space:]]*Scenario/ { in_scenario = 1; step_count = 0; next }
+      /^[[:space:]]*(Given|When|Then|And|But)/ {
+        if (in_scenario && step_count == 0 && /^[[:space:]]*And/) {
+          printf "ERROR %s:%d scenario starts with And (use Given/When/Then)\n", file, NR
+        }
+        step_count++
+      }
+    ' "$feature"
+
+  done < <(find "$PATH_ARG" -name "*.feature" -print0)
+fi
+
+echo ""
+echo "gherkin-lint summary: $WARN_COUNT warning(s), $ERROR_COUNT error(s)"
+
+if [[ "$ERROR_COUNT" -gt 0 ]]; then
+  exit 1
+fi
+if [[ "$STRICT" -eq 1 && "$WARN_COUNT" -gt 0 ]]; then
+  exit 1
+fi
+exit 0


4. Strict gherkin-lint never fails 🐞 Bug ≡ Correctness

In the awk fallback, most WARN/ERROR lines are printed directly by awk and do not increment WARN_COUNT/ERROR_COUNT, so --strict and error handling often still exit 0. This defeats enforcement when gherkin-lint isn’t installed.

Agent Prompt

## Issue description When falling back to awk, WARN/ERROR output is not reflected in `WARN_COUNT`/`ERROR_COUNT`, so `--strict` and error exits do not reliably trigger. ## Issue Context `warn()`/`err()` update counters, but the awk blocks currently `printf` directly. The exit behavior must be based on the same signals being printed. ## Fix Focus Areas - .audit-harness/scripts/gherkin-lint.sh[34-38] - .audit-harness/scripts/gherkin-lint.sh[49-100] - .audit-harness/scripts/gherkin-lint.sh[102-110]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

jeremylongshore merged commit 281f135 into main May 2, 2026

jeremylongshore deleted the feat/install-audit-harness-baseline branch May 2, 2026 00:06

greptile-apps Bot reviewed May 2, 2026

View reviewed changes

gemini-code-assist Bot reviewed May 2, 2026

View reviewed changes

qodo-code-review Bot reviewed May 2, 2026

View reviewed changes

		--staged) DIFF_SRC=$(mktemp); git diff --cached > "$DIFF_SRC" ;;
		--range) DIFF_SRC=$(mktemp); git diff "$2" > "$DIFF_SRC"; shift ;;

	printf '%s %s\n' "$(sha256sum "$f" \| awk '{print $1}')" "$f"
	printf '%s %s\n' "$( (sha256sum "$f" 2>/dev/null \|\| shasum -a 256 "$f") \| awk '{print $1}')" "$f"

Conversation

jeremylongshore commented May 2, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented May 2, 2026

Rate limit exceeded

Uh oh!

qodo-code-review Bot commented May 2, 2026

Review Summary by Qodo

Walkthroughs

File Changes

Uh oh!

qodo-code-review Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review by Qodo

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

qodo-code-review Bot May 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

qodo-code-review Bot commented May 2, 2026 •

edited

Loading