From 070378ef40ca1dbe35d98a7affd71fae63d8b74b Mon Sep 17 00:00:00 2001
From: Dean Sharon <deanshrn@gmain.com>
Date: Sat, 21 Mar 2026 01:52:24 +0200
Subject: [PATCH] feat: improve agent quality gates (#120, #121, #124)

Simplifier: replace vague clarity bullets with 8 structured slop
detection categories (language-behavior tests, redundant type checks,
over-defensive handling, debug remnants, commented-out code, unused
imports, verbose names, unnecessary intermediates).

Scrutinizer: add stub detection responsibility between P0 and P1
evaluation, backed by new stub-detection.md reference covering
component stubs, API/service stubs, hook/effect stubs, and wiring gaps.

Shepherd: add goal-backward verification (trace from user goals to
implementation), artifact depth table (Exists/Substantive/Wired), stub
misalignment type, and re-verification section for retry runs.
---
 CHANGELOG.md                                  |   5 +
 shared/agents/scrutinizer.md                  |  10 +-
 shared/agents/shepherd.md                     |  38 +++--
 shared/agents/simplifier.md                   |  39 +++--
 .../self-review/references/stub-detection.md  | 135 ++++++++++++++++++
 5 files changed, 196 insertions(+), 31 deletions(-)
 create mode 100644 shared/skills/self-review/references/stub-detection.md

diff --git a/CHANGELOG.md b/CHANGELOG.md
index 7807c732..11f59ec4 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -7,6 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+- **Simplifier agent** — 8 structured slop detection categories (#120)
+- **Scrutinizer agent** — stub detection patterns with reference file (#121)
+- **Shepherd agent** — goal-backward verification, artifact depth checking, stub type, re-verification (#124)
+
 ---
 
 ## [1.7.0] - 2026-03-20
diff --git a/shared/agents/scrutinizer.md b/shared/agents/scrutinizer.md
index dcdaeb2a..1e26154a 100644
--- a/shared/agents/scrutinizer.md
+++ b/shared/agents/scrutinizer.md
@@ -21,13 +21,15 @@ You receive from orchestrator:
 
 2. **Evaluate P0 pillars** (Design, Functionality, Security): These MUST pass. Fix all issues found.
 
-3. **Evaluate P1 pillars** (Complexity, Error Handling, Tests): These SHOULD pass. Fix all issues found.
+3. **Detect stubs and wiring gaps**: Check for placeholder implementations that compile but don't deliver real functionality. See `references/stub-detection.md` for patterns. Flag as P0-Functionality issues.
 
-4. **Evaluate P2 pillars** (Naming, Consistency, Documentation): Report as suggestions. Fix if straightforward.
+4. **Evaluate P1 pillars** (Complexity, Error Handling, Tests): These SHOULD pass. Fix all issues found.
 
-5. **Commit fixes**: If any changes were made, create a commit with message "fix: address self-review issues".
+5. **Evaluate P2 pillars** (Naming, Consistency, Documentation): Report as suggestions. Fix if straightforward.
 
-6. **Report status**: Return structured report with pillar evaluations and changes made.
+6. **Commit fixes**: If any changes were made, create a commit with message "fix: address self-review issues".
+
+7. **Report status**: Return structured report with pillar evaluations and changes made.
 
 ## Principles
 
diff --git a/shared/agents/shepherd.md b/shared/agents/shepherd.md
index e79ea280..936daa04 100644
--- a/shared/agents/shepherd.md
+++ b/shared/agents/shepherd.md
@@ -21,9 +21,19 @@ You receive from orchestrator:
 
 1. **Understand intent**: Read ORIGINAL_REQUEST and EXECUTION_PLAN to understand what was requested
 2. **Review implementation**: Read FILES_CHANGED to understand what was built
-3. **Check completeness**: Verify all plan steps implemented, all acceptance criteria met
-4. **Check scope**: Identify out-of-scope additions not justified by design improvements
-5. **Report misalignments**: Document issues with sufficient detail for Coder to fix
+3. **Goal-backward verification**: Start from the user's observable goals. For each goal: trace backward through the implementation — is it wired into the running app → does it contain substantive logic → does the file/function exist? Report any goal failing at any depth.
+4. **Check artifact depth**: Classify each deliverable using this scale:
+
+   | Depth | Meaning | Example |
+   |-------|---------|---------|
+   | Exists | File/function created | Route file exists |
+   | Substantive | Contains real logic | Route has validation + DB call |
+   | Wired | Connected to running app | Route registered, imported, reachable |
+
+   Flag anything at "Exists" without reaching "Wired" as `incomplete`.
+5. **Check completeness**: Verify all plan steps implemented, all acceptance criteria met
+6. **Check scope**: Identify out-of-scope additions not justified by design improvements
+7. **Report misalignments**: Document issues with sufficient detail for Coder to fix
 
 ## Principles
 
@@ -51,6 +61,11 @@ Return structured alignment status:
 - Implementation solves: {1-sentence summary}
 - Alignment: aligned | drifted
 
+### Artifact Depth
+| Deliverable | Exists | Substantive | Wired | Status |
+|-------------|--------|-------------|-------|--------|
+| {feature}   | Y/N    | Y/N         | Y/N   | complete/incomplete/stub |
+
 ### Misalignments Found (if MISALIGNED)
 
 | Type | Description | Files | Suggested Fix |
@@ -59,20 +74,17 @@ Return structured alignment status:
 | scope_creep | {what's out of scope} | {file paths} | {remove or justify} |
 | incomplete | {what's partially done} | {file paths} | {what remains} |
 | intent_drift | {how intent drifted} | {file paths} | {how to realign} |
+| stub | {placeholder, not real logic} | {file paths} | {what real implementation needs} |
 
 ### Scope Check
 - Out-of-scope additions: {list or "None"}
 - Justification: {if additions found, are they justified design improvements?}
-```
-
-## Misalignment Types
 
-| Type | Description | Example |
-|------|-------------|---------|
-| `missing` | Functionality in plan not implemented | "Login validation not implemented" |
-| `scope_creep` | Added functionality not in plan | "Analytics tracking added but not requested" |
-| `incomplete` | Partially implemented functionality | "Error handling added but no user-facing messages" |
-| `intent_drift` | Implementation solves different problem | "Built password reset instead of login flow" |
+### Re-verification (if applicable)
+| Previously Failed | Status | Notes |
+|-------------------|--------|-------|
+| {item} | RESOLVED/STILL_FAILING | {details} |
+```
 
 ## Boundaries
 
@@ -81,12 +93,14 @@ Return structured alignment status:
 - Out-of-scope additions not justified by design
 - Partial implementations
 - Intent drift
+- Stubs or placeholders passing as real implementations
 
 **Report as ALIGNED:**
 - All plan steps implemented
 - All acceptance criteria met
 - No unjustified scope additions
 - Implementation matches original intent
+- All deliverables reach "Wired" depth
 
 **Never:**
 - Modify code or create commits
diff --git a/shared/agents/simplifier.md b/shared/agents/simplifier.md
index 4d93fa43..b6a71e07 100644
--- a/shared/agents/simplifier.md
+++ b/shared/agents/simplifier.md
@@ -30,31 +30,39 @@ Analyze recently modified code and apply refinements that:
    - Use proper error handling patterns (avoid try/catch when possible)
    - Maintain consistent naming conventions
 
-3. **Enhance Clarity**: Simplify code structure by:
-
-   - Reducing unnecessary complexity and nesting
-   - Eliminating redundant code and abstractions
-   - Improving readability through clear variable and function names
+3. **Remove Slop**: Detect and remove these categories:
+
+   | Category | Pattern |
+   |----------|---------|
+   | Language-behavior tests | Tests verifying built-in language features work as documented |
+   | Redundant type checks | Runtime checks for types TypeScript already enforces |
+   | Over-defensive handling | try/catch around code that cannot throw |
+   | Debug remnants | console.log, debugger, alert() left behind |
+   | Commented-out code | Dead code preserved in comments |
+   | Unused imports | Imports not referenced anywhere in file |
+   | Verbose names | Unnecessarily long names (`currentUserDataObject` → `user`) |
+   | Unnecessary intermediates | Variables used once, immediately after assignment |
+
+4. **Enhance Clarity**: Simplify code structure by:
+
+   - Reducing unnecessary nesting (early returns, guard clauses)
    - Consolidating related logic
-   - Removing unnecessary comments that describe obvious code
-   - IMPORTANT: Avoid nested ternary operators - prefer switch statements or if/else chains for multiple conditions
-   - Choose clarity over brevity - explicit code is often better than overly compact code
+   - Avoiding nested ternary operators — prefer switch or if/else
+   - Choosing clarity over brevity — explicit code beats compact code
 
-4. **Maintain Balance**: Avoid over-simplification that could:
+5. **Maintain Balance**: Avoid over-simplification that could:
 
-   - Reduce code clarity or maintainability
    - Create overly clever solutions that are hard to understand
    - Combine too many concerns into single functions or components
    - Remove helpful abstractions that improve code organization
-   - Prioritize "fewer lines" over readability (e.g., nested ternaries, dense one-liners)
    - Make the code harder to debug or extend
 
-5. **Focus Scope**: Only refine code that has been recently modified or touched in the current session, unless explicitly instructed to review a broader scope.
+6. **Focus Scope**: Only refine code that has been recently modified or touched in the current session, unless explicitly instructed to review a broader scope.
 
 Your refinement process:
 
 1. Identify the recently modified code sections
-2. Analyze for opportunities to improve elegance and consistency
+2. Analyze for slop categories and clarity improvements
 3. Apply project-specific best practices and coding standards
 4. Ensure all functionality remains unchanged
 5. Verify the refined code is simpler and more maintainable
@@ -87,7 +95,8 @@ Return structured completion status:
 - Files outside the recently modified scope (unless instructed)
 
 **Handle autonomously:**
-- Naming improvements, dead code removal, nesting reduction
+- Slop removal (all 8 categories)
+- Naming improvements, nesting reduction
 - Import sorting and organization
 - Redundant abstraction elimination
-- Comment cleanup (remove obvious, keep non-obvious)
\ No newline at end of file
+- Comment cleanup (remove obvious, keep non-obvious)
diff --git a/shared/skills/self-review/references/stub-detection.md b/shared/skills/self-review/references/stub-detection.md
new file mode 100644
index 00000000..47e21508
--- /dev/null
+++ b/shared/skills/self-review/references/stub-detection.md
@@ -0,0 +1,135 @@
+# Stub Detection Patterns
+
+Placeholder implementations that compile but don't deliver real functionality. Flag as **P0-Functionality** issues.
+
+Cross-reference: `core-patterns/references/code-smell-violations.md` covers hardcoded data and fake functionality labeling. This file focuses on structural stub patterns.
+
+---
+
+## 1. Component Stubs
+
+Render nothing meaningful or return placeholder markup.
+
+```tsx
+// STUB — returns static text, no real rendering
+function UserProfile({ userId }: Props) {
+  return <div>Name</div>;
+}
+
+// REAL — fetches and renders actual data
+function UserProfile({ userId }: Props) {
+  const user = useUser(userId);
+  if (!user) return <Skeleton />;
+  return <div><h2>{user.name}</h2><p>{user.email}</p></div>;
+}
+```
+
+**Patterns to flag:**
+- `return null` / `return <></>` in components that should render content
+- Empty function bodies (`{}`) for handlers or lifecycle methods
+- Components returning only hardcoded strings with no data binding
+
+---
+
+## 2. API / Service Stubs
+
+Functions that exist in signature but throw, return hardcoded values, or do nothing.
+
+```typescript
+// STUB — throws instead of implementing
+async function createOrder(items: CartItem[]): Promise<Order> {
+  throw new Error("TODO: implement");
+}
+
+// STUB — hardcoded return, no real logic
+async function getUser(id: string): Promise<User> {
+  return { id, name: "Test User", email: "test@test.com" };
+}
+
+// REAL — actual implementation
+async function createOrder(items: CartItem[]): Promise<Result<Order, OrderError>> {
+  const validated = validateItems(items);
+  if (!validated.ok) return validated;
+  const order = await db.orders.create({ items: validated.value });
+  await queue.publish("order.created", order);
+  return { ok: true, value: order };
+}
+```
+
+**Patterns to flag:**
+- `throw new Error("TODO")` / `throw new Error("Not implemented")`
+- Functions returning hardcoded objects (no DB/API/computation)
+- `"Not implemented"` strings in response bodies
+- Empty async functions (`async function foo() {}`)
+
+---
+
+## 3. Hook / Effect Stubs
+
+State and effects declared but not wired to behavior.
+
+```typescript
+// STUB — effect does nothing
+useEffect(() => {}, [userId]);
+
+// STUB — state declared, setter never called
+const [items, setItems] = useState<Item[]>([]);
+// ... setItems never appears in the component
+
+// STUB — custom hook returns static value
+function usePermissions(): Permissions {
+  return { canEdit: true, canDelete: false };
+}
+
+// REAL — custom hook with actual logic
+function usePermissions(): Permissions {
+  const { user } = useAuth();
+  const { data } = useQuery(["permissions", user.role], fetchPermissions);
+  return data ?? DEFAULT_PERMISSIONS;
+}
+```
+
+**Patterns to flag:**
+- `useEffect(() => {}, [...])` — empty effect body
+- `useState` where the setter is never called in the component
+- Custom hooks returning static/hardcoded values
+- `useMemo`/`useCallback` wrapping static values
+
+---
+
+## 4. Wiring Gaps
+
+Individual pieces exist but aren't connected to the running application. **Highest-value detection** — these pass compilation and individual tests but the feature doesn't work end-to-end.
+
+```typescript
+// GAP — fetch without await (result discarded)
+function loadDashboard() {
+  fetchMetrics(); // Promise floats, never awaited
+  return <Dashboard />;
+}
+
+// GAP — state declared but never rendered
+const [error, setError] = useState<string | null>(null);
+// ... error never appears in JSX
+
+// GAP — handler defined but not bound
+function handleSubmit(data: FormData) { /* real logic */ }
+// ... <form onSubmit={handleSubmit}> never appears
+
+// GAP — route defined with no-op handler
+app.post("/api/orders", (_req, res) => {
+  res.status(200).json({ ok: true });
+});
+
+// GAP — env var read but unused
+const API_KEY = process.env.STRIPE_API_KEY;
+// ... API_KEY never passed to any client or fetch call
+```
+
+**Patterns to flag:**
+- `fetch`/`axios`/API call without `await` or `.then` (result discarded)
+- State variable (`useState`, `useRef`) never rendered or read in output
+- Event handler defined but not bound to any element/listener
+- Route/endpoint registered with empty or no-op handler
+- Environment variable or config read but never used downstream
+- Import used only in type position but imported as value (in non-type-only import)