[Bug] Claude never proactively surfaces its own mistakes — requires user to act as quality gate

## Bug Description

Claude Code does not proactively identify or surface its own mistakes during task execution. Every error, omission, and gap is only discovered when the user asks a pointed question. Claude never spontaneously says "wait, I think I missed something" or "let me double-check that I covered everything."

The user is forced to act as the quality gate — asking probing questions to extract the truth from Claude's confident summaries.

## Concrete Example: 5 Prompts to Surface 4 Mistakes

In a single session (applying 7 SQL files + QA), the user had to ask 5 sequential probing questions to surface 4 distinct mistakes:

### Prompt 1: "Would you do anything differently?"
- **What Claude focused on**: A false concern about File 6's INSERT vs INSERT IGNORE (which was actually fine)
- **What Claude missed**: The `_00_` file sitting on disk, referenced in the coordination document Claude had already read, that needed to be applied first

### Prompt 2: "what has likely gone wrong?"
- **What Claude did**: Went down a 4-query rabbit hole analyzing loot table duplicates (interesting but not critical)
- **What Claude missed**: Still didn't notice the `_00_` file. Didn't check DBErrors.log. Didn't notice it had skipped session_state.md updates.

### Prompt 3: "did you not apply the SQL updates?"
- **What Claude finally found**: The `_00_` file — which was mentioned in the coordination document it read at session start
- **Self-audit gap**: Claude had to be told exactly what category of thing it missed before it could find it

### Prompt 4: "QA/QC everything you did and make sure nothing was forgotten"
- **What Claude did**: Ran verification queries (which turned out to be tautological — see #32291)
- **What Claude still missed**: The session_state.md update

### Prompt 5: "what else have you forgotten to do"
- **What Claude finally found**: It never updated `session_state.md` — a MANDATORY rule in the project's CLAUDE.md that it had read at session start

## The Pattern

Each prompt peeled back one layer. Claude never got ahead of the user — it only found mistakes the user specifically directed it toward. At no point did Claude:

- Spontaneously re-read the procedure document to check for missed steps
- Compare its completed actions against the source instructions
- Say "I'm not sure I covered everything — let me verify"
- Preemptively audit its own work before the user asked

## Why This Is Distinct From Other Issues

The existing issues (#32281-#32296) cover specific failure modes:
- What Claude does wrong (incorrect code, skipped steps, false claims)
- How Claude verifies (tautological queries, no per-step gates)
- How Claude reports (confident summaries without evidence)

This issue covers the **meta-failure**: Claude's inability to catch any of those problems without external prompting. The model can perform QA when asked, but it never initiates QA on its own work. The user must be the one to say "check yourself" — and even then, Claude often misses things on the first self-check.

## Expected Behavior

After completing a multi-step task, Claude should automatically:

1. Re-read the source instructions/procedure document
2. Enumerate each required step
3. For each step, verify: was it executed? What was the output? Does the output indicate success?
4. Surface any gaps BEFORE writing a completion summary
5. If gaps are found, either execute the missing steps or explicitly flag them

This is the difference between "I'm done" and "I've verified I'm done."

## Impact

- The user becomes the debugger of Claude's work, not the beneficiary of it
- Each additional probing question costs time and erodes trust
- Users who don't ask probing questions never discover the gaps
- The more experienced the user, the more gaps they find — suggesting the gaps are always there, just usually undetected

## Related

- #32281 — Reports completion without execution (the symptom)
- #32293 — No per-step verification (part of the cause)
- #32295 — Silently skips steps (part of the cause)
- #32296 — No evidence-based completion format (a mitigation)

## Environment

- Claude Code 2.1.71
- Windows 11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Claude never proactively surfaces its own mistakes — requires user to act as quality gate #32301

Bug Description

Concrete Example: 5 Prompts to Surface 4 Mistakes

Prompt 1: "Would you do anything differently?"

Prompt 2: "what has likely gone wrong?"

Prompt 3: "did you not apply the SQL updates?"

Prompt 4: "QA/QC everything you did and make sure nothing was forgotten"

Prompt 5: "what else have you forgotten to do"

The Pattern

Why This Is Distinct From Other Issues

Expected Behavior

Impact

Related

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] Claude never proactively surfaces its own mistakes — requires user to act as quality gate #32301

Description

Bug Description

Concrete Example: 5 Prompts to Surface 4 Mistakes

Prompt 1: "Would you do anything differently?"

Prompt 2: "what has likely gone wrong?"

Prompt 3: "did you not apply the SQL updates?"

Prompt 4: "QA/QC everything you did and make sure nothing was forgotten"

Prompt 5: "what else have you forgotten to do"

The Pattern

Why This Is Distinct From Other Issues

Expected Behavior

Impact

Related

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions