Skip to content

fix(engine): add degraded_success status and verification reporting#62

Open
mvanhorn wants to merge 3 commits intodanshapiro:mainfrom
mvanhorn:osc/50-verification-false-green
Open

fix(engine): add degraded_success status and verification reporting#62
mvanhorn wants to merge 3 commits intodanshapiro:mainfrom
mvanhorn:osc/50-verification-false-green

Conversation

@mvanhorn
Copy link
Contributor

Fixes #50

Summary

When verification commands (tsc, pytest, go test) fail due to transient infra issues (DNS failures, missing tools), stages still reported unqualified success. This adds the foundation to properly track and report verification outcomes.

Changes

New status: degraded_success (runtime/status.go)

  • Added StatusDegradedSuccess to the canonical status enum
  • Routes like success in the engine (retry reset, goal gate, parallel completion)
  • Parsed from degraded_success, degraded-success, degradedsuccess
  • Included in IsCanonical() check

Verification types (runtime/status.go)

  • VerificationResult - captures overall verification status (passed/failed/blocked), blocked reason, and individual command results
  • VerificationEntry - records command, exit code, blocked flag, and reason
  • Added Verification *VerificationResult field to Outcome

Engine routing (engine/engine.go)

  • degraded_success resets retry counters (like success)
  • degraded_success passes goal gate checks (like success)
  • Runs complete with FinalSuccess when stages report degraded_success

Agent instructions (prompts/stage_status_contract_preamble.tmpl)

  • Instructs agents to use degraded_success with verification details when verification commands fail or are blocked by infra issues
  • Provides example JSON format for the verification object

Test plan

  • TestParseStageStatus_CanonicalAndLegacy - degraded_success and aliases parsed correctly
  • TestStageStatus_IsCanonical - degraded_success is canonical
  • TestDecodeOutcomeJSON_DegradedSuccessWithVerification - full verification object preserved through decode
  • TestDecodeOutcomeJSON_SuccessWithPassedVerification - verification data preserved on normal success too
  • go build ./... compiles cleanly
  • All existing tests pass

This contribution was developed with AI assistance (Claude Code).

Adds infrastructure to prevent false-green outcomes when verification
commands fail due to transient infra issues:

- New StatusDegradedSuccess stage status that routes like success but
  signals that required verification was blocked or failed
- New Verification and VerificationEntry types on Outcome for structured
  verification reporting (status, blocked_reason, command results)
- Updated stage status contract preamble to instruct agents to use
  degraded_success when verification is blocked
- degraded_success routes like success in the engine (retry reset,
  goal gate, parallel completion) so runs continue correctly

Fixes danshapiro#50

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
mvanhorn and others added 2 commits March 9, 2026 20:40
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…i/ paths

- Add gpt-5.3-codex-spark to cliOnlyModelIDs map (fixes TestIsCLIOnlyModel)
- Replace root .ai/verify_errors.log and .ai/test-evidence/latest/ with
  run-scoped .ai/runs/$KILROY_RUN_ID/ paths in demo spec (fixes
  TestReferenceSurfaces_NoLegacyRootAIScratchPaths)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

verification can report success when core checks are blocked by transient infra/tool failures (false-green risk)

1 participant