Skip to content

[aw-failures] Fix: Smoke Claude safe_outputs fails when resolve_pull_request_review_thread targets already-resolved thread #29372

@github-actions

Description

@github-actions

Problem Statement

The Smoke Claude smoke test run §25181816514 (2026-04-30 18:14 UTC) failed at the safe_outputs step due to a GraphQL error on resolve_pull_request_review_thread. All 19 smoke sub-tests completed (18 pass, 1 skipped), but the safe_outputs stage failed, marking the entire run as a failure.

Error Signature

Processing message 8/11: resolve_pull_request_review_thread
Failed to resolve review thread: Request failed due to following response errors:
Message 8 (resolve_pull_request_review_thread) failed: Request failed due to following response errors:
1 safe output(s) failed:
  - resolve_pull_request_review_thread: Request failed due to following response errors:

Thread ID attempted: PRRT_kwDOPc1QR85-0kYa (on PR #29360)

Root Cause

The Smoke Claude smoke test:

  1. Calls submit_pull_request_review (creates a new review on PR fix: extend detectFirewallAuditArtifacts to find audit files in non-flattened agent artifact #29360)
  2. Then calls resolve_pull_request_review_thread targeting a pre-existing thread ID (PRRT_kwDOPc1QR85-0kYa)
  3. Queues this resolve in safeoutputs.jsonl

The thread referenced in step 2 may already be in a resolved state from a previous Smoke Claude run. The safe_outputs handler then fails when trying to re-resolve an already-resolved thread via the GraphQL API.

Impact

  • Smoke Claude test incorrectly shows as FAIL despite all sub-tests passing
  • 1 safe output failure triggers full safe_outputs step failure
  • Creates a misleading signal that the Claude engine is unhealthy when it is not

Metrics

Metric Value
Agent status success (9.3m, 42 turns)
Smoke sub-tests passed 18/19 (test 19 skipped)
safe_outputs step FAILURE (22s, 1/11 items failed)
Tokens 1.23M
Cost ~/bin/bash.98

Proposed Remediation

Option A (Recommended): Resolve the thread created in the same run

  • The smoke test should capture the thread ID of the review it just created (from the submit_pull_request_review response) and resolve THAT thread, not a hardcoded pre-existing thread ID

Option B: Use continue-on-error for resolve_pull_request_review_thread in safe_outputs

  • Configure the safe_outputs handler to treat resolve_thread failures as warnings rather than errors, since the thread being already-resolved is a harmless state

Success Criteria

Smoke Claude runs to full success. resolve_pull_request_review_thread resolves a thread from the same run and does not fail when threads are pre-resolved.

Parent: #29232

References:

Generated by [aw] Failure Investigator (6h) · ● 397.8K ·

  • expires on May 7, 2026, 7:24 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions