From 085c862d3ac8461cea7ec8d3839e7582fd4515ee Mon Sep 17 00:00:00 2001 From: Kevin De Porre Date: Thu, 15 Jan 2026 15:49:52 +0100 Subject: [PATCH 1/5] Claude workflow for fixing opened issues --- .../reproduce-and-fix-issues-claude.yml | 122 ++++++++++++++++++ 1 file changed, 122 insertions(+) create mode 100644 .github/workflows/reproduce-and-fix-issues-claude.yml diff --git a/.github/workflows/reproduce-and-fix-issues-claude.yml b/.github/workflows/reproduce-and-fix-issues-claude.yml new file mode 100644 index 000000000..1f5774fb7 --- /dev/null +++ b/.github/workflows/reproduce-and-fix-issues-claude.yml @@ -0,0 +1,122 @@ +name: Claude Auto Issue Fix + +on: + issues: + types: [opened] + +jobs: + claude_auto_issue_fix: + runs-on: ubuntu-latest + + # Prevent multiple runs for the same issue + concurrency: + group: claude-auto-issue-${{ github.repository }}-${{ github.event.issue.number }} + cancel-in-progress: false + + permissions: + contents: write + pull-requests: write + issues: write + id-token: write + + steps: + - name: Checkout code + uses: actions/checkout@v6.0.1 + with: + fetch-depth: 0 + + - name: Run Claude Code (auto issue handler) + uses: anthropics/claude-code-action@v1 + with: + anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} + prompt: | + You are an autonomous coding agent running in CI for this repository. + This workflow runs automatically whenever a new GitHub issue is opened. + + === Context === + REPO: ${{ github.repository }} + ISSUE NUMBER: ${{ github.event.issue.number }} + ISSUE TITLE: ${{ github.event.issue.title }} + ISSUE URL: ${{ github.event.issue.html_url }} + + ISSUE BODY: + ${{ github.event.issue.body }} + + === Goal === + Apply a strict “repro test → fix” methodology. Produce exactly ONE of these outcomes: + 1) Repro PR + Fix PR (preferred for actionable bugs) + 2) Repro PR only (if reproducible but cannot fix) + 3) Comment-only “Needs info / Not actionable / Question answer draft” + No other outcomes. + + === Hard rules (non-negotiable) === + - Never claim reproduction unless a test fails due to a concrete behavioral assertion. + - Never use “does not throw / does not crash” as the primary assertion unless: + (a) the issue explicitly says throwing is the bug, AND + (b) the correct behavior is explicitly “should not throw”. + - Never “fix” by skipping, swallowing, or ignoring the problematic case + (e.g. guard clauses / try-catch that hides the error) unless the issue explicitly says that is correct. + Any conditional handling must still implement the intended behavior. + - Do not weaken tests to make them pass. Do not remove assertions. Do not skip tests. Do not mark flaky. + - Keep changes minimal and scoped. No refactors. No drive-by formatting. + - If expected behavior is unclear or not testable, STOP and request the minimum missing info. + + === Step 0: Determine actionability === + From the issue, extract: + - Expected behavior (must be explicit & testable) + - Actual behavior + - Repro steps / minimal example + If you cannot extract a testable expected behavior: + - Post a GitHub comment requesting the minimum missing details (inputs, expected outputs, versions, etc.) + - Stop (no PRs) + + === Step 1: Reproduction test (behavioral oracle) === + Create a minimal unit/integration test that asserts the expected behavior: + - Test name must describe intended behavior (NOT “reproduces issue #…”) + - Assert concrete outcomes: returned values, state transitions, emitted events, persisted data, etc. + - Add at least one “anti-noop” assertion that would fail if the code simply “does nothing” + (e.g. verify a state change, returned value, side effect). + - If the issue involves an error, prefer asserting correct error type/message or correct recovery/result. + + === Step 2: Prove it reproduces on base === + Run tests on base: + - Confirm the new test FAILS due to assertion mismatch (not due to broken test setup). + If the test passes on base: + - Do NOT fake a repro by weakening assertions. + - If the issue seems intermittent, attempt to make repro deterministic; otherwise comment “not reproducible” and stop. + + Commit repro and open PR: + - Branch: ai/issue-${{ github.event.issue.number }}-repro + - Commit message: test: assert (issue #${{ github.event.issue.number }}) + - PR title: [repro] (issue #${{ github.event.issue.number }}) + - PR body: link issue + what test asserts + how to run + observed failure + + === Step 3: Fix (stacked on repro) === + Create fix branch from repro branch: + - Branch: ai/issue-${{ github.event.issue.number }}-fix + Implement the minimal fix. + Validation: + - Previously failing test MUST now pass. + - Fix must not be a no-op. + - If you add guards/conditionals, justify why it's correct behavior (not hiding bug), + and ensure intended work still occurs. + + Commit fix and open stacked PR: + - Commit message: fix: (issue #${{ github.event.issue.number }}) + - PR title: [fix] (issue #${{ github.event.issue.number }}) + - PR body: link issue + link repro PR + root cause + fix explanation + how to test + + === If stuck === + If you cannot fix without derailing: + - Still open the repro PR if it's valid (high value). + - Post a comment summarizing findings and blockers. + - Do NOT force a “fix” that silences symptoms. + + === Quality checklist (must satisfy before marking fixed) === + - [ ] Expected behavior asserted (not “no throw” unless truly correct) + - [ ] Test fails on base for the right reason + - [ ] Fix makes test pass without weakening assertions + - [ ] Fix is not “skip/swallow/ignore” + - [ ] Changes minimal and scoped + + Proceed now. From e5b532b6f43a0c171d4aa92b539bd90bcae08ccf Mon Sep 17 00:00:00 2001 From: Kevin De Porre Date: Mon, 19 Jan 2026 14:06:05 +0100 Subject: [PATCH 2/5] Introduce /reproduce command --- .../reproduce-and-fix-issues-claude.yml | 122 ------------------ 1 file changed, 122 deletions(-) delete mode 100644 .github/workflows/reproduce-and-fix-issues-claude.yml diff --git a/.github/workflows/reproduce-and-fix-issues-claude.yml b/.github/workflows/reproduce-and-fix-issues-claude.yml deleted file mode 100644 index 1f5774fb7..000000000 --- a/.github/workflows/reproduce-and-fix-issues-claude.yml +++ /dev/null @@ -1,122 +0,0 @@ -name: Claude Auto Issue Fix - -on: - issues: - types: [opened] - -jobs: - claude_auto_issue_fix: - runs-on: ubuntu-latest - - # Prevent multiple runs for the same issue - concurrency: - group: claude-auto-issue-${{ github.repository }}-${{ github.event.issue.number }} - cancel-in-progress: false - - permissions: - contents: write - pull-requests: write - issues: write - id-token: write - - steps: - - name: Checkout code - uses: actions/checkout@v6.0.1 - with: - fetch-depth: 0 - - - name: Run Claude Code (auto issue handler) - uses: anthropics/claude-code-action@v1 - with: - anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} - prompt: | - You are an autonomous coding agent running in CI for this repository. - This workflow runs automatically whenever a new GitHub issue is opened. - - === Context === - REPO: ${{ github.repository }} - ISSUE NUMBER: ${{ github.event.issue.number }} - ISSUE TITLE: ${{ github.event.issue.title }} - ISSUE URL: ${{ github.event.issue.html_url }} - - ISSUE BODY: - ${{ github.event.issue.body }} - - === Goal === - Apply a strict “repro test → fix” methodology. Produce exactly ONE of these outcomes: - 1) Repro PR + Fix PR (preferred for actionable bugs) - 2) Repro PR only (if reproducible but cannot fix) - 3) Comment-only “Needs info / Not actionable / Question answer draft” - No other outcomes. - - === Hard rules (non-negotiable) === - - Never claim reproduction unless a test fails due to a concrete behavioral assertion. - - Never use “does not throw / does not crash” as the primary assertion unless: - (a) the issue explicitly says throwing is the bug, AND - (b) the correct behavior is explicitly “should not throw”. - - Never “fix” by skipping, swallowing, or ignoring the problematic case - (e.g. guard clauses / try-catch that hides the error) unless the issue explicitly says that is correct. - Any conditional handling must still implement the intended behavior. - - Do not weaken tests to make them pass. Do not remove assertions. Do not skip tests. Do not mark flaky. - - Keep changes minimal and scoped. No refactors. No drive-by formatting. - - If expected behavior is unclear or not testable, STOP and request the minimum missing info. - - === Step 0: Determine actionability === - From the issue, extract: - - Expected behavior (must be explicit & testable) - - Actual behavior - - Repro steps / minimal example - If you cannot extract a testable expected behavior: - - Post a GitHub comment requesting the minimum missing details (inputs, expected outputs, versions, etc.) - - Stop (no PRs) - - === Step 1: Reproduction test (behavioral oracle) === - Create a minimal unit/integration test that asserts the expected behavior: - - Test name must describe intended behavior (NOT “reproduces issue #…”) - - Assert concrete outcomes: returned values, state transitions, emitted events, persisted data, etc. - - Add at least one “anti-noop” assertion that would fail if the code simply “does nothing” - (e.g. verify a state change, returned value, side effect). - - If the issue involves an error, prefer asserting correct error type/message or correct recovery/result. - - === Step 2: Prove it reproduces on base === - Run tests on base: - - Confirm the new test FAILS due to assertion mismatch (not due to broken test setup). - If the test passes on base: - - Do NOT fake a repro by weakening assertions. - - If the issue seems intermittent, attempt to make repro deterministic; otherwise comment “not reproducible” and stop. - - Commit repro and open PR: - - Branch: ai/issue-${{ github.event.issue.number }}-repro - - Commit message: test: assert (issue #${{ github.event.issue.number }}) - - PR title: [repro] (issue #${{ github.event.issue.number }}) - - PR body: link issue + what test asserts + how to run + observed failure - - === Step 3: Fix (stacked on repro) === - Create fix branch from repro branch: - - Branch: ai/issue-${{ github.event.issue.number }}-fix - Implement the minimal fix. - Validation: - - Previously failing test MUST now pass. - - Fix must not be a no-op. - - If you add guards/conditionals, justify why it's correct behavior (not hiding bug), - and ensure intended work still occurs. - - Commit fix and open stacked PR: - - Commit message: fix: (issue #${{ github.event.issue.number }}) - - PR title: [fix] (issue #${{ github.event.issue.number }}) - - PR body: link issue + link repro PR + root cause + fix explanation + how to test - - === If stuck === - If you cannot fix without derailing: - - Still open the repro PR if it's valid (high value). - - Post a comment summarizing findings and blockers. - - Do NOT force a “fix” that silences symptoms. - - === Quality checklist (must satisfy before marking fixed) === - - [ ] Expected behavior asserted (not “no throw” unless truly correct) - - [ ] Test fails on base for the right reason - - [ ] Fix makes test pass without weakening assertions - - [ ] Fix is not “skip/swallow/ignore” - - [ ] Changes minimal and scoped - - Proceed now. From 01d59aad4fd7332502c8b1ffdbb17b5b25f6f876 Mon Sep 17 00:00:00 2001 From: Kevin De Porre Date: Mon, 19 Jan 2026 14:25:25 +0100 Subject: [PATCH 3/5] Use opus instead of sonnet --- .github/workflows/reproduce-and-fix-issue-claude.yml | 2 ++ 1 file changed, 2 insertions(+) diff --git a/.github/workflows/reproduce-and-fix-issue-claude.yml b/.github/workflows/reproduce-and-fix-issue-claude.yml index e48ca8290..3bcd564a2 100644 --- a/.github/workflows/reproduce-and-fix-issue-claude.yml +++ b/.github/workflows/reproduce-and-fix-issue-claude.yml @@ -33,6 +33,8 @@ jobs: uses: anthropics/claude-code-action@v1 with: anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} + claude_args: | + --model claude-opus-4-5 prompt: | You are an autonomous coding agent running in CI for this repository. This workflow runs when a maintainer requests issue analysis by commenting "/reproduce" on an issue. From 6f06d4e164bea6a39f12522b6a9021b672b5e34e Mon Sep 17 00:00:00 2001 From: Kevin De Porre Date: Mon, 19 Jan 2026 14:32:48 +0100 Subject: [PATCH 4/5] Use opus and allow tools --- .github/workflows/reproduce-and-fix-issue-claude.yml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/.github/workflows/reproduce-and-fix-issue-claude.yml b/.github/workflows/reproduce-and-fix-issue-claude.yml index 3bcd564a2..343edb07c 100644 --- a/.github/workflows/reproduce-and-fix-issue-claude.yml +++ b/.github/workflows/reproduce-and-fix-issue-claude.yml @@ -7,8 +7,10 @@ on: jobs: claude_auto_issue_fix: # Only run when a maintainer comments with /reproduce on an issue + # Exclude Claude's own comments to prevent recursive triggers if: | github.event.issue != null && + github.actor != 'claude[bot]' && contains(github.event.comment.body, '/reproduce') && (github.event.comment.user.login == 'kevin-dp' || github.event.comment.user.login == 'KyleAMathews' || github.event.comment.user.login == 'samwillis') runs-on: ubuntu-latest @@ -22,6 +24,7 @@ jobs: pull-requests: write issues: write id-token: write + actions: read steps: - name: Checkout code @@ -35,6 +38,7 @@ jobs: anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }} claude_args: | --model claude-opus-4-5 + --allowedTools Bash,Edit,Read,Write prompt: | You are an autonomous coding agent running in CI for this repository. This workflow runs when a maintainer requests issue analysis by commenting "/reproduce" on an issue. From 4566ab753d9a03c532fc8681f81b1801bfc83fa3 Mon Sep 17 00:00:00 2001 From: Kevin De Porre Date: Mon, 19 Jan 2026 15:03:55 +0100 Subject: [PATCH 5/5] Modify prompt to disallow references to the issue in code --- .../reproduce-and-fix-issue-claude.yml | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/.github/workflows/reproduce-and-fix-issue-claude.yml b/.github/workflows/reproduce-and-fix-issue-claude.yml index 343edb07c..b4d810756 100644 --- a/.github/workflows/reproduce-and-fix-issue-claude.yml +++ b/.github/workflows/reproduce-and-fix-issue-claude.yml @@ -61,15 +61,19 @@ jobs: === Hard rules (non-negotiable) === - Never claim reproduction unless a test fails due to a concrete behavioral assertion. - - Never use “does not throw / does not crash” as the primary assertion unless: + - Never use "does not throw / does not crash" as the primary assertion unless: (a) the issue explicitly says throwing is the bug, AND - (b) the correct behavior is explicitly “should not throw”. - - Never “fix” by skipping, swallowing, or ignoring the problematic case + (b) the correct behavior is explicitly "should not throw". + - Never "fix" by skipping, swallowing, or ignoring the problematic case (e.g. guard clauses / try-catch that hides the error) unless the issue explicitly says that is correct. Any conditional handling must still implement the intended behavior. - Do not weaken tests to make them pass. Do not remove assertions. Do not skip tests. Do not mark flaky. - Keep changes minimal and scoped. No refactors. No drive-by formatting. - If expected behavior is unclear or not testable, STOP and request the minimum missing info. + - NEVER add issue references (e.g. "issue #1152", "#1152", "Issue 1152") in the codebase itself. + This includes test names, code comments, variable names, or any other code artifacts. + Issue references are ONLY allowed in: branch names, commit messages, PR titles, PR bodies, and GitHub comments. + Test names and code comments should be purely descriptive of current behavior, no description of old behavior and do not reference where the bug was reported. === Step 0: Determine actionability === From the issue, extract: @@ -82,11 +86,13 @@ jobs: === Step 1: Reproduction test (behavioral oracle) === Create a minimal unit/integration test that asserts the expected behavior: - - Test name must describe intended behavior (NOT “reproduces issue #…”) + - Test name must describe the intended behavior only (e.g. "updates cache entry when using computed query key"). + Do NOT include issue numbers or references in test names (BAD: "fixes issue #1152", "issue 1152 regression"). - Assert concrete outcomes: returned values, state transitions, emitted events, persisted data, etc. - - Add at least one “anti-noop” assertion that would fail if the code simply “does nothing” + - Add at least one "anti-noop" assertion that would fail if the code simply "does nothing" (e.g. verify a state change, returned value, side effect). - If the issue involves an error, prefer asserting correct error type/message or correct recovery/result. + - Do NOT add comments in the test code referencing the issue number. === Step 2: Prove it reproduces on base === Run tests on base: @@ -110,6 +116,9 @@ jobs: - Fix must not be a no-op. - If you add guards/conditionals, justify why it's correct behavior (not hiding bug), and ensure intended work still occurs. + - Code comments should explain the "why" of the fix, NOT reference the issue number. + BAD: "Issue #1152: Fixed cache lookup" + GOOD: "Use the actual query key for cache lookup, not just the base key" Commit fix and open stacked PR: - Commit message: fix: (issue #${{ github.event.issue.number }})