Add prescan data preparation to duplicate-issue-detector and stale-issues#481
Add prescan data preparation to duplicate-issue-detector and stale-issues#481
Conversation
…sues workflows - duplicate-issue-detector: prescan fetches newest 500 + oldest 500 issues (number, title, state) into TSV, agent scans index before searching - stale-issues: prescan fetches open issues sorted by least recently updated with metadata (labels, timestamps) into TSV for immediate candidate pool - Add bash: true to duplicate-issue-detector tools for file reading - Update READMEs to document prescan behavior Co-authored-by: strawgate <6384545+strawgate@users.noreply.github.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds prescan steps and documentation for two GitHub Actions workflows. The duplicate-issue-detector workflow now creates an issue index by fetching the newest 500 and oldest 500 issues, deduplicating them, and writing a TSV index used as the initial scan input. The stale-issues workflow now prescans up to 500 open issues (sorted by least-recently-updated) into a TSV to seed the candidate pool. README and workflow docs/lock files were updated to describe the index-driven prescan flow and to adjust step numbering and labels. Possibly related PRs
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/ai aren't file read tools included by default? Is enabling bash here a bad idea? Please investigate. Let's make sure we have logic to not grab overlapping issues for duplicate issues and for stale issues let's grab like 500 issues Please update the pr |
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
❌ Author of the following commits did not sign a Contributor Agreement: Please, read and sign the above mentioned agreement if you want to contribute to this project |
- Use --search "sort:updated-asc" for prescan queries to guarantee oldest-updated issues are fetched first (not just sorted client-side) - Add bash: true to duplicate-issue-detector github/ copy - Recompile lock files Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The agent can read prescan files with its built-in file-reading tools. Enabling bash unnecessarily expands agent capabilities beyond what's needed for reading a TSV file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Aligns the runtime workflow with the README and github/ copy which both document a 500-issue prescan window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/gh-aw-stale-issues.md:
- Around line 115-117: The fenced code block containing the command "cat
/tmp/gh-aw/agent/open-issues.tsv" is missing a language tag; update the opening
fence from ``` to ```bash so the block declares the language (e.g., change the
snippet around the "cat /tmp/gh-aw/agent/open-issues.tsv" command to start with
```bash).
- Around line 92-97: The command invoking gh issue list currently swallows all
errors via "2>/dev/null || true", which hides prescan failures and allows the
workflow to continue with an empty "$issues_file"; remove the silent suppression
and let failures surface by deleting the "2>/dev/null || true" tail (or replace
it with explicit error handling that writes stderr to logs and exits non‑zero),
so that the gh issue list invocation fails the job on error and preserves
visibility into problems when populating "$issues_file".
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/gh-aw-stale-issues.lock.yml.github/workflows/gh-aw-stale-issues.md
…anguage Replaces 2>/dev/null || true with a ::warning annotation so failures are visible in the Actions log. Adds bash language marker to code fence. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Requesting changes due to a verified prescan error-handling gap that can silently degrade duplicate detection quality.
What is this? | From workflow: PR Review
Give us feedback! React with 🚀 if perfect, 👍 if helpful, 👎 if not.
Replaces 2>/dev/null || true with ::warning annotations on both prescan queries so API failures are visible in the Actions log. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Keeps both the stale-labeled issues collection step (from this PR) and the prescan open issues step (from merged PR #481). Recompiled lock file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This PR adds a prescan-first data preparation flow to issue-investigation workflows so the agent starts from concrete candidates before broader search.
duplicate-issue-detector/tmp/gh-aw/agent/issues-index.tsvwith issuenumber,title, andstate..github/workflows/gh-aw-duplicate-issue-detector.md, addedbash: trueand updated the prompt flow to include Step 2: Scan the Issue Index before targeted search and candidate evaluation (with step renumbering through post result).github/workflows/gh-aw-duplicate-issue-detector.mdwith the same prescan-first investigation structure for the workflow-source copy.stale-issues/tmp/gh-aw/agent/open-issues.tsvwithnumber,title,updated_at,created_at, andlabel_names..github/workflows/gh-aw-stale-issues.mdandgithub/workflows/gh-aw-stale-issues.md, prescan now fetches up to 500 open issues sorted by least recently updated.bash: trueand updated the prompt flow to include Step 0 so the agent reads the prescanned index first and prioritizes oldestupdated_atcandidates.Workflow source and docs updates
github/workflows/for bothgh-aw-duplicate-issue-detector.mdandgh-aw-stale-issues.mdwith the prescan-first investigation structure..lock.ymlcompiled workflows to include the prescan steps and prompt updates.duplicate-issue-detectorandstale-issuesto document the prescan-first investigation flow.Notes
estc-actions-resource-not-accessible-detector.