Skip to content

fix(feature-ideation): address Copilot + CodeRabbit review on PR #85 (18 fixes, 17 new tests)#85

Open
don-petry wants to merge 7 commits intomainfrom
feat/feature-ideation-test-hardening
Open

fix(feature-ideation): address Copilot + CodeRabbit review on PR #85 (18 fixes, 17 new tests)#85
don-petry wants to merge 7 commits intomainfrom
feat/feature-ideation-test-hardening

Conversation

@don-petry
Copy link
Copy Markdown
Contributor

@don-petry don-petry commented Apr 7, 2026

Addresses all 14 Copilot and 13 CodeRabbit review comments on the feature-ideation test-hardening PR, closing critical bugs that would have failed at runtime.

Critical bug fixes

  • lint-prompt.sh — now scans prompt: blocks (claude-code-action v1) in addition to direct_prompt: (v0); previously had a blind spot on the exact file it was meant to protect
  • add_label_to_discussion — sends labelIds as a proper JSON array via new gh_safe_graphql_input helper; the GraphQL API would have rejected the [ID!]! mutation with the prior raw-field string
  • validate-signals.py — registers an inline date-time format checker; Draft202012Validator ignores format by default, so invalid timestamps were passing schema validation
  • gh_safe_graphql --jq — jq filter errors no longer swallowed with || true; bad filters now surface as exit-65 instead of silently returning []
  • collect-signals.sh — truncation warning now computed on raw pre-filter count; previously an all-bot result set at the limit would mask real API truncation
  • match-discussions.shMATCH_THRESHOLD validated as a float in [0,1] before entering Python; bad values now fail with exit-64 instead of an opaque traceback

Cleanup

  • Dead normalize_title / jaccard_similarity bash functions removed from match-discussions.sh (Python implementation is the only active path)
  • Schema $id corrected to petry-projects/.github (was pointing at TalkTerm)
  • signals-schema.bats "executable" test now asserts -x not just -f -r
  • README + filter-bots.sh comments: "allowlist" → "blocklist"
  • gh stub logs argv via printf '%q ' for accurate shell-quoted argument recording
  • FEATURE_IDEATION_BOT_AUTHORS CSV entries whitespace-trimmed before joining blocklist
  • validate-signals.py malformed-JSON exits 2 (data error), not 1 (schema error), per documented contract
  • signals.schema.json + collect-signals.sh SCHEMA_VERSION alignment documented and regression-tested
  • match-discussions.bats build_signals computes discussion count from array length

New helper

gh_safe_graphql_input(body) — same defensive contract as gh_safe_graphql but accepts a fully-formed JSON request body via stdin, enabling mutations whose variables include arrays (labelIds: [ID!]!) that gh's flag interface cannot express.

Summary by CodeRabbit

  • New Features

    • Feature ideation pipeline: collects repository signals, validates them against a new canonical schema, matches proposals to discussions, and optionally applies discussion mutations with a dry-run/log mode; workflow inputs added to control dry-run and tooling version.
  • Documentation

    • Added comprehensive usage and testing docs for the tooling, local/CI test workflow, and schema/version bump process.
  • Tests

    • Extensive CI tests and fixtures covering collection, schema validation, matching, linting, mutation dry-run, and helper utilities.

… tests

Refactors the reusable feature-ideation workflow's parsing surface from
an inline 600-line YAML heredoc into testable scripts with deterministic
contracts. Every defect that previously required post-merge review can
now fail in CI before adopters notice.

Why
---
The prior reusable workflow used `2>/dev/null || echo '[]'` for every
gh / GraphQL call, which silently downgraded auth failures, rate limits,
network outages, and GraphQL schema drift to empty arrays. The pipeline
would "succeed" while producing useless signals — and Mary's Discussion
posts would silently degrade across every BMAD repo on the org. The
prompt also instructed Mary to "use fuzzy matching" against existing
Ideas Discussions in her head, which is non-deterministic and untestable.

Risk register (probability × impact, scale 1–9):
  R1=9  swallow-all-errors gh wrapper
  R2=6  literal $() inside YAML direct prompt
  R3=6  no signals.json schema
  R4=6  jq --argjson crash on empty input
  R5=6  fuzzy match in Mary's prompt → duplicate Discussions
  R6=6  retry idempotency hole
  R7=6  GraphQL errors[]/null data not detected
  R8=4  GraphQL partial errors silently accepted
  R10=3 bot filter only catches dependabot/github-actions
  R11=4 pagination silently truncates

What's new
----------
.github/scripts/feature-ideation/
  collect-signals.sh         Orchestrator (replaces inline heredoc)
  validate-signals.py        JSON Schema 2020-12 validator
  match-discussions.sh       Deterministic Jaccard matcher (kills R5/R6)
  discussion-mutations.sh    create/comment/label wrappers + DRY_RUN mode
  lint-prompt.sh             Catches unescaped $() / ${VAR} in prompt blocks
  lib/gh-safe.sh             Defensive gh wrapper, fails loud on every
                             documented failure mode (kills R1, R7, R8)
  lib/compose-signals.sh     Validates JSON inputs before jq composition
  lib/filter-bots.sh         Extensible bot author filter (kills R10)
  lib/date-utils.sh          Cross-platform date helpers
  README.md                  Maintainer docs

.github/schemas/signals.schema.json
  Pinned producer/consumer contract for signals.json (Draft 2020-12).
  CI rejects any drift; the runtime signals.json is also validated by
  the workflow before being handed to Mary.

.github/workflows/feature-ideation-reusable.yml
  Rewritten. Adds a self-checkout of petry-projects/.github so the
  scripts above are available in the runner. Replaces inline bash with
  collect-signals.sh + validate-signals.py. Adds RUN_DATE / SIGNALS_PATH /
  PROPOSALS_PATH / MATCH_PLAN_PATH / TOOLING_DIR env vars passed to
  claude-code-action via env: instead of unescaped shell expansions in
  the prompt body. Adds dry_run input that flows through to
  discussion-mutations.sh, which logs every planned action to a JSONL
  audit log instead of executing — uploaded as the dry-run-log artifact.

.github/workflows/feature-ideation-tests.yml
  New CI gate, path-filtered. Runs shellcheck, lint-prompt, schema
  fixture validation, and the full bats suite on every PR that touches
  the feature-ideation surface.

standards/workflows/feature-ideation.yml
  Updated caller stub template. Adds dry_run workflow_dispatch input
  so adopters get safe smoke-testing for free. Existing TalkTerm caller
  stub continues to work unchanged (dry_run defaults to false).

test/workflows/feature-ideation/
  92 bats tests across 9 suites. 14 GraphQL/REST response fixtures.
  5 expected signals.json fixtures (3 valid + 2 INVALID for negative
  schema testing). Programmable gh PATH stub with single-call and
  multi-call modes for integration testing.

  | Suite                       | Tests | Risks closed       |
  |-----------------------------|------:|--------------------|
  | gh-safe.bats                |    19 | R1, R7, R8         |
  | compose-signals.bats        |     8 | R3, R4             |
  | filter-bots.bats            |     5 | R10                |
  | date-utils.bats             |     7 | R9                 |
  | collect-signals.bats        |    14 | R1, R3, R4, R7, R11|
  | match-discussions.bats      |    13 | R5, R6             |
  | discussion-mutations.bats   |    10 | DRY_RUN contract   |
  | lint-prompt.bats            |     8 | R2                 |
  | signals-schema.bats         |     8 | R3                 |
  | TOTAL                       |    92 |                    |

Test results: 92 passing, 0 failing, 0 skipped. Run with:
  bats test/workflows/feature-ideation/

Backwards compatibility
-----------------------
The reusable workflow's input surface is unchanged for existing callers
(TalkTerm continues to work with no edits). The new dry_run input is
optional and defaults to false. Adopters who copy the new standards
caller stub get dry_run support automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 7, 2026 17:48
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

Introduces a Feature Ideation toolset: a strict signals JSON Schema, new scripts to collect/compose/validate/match/mutate signals (with dry-run logging), a prompt linter, test harness/fixtures/stubs, and workflow changes to invoke and CI-test the tooling.

Changes

Cohort / File(s) Summary
Schema
/.github/schemas/signals.schema.json
New JSON Schema defining the canonical Feature Ideation signals contract (strict top-level props, schema_version/date/repo formats, bucketed aggregates, item $defs, and truncation_warnings constraints).
Collector & Orchestrator
/.github/scripts/feature-ideation/collect-signals.sh
New executable entrypoint (exports SCHEMA_VERSION, main()) that queries GitHub (REST/GraphQL), applies bot filtering and 30-day cutoffs, records truncation warnings, composes and emits signals.json, and appends Actions step summary when applicable.
Mutation Helpers
/.github/scripts/feature-ideation/discussion-mutations.sh
GraphQL mutation wrappers: create_discussion, comment_on_discussion, add_label_to_discussion with DRY_RUN JSONL audit logging and live GraphQL execution paths.
Compose & Validation
/.github/scripts/feature-ideation/lib/compose-signals.sh, /.github/scripts/feature-ideation/validate-signals.py
compose_signals() validates inputs and emits canonical signals JSON; Python validator checks signals against schema with a custom date-time checker and structured error reporting.
gh-safe & Utilities
/.github/scripts/feature-ideation/lib/gh-safe.sh, .../lib/date-utils.sh, .../lib/filter-bots.sh
Robust gh REST/GraphQL wrappers (validated JSON, envelope/error handling), cross-platform UTC date helpers, and bot-list builder + filter pipeline.
Matching & Linting
/.github/scripts/feature-ideation/match-discussions.sh, .../lint-prompt.sh
Deterministic matcher (token normalization + Jaccard, greedy assignment, thresholding) and prompt linter that detects unescaped shell expansions inside workflow prompt blocks.
Workflows
/.github/workflows/feature-ideation-reusable.yml, /.github/workflows/feature-ideation-tests.yml, standards/workflows/feature-ideation.yml
Reusable workflow now checks out external tooling, adds dry_run and tooling_ref inputs, invokes collector/validator/matcher/mutator tooling, and a new CI workflow exercises scripts/tests.
Tests, Fixtures & Stubs
test/workflows/feature-ideation/..., test/.../fixtures/**/*, test/.../stubs/gh, test/.../helpers/setup.bash
Comprehensive Bats suites covering all scripts, expected good/invalid/truncated fixtures, GH response fixtures, a multi-call gh test stub, and test helpers for temp dirs and stub installation.
Docs
/.github/scripts/feature-ideation/README.md
README describing modules, CLI usage, dry-run behavior, test/lint workflow, schema/version bump process, and required CI steps.

Sequence Diagram(s)

sequenceDiagram
    participant WF as Workflow
    participant CS as collect-signals.sh
    participant GH as GitHub API
    participant VS as validate-signals.py
    participant MD as match-discussions.sh
    participant MUT as discussion-mutations.sh

    WF->>CS: run with REPO, GH_TOKEN, inputs (dry_run, tooling_ref)
    CS->>GH: fetch issues, PRs, releases, discussion categories/discussions
    GH-->>CS: return REST/GraphQL responses
    CS->>CS: filter bots, apply 30-day cutoffs, record truncation_warnings
    CS->>CS: call compose_signals() -> signals.json
    CS-->>WF: upload signals.json
    WF->>VS: validate signals.json vs schema
    VS-->>WF: validation result
    WF->>MD: run matcher with signals + proposals -> match plan
    MD-->>WF: matched/new_candidates plan
    alt DRY_RUN = true
        WF->>MUT: run mutations with DRY_RUN=1 (log actions to JSONL)
        MUT-->>WF: dry-run JSONL artifact
    else
        WF->>MUT: execute mutations
        MUT->>GH: GraphQL create/comment/label mutations
        GH-->>MUT: mutation responses
        MUT-->>WF: mutation outcomes
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.78% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: addressing review feedback with fixes and new tests for feature-ideation tooling.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/feature-ideation-test-hardening

Comment @coderabbitai help to get the list of available commands and usage tips.

… test

CI failure on the previous commit: 91/92 passing, 1 failing.

The filter-bots env-extension test used `sh -c` to source filter-bots.sh
in a sub-shell with FEATURE_IDEATION_BOT_AUTHORS set. On macOS this works
because /bin/sh is bash. On Ubuntu (CI), /bin/sh is dash, which does not
support `set -o pipefail`, so sourcing filter-bots.sh produced:

  sh: 12: set: Illegal option -o pipefail

Fixed by switching to `bash -c`. All scripts already use
`#!/usr/bin/env bash` shebangs; this is the only place a sub-shell was
spawned via `sh`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the reusable feature-ideation GitHub Actions workflow by extracting the previously inlined bash/GraphQL/jq parsing into testable scripts, pinning a signals.json producer/consumer contract via JSON Schema, and adding a CI gate (ShellCheck + prompt lint + schema validation + bats) to fail regressions pre-merge.

Changes:

  • Add .github/scripts/feature-ideation/ tooling (collection, schema validation, matching, mutation wrappers, and prompt linting).
  • Introduce JSON Schema contract for signals.json and validate both runtime output + fixtures.
  • Add a dedicated path-filtered CI workflow plus a comprehensive bats test suite + fixtures.

Reviewed changes

Copilot reviewed 39 out of 39 changed files in this pull request and generated 14 comments.

Show a summary per file
File Description
test/workflows/feature-ideation/stubs/gh Programmable gh stub for bats tests (single-call + multi-call).
test/workflows/feature-ideation/signals-schema.bats Schema validation tests for fixture signals.json files.
test/workflows/feature-ideation/match-discussions.bats Tests for deterministic proposal↔discussion matching.
test/workflows/feature-ideation/lint-prompt.bats Tests for the workflow prompt linter.
test/workflows/feature-ideation/helpers/setup.bash Shared bats helpers (repo root, fixtures, PATH stub install, tmp dirs).
test/workflows/feature-ideation/gh-safe.bats Tests for the defensive gh wrapper (REST + GraphQL).
test/workflows/feature-ideation/fixtures/gh-responses/release-list.json Release list fixture for integration tests.
test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json Merged PR list fixture for integration tests.
test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json Open issues fixture for integration tests.
test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json Closed issues fixture for integration tests.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json GraphQL fixture (no Ideas category).
test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json GraphQL fixture with errors[] envelope.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json GraphQL fixture for discussions list.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json GraphQL fixture for truncated discussions (hasNextPage).
test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json GraphQL fixture for discussion categories.
test/workflows/feature-ideation/fixtures/expected/truncated.signals.json Expected signals fixture (truncation warnings).
test/workflows/feature-ideation/fixtures/expected/populated.signals.json Expected signals fixture (populated).
test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json Negative schema fixture (missing required field).
test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json Negative schema fixture (invalid repo slug).
test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json Expected signals fixture (empty repo).
test/workflows/feature-ideation/filter-bots.bats Tests for bot-author filtering.
test/workflows/feature-ideation/discussion-mutations.bats Tests for DRY_RUN mutation logging + live-mode behavior.
test/workflows/feature-ideation/date-utils.bats Tests for cross-platform date helpers.
test/workflows/feature-ideation/compose-signals.bats Tests for signals composition + JSON input validation.
test/workflows/feature-ideation/collect-signals.bats End-to-end integration tests for signals collection + schema validation.
standards/workflows/feature-ideation.yml Standards caller stub updated to include dry_run passthrough.
.github/workflows/feature-ideation-tests.yml New CI gate workflow running ShellCheck, prompt lint, schema checks, and bats.
.github/workflows/feature-ideation-reusable.yml Reusable workflow rewritten to use checked-out tooling scripts + DRY_RUN artifacts.
.github/scripts/feature-ideation/validate-signals.py JSON Schema (Draft 2020-12) validator script for signals.json.
.github/scripts/feature-ideation/README.md Tooling and test strategy documentation + contracts.
.github/scripts/feature-ideation/match-discussions.sh Deterministic Jaccard-based matcher producing a match plan JSON.
.github/scripts/feature-ideation/lint-prompt.sh Linter for detecting unescaped shell expansions in workflow prompt blocks.
.github/scripts/feature-ideation/lib/gh-safe.sh Defensive wrapper for gh REST + GraphQL calls (fail loud on auth/errors/null).
.github/scripts/feature-ideation/lib/filter-bots.sh Bot login filtering helper with env extensibility.
.github/scripts/feature-ideation/lib/date-utils.sh Cross-platform date arithmetic helpers.
.github/scripts/feature-ideation/lib/compose-signals.sh Signals JSON composition with pre-validation to avoid jq --argjson crashes.
.github/scripts/feature-ideation/discussion-mutations.sh Mutation wrappers with DRY_RUN JSONL logging contract.
.github/scripts/feature-ideation/collect-signals.sh Orchestrates collection of signals + truncation warnings + composition.
.github/schemas/signals.schema.json JSON Schema contract for signals.json (consumer/producer pin).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread test/workflows/feature-ideation/signals-schema.bats
Comment thread .github/schemas/signals.schema.json Outdated
Comment thread .github/scripts/feature-ideation/validate-signals.py Outdated
Comment thread .github/scripts/feature-ideation/lint-prompt.sh Outdated
Comment thread .github/scripts/feature-ideation/lib/gh-safe.sh
Comment thread .github/scripts/feature-ideation/match-discussions.sh Outdated
Comment thread .github/scripts/feature-ideation/match-discussions.sh
Comment thread test/workflows/feature-ideation/stubs/gh
Comment thread .github/workflows/feature-ideation-reusable.yml
Comment thread .github/workflows/feature-ideation-reusable.yml
DJ and others added 2 commits April 8, 2026 05:03
Aligns the script-tooling self-checkout with the @v1 pinning convention
introduced in #88. Now when a downstream caller stub pins to
`@v1` of the workflow file, the reusable workflow defaults to checking
out the matching `v1` tag for the scripts. Workflow file and scripts
upgrade in lockstep.

Override `tooling_ref` only for testing forks (`tooling_ref: my-branch`)
or bleeding-edge testing (`tooling_ref: main`). Documented in the input
description.

Note for the v1 tag move: after this PR merges, the v1 tag must be
moved forward to point to the new HEAD so that downstream BMAD repos
pinned to @v1 actually pick up the hardening. The change is purely
additive (new optional inputs `dry_run` and `tooling_ref`, new env vars
in the prompt context), so the move is backwards-compatible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 13

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/schemas/signals.schema.json:
- Line 3: The $id value in signals.schema.json incorrectly references
"TalkTerm/.github/schemas/signals.schema.json"; update the "$id" field to
accurately reflect the hosting repository (e.g.,
"https://github.com/petry-projects/.github/schemas/signals.schema.json" or a
neutral canonical URL) so external consumers resolve the schema correctly;
modify the "$id" string in the signals.schema.json file accordingly and ensure
it matches the org-level .github repository location.

In @.github/scripts/feature-ideation/collect-signals.sh:
- Line 28: Add a clear inline comment above the SCHEMA_VERSION constant to state
it must be kept in sync with the version declared in signals.schema.json, and
implement a CI validation step that parses signals.schema.json and fails if its
"version" (or $schema.version) does not match the SCHEMA_VERSION value in
collect-signals.sh; reference the SCHEMA_VERSION variable in the script and the
signals.schema.json filename so the check can locate and compare those two
sources.

In @.github/scripts/feature-ideation/discussion-mutations.sh:
- Around line 156-159: The gh_safe_graphql call is passing labelIds as a quoted
JSON string which sends the wrong GraphQL type; update the invocation in the
gh_safe_graphql call that sets labelableId/labelIds (the block containing "-f
query=\"$query\" -f labelableId=\"$discussion_id\" -f labelIds=...") to pass an
array using the CLI array syntax (use repeated key[] form for labelIds so each
ID is sent as an element) instead of the quoted JSON string; ensure the final
call provides labelIds[]=$label_id (one entry per label) so the GraphQL
parameter matches the expected [ID!]! type.

In @.github/scripts/feature-ideation/lib/filter-bots.sh:
- Around line 35-42: The env var FEATURE_IDEATION_BOT_AUTHORS is split into
extras but individual entries are not trimmed, so entries like "bot1, bot2" keep
leading spaces; update the block that builds extras to split on commas and trim
whitespace from each element before appending to list (i.e., process the local
array extras and sanitize each item by removing leading/trailing whitespace),
then push the cleaned values into list (references:
FEATURE_IDEATION_BOT_AUTHORS, extras, list in this function).

In @.github/scripts/feature-ideation/lib/gh-safe.sh:
- Around line 152-162: The code currently swallows jq errors by using "|| true"
when running jq; update the block that references has_jq, raw, jq_filter and
filtered so jq failures are detected and propagated: run jq without "|| true",
capture jq's exit status and stderr, and if jq exits non‑zero print the jq
stderr to stderr (or a clear error via process/console) and return non‑zero
instead of returning GH_SAFE_EMPTY_ARRAY; only treat empty/null output as the
empty-array sentinel (GH_SAFE_EMPTY_ARRAY) when jq succeeded.

In @.github/scripts/feature-ideation/lint-prompt.sh:
- Line 51: The regex in shell_expansion inconsistently handles escapes: the
$(...) branch uses (?<!\\) but the ${VAR} branch uses (?<!\$) so \${VAR} still
matches; update the ${VAR} subpattern to also reject backslash-escaped forms
(and still allow $$-escaped) by changing its lookbehind to reject both backslash
and dollar (e.g., use a single-character class lookbehind like (?<![\\$]) before
\${...}) so shell_expansion consistently skips both \${VAR} and $${VAR}.

In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 34-69: The exported Bash functions normalize_title() and
jaccard_similarity() are dead/duplicative because the script performs the same
normalization and Jaccard logic inside the embedded Python blocks; remove the
unused functions and their export lines (normalize_title and jaccard_similarity)
to avoid duplication and maintenance drift, and ensure no other parts of the
script call these symbols before deleting them.

In @.github/scripts/feature-ideation/README.md:
- Around line 22-33: The table contains an escaped pipe sequence `\|\|` inside
the literal shell fragment which can render incorrectly; update the README entry
so the shell snippet is shown as literal code instead of escaped pipes — for
example replace the in-table escaped text with an inline code span or fenced
code block showing the actual fragment (e.g., the intended `2>/dev/null || echo
'[]'`) and ensure the `lib/gh-safe.sh` description still references that
behavior; keep the table layout unchanged but present the shell fragment using
backticks or a code fence so Markdown renderers display the `||` correctly.
- Around line 3-5: Update the README reference to the actual reusable workflow
filename: replace the string `.github/workflows/feature-ideation.yml` with
`.github/workflows/feature-ideation-reusable.yml` (search for that literal in
the README content) so the documentation matches the reusable workflow name used
in the PR.

In @.github/scripts/feature-ideation/validate-signals.py:
- Around line 53-57: The except block catching json.JSONDecodeError when loading
signals_path should return exit code 2 (usage/file error) instead of 1 to match
the documented contract and the handling of malformed schema JSON; update the
handler that catches json.JSONDecodeError for signals_path.read_text() (the
signals loading block referencing signals_path and exc) to write the same stderr
message but return 2.

In `@test/workflows/feature-ideation/gh-safe.bats`:
- Around line 70-76: The assertion line in the test "rest: EXITS NON-ZERO on
auth failure (gh exit 4)" is ineffective because it ends with "|| true"; remove
the "|| true" and replace the redundant stderr check by asserting the captured
run output contains "rest-failure" (e.g., use [[ "$output" == *"rest-failure"*
]]) after the existing status check; locate the test block referencing
GH_STUB_EXIT and gh_safe_rest to make this change.

In `@test/workflows/feature-ideation/match-discussions.bats`:
- Around line 21-38: build_signals currently hardcodes "count": 0 for
ideas_discussions; change it to compute the actual number of items from the
discussions JSON and inject that value into the "count" field. Inside
build_signals, call jq to derive count (e.g. use echo "$discussions" | jq
'length' or similar) and assign it to a local variable (e.g. count) and then
write `"ideas_discussions": { "count": ${count}, "items": ${discussions} }` into
the signals JSON so count matches the items; keep the existing discussions
parameter and TT_TMP output location and ensure jq is used to safely handle
empty arrays and valid JSON.

In `@test/workflows/feature-ideation/signals-schema.bats`:
- Around line 13-16: The test named "schema: validator script exists and is
executable" is asserting existence/readability only; update the assertions to
match the name by checking executability (use [ -x "$VALIDATOR" ] and optionally
keep [ -f "$VALIDATOR" ]) or alternatively change the test name to "exists and
is readable" to reflect the current checks; locate the `@test` block with that
title and the VALIDATOR variable to apply the fix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b175ee9c-d0ba-4fe8-9a6a-089485a79cdb

📥 Commits

Reviewing files that changed from the base of the PR and between 67cb057 and 43bcf96.

📒 Files selected for processing (39)
  • .github/schemas/signals.schema.json
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/discussion-mutations.sh
  • .github/scripts/feature-ideation/lib/compose-signals.sh
  • .github/scripts/feature-ideation/lib/date-utils.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • .github/workflows/feature-ideation-reusable.yml
  • .github/workflows/feature-ideation-tests.yml
  • standards/workflows/feature-ideation.yml
  • test/workflows/feature-ideation/collect-signals.bats
  • test/workflows/feature-ideation/compose-signals.bats
  • test/workflows/feature-ideation/date-utils.bats
  • test/workflows/feature-ideation/discussion-mutations.bats
  • test/workflows/feature-ideation/filter-bots.bats
  • test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json
  • test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/populated.signals.json
  • test/workflows/feature-ideation/fixtures/expected/truncated.signals.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json
  • test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json
  • test/workflows/feature-ideation/fixtures/gh-responses/release-list.json
  • test/workflows/feature-ideation/gh-safe.bats
  • test/workflows/feature-ideation/helpers/setup.bash
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment thread .github/schemas/signals.schema.json Outdated
Comment thread .github/scripts/feature-ideation/collect-signals.sh
Comment thread .github/scripts/feature-ideation/discussion-mutations.sh Outdated
Comment thread .github/scripts/feature-ideation/lib/filter-bots.sh
Comment thread .github/scripts/feature-ideation/lib/gh-safe.sh
Comment thread .github/scripts/feature-ideation/README.md
Comment thread .github/scripts/feature-ideation/validate-signals.py Outdated
Comment thread test/workflows/feature-ideation/gh-safe.bats Outdated
Comment thread test/workflows/feature-ideation/match-discussions.bats
Comment thread test/workflows/feature-ideation/signals-schema.bats
DJ and others added 2 commits April 8, 2026 05:17
…6 tests)

Triaged 14 inline comments from Copilot's review of #85; two were already
fixed by the tooling_ref→v1 commit, the remaining 11 are addressed here.

Critical bug fixes
------------------

1. lint-prompt.sh now scans claude-code-action v1 `prompt:` blocks in
   addition to v0 `direct_prompt:`. The reusable workflow uses `prompt:`
   so the linter was silently allowing R2 regressions on the very file
   it was supposed to protect. Added two regression tests covering both
   the v1 form and a clean v1 form passes.

2. add_label_to_discussion now sends labelIds as a proper JSON array via
   gh_safe_graphql_input (new helper). Previously used `gh -f labelIds=`
   which sent the literal string `["L_1"]` and the GraphQL API would have
   rejected the mutation at runtime. Added a test that captures gh's
   stdin and asserts the variables block contains a length-1 array.

3. validate-signals.py now registers a `date-time` format checker via
   FormatChecker so the `format: date-time` keyword in signals.schema.json
   is actually enforced. Draft202012Validator does NOT enforce formats
   by default, and the default FormatChecker omits date-time entirely.
   Used an inline checker (datetime.fromisoformat with Z normalisation)
   to avoid pulling in rfc3339-validator. Added two regression tests:
   one for an invalid timestamp failing, one for a clean timestamp
   passing.

4. gh_safe_graphql --jq path no longer swallows jq filter errors with
   `|| true`. Filter typos / wrong paths now exit non-zero instead of
   silently returning []. Added a regression test using a deliberately
   broken filter.

5. collect-signals.sh now computes the open-issue truncation warning
   BEFORE filter_bots_apply. Previously, a result set composed entirely
   of bots could drop below ISSUE_LIMIT after filtering and mask real
   truncation. Added an integration test with all-bot fixtures.

6. match-discussions.sh now validates MATCH_THRESHOLD as a non-negative
   number in [0, 1] before passing to Python. A typo previously surfaced
   as an opaque traceback. Added regression tests for non-numeric input,
   out-of-range input, and boundary values 0 and 1.

Cleanup
-------

7. Removed dead bash `normalize_title` / `jaccard_similarity` functions
   from match-discussions.sh — the actual matching is implemented in the
   embedded Python block and the bash helpers were never called.

8. Schema $id corrected from petry-projects/TalkTerm/... to the canonical
   petry-projects/.github location.

9. signals-schema.bats "validator script exists and is executable" test
   now actually checks the `-x` bit (was only checking `-f` and `-r`).

10. README + filter-bots.sh comments now describe the bot list as a
    "blocklist" (it removes matching authors) instead of "allowlist".

11. test/workflows/feature-ideation/stubs/gh now logs argv with `printf
    '%q '` so each invocation is shell-quoted and re-parseable, matching
    its documentation. Previously logged `$*` which lost arg boundaries.

New helper
----------

gh_safe_graphql_input — same defensive contract as gh_safe_graphql, but
takes a fully-formed JSON request body via stdin instead of -f/-F flags.
Use for mutations whose variables include arrays (e.g. labelIds: [ID!]!)
that gh's flag-based interface cannot express. Five new tests cover
its happy path and every documented failure mode.

Tests
-----

Test count: 92 → 108 (16 new regression tests, all green). Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… 1 test)

Triaged 13 inline comments from CodeRabbit's review of #85; 6 of them
overlapped with Copilot's review and were already fixed by bcaa579. The
remaining 7 are addressed here.

Fixes
-----

1. lint-prompt.sh: ${VAR} branch lookbehind was inconsistent with the
   $(...) branch — only rejected $$VAR but not \${VAR}. Both branches
   now use [\\$] so backslash-escaped and dollar-escaped forms are
   skipped uniformly.

2. filter-bots.sh: FEATURE_IDEATION_BOT_AUTHORS CSV entries are now
   trimmed of leading/trailing whitespace before being added to the
   blocklist, so "bot1, bot2" matches both bots correctly instead of
   keeping a literal " bot2" entry.

3. validate-signals.py: malformed signals JSON now exits 2 (file/data
   error) to match the documented contract, instead of 1 (which means
   schema validation error).

4. README.md: corrected the workflow filename reference from
   feature-ideation.yml to feature-ideation-reusable.yml, and reworded
   the table cell that contained `\|\|` (escaped pipes that don't
   render correctly in some Markdown engines) to use plain prose. Also
   noted that lint-prompt scans both v0 `direct_prompt:` and v1 `prompt:`.

5. collect-signals.sh: added an explicit comment above SCHEMA_VERSION
   documenting the lockstep requirement with signals.schema.json's
   $comment version annotation. Backed by a new bats test that parses
   both files and asserts they match.

6. signals.schema.json: added $comment "version: 1.0.0" annotation so
   the schema file declares its own version explicitly. Used $comment
   instead of a custom keyword to keep Draft202012 compliance.

7. test/workflows/feature-ideation/match-discussions.bats: build_signals
   helper now computes the discussions count from the array length
   instead of hardcoding 0, so the fixture satisfies its own contract
   (cosmetic — the matcher only reads .items, but contract hygiene
   matters in test scaffolding).

8. test/workflows/feature-ideation/gh-safe.bats: removed the `|| true`
   suffix on the rest-failure assertion that made it always pass.
   Now uses --separate-stderr to capture stderr and asserts the
   structured `[gh-safe][rest-failure]` prefix is emitted on the auth
   failure path. Required `bats_require_minimum_version 1.5.0` to
   suppress the bats-core warning about flag usage.

Tests
-----

Test count: 108 → 109 (one new test for SCHEMA_VERSION ↔ schema sync).
All 109 passing locally. Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@don-petry
Copy link
Copy Markdown
Contributor Author

Review feedback addressed across two follow-up commits

Thanks @copilot-pull-request-reviewer and @coderabbitai for the thorough reviews. All 14 Copilot comments and all 13 CodeRabbit comments have been triaged and addressed across two commits on top of the merge:

bcaa579 — Copilot review fixes (11 fixes + 16 new regression tests)

Critical bugs:

  • lint-prompt.sh now scans claude-code-action v1 prompt: blocks in addition to v0 direct_prompt: — the linter previously had a blind spot on the very file it was meant to protect
  • add_label_to_discussion now sends labelIds as a proper JSON array via the new gh_safe_graphql_input helper instead of the literal string ["L_1"] (the GraphQL API would have rejected the mutation at runtime)
  • validate-signals.py now registers an inline date-time format checker so format: date-time keywords are actually enforced (default Draft202012Validator ignores format entirely; default FormatChecker omits date-time)
  • gh_safe_graphql --jq no longer swallows jq filter errors with || true — typos surface as exit-65 instead of silently returning []
  • collect-signals.sh now computes the open-issue truncation warning before filter_bots_apply so an all-bot result set at the limit doesn't mask real truncation
  • match-discussions.sh now validates MATCH_THRESHOLD as a non-negative number in [0, 1] before passing to Python (no more opaque tracebacks on typo)

Cleanup:

  • ✅ Dead bash normalize_title / jaccard_similarity functions removed from match-discussions.sh
  • ✅ Schema $id corrected to point at petry-projects/.github (not the original TalkTerm location)
  • signals-schema.bats "executable" test now actually checks -x
  • ✅ README + filter-bots.sh comments now describe the bot list as a "blocklist"
  • gh stub now logs argv with printf '%q ' so each invocation is shell-quoted and re-parseable

New helper:

  • gh_safe_graphql_input(body) — same defensive contract as gh_safe_graphql, but takes a fully-formed JSON request body via stdin. Use for mutations whose variables include arrays (e.g. labelIds: [ID!]!) that gh's flag-based interface cannot express.

67c6037 — CodeRabbit review fixes (7 additional fixes + 1 new test)

CodeRabbit identified 13 actionable items; 6 overlapped with Copilot's set (already fixed in bcaa579). The remaining 7 are addressed here:

  • lint-prompt.sh regex: ${VAR} lookbehind now uses [\\$] so backslash-escaped and dollar-escaped forms are skipped uniformly
  • filter-bots.sh: FEATURE_IDEATION_BOT_AUTHORS CSV entries are now whitespace-trimmed before joining the blocklist
  • validate-signals.py: malformed signals JSON now exits 2 (file/data error) per the documented contract, not 1 (schema error)
  • README.md: corrected workflow filename reference (feature-ideation-reusable.yml), reworded the \|\|-escaped table cell, noted lint-prompt scans both direct_prompt: and prompt:
  • collect-signals.sh: explicit comment above SCHEMA_VERSION documenting the lockstep requirement with signals.schema.json, backed by a new bats test that parses both files and asserts they match
  • signals.schema.json: added $comment "version: 1.0.0" annotation
  • match-discussions.bats: build_signals helper computes discussions count from array length instead of hardcoding 0
  • gh-safe.bats: removed the || true that made the auth-failure assertion ineffective; now uses --separate-stderr and asserts the [gh-safe][rest-failure] prefix

Final state

Metric Before review After both fix batches
bats tests 92 109
Open Copilot issues 14 0
Open CodeRabbit issues 13 0
shellcheck warnings 0 0
lint-prompt warnings 0 0
Schema fixture failures 0 0

Local CI mirror green across all gates. Pushed to feat/feature-ideation-test-hardening at 67c6037.

@coderabbitai please re-review.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

@don-petry: Thank you for the detailed and thorough summary! The two fix batches (bcaa579 and 67c6037) look very well-structured — particularly the critical bug fixes around the add_label_to_discussion GraphQL array issue, the date-time format enforcement in validate-signals.py, and the truncation-before-bot-filter ordering in collect-signals.sh. The new gh_safe_graphql_input helper is a clean addition for array-variable mutations. The jump from 92 → 109 bats tests is a strong signal that regressions are covered. Triggering a full re-review now.


🧠 Learnings used
Learnt from: CR
Repo: petry-projects/.github PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-08T01:43:24.250Z
Learning: All review threads must be resolved before merge; agents MUST address and resolve Copilot, CodeRabbit, and human review comments

Learnt from: CR
Repo: petry-projects/.github PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-08T01:43:24.250Z
Learning: Iterate until all CI checks pass before requesting review or marking PR ready; never bypass CI gates or weaken thresholds
✅ Actions performed

Full review triggered.

Copilot AI changed the title test(feature-ideation): extract bash, add schema + 92 bats tests fix(feature-ideation): address Copilot + CodeRabbit review on PR #85 (18 fixes, 17 new tests) Apr 8, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 11

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/feature-ideation/collect-signals.sh:
- Around line 56-59: The script accepts ISSUE_LIMIT, PR_LIMIT and
DISCUSSION_LIMIT but does not validate they are numeric; add a fast-fail numeric
check after the assignments for local issue_limit, pr_limit and discussion_limit
(e.g., using a regex like '^[0-9]+$' or bash arithmetic test) and if any value
is non-numeric print a clear error to stderr and exit 1; keep the existing
defaults (ISSUE_LIMIT:-50, PR_LIMIT:-30, DISCUSSION_LIMIT:-100) and ensure you
validate the environment variables (ISSUE_LIMIT, PR_LIMIT, DISCUSSION_LIMIT)
themselves before using them in GraphQL so malformed values aren’t forwarded
(reference the variables issue_limit, pr_limit, discussion_limit and the
existing output_path variable to locate the block to modify).
- Around line 61-67: The REPO parsing is too permissive: validate that REPO
matches exactly one non-empty owner and one non-empty repo segment (no
leading/trailing slash, no empty segments, no extra path parts). Replace the
current split check with a strict pattern check against REPO (e.g. assert REPO
matches '^[^/]+/[^/]+$'), then extract owner and repo_name from REPO only after
that succeeds; if the check fails, print the same error message and return 64.
Refer to the REPO variable and the owner and repo_name extraction logic when
implementing this change.

In @.github/scripts/feature-ideation/lint-prompt.sh:
- Around line 109-120: The loop collapses all nonzero exit codes from scan_file
into 1, losing the distinct missing-file error (2) and allowing later findings
to overwrite earlier file errors; change the handling in the for-loop that
iterates files so that when scan_file "$file" returns nonzero you capture its
actual exit code (e.g., rc=$?) and set exit=$rc only if rc is 2 or if exit is
currently 0 (otherwise keep the higher-priority existing exit); similarly ensure
the existing missing-file branch (which sets exit=2) is preserved and not
overwritten by later scan failures—use the variables exit and scan_file to
implement this priority-preserving logic.
- Around line 81-83: The current remover using no_gh =
re.sub(r'\$\{\{[^}]*\}\}', '', raw) fails on nested/quoted braces; update the
logic that produces no_gh (and the loop using shell_expansion.finditer(no_gh))
to scan raw and strip GitHub Actions expressions by walking characters: on
seeing '${{' consume until the matching '}}' while tracking and ignoring braces
inside single- or double-quoted strings and escaped quotes, then append the
rest; ensure you replace the single re.sub call with this scanner-based removal
so expressions like "${{ format('${FOO}', github.ref) }}" are fully removed
before shell_expansion.finditer runs.
- Around line 57-59: Update the prompt_marker regex to recognize YAML
block-scalar chomping modifiers so block headers like "|-", "|+", ">-", ">+" are
matched; change the pattern used by prompt_marker (the compiled regex named
prompt_marker) to r'(?:direct_prompt|prompt):\s*[|>]?[-+]?\s*$' and add
regression tests that exercise prompt: |- , prompt: |+, prompt: >-, and prompt:
>+ (and the corresponding direct_prompt variants) to ensure those bodies are
linted.

In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 100-103: Wrap the json.load calls for signals_path and
proposals_path in a try/except that catches json.JSONDecodeError, emit a
structured error message (including which file failed and the exception message)
and exit with code 2 to match validate-signals.py's contract; update the block
containing the signals = json.load(f) and proposals = json.load(f) lines to
perform each load inside the try/except (or a shared try that reports the
specific filename) so malformed JSON results in a controlled, consistent exit
code and message rather than a Python traceback.
- Around line 124-140: The loop treating discussions uses disc.get("id") which
can be None and cause all id-less discussions to be merged; to fix, extract the
id into a variable (e.g., disc_id = disc.get("id")) at the top of the loop, skip
that iteration if disc_id is falsy or None before checking seen_disc_ids or
comparing similarity, and use disc_id when adding to seen_disc_ids and when
skipping; update references to disc.get("id") in this block so comparisons and
set additions always use the validated disc_id.

In @.github/scripts/feature-ideation/validate-signals.py:
- Around line 81-92: The _check_date_time function lacks a return type
annotation; update its signature to include "-> bool" (i.e., change the callback
def _check_date_time(instance): to def _check_date_time(instance) -> bool:) so
static analyzers understand it returns a boolean; keep the existing behavior and
exception handling (ValueError/TypeError) and do not alter the body of
_check_date_time or the `@format_checker.checks` decorator.

In `@test/workflows/feature-ideation/lint-prompt.bats`:
- Around line 119-122: The test suite is missing a regression asserting exit
code 2 for file-error precedence in main(), so add a new BATS test (or extend
tests in test/workflows/feature-ideation/lint-prompt.bats) that invokes the
LINTER in a scenario where a missing workflow file is encountered before any
lint findings and asserts that the process exits with status 2 (rather than
collapsing to 0 or another code); reference the existing test names like
"lint-prompt: scans every .github/workflows file by default" and the program
entrypoint main() when locating where to add the new case to validate the
file-error path.

In `@test/workflows/feature-ideation/match-discussions.bats`:
- Around line 170-184: Add tests to cover syntactically invalid JSON input for
the MATCH runner: create a test named like "match: malformed JSON in signals
file produces error" that writes invalid JSON into signals.json (e.g., printf
'not valid json' >"${TT_TMP}/signals.json"), builds a valid proposals.json via
build_proposals, runs bash "$MATCH" "${TT_TMP}/signals.json"
"${TT_TMP}/proposals.json" and assert the process exits non‑zero (or specific
exit code 1 if you want to codify current behavior); similarly add a mirror test
"match: malformed JSON in proposals file produces error" that injects invalid
JSON into proposals.json and asserts non‑zero exit. Ensure tests reference the
MATCH variable and use TT_TMP paths consistent with existing tests.

In `@test/workflows/feature-ideation/signals-schema.bats`:
- Around line 38-70: Add a new BATS test that verifies syntactically malformed
JSON returns exit code 2: create a bad_file (e.g. write invalid JSON like "}{")
into BATS_TEST_TMPDIR, run python3 "$VALIDATOR" "$bad_file" "$SCHEMA" (same
pattern used in existing tests) and assert [ "$status" -eq 2 ]; give the test a
clear name such as "schema: malformed JSON FAILS with exit code 2" so it
complements the existing tests ("schema: malformed repo string FAILS
validation", "schema: invalid date-time scan_date FAILS validation", etc.).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c17720bc-b640-423c-bda7-2c442a1884e6

📥 Commits

Reviewing files that changed from the base of the PR and between 43bcf96 and 67c6037.

📒 Files selected for processing (16)
  • .github/schemas/signals.schema.json
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/discussion-mutations.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • test/workflows/feature-ideation/collect-signals.bats
  • test/workflows/feature-ideation/discussion-mutations.bats
  • test/workflows/feature-ideation/gh-safe.bats
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment thread .github/scripts/feature-ideation/collect-signals.sh
Comment thread .github/scripts/feature-ideation/collect-signals.sh Outdated
Comment thread .github/scripts/feature-ideation/lint-prompt.sh Outdated
Comment thread .github/scripts/feature-ideation/lint-prompt.sh Outdated
Comment thread .github/scripts/feature-ideation/lint-prompt.sh
Comment thread .github/scripts/feature-ideation/match-discussions.sh Outdated
Comment thread .github/scripts/feature-ideation/validate-signals.py
Comment thread test/workflows/feature-ideation/lint-prompt.bats
Comment thread test/workflows/feature-ideation/match-discussions.bats
Comment thread test/workflows/feature-ideation/signals-schema.bats
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

@don-petry: Acknowledged — the two-commit fix series is thorough and well-documented. The critical runtime bugs (GraphQL array encoding in add_label_to_discussion, date-time format enforcement, truncation-before-bot-filter ordering, jq error propagation) and the 17 regression tests added on top are exactly the right response to those findings. Triggering a full re-review now across all changed files.


🧠 Learnings used
Learnt from: CR
Repo: petry-projects/.github PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-08T01:43:24.250Z
Learning: All review threads must be resolved before merge; agents MUST address and resolve Copilot, CodeRabbit, and human review comments

Learnt from: CR
Repo: petry-projects/.github PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-08T01:43:24.250Z
Learning: Iterate until all CI checks pass before requesting review or marking PR ready; never bypass CI gates or weaken thresholds
✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

♻️ Duplicate comments (6)
.github/scripts/feature-ideation/match-discussions.sh (1)

100-103: ⚠️ Potential issue | 🟠 Major

Catch malformed JSON and return the data-error code.

Lines 100-103 still let json.load() raise a traceback and exit 1. This is a realistic workflow failure because .github/workflows/feature-ideation-reusable.yml:376-407 has Claude write proposals.json and invokes the matcher immediately afterward; malformed JSON should produce a structured [match-discussions] invalid JSON ... message and exit 2 instead.

Suggested fix
-with open(signals_path) as f:
-    signals = json.load(f)
-with open(proposals_path) as f:
-    proposals = json.load(f)
+def load_json(path: str, label: str):
+    try:
+        with open(path, encoding="utf-8") as f:
+            return json.load(f)
+    except json.JSONDecodeError as exc:
+        sys.stderr.write(f"[match-discussions] invalid JSON in {label}: {exc}\n")
+        sys.exit(2)
+
+signals = load_json(signals_path, "signals")
+proposals = load_json(proposals_path, "proposals")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/match-discussions.sh around lines 100 -
103, Wrap the json.load calls for signals_path and proposals_path in try/except
blocks that catch json.JSONDecodeError (and optionally ValueError), log a
structured message like "[match-discussions] invalid JSON <filename>: <error>"
using the same logging/output mechanism, and exit with status code 2;
specifically update the locations that call json.load(signals_path) and
json.load(proposals_path) (referencing the signals_path and proposals_path
variables and the json.load function) so malformed JSON no longer raises an
uncaught traceback but returns the data-error exit code 2 with the described
message.
.github/scripts/feature-ideation/lint-prompt.sh (3)

109-120: ⚠️ Potential issue | 🟠 Major

Preserve exit code 2 for missing or unreadable files.

Lines 116-118 collapse every scan_file failure to 1. That hides scan_file’s 2 path and lets a later lint finding overwrite an earlier file error, which breaks the documented exit-code contract.

Suggested fix
-  local exit=0
+  local exit=0
+  local file_rc=0
   for file in "$@"; do
     if [ ! -f "$file" ]; then
       printf '[lint-prompt] not found: %s\n' "$file" >&2
       exit=2
       continue
     fi
-    if ! scan_file "$file"; then
-      exit=1
-    fi
+    if scan_file "$file"; then
+      file_rc=0
+    else
+      file_rc=$?
+    fi
+    case "$file_rc" in
+      0) ;;
+      1) if [ "$exit" -eq 0 ]; then exit=1; fi ;;
+      2) exit=2 ;;
+      *) return "$file_rc" ;;
+    esac
   done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 109 - 120, The
loop currently collapses all scan_file failures to exit=1 and can overwrite a
prior exit=2 (missing/unreadable file); change the scan_file call in the for
loop to capture its return code (e.g., rc=$?) and then set exit to 2 if rc==2,
else set exit to 1 if rc!=0, making sure you never downgrade an existing exit=2
to 1. Update the block around the scan_file invocation (referencing the
scan_file call and the local exit variable) so the final return "$exit"
preserves the severity of a missing/unreadable-file error.

81-83: ⚠️ Potential issue | 🟠 Major

Replace the ${{ ... }} stripper with a scanner.

Line 82 stops at the first }, so expressions such as ${{ format('${FOO}', github.ref) }} are not fully removed and ${FOO} is falsely reported as a shell expansion. A small stateful scanner that consumes until the matching }} is needed here.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 81 - 83, The
regexp re.sub(r'\$\{\{[^}]*\}\}', '', raw) assigned to no_gh doesn't handle
nested/embedded braces and stops at the first '}', causing sequences like ${{
format('${FOO}', github.ref) }} to be clipped incorrectly; replace that regexp
with a small stateful scanner that iterates over raw and when it sees the start
token "${{" consumes characters until it finds the matching "}}" (skipping over
any internal braces or quotes) and only then removes the whole "${{...}}" token
before passing the result to shell_expansion.finditer; update the code that sets
no_gh so it uses this scanner function to produce the stripped string instead of
the current regex.

57-59: ⚠️ Potential issue | 🟠 Major

Handle YAML chomping modifiers in prompt_marker.

Lines 57-59 still miss valid block-scalar headers like prompt: |-, |+, >-, and >+, so those prompt bodies bypass the linter entirely.

Suggested fix
-prompt_marker = re.compile(r'(?:direct_prompt|prompt):\s*[|>]?\s*$')
+prompt_marker = re.compile(r'(?:direct_prompt|prompt):\s*[|>]?[-+]?\s*$')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 57 - 59, The
prompt_marker regex currently matches block scalars like `|` or `>` but misses
YAML chomping modifiers (e.g., `|-`, `|+`, `>-`, `>+`) so those prompt bodies
bypass the linter; update the prompt_marker pattern to accept optional
chomping/indentation modifiers after the block scalar indicator (allowing + or -
and optional indentation digits) so entries like `prompt: |-`, `prompt: >+`, or
`prompt: |+2` are recognized; modify the prompt_marker (the re.compile
definition) accordingly and ensure tests cover examples with `direct_prompt:`
and `prompt:` variants plus the chomping forms.
test/workflows/feature-ideation/lint-prompt.bats (1)

119-122: ⚠️ Potential issue | 🟡 Minor

Still missing exit-code precedence regression for file errors.

Please add a case where a missing file is processed before a lint-failing file and assert final status is 2. Without it, file-error precedence can regress silently.

Suggested test
+@test "lint-prompt: file errors keep exit 2 even with later findings" {
+  write_yml "${TT_TMP}/bad.yml" <<'YML'
+jobs:
+  analyze:
+    steps:
+      - uses: anthropics/claude-code-action@v1
+        with:
+          prompt: |
+            Date: $(date -u +%Y-%m-%d)
+YML
+  run bash "$LINTER" "${TT_TMP}/missing.yml" "${TT_TMP}/bad.yml"
+  [ "$status" -eq 2 ]
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/lint-prompt.bats` around lines 119 - 122, Add
a new Bats test that reproduces the exit-code precedence regression: create a
scenario where a missing file is encountered before a lint-failing file and
assert the final exit status is 2; use the existing test harness variables
(e.g., the LINTER invoked in the "lint-prompt: scans every .github/workflows
file by default" test) and name the test clearly (e.g., "lint-prompt: missing
file before failing file returns 2") so reviewers can find it, ensure you set up
one missing path and one file that will fail linting, run bash "$LINTER", and
assert [ "$status" -eq 2 ] to lock in file-error precedence.
.github/scripts/feature-ideation/collect-signals.sh (1)

56-67: ⚠️ Potential issue | 🟡 Minor

Prior review items remain unaddressed.

Two issues flagged in earlier reviews are still present:

  1. Non-numeric limits (lines 56-59): ISSUE_LIMIT, PR_LIMIT, DISCUSSION_LIMIT are not validated. A value like ISSUE_LIMIT=foo will cause inconsistent failures downstream.

  2. REPO format validation (lines 61-67): The current check allows malformed values like /repo, org//repo, or org/repo/extra to pass. The suggested fix was to use a stricter regex: [[ ! $REPO =~ ^[^/]+/[^/]+$ ]].

Both are minor issues since malformed input fails downstream, but the error messages would be clearer with upfront validation.

,

🔧 Combined fix for both issues
   local issue_limit="${ISSUE_LIMIT:-50}"
   local pr_limit="${PR_LIMIT:-30}"
   local discussion_limit="${DISCUSSION_LIMIT:-100}"
   local output_path="${SIGNALS_OUTPUT:-./signals.json}"
 
+  # Validate numeric limits
+  for name in ISSUE_LIMIT PR_LIMIT DISCUSSION_LIMIT; do
+    local value
+    case "$name" in
+      ISSUE_LIMIT) value="$issue_limit" ;;
+      PR_LIMIT) value="$pr_limit" ;;
+      DISCUSSION_LIMIT) value="$discussion_limit" ;;
+    esac
+    if [[ ! $value =~ ^[1-9][0-9]*$ ]]; then
+      printf '[collect-signals] %s must be a positive integer, got: %s\n' "$name" "$value" >&2
+      return 64
+    fi
+  done
+
-  local owner repo_name
-  owner="${REPO%%/*}"
-  repo_name="${REPO##*/}"
-  if [ "$owner" = "$REPO" ] || [ -z "$repo_name" ]; then
+  if [[ ! $REPO =~ ^[^/]+/[^/]+$ ]]; then
     printf '[collect-signals] REPO must be in owner/name format, got: %s\n' "$REPO" >&2
     return 64
   fi
+  local owner repo_name
+  owner="${REPO%%/*}"
+  repo_name="${REPO##*/}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/collect-signals.sh around lines 56 - 67,
Validate ISSUE_LIMIT, PR_LIMIT, and DISCUSSION_LIMIT variables (issue_limit,
pr_limit, discussion_limit) are positive integers and default to their fallback
values if non-numeric, e.g., check with a numeric regex or test before assigning
and error out with a clear message if invalid; also tighten REPO validation by
replacing the current owner/repo parsing and checks with a regex match like [[
$REPO =~ ^[^/]+/[^/]+$ ]] (use that to set owner and repo_name from REPO or
print a clear error and return 64) so malformed values such as "/repo",
"org//repo", or "org/repo/extra" are rejected up-front.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/feature-ideation/lib/compose-signals.sh:
- Around line 46-56: The preflight JSON check in the loop over variables
open_issues, closed_issues, ideas_discussions, releases, merged_prs,
feature_requests, bug_reports, truncation_warnings currently only verifies valid
JSON; change the jq invocation in that for-loop (inside compose-signals.sh) to
assert the value is a JSON array (i.e., fail if jq reports a non-array type)
instead of using plain 'jq -e .', and keep the existing error message and return
65 behavior so non-array inputs are rejected early.

In @.github/scripts/feature-ideation/lib/date-utils.sh:
- Around line 12-17: The function date_days_ago reads $1 before checking
argument count which breaks under set -u; add an explicit argument-count guard
at the top (e.g. if [ "$#" -ne 1 ] ... ) before touching positional parameters,
then assign local days="$1" and keep the existing numeric validation and error
message; adjust the return code/message as needed so date_days_ago handles
zero-arg calls safely.

In @.github/scripts/feature-ideation/lib/filter-bots.sh:
- Around line 38-40: The unquoted array expansion local
extras=($FEATURE_IDEATION_BOT_AUTHORS) allows globbing; change the splitting to
a safe read into an array and disable globbing around it: temporarily call set
-f, do IFS=',' read -r -a extras <<< "$FEATURE_IDEATION_BOT_AUTHORS"
(referencing FEATURE_IDEATION_BOT_AUTHORS and the extras array), then restore
globbing with set +f so wildcard characters in entries are not expanded against
the filesystem.

In @.github/scripts/feature-ideation/lib/gh-safe.sh:
- Around line 95-98: The script dereferences missing arguments causing
unbound-variable errors under set -u: in the --jq parsing block (referencing
args, has_jq, jq_filter, i) add a bounds check before accessing args[$((i + 1))]
(e.g. ensure $((i+1)) < ${`#args`[@]}) and if the filter is missing emit the same
structured error and return code 64 instead of dereferencing; likewise, in
gh_safe_graphql_input() verify the input argument exists (check $# or existence
of $1) and return error code 64 with a clear message if it is missing, avoiding
unconditional use of $1.

In @.github/scripts/feature-ideation/README.md:
- Around line 54-55: Update the README entry that currently says "Lint the
workflow's direct_prompt block" to reflect that the lint script (lint-prompt.sh)
now validates both direct_prompt: and prompt: blocks, and change any
instructions telling Mary to edit prompts to instead point her to the reusable
workflow file where prompt content is owned/maintained; ensure the command
example and brief description mention both keys (direct_prompt and prompt) and
reference the reusable workflow as the source of truth for prompt edits.

In @.github/scripts/feature-ideation/validate-signals.py:
- Around line 55-69: The JSON loading blocks for signals_path and schema_path
currently only catch json.JSONDecodeError, so file read failures escape; wrap
each signals_path.read_text() and schema_path.read_text() call in a try/except
that catches file I/O errors (e.g., OSError/FileNotFoundError) and on error
write a stderr message similar to the JSON error messages and return 2 to
preserve the exit-code contract; keep the existing json.JSONDecodeError handlers
for malformed JSON after successfully reading the file.

---

Duplicate comments:
In @.github/scripts/feature-ideation/collect-signals.sh:
- Around line 56-67: Validate ISSUE_LIMIT, PR_LIMIT, and DISCUSSION_LIMIT
variables (issue_limit, pr_limit, discussion_limit) are positive integers and
default to their fallback values if non-numeric, e.g., check with a numeric
regex or test before assigning and error out with a clear message if invalid;
also tighten REPO validation by replacing the current owner/repo parsing and
checks with a regex match like [[ $REPO =~ ^[^/]+/[^/]+$ ]] (use that to set
owner and repo_name from REPO or print a clear error and return 64) so malformed
values such as "/repo", "org//repo", or "org/repo/extra" are rejected up-front.

In @.github/scripts/feature-ideation/lint-prompt.sh:
- Around line 109-120: The loop currently collapses all scan_file failures to
exit=1 and can overwrite a prior exit=2 (missing/unreadable file); change the
scan_file call in the for loop to capture its return code (e.g., rc=$?) and then
set exit to 2 if rc==2, else set exit to 1 if rc!=0, making sure you never
downgrade an existing exit=2 to 1. Update the block around the scan_file
invocation (referencing the scan_file call and the local exit variable) so the
final return "$exit" preserves the severity of a missing/unreadable-file error.
- Around line 81-83: The regexp re.sub(r'\$\{\{[^}]*\}\}', '', raw) assigned to
no_gh doesn't handle nested/embedded braces and stops at the first '}', causing
sequences like ${{ format('${FOO}', github.ref) }} to be clipped incorrectly;
replace that regexp with a small stateful scanner that iterates over raw and
when it sees the start token "${{" consumes characters until it finds the
matching "}}" (skipping over any internal braces or quotes) and only then
removes the whole "${{...}}" token before passing the result to
shell_expansion.finditer; update the code that sets no_gh so it uses this
scanner function to produce the stripped string instead of the current regex.
- Around line 57-59: The prompt_marker regex currently matches block scalars
like `|` or `>` but misses YAML chomping modifiers (e.g., `|-`, `|+`, `>-`,
`>+`) so those prompt bodies bypass the linter; update the prompt_marker pattern
to accept optional chomping/indentation modifiers after the block scalar
indicator (allowing + or - and optional indentation digits) so entries like
`prompt: |-`, `prompt: >+`, or `prompt: |+2` are recognized; modify the
prompt_marker (the re.compile definition) accordingly and ensure tests cover
examples with `direct_prompt:` and `prompt:` variants plus the chomping forms.

In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 100-103: Wrap the json.load calls for signals_path and
proposals_path in try/except blocks that catch json.JSONDecodeError (and
optionally ValueError), log a structured message like "[match-discussions]
invalid JSON <filename>: <error>" using the same logging/output mechanism, and
exit with status code 2; specifically update the locations that call
json.load(signals_path) and json.load(proposals_path) (referencing the
signals_path and proposals_path variables and the json.load function) so
malformed JSON no longer raises an uncaught traceback but returns the data-error
exit code 2 with the described message.

In `@test/workflows/feature-ideation/lint-prompt.bats`:
- Around line 119-122: Add a new Bats test that reproduces the exit-code
precedence regression: create a scenario where a missing file is encountered
before a lint-failing file and assert the final exit status is 2; use the
existing test harness variables (e.g., the LINTER invoked in the "lint-prompt:
scans every .github/workflows file by default" test) and name the test clearly
(e.g., "lint-prompt: missing file before failing file returns 2") so reviewers
can find it, ensure you set up one missing path and one file that will fail
linting, run bash "$LINTER", and assert [ "$status" -eq 2 ] to lock in
file-error precedence.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c1571328-f3bd-46f8-a0ce-580dc17afb6a

📥 Commits

Reviewing files that changed from the base of the PR and between 67cb057 and 67c6037.

📒 Files selected for processing (39)
  • .github/schemas/signals.schema.json
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/discussion-mutations.sh
  • .github/scripts/feature-ideation/lib/compose-signals.sh
  • .github/scripts/feature-ideation/lib/date-utils.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • .github/workflows/feature-ideation-reusable.yml
  • .github/workflows/feature-ideation-tests.yml
  • standards/workflows/feature-ideation.yml
  • test/workflows/feature-ideation/collect-signals.bats
  • test/workflows/feature-ideation/compose-signals.bats
  • test/workflows/feature-ideation/date-utils.bats
  • test/workflows/feature-ideation/discussion-mutations.bats
  • test/workflows/feature-ideation/filter-bots.bats
  • test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json
  • test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/populated.signals.json
  • test/workflows/feature-ideation/fixtures/expected/truncated.signals.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json
  • test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json
  • test/workflows/feature-ideation/fixtures/gh-responses/release-list.json
  • test/workflows/feature-ideation/gh-safe.bats
  • test/workflows/feature-ideation/helpers/setup.bash
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment thread .github/scripts/feature-ideation/lib/compose-signals.sh
Comment thread .github/scripts/feature-ideation/lib/date-utils.sh
Comment thread .github/scripts/feature-ideation/lib/filter-bots.sh Outdated
Comment thread .github/scripts/feature-ideation/lib/gh-safe.sh
Comment thread .github/scripts/feature-ideation/README.md Outdated
Comment thread .github/scripts/feature-ideation/validate-signals.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (15)
.github/scripts/feature-ideation/lib/filter-bots.sh (1)

37-40: ⚠️ Potential issue | 🟠 Major

Prevent glob expansion when parsing FEATURE_IDEATION_BOT_AUTHORS.
Line 40 still uses unquoted expansion (local extras=($FEATURE_IDEATION_BOT_AUTHORS)), so wildcard characters can expand against filesystem entries and corrupt the blocklist.

Proposed fix
   if [ -n "${FEATURE_IDEATION_BOT_AUTHORS:-}" ]; then
-    local IFS=','
-    # shellcheck disable=SC2206
-    local extras=($FEATURE_IDEATION_BOT_AUTHORS)
+    local extras=()
+    IFS=',' read -r -a extras <<<"${FEATURE_IDEATION_BOT_AUTHORS}"
In Bash, does unquoted array assignment like `local extras=($FEATURE_IDEATION_BOT_AUTHORS)` perform pathname expansion (globbing), and is `IFS=',' read -r -a extras <<<"$FEATURE_IDEATION_BOT_AUTHORS"` the safer alternative?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/filter-bots.sh around lines 37 - 40,
The unquoted array assignment for FEATURE_IDEATION_BOT_AUTHORS can trigger
pathname expansion; replace the unsafe expansion of local
extras=($FEATURE_IDEATION_BOT_AUTHORS) with a quoted, IFS-aware read to split on
commas and avoid globbing (e.g. use IFS=',' read -r -a extras <<<
"$FEATURE_IDEATION_BOT_AUTHORS"), keeping the IFS=',' and local extras
identifiers so the script still produces the same array without allowing
wildcard expansion.
.github/scripts/feature-ideation/README.md (1)

54-55: ⚠️ Potential issue | 🟡 Minor

Docs still reference outdated prompt scope/location.
Line 54 should mention both direct_prompt: and prompt: linting, and Line 79 should point prompt updates to .github/workflows/feature-ideation-reusable.yml (not feature-ideation.yml).

Also applies to: 79-80

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/README.md around lines 54 - 55, Update the
README text that describes the prompt linter so it references both
direct_prompt: and prompt: scopes (not just direct_prompt) and update the
workflow pointer so it directs maintainers to
.github/workflows/feature-ideation-reusable.yml (instead of
feature-ideation.yml); ensure the section that shows the lint command (bash
.github/scripts/feature-ideation/lint-prompt.sh) and any explanatory text
mention both prompt keys and the correct reusable workflow file name.
.github/scripts/feature-ideation/lib/date-utils.sh (1)

12-14: ⚠️ Potential issue | 🟠 Major

Guard arg count before reading $1 in date_days_ago().
Line 13 dereferences $1 before validation; with set -u, a zero-arg call aborts the shell instead of returning 64.

Proposed fix
 date_days_ago() {
+  if [ "$#" -ne 1 ]; then
+    printf '[date-utils] expected 1 arg (days), got: %d\n' "$#" >&2
+    return 64
+  fi
   local days="$1"
   if [ -z "$days" ] || ! printf '%s' "$days" | grep -Eq '^[0-9]+$'; then
With Bash `set -u`, what happens when a function accesses `$1` but is called with zero arguments, and is an explicit `"$#"` guard the recommended pattern?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/date-utils.sh around lines 12 - 14,
date_days_ago currently dereferences $1 before validating argument count which
breaks under set -u; update date_days_ago to check the argument count (use "$#"
or an explicit [ "$#" -lt 1 ] guard) before assigning or reading $1, and on
missing args return exit code 64, keeping the numeric-validation logic (grep -Eq
'^[0-9]+$') after the guard so $1 is only accessed when present.
.github/scripts/feature-ideation/validate-signals.py (2)

81-92: 🧹 Nitpick | 🔵 Trivial

Add return type annotation to _check_date_time.

The static analysis tool and a past review flagged the missing return type annotation. Adding -> bool improves readability and satisfies type checkers.

♻️ Suggested fix
     `@format_checker.checks`("date-time", raises=(ValueError, TypeError))
-    def _check_date_time(instance):  # noqa: ANN001 — jsonschema callback signature
+    def _check_date_time(instance) -> bool:  # noqa: ANN001 — jsonschema callback signature
         if not isinstance(instance, str):
             return True  # non-strings handled by `type` keyword, not format
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/validate-signals.py around lines 81 - 92,
The _check_date_time format-check callback is missing a return type annotation;
update the function definition of _check_date_time to include an explicit "->
bool" return type (preserve the existing decorator and the "# noqa: ANN001"
comment and behavior), e.g., modify the signature for the function named
_check_date_time to declare it returns bool so static type checkers are
satisfied while leaving the body unchanged.

55-69: ⚠️ Potential issue | 🟡 Minor

File read errors not caught.

A past review noted that read_text() calls can raise OSError (permissions, I/O errors) which would escape as uncaught exceptions instead of returning exit code 2 per the contract. The current code only catches json.JSONDecodeError.

🛡️ Proposed fix to handle file read errors
     try:
-        signals = json.loads(signals_path.read_text())
-    except json.JSONDecodeError as exc:
+        signals_raw = signals_path.read_text(encoding="utf-8")
+        signals = json.loads(signals_raw)
+    except OSError as exc:
+        sys.stderr.write(f"[validate-signals] cannot read {signals_path}: {exc}\n")
+        return 2
+    except json.JSONDecodeError as exc:
         # Per the docstring contract, exit 2 means usage / file error and
         # exit 1 means schema validation error. A malformed signals file
         # is a file/data error, not a schema violation. Caught by
         # CodeRabbit review on PR petry-projects/.github#85.
         sys.stderr.write(f"[validate-signals] invalid JSON in {signals_path}: {exc}\n")
         return 2

     try:
-        schema = json.loads(schema_path.read_text())
+        schema_raw = schema_path.read_text(encoding="utf-8")
+        schema = json.loads(schema_raw)
+    except OSError as exc:
+        sys.stderr.write(f"[validate-signals] cannot read schema {schema_path}: {exc}\n")
+        return 2
     except json.JSONDecodeError as exc:
         sys.stderr.write(f"[validate-signals] invalid schema JSON: {exc}\n")
         return 2
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/validate-signals.py around lines 55 - 69,
The JSON file reads for signals and schema currently only catch
json.JSONDecodeError and will let OSError (e.g., read_text IO/permission errors)
bubble up; wrap or modify the parsing blocks that use signals_path.read_text()
and schema_path.read_text() (the try/except blocks around signals =
json.loads(...) and schema = json.loads(...)) to also catch OSError (or a
combined except (OSError, json.JSONDecodeError) as exc), write an explanatory
message to sys.stderr including the failing path (signals_path or schema_path)
and the exception, and return exit code 2 so file/read errors follow the
documented contract.
test/workflows/feature-ideation/signals-schema.bats (1)

38-70: 🧹 Nitpick | 🔵 Trivial

Missing test for malformed JSON exit code 2.

A past review suggested adding a test that verifies syntactically malformed JSON (not just schema-invalid) returns exit code 2 per the documented contract in validate-signals.py. This regression would catch if the script reverts to a generic non-zero exit.

🧪 Proposed test addition
 `@test` "schema: well-formed date-time scan_date passes" {
   good_file="${BATS_TEST_TMPDIR}/good-scan-date.json"
   jq '.scan_date = "2026-04-08T12:34:56Z"' "${FIX}/empty-repo.signals.json" >"$good_file"
   run python3 "$VALIDATOR" "$good_file" "$SCHEMA"
   [ "$status" -eq 0 ]
 }

+@test "schema: malformed JSON FAILS with exit code 2" {
+  bad_file="${BATS_TEST_TMPDIR}/malformed.json"
+  printf '{"schema_version":"1.0.0",' >"$bad_file"
+  run python3 "$VALIDATOR" "$bad_file" "$SCHEMA"
+  [ "$status" -eq 2 ]
+}
+
 `@test` "schema: SCHEMA_VERSION constant matches schema file version comment" {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/signals-schema.bats` around lines 38 - 70,
Add a BATS test to assert that syntactically malformed JSON yields exit code 2
by invoking the validator the same way other tests do: create a temp file
containing invalid JSON (e.g., truncated or missing a brace), run python3
"$VALIDATOR" "<bad_json_file>" "$SCHEMA", and assert `[ "$status" -eq 2 ]`;
reference the test file signals-schema.bats and the validate-signals.py
invocation via the VALIDATOR variable so the new test mirrors the existing tests
for scan_date and malformed inputs.
test/workflows/feature-ideation/lint-prompt.bats (1)

119-122: 🧹 Nitpick | 🔵 Trivial

Missing regression test for exit code 2 (file error path).

The previous review suggested adding a test that asserts exit code 2 when a missing file is encountered before any lint findings. This ensures the status-collapsing behavior in main() is properly locked down. The current test only checks that the default scan succeeds (status 0).

💡 Suggested test to add
 `@test` "lint-prompt: scans every .github/workflows file by default" {
   run bash "$LINTER"
   [ "$status" -eq 0 ]
 }
+
+@test "lint-prompt: file errors keep exit 2 even with later findings" {
+  write_yml "${TT_TMP}/bad.yml" <<'YML'
+jobs:
+  analyze:
+    steps:
+      - uses: anthropics/claude-code-action@v1
+        with:
+          prompt: |
+            Date: $(date -u +%Y-%m-%d)
+YML
+  run bash "$LINTER" "${TT_TMP}/missing.yml" "${TT_TMP}/bad.yml"
+  [ "$status" -eq 2 ]
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/lint-prompt.bats` around lines 119 - 122, Add
a regression test that asserts the linter exits with code 2 when a missing file
is encountered before any lint findings: create a new `@test` in
test/workflows/feature-ideation/lint-prompt.bats that invokes the same "$LINTER"
but arranges for a non-existent file path (or sets the workflow list to include
a missing file) so the run returns status 2, then assert [ "$status" -eq 2 ];
this verifies the status-collapsing behavior implemented in main() for the
file-error path.
.github/scripts/feature-ideation/collect-signals.sh (2)

61-67: ⚠️ Potential issue | 🟡 Minor

REPO validation is still too permissive.

The current check allows values like /repo, org//repo, or org/repo/extra which don't match the schema's ^[^/]+/[^/]+$ pattern. These would fail downstream rather than at input validation.

🛡️ Proposed fix for strict validation
-  local owner repo_name
-  owner="${REPO%%/*}"
-  repo_name="${REPO##*/}"
-  if [ "$owner" = "$REPO" ] || [ -z "$repo_name" ]; then
+  if [[ ! $REPO =~ ^[^/]+/[^/]+$ ]]; then
     printf '[collect-signals] REPO must be in owner/name format, got: %s\n' "$REPO" >&2
     return 64
   fi
+  local owner repo_name
+  owner="${REPO%%/*}"
+  repo_name="${REPO##*/}"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/collect-signals.sh around lines 61 - 67,
The REPO validation currently using owner="${REPO%%/*}" and
repo_name="${REPO##*/}" allows malformed values like "/repo", "org//repo", or
"org/repo/extra"; update the validation in collect-signals.sh to enforce exactly
one slash and non-empty owner and repo segments (pattern ^[^/]+/[^/]+$). Replace
the existing conditional with a stricter test (e.g., use a shell pattern or
grep/egrep with the regex against "$REPO") and return 64 and the same error
message if the test fails, referencing the REPO variable and keeping the
existing printf error path.

56-59: ⚠️ Potential issue | 🟡 Minor

Numeric validation for limits is still missing.

The previous review noted that ISSUE_LIMIT, PR_LIMIT, and DISCUSSION_LIMIT are never validated as numeric. A value like ISSUE_LIMIT=foo will fail inconsistently downstream rather than returning the documented exit code 64.

🛡️ Proposed fix to add validation
   local issue_limit="${ISSUE_LIMIT:-50}"
   local pr_limit="${PR_LIMIT:-30}"
   local discussion_limit="${DISCUSSION_LIMIT:-100}"
   local output_path="${SIGNALS_OUTPUT:-./signals.json}"
+
+  local name value
+  for name in ISSUE_LIMIT PR_LIMIT DISCUSSION_LIMIT; do
+    case "$name" in
+      ISSUE_LIMIT) value="$issue_limit" ;;
+      PR_LIMIT) value="$pr_limit" ;;
+      DISCUSSION_LIMIT) value="$discussion_limit" ;;
+    esac
+    if [[ ! $value =~ ^[1-9][0-9]*$ ]]; then
+      printf '[collect-signals] %s must be a positive integer, got: %s\n' "$name" "$value" >&2
+      return 64
+    fi
+  done
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/collect-signals.sh around lines 56 - 59,
Validate that ISSUE_LIMIT, PR_LIMIT, and DISCUSSION_LIMIT are numeric before
assigning to issue_limit/pr_limit/discussion_limit: after reading the env vars
(ISSUE_LIMIT, PR_LIMIT, DISCUSSION_LIMIT) check each value with a numeric test
(e.g., regex or shell arithmetic check) and if any is non-numeric log a clear
error referencing the offending env var and exit with status 64; otherwise
assign the validated numeric value or fall back to the existing defaults
(ISSUE_LIMIT:-50, PR_LIMIT:-30, DISCUSSION_LIMIT:-100) so that issue_limit,
pr_limit, and discussion_limit are always integers downstream.
.github/scripts/feature-ideation/lib/compose-signals.sh (1)

46-56: ⚠️ Potential issue | 🟠 Major

Array-type validation is still missing.

The previous review confirmed that jq -e . accepts objects, strings, and other non-array types. Since the function's contract declares inputs as "JSON arrays" (line 11), non-arrays can silently produce wrong results: objects yield key counts, strings yield character counts.

🐛 Proposed fix to enforce array type
-    if ! printf '%s' "$input" | jq -e . >/dev/null 2>&1; then
-      printf '[compose-signals] arg #%d is not valid JSON: %s\n' "$idx" "${input:0:120}" >&2
+    if ! printf '%s' "$input" | jq -e 'type == "array"' >/dev/null 2>&1; then
+      printf '[compose-signals] arg #%d must be a JSON array: %s\n' "$idx" "${input:0:120}" >&2
       return 65  # EX_DATAERR
     fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/compose-signals.sh around lines 46 -
56, The current JSON validation loop (variables like idx and inputs
"$open_issues" "$closed_issues" ... in compose-signals.sh) only checks that
inputs are valid JSON but not that they are arrays; update the loop so after
confirming valid JSON you also run a jq array-type check (e.g. use jq -e 'type
== "array"') and on failure print the same error with idx and a snippet of the
input and return 65. Ensure the check targets each input in the existing for
loop before composition.
.github/scripts/feature-ideation/match-discussions.sh (1)

100-103: ⚠️ Potential issue | 🟡 Minor

Handle malformed input JSON without a traceback.

These raw json.load() calls still bubble parsing failures to stderr. Catch bad input here and emit a structured [match-discussions] invalid JSON ... error instead so callers get a stable failure contract.

🔧 Proposed fix
-with open(signals_path) as f:
-    signals = json.load(f)
-with open(proposals_path) as f:
-    proposals = json.load(f)
+def load_json(path):
+    try:
+        with open(path) as f:
+            return json.load(f)
+    except OSError as exc:
+        sys.stderr.write(f"[match-discussions] cannot read {path}: {exc}\n")
+        sys.exit(2)
+    except json.JSONDecodeError as exc:
+        sys.stderr.write(f"[match-discussions] invalid JSON in {path}: {exc}\n")
+        sys.exit(2)
+
+signals = load_json(signals_path)
+proposals = load_json(proposals_path)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/match-discussions.sh around lines 100 -
103, Wrap the json.load() calls for signals_path and proposals_path in
try/except blocks that catch json.JSONDecodeError (and optionally
OSError/FileNotFoundError) and, on error, print a single structured message to
stderr like "[match-discussions] invalid JSON <path>: <error>" and exit with a
non‑zero status; update the locations around the json.load(signals_path) and
json.load(proposals_path) usages so malformed input yields that stable error
contract instead of a traceback.
.github/scripts/feature-ideation/lint-prompt.sh (3)

109-120: ⚠️ Potential issue | 🟠 Major

Do not downgrade file errors to lint findings.

scan_file can return 2, but the loop maps every nonzero result to 1. A read error or earlier not found condition loses the documented file-error contract if any later file only has lint findings.

🛠️ Proposed fix
-  local exit=0
+  local rc=0
+  local file_rc=0
   for file in "$@"; do
     if [ ! -f "$file" ]; then
       printf '[lint-prompt] not found: %s\n' "$file" >&2
-      exit=2
+      rc=2
       continue
     fi
-    if ! scan_file "$file"; then
-      exit=1
-    fi
+    if scan_file "$file"; then
+      file_rc=0
+    else
+      file_rc=$?
+    fi
+    case "$file_rc" in
+      0) ;;
+      1)
+        if [ "$rc" -eq 0 ]; then
+          rc=1
+        fi
+        ;;
+      2) rc=2 ;;
+      *) return "$file_rc" ;;
+    esac
   done
-  return "$exit"
+  return "$rc"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 109 - 120, The
loop currently maps any nonzero return from scan_file to exit=1, which
downgrades file errors (e.g., code 2) to lint findings; change the logic to
capture the actual return code from scan_file (e.g., run scan_file "$file" and
store its return in a local rc or use $? immediately), then update the aggregate
exit variable to preserve the most severe code (for example: if rc is greater
than current exit, set exit=rc, or explicitly set exit=2 when rc==2 else set
exit=1 for lint failures). Reference the loop, the scan_file invocation, the
local exit variable, and use a temporary rc variable to avoid losing the
original return code.

81-83: ⚠️ Potential issue | 🟠 Major

Strip ${{ ... }} with a scanner, not [^}]*.

GitHub Actions expressions allow single-quoted string literals and format() calls, so valid ${{ ... }} content can contain } inside the expression. This re.sub(r'\$\{\{[^}]*\}\}', '', raw) stops at the first brace and leaves false-positive shell expansions behind. (docs.github.com)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 81 - 83, The
current regex r'\$\{\{[^}]*\}\}' used to compute no_gh from raw fails on
expressions containing '}' (e.g., inside single-quoted strings or format()
calls); replace this brittle regex with a small scanner function (e.g.,
strip_github_expressions or similar) that walks raw, detects the opening
sequence '${{', tracks nested/braced content and quoted literals correctly, and
only removes the full balanced expression before assigning no_gh; update the
call site that currently does no_gh = re.sub(...) to use the scanner and keep
the existing shell_expansion.finditer loop intact.

57-59: ⚠️ Potential issue | 🟠 Major

Recognize YAML chomping modifiers in prompt headers.

prompt: |-, |+, >-, and >+ are legal block-scalar headers, but this regex misses them, so those prompt bodies skip linting completely. (yaml.org)

🛠️ Proposed fix
-prompt_marker = re.compile(r'(?:direct_prompt|prompt):\s*[|>]?\s*$')
+prompt_marker = re.compile(r'(?:direct_prompt|prompt):\s*[|>]?[-+]?\s*$')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 57 - 59, The
current regex prompt_marker fails to match YAML block-scalar headers that
include chomping modifiers (e.g., `|-`, `|+`, `>-`, `>+`). Update the
prompt_marker regex to allow an optional chomping modifier after the `|` or `>`
indicator, for example change the pattern to use a non-capturing group like
(?:[|>][+-]?)? so the full regex becomes something like
r'(?:direct_prompt|prompt):\s*(?:[|>][+-]?)?\s*$' to ensure prompt headers with
`+` or `-` are recognized.
.github/scripts/feature-ideation/lib/gh-safe.sh (1)

94-98: ⚠️ Potential issue | 🟠 Major

Guard missing --jq / body args before dereferencing them.

With set -u, expanding an unset parameter is an error and a non-interactive shell exits. These dereferences therefore bypass the wrapper's structured 64 path and abort the shell instead. (gnu.org)

🛠️ Proposed fix
   while [ "$i" -lt "${`#args`[@]}" ]; do
     if [ "${args[$i]}" = "--jq" ]; then
+      if [ $((i + 1)) -ge "${`#args`[@]}" ]; then
+        _gh_safe_err "graphql-bad-args" "--jq requires a jq filter"
+        return 64
+      fi
       has_jq=1
       jq_filter="${args[$((i + 1))]}"
       break
     fi
@@
 gh_safe_graphql_input() {
+  if [ "$#" -ne 1 ]; then
+    _gh_safe_err "graphql-bad-input" "expected 1 arg: JSON request body"
+    return 64
+  fi
   local body="$1"

Use this read-only check to confirm the nounset path:

#!/bin/bash
set -euo pipefail

echo "Relevant code:"
sed -n '89,98p;189,193p' .github/scripts/feature-ideation/lib/gh-safe.sh

echo
echo "Missing --jq filter:"
if bash -uc 'args=(--jq); i=0; jq_filter="${args[$((i + 1))]}"' >/tmp/out 2>/tmp/err; then
  echo "unexpected success"
  exit 1
else
  cat /tmp/err
fi

echo
echo "Missing body arg:"
if bash -uc 'f(){ local body="$1"; :; }; f' >/tmp/out 2>/tmp/err; then
  echo "unexpected success"
  exit 1
else
  cat /tmp/err
fi

Also applies to: 189-193

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/gh-safe.sh around lines 94 - 98, The
loop dereferences a possibly-missing next argument when handling "--jq" (vars:
args, i, jq_filter) and elsewhere dereferences positional body without checking
(var: body / "$1"), which fails under set -u; fix by guarding before
dereference: ensure the next index exists (compare i+1 < ${`#args`[@]} or test $#
when in a function) and only then assign jq_filter, and for the body use a safe
default or check that a positional argument exists (e.g., test $# -ge 1) before
using body="$1" or use parameter-expansion defaults like "${args[$((i+1))]-}" /
"${1-}" to avoid nounset errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 116-142: The current greedy loop in match-discussions.sh uses
proposal order and a shared seen_disc_ids, causing suboptimal consumes; instead
compute all similarities between proposals and disc_norm using normalize and
jaccard, collect tuples (proposal_index, disc_id, similarity, disc_obj), sort
them descending by similarity, then iterate selecting matches only if similarity
>= threshold and neither the proposal_index nor disc_id have been assigned;
update matched and seen_disc_ids when selecting, and after selection fill
new_candidates for any unassigned proposals with their highest similarity
(rounding like before). Use the existing symbols proposals, disc_norm,
normalize, jaccard, matched, new_candidates, seen_disc_ids, and threshold to
locate where to replace the greedy per-proposal loop.

In `@test/workflows/feature-ideation/date-utils.bats`:
- Around line 17-22: The test uses a direct system call to get today's date
instead of the helper function; change the expectation to use the date_today
helper so the test validates behavior consistently: replace the `today=$(date -u
+%Y-%m-%d)` assignment with `today=$(date_today)` and keep the rest of the
`date_days_ago` test unchanged so it compares `date_days_ago 0` output against
`date_today`.

In `@test/workflows/feature-ideation/stubs/gh`:
- Around line 34-47: The current GH stub uses a global fallback
/tmp/.gh-stub-counter which can collide across parallel tests; update the
counter_file resolution (the counter_file variable created when GH_STUB_SCRIPT
is set) to prefer a per-test temporary dir (use TT_TMP if set, else
BATS_TEST_TMPDIR if available) instead of unscoped /tmp, and add a short note in
the surrounding comments or README that TT_TMP (or BATS_TEST_TMPDIR) must be
provided for parallel test runs; ensure references to GH_STUB_SCRIPT,
counter_file, and the logic that reads/writes the counter are updated
accordingly so parallel suites use isolated counters.

---

Duplicate comments:
In @.github/scripts/feature-ideation/collect-signals.sh:
- Around line 61-67: The REPO validation currently using owner="${REPO%%/*}" and
repo_name="${REPO##*/}" allows malformed values like "/repo", "org//repo", or
"org/repo/extra"; update the validation in collect-signals.sh to enforce exactly
one slash and non-empty owner and repo segments (pattern ^[^/]+/[^/]+$). Replace
the existing conditional with a stricter test (e.g., use a shell pattern or
grep/egrep with the regex against "$REPO") and return 64 and the same error
message if the test fails, referencing the REPO variable and keeping the
existing printf error path.
- Around line 56-59: Validate that ISSUE_LIMIT, PR_LIMIT, and DISCUSSION_LIMIT
are numeric before assigning to issue_limit/pr_limit/discussion_limit: after
reading the env vars (ISSUE_LIMIT, PR_LIMIT, DISCUSSION_LIMIT) check each value
with a numeric test (e.g., regex or shell arithmetic check) and if any is
non-numeric log a clear error referencing the offending env var and exit with
status 64; otherwise assign the validated numeric value or fall back to the
existing defaults (ISSUE_LIMIT:-50, PR_LIMIT:-30, DISCUSSION_LIMIT:-100) so that
issue_limit, pr_limit, and discussion_limit are always integers downstream.

In @.github/scripts/feature-ideation/lib/compose-signals.sh:
- Around line 46-56: The current JSON validation loop (variables like idx and
inputs "$open_issues" "$closed_issues" ... in compose-signals.sh) only checks
that inputs are valid JSON but not that they are arrays; update the loop so
after confirming valid JSON you also run a jq array-type check (e.g. use jq -e
'type == "array"') and on failure print the same error with idx and a snippet of
the input and return 65. Ensure the check targets each input in the existing for
loop before composition.

In @.github/scripts/feature-ideation/lib/date-utils.sh:
- Around line 12-14: date_days_ago currently dereferences $1 before validating
argument count which breaks under set -u; update date_days_ago to check the
argument count (use "$#" or an explicit [ "$#" -lt 1 ] guard) before assigning
or reading $1, and on missing args return exit code 64, keeping the
numeric-validation logic (grep -Eq '^[0-9]+$') after the guard so $1 is only
accessed when present.

In @.github/scripts/feature-ideation/lib/filter-bots.sh:
- Around line 37-40: The unquoted array assignment for
FEATURE_IDEATION_BOT_AUTHORS can trigger pathname expansion; replace the unsafe
expansion of local extras=($FEATURE_IDEATION_BOT_AUTHORS) with a quoted,
IFS-aware read to split on commas and avoid globbing (e.g. use IFS=',' read -r
-a extras <<< "$FEATURE_IDEATION_BOT_AUTHORS"), keeping the IFS=',' and local
extras identifiers so the script still produces the same array without allowing
wildcard expansion.

In @.github/scripts/feature-ideation/lib/gh-safe.sh:
- Around line 94-98: The loop dereferences a possibly-missing next argument when
handling "--jq" (vars: args, i, jq_filter) and elsewhere dereferences positional
body without checking (var: body / "$1"), which fails under set -u; fix by
guarding before dereference: ensure the next index exists (compare i+1 <
${`#args`[@]} or test $# when in a function) and only then assign jq_filter, and
for the body use a safe default or check that a positional argument exists
(e.g., test $# -ge 1) before using body="$1" or use parameter-expansion defaults
like "${args[$((i+1))]-}" / "${1-}" to avoid nounset errors.

In @.github/scripts/feature-ideation/lint-prompt.sh:
- Around line 109-120: The loop currently maps any nonzero return from scan_file
to exit=1, which downgrades file errors (e.g., code 2) to lint findings; change
the logic to capture the actual return code from scan_file (e.g., run scan_file
"$file" and store its return in a local rc or use $? immediately), then update
the aggregate exit variable to preserve the most severe code (for example: if rc
is greater than current exit, set exit=rc, or explicitly set exit=2 when rc==2
else set exit=1 for lint failures). Reference the loop, the scan_file
invocation, the local exit variable, and use a temporary rc variable to avoid
losing the original return code.
- Around line 81-83: The current regex r'\$\{\{[^}]*\}\}' used to compute no_gh
from raw fails on expressions containing '}' (e.g., inside single-quoted strings
or format() calls); replace this brittle regex with a small scanner function
(e.g., strip_github_expressions or similar) that walks raw, detects the opening
sequence '${{', tracks nested/braced content and quoted literals correctly, and
only removes the full balanced expression before assigning no_gh; update the
call site that currently does no_gh = re.sub(...) to use the scanner and keep
the existing shell_expansion.finditer loop intact.
- Around line 57-59: The current regex prompt_marker fails to match YAML
block-scalar headers that include chomping modifiers (e.g., `|-`, `|+`, `>-`,
`>+`). Update the prompt_marker regex to allow an optional chomping modifier
after the `|` or `>` indicator, for example change the pattern to use a
non-capturing group like (?:[|>][+-]?)? so the full regex becomes something like
r'(?:direct_prompt|prompt):\s*(?:[|>][+-]?)?\s*$' to ensure prompt headers with
`+` or `-` are recognized.

In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 100-103: Wrap the json.load() calls for signals_path and
proposals_path in try/except blocks that catch json.JSONDecodeError (and
optionally OSError/FileNotFoundError) and, on error, print a single structured
message to stderr like "[match-discussions] invalid JSON <path>: <error>" and
exit with a non‑zero status; update the locations around the
json.load(signals_path) and json.load(proposals_path) usages so malformed input
yields that stable error contract instead of a traceback.

In @.github/scripts/feature-ideation/README.md:
- Around line 54-55: Update the README text that describes the prompt linter so
it references both direct_prompt: and prompt: scopes (not just direct_prompt)
and update the workflow pointer so it directs maintainers to
.github/workflows/feature-ideation-reusable.yml (instead of
feature-ideation.yml); ensure the section that shows the lint command (bash
.github/scripts/feature-ideation/lint-prompt.sh) and any explanatory text
mention both prompt keys and the correct reusable workflow file name.

In @.github/scripts/feature-ideation/validate-signals.py:
- Around line 81-92: The _check_date_time format-check callback is missing a
return type annotation; update the function definition of _check_date_time to
include an explicit "-> bool" return type (preserve the existing decorator and
the "# noqa: ANN001" comment and behavior), e.g., modify the signature for the
function named _check_date_time to declare it returns bool so static type
checkers are satisfied while leaving the body unchanged.
- Around line 55-69: The JSON file reads for signals and schema currently only
catch json.JSONDecodeError and will let OSError (e.g., read_text IO/permission
errors) bubble up; wrap or modify the parsing blocks that use
signals_path.read_text() and schema_path.read_text() (the try/except blocks
around signals = json.loads(...) and schema = json.loads(...)) to also catch
OSError (or a combined except (OSError, json.JSONDecodeError) as exc), write an
explanatory message to sys.stderr including the failing path (signals_path or
schema_path) and the exception, and return exit code 2 so file/read errors
follow the documented contract.

In `@test/workflows/feature-ideation/lint-prompt.bats`:
- Around line 119-122: Add a regression test that asserts the linter exits with
code 2 when a missing file is encountered before any lint findings: create a new
`@test` in test/workflows/feature-ideation/lint-prompt.bats that invokes the same
"$LINTER" but arranges for a non-existent file path (or sets the workflow list
to include a missing file) so the run returns status 2, then assert [ "$status"
-eq 2 ]; this verifies the status-collapsing behavior implemented in main() for
the file-error path.

In `@test/workflows/feature-ideation/signals-schema.bats`:
- Around line 38-70: Add a BATS test to assert that syntactically malformed JSON
yields exit code 2 by invoking the validator the same way other tests do: create
a temp file containing invalid JSON (e.g., truncated or missing a brace), run
python3 "$VALIDATOR" "<bad_json_file>" "$SCHEMA", and assert `[ "$status" -eq 2
]`; reference the test file signals-schema.bats and the validate-signals.py
invocation via the VALIDATOR variable so the new test mirrors the existing tests
for scan_date and malformed inputs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d1bec354-118e-4cd2-8940-99c2690c6cf3

📥 Commits

Reviewing files that changed from the base of the PR and between 67cb057 and 67c6037.

📒 Files selected for processing (39)
  • .github/schemas/signals.schema.json
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/discussion-mutations.sh
  • .github/scripts/feature-ideation/lib/compose-signals.sh
  • .github/scripts/feature-ideation/lib/date-utils.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • .github/workflows/feature-ideation-reusable.yml
  • .github/workflows/feature-ideation-tests.yml
  • standards/workflows/feature-ideation.yml
  • test/workflows/feature-ideation/collect-signals.bats
  • test/workflows/feature-ideation/compose-signals.bats
  • test/workflows/feature-ideation/date-utils.bats
  • test/workflows/feature-ideation/discussion-mutations.bats
  • test/workflows/feature-ideation/filter-bots.bats
  • test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json
  • test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/populated.signals.json
  • test/workflows/feature-ideation/fixtures/expected/truncated.signals.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json
  • test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json
  • test/workflows/feature-ideation/fixtures/gh-responses/release-list.json
  • test/workflows/feature-ideation/gh-safe.bats
  • test/workflows/feature-ideation/helpers/setup.bash
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment thread .github/scripts/feature-ideation/match-discussions.sh Outdated
Comment thread test/workflows/feature-ideation/date-utils.bats
Comment thread test/workflows/feature-ideation/filter-bots.bats
Comment thread test/workflows/feature-ideation/stubs/gh
@don-petry
Copy link
Copy Markdown
Contributor Author

@claude - please review and fix remaining feedback

@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 8, 2026

Claude finished @don-petry's task in 9m 35s —— View job


Fixing remaining CodeRabbit feedback — done

Pushed commit 5c52288 addressing all 15 remaining open issues from the CodeRabbit re-reviews. Test count: 109 → 114.

Todo list:

  • Read relevant source files to understand current state
  • collect-signals.sh — add numeric limit validation + strict REPO regex
  • lib/compose-signals.sh — enforce array-type check (jq -e 'type == "array"')
  • lib/date-utils.sh — guard arg count before reading $1
  • lib/filter-bots.sh — prevent glob expansion with IFS=',' read -r -a
  • lib/gh-safe.sh — guard --jq arg bounds + gh_safe_graphql_input arg count
  • lint-prompt.sh — YAML chomping modifiers, nested-brace stripper, exit-code preservation
  • match-discussions.sh — JSON error handling, discussion id guard, optimal matching
  • validate-signals.py — catch OSError + add -> bool return annotation
  • README.md — update prompt linter description
  • test/lint-prompt.bats — add 3 tests (exit-2 precedence, chomping modifiers, nested GH expressions)
  • test/match-discussions.bats — add 2 malformed-JSON tests
  • test/signals-schema.bats — add malformed-JSON exit-2 test
  • test/date-utils.bats — use date_today helper instead of raw date -u
  • test/stubs/gh — prefer TT_TMP/BATS_TEST_TMPDIR for counter isolation
  • Commit and push

Summary of changes

File Fix
collect-signals.sh Validate ISSUE_LIMIT/PR_LIMIT/DISCUSSION_LIMIT are positive integers before forwarding to GraphQL; tighten REPO with strict ^[^/]+/[^/]+$ regex
lib/compose-signals.sh jq -e 'type == "array"' instead of jq -e . — rejects objects/strings that would silently produce wrong counts
lib/date-utils.sh [ "$#" -ne 1 ] guard before touching $1 — prevents set -u abort on zero-arg calls
lib/filter-bots.sh IFS=',' read -r -a extras instead of unquoted expansion — prevents pathname-globbing
lib/gh-safe.sh Bounds-check args[i+1] before --jq dereference; $# guard in gh_safe_graphql_input()
lint-prompt.sh Add [-+]? to prompt_marker for YAML chomping modifiers; replace [^}]* GH-expression stripper with a stateful scanner; preserve exit-2 priority over exit-1 in main() loop
match-discussions.sh _load_json() helper with OSError/JSONDecodeError handling exits 2 instead of traceback; skip id-less discussions; switch to similarity-sorted global optimal matching (not greedy per-proposal)
validate-signals.py Catch OSError on read_text() for both signals and schema files; -> bool on _check_date_time
README.md Mention both direct_prompt: and prompt: in lint command; point to feature-ideation-reusable.yml
5 test files Exit-2 precedence, chomping modifiers, nested GH expression false-positive, malformed JSON (×3), date_today consistency, counter-file isolation
· Branch

…es + 5 new tests)

Critical/major:
- collect-signals.sh: validate ISSUE_LIMIT/PR_LIMIT/DISCUSSION_LIMIT are
  positive integers; tighten REPO validation with strict ^[^/]+/[^/]+$ regex
- compose-signals.sh: enforce array type (jq 'type == "array"') not just
  valid JSON so objects/strings don't silently produce wrong counts
- date-utils.sh: guard $# before reading $1 to prevent set -u abort on
  zero-arg calls
- filter-bots.sh: replace unquoted array expansion with IFS=',' read -r -a
  to prevent pathname-globbing against filesystem entries
- gh-safe.sh: bounds-check args[i+1] before --jq dereference; add $# guard
  to gh_safe_graphql_input() to prevent nounset abort
- lint-prompt.sh: recognise YAML chomping modifiers (|-,|+,>-,>+) in
  prompt_marker regex; replace [^}]* GH-expression stripper with a stateful
  scanner that handles nested braces; preserve exit-2 over exit-1 in main()
- match-discussions.sh: wrap json.load calls in try/except for structured
  error exit-2 instead of Python traceback; skip discussions without an id;
  switch from greedy per-proposal to similarity-sorted global optimal matching
- validate-signals.py: catch OSError on read_text() to preserve exit-2
  contract; add -> bool return type annotation to _check_date_time

Docs:
- README.md: update lint command to mention both direct_prompt: and prompt:;
  fix Mary's prompt pointer to feature-ideation-reusable.yml

Tests (+5 new, 109 → 114 total):
- lint-prompt.bats: missing-file-before-lint-failing-file exits 2; YAML
  chomping modifiers detected; nested GH expressions don't false-positive
- match-discussions.bats: malformed signals JSON exits non-zero; malformed
  proposals JSON exits non-zero
- signals-schema.bats: truncated/malformed JSON exits 2 not 1
- date-utils.bats: use date_today helper instead of raw date -u
- stubs/gh: prefer TT_TMP/BATS_TEST_TMPDIR for counter file isolation

Co-authored-by: don-petry <don-petry@users.noreply.github.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Apr 8, 2026

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/feature-ideation/lib/gh-safe.sh:
- Around line 141-156: The two-step GraphQL error-envelope check can be
simplified into one jq expression: replace the dual jq tests that inspect "$raw"
for errors with a single jq predicate that directly checks whether .errors is an
array with length > 0 (operating on the raw JSON stored in the variable raw)
and, if true, set errs via jq -c '.errors' and call _gh_safe_err
"graphql-errors" "$errs" and return 66 as before; update the logic in the
gh-safe.sh block that currently references raw and _gh_safe_err to use the
consolidated jq expression (e.g., testing .errors as an array with length > 0)
to improve readability while preserving behavior.

In `@test/workflows/feature-ideation/stubs/gh`:
- Around line 37-38: The stub should fail fast when GH_STUB_SCRIPT is set but
points to a missing file: add an explicit check for GH_STUB_SCRIPT being
non-empty and the referenced file not existing, write an error to stderr
(including the value of GH_STUB_SCRIPT) and exit with non-zero status; keep the
existing logic that only enters "scripted mode" when the file exists (the
current if [ -n "${GH_STUB_SCRIPT:-}" ] && [ -f "${GH_STUB_SCRIPT}" ]; then
block and the counter_file assignment should remain), but insert the new failure
branch before it to detect the "set but missing" case and prevent silent
fall-through to single-call mode.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1e2cac8e-9479-40bc-a2b7-eb3f00879b57

📥 Commits

Reviewing files that changed from the base of the PR and between 67c6037 and 5c52288.

📒 Files selected for processing (14)
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/lib/compose-signals.sh
  • .github/scripts/feature-ideation/lib/date-utils.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • test/workflows/feature-ideation/date-utils.bats
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment thread .github/scripts/feature-ideation/lib/gh-safe.sh
Comment thread test/workflows/feature-ideation/stubs/gh
don-petry pushed a commit that referenced this pull request Apr 9, 2026
…6 tests)

Triaged 14 inline comments from Copilot's review of #85; two were already
fixed by the tooling_ref→v1 commit, the remaining 11 are addressed here.

Critical bug fixes
------------------

1. lint-prompt.sh now scans claude-code-action v1 `prompt:` blocks in
   addition to v0 `direct_prompt:`. The reusable workflow uses `prompt:`
   so the linter was silently allowing R2 regressions on the very file
   it was supposed to protect. Added two regression tests covering both
   the v1 form and a clean v1 form passes.

2. add_label_to_discussion now sends labelIds as a proper JSON array via
   gh_safe_graphql_input (new helper). Previously used `gh -f labelIds=`
   which sent the literal string `["L_1"]` and the GraphQL API would have
   rejected the mutation at runtime. Added a test that captures gh's
   stdin and asserts the variables block contains a length-1 array.

3. validate-signals.py now registers a `date-time` format checker via
   FormatChecker so the `format: date-time` keyword in signals.schema.json
   is actually enforced. Draft202012Validator does NOT enforce formats
   by default, and the default FormatChecker omits date-time entirely.
   Used an inline checker (datetime.fromisoformat with Z normalisation)
   to avoid pulling in rfc3339-validator. Added two regression tests:
   one for an invalid timestamp failing, one for a clean timestamp
   passing.

4. gh_safe_graphql --jq path no longer swallows jq filter errors with
   `|| true`. Filter typos / wrong paths now exit non-zero instead of
   silently returning []. Added a regression test using a deliberately
   broken filter.

5. collect-signals.sh now computes the open-issue truncation warning
   BEFORE filter_bots_apply. Previously, a result set composed entirely
   of bots could drop below ISSUE_LIMIT after filtering and mask real
   truncation. Added an integration test with all-bot fixtures.

6. match-discussions.sh now validates MATCH_THRESHOLD as a non-negative
   number in [0, 1] before passing to Python. A typo previously surfaced
   as an opaque traceback. Added regression tests for non-numeric input,
   out-of-range input, and boundary values 0 and 1.

Cleanup
-------

7. Removed dead bash `normalize_title` / `jaccard_similarity` functions
   from match-discussions.sh — the actual matching is implemented in the
   embedded Python block and the bash helpers were never called.

8. Schema $id corrected from petry-projects/TalkTerm/... to the canonical
   petry-projects/.github location.

9. signals-schema.bats "validator script exists and is executable" test
   now actually checks the `-x` bit (was only checking `-f` and `-r`).

10. README + filter-bots.sh comments now describe the bot list as a
    "blocklist" (it removes matching authors) instead of "allowlist".

11. test/workflows/feature-ideation/stubs/gh now logs argv with `printf
    '%q '` so each invocation is shell-quoted and re-parseable, matching
    its documentation. Previously logged `$*` which lost arg boundaries.

New helper
----------

gh_safe_graphql_input — same defensive contract as gh_safe_graphql, but
takes a fully-formed JSON request body via stdin instead of -f/-F flags.
Use for mutations whose variables include arrays (e.g. labelIds: [ID!]!)
that gh's flag-based interface cannot express. Five new tests cover
its happy path and every documented failure mode.

Tests
-----

Test count: 92 → 108 (16 new regression tests, all green). Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
don-petry pushed a commit that referenced this pull request Apr 9, 2026
… 1 test)

Triaged 13 inline comments from CodeRabbit's review of #85; 6 of them
overlapped with Copilot's review and were already fixed by bcaa579. The
remaining 7 are addressed here.

Fixes
-----

1. lint-prompt.sh: ${VAR} branch lookbehind was inconsistent with the
   $(...) branch — only rejected $$VAR but not \${VAR}. Both branches
   now use [\\$] so backslash-escaped and dollar-escaped forms are
   skipped uniformly.

2. filter-bots.sh: FEATURE_IDEATION_BOT_AUTHORS CSV entries are now
   trimmed of leading/trailing whitespace before being added to the
   blocklist, so "bot1, bot2" matches both bots correctly instead of
   keeping a literal " bot2" entry.

3. validate-signals.py: malformed signals JSON now exits 2 (file/data
   error) to match the documented contract, instead of 1 (which means
   schema validation error).

4. README.md: corrected the workflow filename reference from
   feature-ideation.yml to feature-ideation-reusable.yml, and reworded
   the table cell that contained `\|\|` (escaped pipes that don't
   render correctly in some Markdown engines) to use plain prose. Also
   noted that lint-prompt scans both v0 `direct_prompt:` and v1 `prompt:`.

5. collect-signals.sh: added an explicit comment above SCHEMA_VERSION
   documenting the lockstep requirement with signals.schema.json's
   $comment version annotation. Backed by a new bats test that parses
   both files and asserts they match.

6. signals.schema.json: added $comment "version: 1.0.0" annotation so
   the schema file declares its own version explicitly. Used $comment
   instead of a custom keyword to keep Draft202012 compliance.

7. test/workflows/feature-ideation/match-discussions.bats: build_signals
   helper now computes the discussions count from the array length
   instead of hardcoding 0, so the fixture satisfies its own contract
   (cosmetic — the matcher only reads .items, but contract hygiene
   matters in test scaffolding).

8. test/workflows/feature-ideation/gh-safe.bats: removed the `|| true`
   suffix on the rest-failure assertion that made it always pass.
   Now uses --separate-stderr to capture stderr and asserts the
   structured `[gh-safe][rest-failure]` prefix is emitted on the auth
   failure path. Required `bats_require_minimum_version 1.5.0` to
   suppress the bats-core warning about flag usage.

Tests
-----

Test count: 108 → 109 (one new test for SCHEMA_VERSION ↔ schema sync).
All 109 passing locally. Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
don-petry added a commit that referenced this pull request Apr 9, 2026
…121)

* test(feature-ideation): extract bash to scripts, add schema + 92 bats tests

Refactors the reusable feature-ideation workflow's parsing surface from
an inline 600-line YAML heredoc into testable scripts with deterministic
contracts. Every defect that previously required post-merge review can
now fail in CI before adopters notice.

Why
---
The prior reusable workflow used `2>/dev/null || echo '[]'` for every
gh / GraphQL call, which silently downgraded auth failures, rate limits,
network outages, and GraphQL schema drift to empty arrays. The pipeline
would "succeed" while producing useless signals — and Mary's Discussion
posts would silently degrade across every BMAD repo on the org. The
prompt also instructed Mary to "use fuzzy matching" against existing
Ideas Discussions in her head, which is non-deterministic and untestable.

Risk register (probability × impact, scale 1–9):
  R1=9  swallow-all-errors gh wrapper
  R2=6  literal $() inside YAML direct prompt
  R3=6  no signals.json schema
  R4=6  jq --argjson crash on empty input
  R5=6  fuzzy match in Mary's prompt → duplicate Discussions
  R6=6  retry idempotency hole
  R7=6  GraphQL errors[]/null data not detected
  R8=4  GraphQL partial errors silently accepted
  R10=3 bot filter only catches dependabot/github-actions
  R11=4 pagination silently truncates

What's new
----------
.github/scripts/feature-ideation/
  collect-signals.sh         Orchestrator (replaces inline heredoc)
  validate-signals.py        JSON Schema 2020-12 validator
  match-discussions.sh       Deterministic Jaccard matcher (kills R5/R6)
  discussion-mutations.sh    create/comment/label wrappers + DRY_RUN mode
  lint-prompt.sh             Catches unescaped $() / ${VAR} in prompt blocks
  lib/gh-safe.sh             Defensive gh wrapper, fails loud on every
                             documented failure mode (kills R1, R7, R8)
  lib/compose-signals.sh     Validates JSON inputs before jq composition
  lib/filter-bots.sh         Extensible bot author filter (kills R10)
  lib/date-utils.sh          Cross-platform date helpers
  README.md                  Maintainer docs

.github/schemas/signals.schema.json
  Pinned producer/consumer contract for signals.json (Draft 2020-12).
  CI rejects any drift; the runtime signals.json is also validated by
  the workflow before being handed to Mary.

.github/workflows/feature-ideation-reusable.yml
  Rewritten. Adds a self-checkout of petry-projects/.github so the
  scripts above are available in the runner. Replaces inline bash with
  collect-signals.sh + validate-signals.py. Adds RUN_DATE / SIGNALS_PATH /
  PROPOSALS_PATH / MATCH_PLAN_PATH / TOOLING_DIR env vars passed to
  claude-code-action via env: instead of unescaped shell expansions in
  the prompt body. Adds dry_run input that flows through to
  discussion-mutations.sh, which logs every planned action to a JSONL
  audit log instead of executing — uploaded as the dry-run-log artifact.

.github/workflows/feature-ideation-tests.yml
  New CI gate, path-filtered. Runs shellcheck, lint-prompt, schema
  fixture validation, and the full bats suite on every PR that touches
  the feature-ideation surface.

standards/workflows/feature-ideation.yml
  Updated caller stub template. Adds dry_run workflow_dispatch input
  so adopters get safe smoke-testing for free. Existing TalkTerm caller
  stub continues to work unchanged (dry_run defaults to false).

test/workflows/feature-ideation/
  92 bats tests across 9 suites. 14 GraphQL/REST response fixtures.
  5 expected signals.json fixtures (3 valid + 2 INVALID for negative
  schema testing). Programmable gh PATH stub with single-call and
  multi-call modes for integration testing.

  | Suite                       | Tests | Risks closed       |
  |-----------------------------|------:|--------------------|
  | gh-safe.bats                |    19 | R1, R7, R8         |
  | compose-signals.bats        |     8 | R3, R4             |
  | filter-bots.bats            |     5 | R10                |
  | date-utils.bats             |     7 | R9                 |
  | collect-signals.bats        |    14 | R1, R3, R4, R7, R11|
  | match-discussions.bats      |    13 | R5, R6             |
  | discussion-mutations.bats   |    10 | DRY_RUN contract   |
  | lint-prompt.bats            |     8 | R2                 |
  | signals-schema.bats         |     8 | R3                 |
  | TOTAL                       |    92 |                    |

Test results: 92 passing, 0 failing, 0 skipped. Run with:
  bats test/workflows/feature-ideation/

Backwards compatibility
-----------------------
The reusable workflow's input surface is unchanged for existing callers
(TalkTerm continues to work with no edits). The new dry_run input is
optional and defaults to false. Adopters who copy the new standards
caller stub get dry_run support automatically.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(feature-ideation): use bash -c instead of sh -c in env-extension test

CI failure on the previous commit: 91/92 passing, 1 failing.

The filter-bots env-extension test used `sh -c` to source filter-bots.sh
in a sub-shell with FEATURE_IDEATION_BOT_AUTHORS set. On macOS this works
because /bin/sh is bash. On Ubuntu (CI), /bin/sh is dash, which does not
support `set -o pipefail`, so sourcing filter-bots.sh produced:

  sh: 12: set: Illegal option -o pipefail

Fixed by switching to `bash -c`. All scripts already use
`#!/usr/bin/env bash` shebangs; this is the only place a sub-shell was
spawned via `sh`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(feature-ideation): default tooling_ref to v1 to match @v1 caller pin

Aligns the script-tooling self-checkout with the @v1 pinning convention
introduced in #88. Now when a downstream caller stub pins to
`@v1` of the workflow file, the reusable workflow defaults to checking
out the matching `v1` tag for the scripts. Workflow file and scripts
upgrade in lockstep.

Override `tooling_ref` only for testing forks (`tooling_ref: my-branch`)
or bleeding-edge testing (`tooling_ref: main`). Documented in the input
description.

Note for the v1 tag move: after this PR merges, the v1 tag must be
moved forward to point to the new HEAD so that downstream BMAD repos
pinned to @v1 actually pick up the hardening. The change is purely
additive (new optional inputs `dry_run` and `tooling_ref`, new env vars
in the prompt context), so the move is backwards-compatible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(feature-ideation): address Copilot review on PR #85 (11 fixes + 16 tests)

Triaged 14 inline comments from Copilot's review of #85; two were already
fixed by the tooling_ref→v1 commit, the remaining 11 are addressed here.

Critical bug fixes
------------------

1. lint-prompt.sh now scans claude-code-action v1 `prompt:` blocks in
   addition to v0 `direct_prompt:`. The reusable workflow uses `prompt:`
   so the linter was silently allowing R2 regressions on the very file
   it was supposed to protect. Added two regression tests covering both
   the v1 form and a clean v1 form passes.

2. add_label_to_discussion now sends labelIds as a proper JSON array via
   gh_safe_graphql_input (new helper). Previously used `gh -f labelIds=`
   which sent the literal string `["L_1"]` and the GraphQL API would have
   rejected the mutation at runtime. Added a test that captures gh's
   stdin and asserts the variables block contains a length-1 array.

3. validate-signals.py now registers a `date-time` format checker via
   FormatChecker so the `format: date-time` keyword in signals.schema.json
   is actually enforced. Draft202012Validator does NOT enforce formats
   by default, and the default FormatChecker omits date-time entirely.
   Used an inline checker (datetime.fromisoformat with Z normalisation)
   to avoid pulling in rfc3339-validator. Added two regression tests:
   one for an invalid timestamp failing, one for a clean timestamp
   passing.

4. gh_safe_graphql --jq path no longer swallows jq filter errors with
   `|| true`. Filter typos / wrong paths now exit non-zero instead of
   silently returning []. Added a regression test using a deliberately
   broken filter.

5. collect-signals.sh now computes the open-issue truncation warning
   BEFORE filter_bots_apply. Previously, a result set composed entirely
   of bots could drop below ISSUE_LIMIT after filtering and mask real
   truncation. Added an integration test with all-bot fixtures.

6. match-discussions.sh now validates MATCH_THRESHOLD as a non-negative
   number in [0, 1] before passing to Python. A typo previously surfaced
   as an opaque traceback. Added regression tests for non-numeric input,
   out-of-range input, and boundary values 0 and 1.

Cleanup
-------

7. Removed dead bash `normalize_title` / `jaccard_similarity` functions
   from match-discussions.sh — the actual matching is implemented in the
   embedded Python block and the bash helpers were never called.

8. Schema $id corrected from petry-projects/TalkTerm/... to the canonical
   petry-projects/.github location.

9. signals-schema.bats "validator script exists and is executable" test
   now actually checks the `-x` bit (was only checking `-f` and `-r`).

10. README + filter-bots.sh comments now describe the bot list as a
    "blocklist" (it removes matching authors) instead of "allowlist".

11. test/workflows/feature-ideation/stubs/gh now logs argv with `printf
    '%q '` so each invocation is shell-quoted and re-parseable, matching
    its documentation. Previously logged `$*` which lost arg boundaries.

New helper
----------

gh_safe_graphql_input — same defensive contract as gh_safe_graphql, but
takes a fully-formed JSON request body via stdin instead of -f/-F flags.
Use for mutations whose variables include arrays (e.g. labelIds: [ID!]!)
that gh's flag-based interface cannot express. Five new tests cover
its happy path and every documented failure mode.

Tests
-----

Test count: 92 → 108 (16 new regression tests, all green). Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(feature-ideation): address CodeRabbit review on PR #85 (7 fixes + 1 test)

Triaged 13 inline comments from CodeRabbit's review of #85; 6 of them
overlapped with Copilot's review and were already fixed by bcaa579. The
remaining 7 are addressed here.

Fixes
-----

1. lint-prompt.sh: ${VAR} branch lookbehind was inconsistent with the
   $(...) branch — only rejected $$VAR but not \${VAR}. Both branches
   now use [\\$] so backslash-escaped and dollar-escaped forms are
   skipped uniformly.

2. filter-bots.sh: FEATURE_IDEATION_BOT_AUTHORS CSV entries are now
   trimmed of leading/trailing whitespace before being added to the
   blocklist, so "bot1, bot2" matches both bots correctly instead of
   keeping a literal " bot2" entry.

3. validate-signals.py: malformed signals JSON now exits 2 (file/data
   error) to match the documented contract, instead of 1 (which means
   schema validation error).

4. README.md: corrected the workflow filename reference from
   feature-ideation.yml to feature-ideation-reusable.yml, and reworded
   the table cell that contained `\|\|` (escaped pipes that don't
   render correctly in some Markdown engines) to use plain prose. Also
   noted that lint-prompt scans both v0 `direct_prompt:` and v1 `prompt:`.

5. collect-signals.sh: added an explicit comment above SCHEMA_VERSION
   documenting the lockstep requirement with signals.schema.json's
   $comment version annotation. Backed by a new bats test that parses
   both files and asserts they match.

6. signals.schema.json: added $comment "version: 1.0.0" annotation so
   the schema file declares its own version explicitly. Used $comment
   instead of a custom keyword to keep Draft202012 compliance.

7. test/workflows/feature-ideation/match-discussions.bats: build_signals
   helper now computes the discussions count from the array length
   instead of hardcoding 0, so the fixture satisfies its own contract
   (cosmetic — the matcher only reads .items, but contract hygiene
   matters in test scaffolding).

8. test/workflows/feature-ideation/gh-safe.bats: removed the `|| true`
   suffix on the rest-failure assertion that made it always pass.
   Now uses --separate-stderr to capture stderr and asserts the
   structured `[gh-safe][rest-failure]` prefix is emitted on the auth
   failure path. Required `bats_require_minimum_version 1.5.0` to
   suppress the bats-core warning about flag usage.

Tests
-----

Test count: 108 → 109 (one new test for SCHEMA_VERSION ↔ schema sync).
All 109 passing locally. Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(compliance-audit): add claude label to individual finding issues

Individual compliance issues were only tagged with `compliance-audit`,
so Claude agents couldn't discover them for remediation. Now all issues
(new and pre-existing) get the `claude` label alongside the umbrella.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: DJ <dj@Rachels-MacBook-Air.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: DJ <dj@Rachels-Air.localdomain>
Copy link
Copy Markdown
Contributor Author

@don-petry don-petry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Automated review — APPROVED

Risk: MEDIUM
Reviewed commit: 5c52288f5bb16fe6e65cb1d2ce8a175848ca6bcf
Cascade: triage → deep (see triage: haiku 4.5 → deep: sonnet 4.6 + duck: gpt-5.4 → audit: opus 4.6 for models)

Summary

This PR refactors the feature-ideation pipeline from an inline workflow heredoc into testable bash/Python modules, fixing 18 bugs identified by Copilot and CodeRabbit. All CI gates pass (ShellCheck, CodeQL, SonarCloud, Agent Security Scan, bats), CodeRabbit approved after multiple rounds of review, and the defensive coding patterns (fail-loud error handling, gh_safe wrappers, format-checked JSON schema validation) are well implemented. No critical security issues found.

Findings

Minor

  • [minor] supply-chain.github/workflows/feature-ideation-reusable.yml:1742 — The tooling_ref workflow_call input (default: 'v1') is passed directly to actions/checkout ref: for petry-projects/.github. Any caller that overrides this to an unreviewed branch can inject arbitrary scripts that run with discussions: write + id-token: write in the analyze job. Risk is limited to internal org callers who must explicitly set the override, but there is no allow-list validation on the input value.
  • [minor] permissions.github/workflows/feature-ideation-reusable.yml:1882id-token: write is granted to the analyze job. This scope was present in the original workflow and appears required by claude-code-action for OIDC; however, if the action ever stops needing it this should be removed to follow least-privilege.

Info

  • [info] correctness — All 18 critical bug fixes look correct: gh_safe wrappers eliminate the 2>/dev/null || echo '[]' swallow pattern, labelIds is now sent as a proper JSON array via gh_safe_graphql_input, the date-time format checker is properly registered on Draft202012Validator, jq filter errors surface as exit-65 instead of being swallowed, and MATCH_THRESHOLD is validated before Python receives it.
  • [info] security — Shell injection risk is reduced vs. prior art: ${{ github.event.repository.name }} references in inline shell heredocs are replaced with environment variables passed via the env: block. The lint-prompt.sh scanner correctly handles ${{ }} GH expressions via a stateful scanner to avoid false positives. (.github/scripts/feature-ideation/lint-prompt.sh)
  • [info] test-coverage — 17 new bats tests cover the critical paths including auth failures, truncation detection, bot filtering, schema format validation, and the Jaccard matcher. All pass in CI.

CI status

Check Result
ShellCheck ✅ SUCCESS
Agent Security Scan ✅ SUCCESS
Lint, schema, and bats ✅ SUCCESS
CodeQL ✅ SUCCESS
SonarCloud ✅ SUCCESS
CodeRabbit ✅ SUCCESS
Claude Code ✅ SUCCESS

Reviewed by the don-petry PR-review cascade (triage: haiku 4.5 → deep: sonnet 4.6 + duck: gpt-5.4 → audit: opus 4.6). Reply with @don-petry if you need a human.

@don-petry don-petry enabled auto-merge (squash) April 16, 2026 13:21
@github-actions
Copy link
Copy Markdown
Contributor

Auto-rebase failed — merge conflict — this branch has conflicts with main that must be resolved manually.

Please resolve the conflicts and push:

git fetch origin
git merge origin/main
# resolve conflicts, then:
git add .
git commit
git push

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants