Skip to content

test(feature-ideation): extract bash, add schema + 91 bats tests#71

Closed
don-petry wants to merge 3 commits intomainfrom
feat/feature-ideation-workflow
Closed

test(feature-ideation): extract bash, add schema + 91 bats tests#71
don-petry wants to merge 3 commits intomainfrom
feat/feature-ideation-workflow

Conversation

@don-petry
Copy link
Copy Markdown
Contributor

@don-petry don-petry commented Apr 7, 2026

Summary

Refactors the feature-ideation.yml parsing surface from inline YAML heredocs into testable scripts with deterministic contracts. Defects that previously required post-merge review can now fail in CI.

Risk register: killed all P0 risks (R1, R2, R3, R4, R5, R6, R7) plus secondary risks (R8, R9, R10, R11). Zero open critical risks per Murat's *trace gate criteria.

What's new

Scripts (.github/scripts/feature-ideation/)

  • lib/gh-safe.sh — defensive gh / GraphQL wrapper. Replaces 2>/dev/null || echo '[]'. Fails loud on auth, rate-limit, network, GraphQL errors[] envelope, and data: null.
  • lib/compose-signals.sh — validates JSON inputs before jq --argjson.
  • lib/filter-bots.sh — extensible bot author filter (catches Renovate, CodeRabbit, Copilot, Claude, etc., not just Dependabot).
  • lib/date-utils.sh — cross-platform date arithmetic.
  • collect-signals.sh — orchestrator. Replaces inline YAML heredoc.
  • match-discussions.sh — deterministic Jaccard-similarity matcher. Replaces "use fuzzy matching" prose. Kills the duplicate-Discussion idempotency hole.
  • discussion-mutations.shcreate_discussion / comment_on_discussion / add_label_to_discussion wrappers with DRY_RUN=1 audit-log mode for safe smoke testing.
  • lint-prompt.sh — catches unescaped $() / ${VAR} in direct_prompt: blocks (R2 fix — Mary was previously receiving the literal string $(date -u +%Y-%m-%d)).
  • validate-signals.py — JSON Schema 2020-12 validator.

Schema

  • .github/schemas/signals.schema.json — pinned producer/consumer contract for signals.json. CI rejects any drift.

Workflow rewrite

.github/workflows/feature-ideation.yml shrunk from 514 → 188 lines:

  • All bash extracted to scripts.
  • RUN_DATE, SIGNALS_PATH, DRY_RUN passed via env: instead of unescaped shell expansions in the prompt.
  • New dry_run workflow_dispatch input for sandbox smoke testing — uploads JSONL audit log as artifact.
  • Mary's prompt now references env vars by name ($RUN_DATE, $SIGNALS_PATH) and instructs her to use the deterministic matcher + mutation wrappers.

CI gate

.github/workflows/feature-ideation-tests.yml (path-filtered):

  1. shellcheck on every script
  2. JSON Schema validation against every fixture (positive AND negative)
  3. lint-prompt.sh against every workflow file
  4. Full bats suite

Tests — 91 tests, 9 suites, 0 failing, 0 skipped

Suite Tests Risks closed
gh-safe.bats 19 R1, R7, R8
compose-signals.bats 8 R3, R4
filter-bots.bats 5 R10
date-utils.bats 7 R9
collect-signals.bats 14 (integration) R1, R3, R4, R7, R11
match-discussions.bats 13 R5, R6
discussion-mutations.bats 10 DRY_RUN contract
lint-prompt.bats 7 R2
signals-schema.bats 8 R3

Includes 9 GraphQL/REST response fixtures, 5 expected signals.json fixtures (3 valid + 2 INVALID for negative-path coverage), and a programmable gh PATH stub with single-call and multi-call modes.

Test plan

  • bats test/workflows/feature-ideation/ → 91 passing locally
  • shellcheck -x clean on every script
  • lint-prompt.sh clean against feature-ideation.yml
  • All schema fixtures validate correctly (positive + negative)
  • Workflow YAML parses cleanly
  • CI runs the new feature-ideation-tests.yml workflow on this PR
  • Smoke test on a fork with dry_run: true before merging — inspect the dry-run-log artifact to audit Mary's planned mutations without polluting real Discussions
  • First scheduled live run after merge produces a valid signals.json and creates Discussions as expected

Notes for reviewers

  • Zero changes outside the feature-ideation surface. src/, _bmad-output/planning-artifacts/ (excluding research), and existing workflows are untouched.
  • The lint-prompt test suite includes a meta-test that runs the linter against the live workflow in this repo — that's how we know R2 stays dead.
  • match-discussions.sh uses Python for normalization (NFKD + emoji stripping + stopword removal). Python 3 is on every GitHub runner; no extra deps beyond jsonschema for the validator.

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added automated Feature Ideation workflow that runs weekly and via manual trigger. The workflow collects repository signals (open/closed issues, discussions, PRs, releases), analyzes them with AI, and generates feature proposals with discussion integration.
    • Introduced dry-run mode for auditing proposed changes before execution.
  • Chores

    • Added JSON schema validation and test infrastructure for feature ideation signals pipeline.

DJ and others added 3 commits April 5, 2026 20:06
Scheduled workflow (Fridays 3am ET) that uses Claude Code Action as the
BMAD Analyst (Mary) to research market trends, analyze project signals,
and create per-idea Discussion threads in the Ideas category. Each run
creates new proposals or updates existing ones with evolving findings.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…, SHA

- Add issues:read, pull-requests:read, discussions:read permissions to
  gather-signals job (was missing with top-level permissions: {})
- Switch from job output to artifact for signals data to avoid GitHub
  Actions output size limits with large issue/PR payloads
- Drop issue body from queries to reduce payload size
- Use category-filtered GraphQL query for Ideas discussions instead of
  fetching all and post-filtering (scales to >50 discussions)
- Increase discussion pagination from 50 to 100
- Clarify DST behavior in cron schedule comment (3 AM EDT / 2 AM EST)
- Align claude-code-action SHA to v1.0.89 matching claude.yml

Addresses Copilot review comments on PR #60.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…t prompt

Refactors the feature-ideation workflow's parsing surface from inline YAML
heredocs into testable scripts with deterministic contracts. Every defect
that previously required post-merge review can now fail in CI.

Why: the prior workflow used `2>/dev/null || echo '[]'` everywhere, which
silently downgraded auth failures, rate limits, network outages, and
GraphQL schema drift to empty arrays. The pipeline would "succeed" while
producing useless signals. Mary's prompt also contained `$(date)` inside
a YAML direct_prompt block — YAML doesn't expand shell, so she received
the literal string. None of this was caught until reviewers noticed.

What's new:
- .github/scripts/feature-ideation/ — orchestrator + 4 lib helpers
  - lib/gh-safe.sh: defensive `gh` / GraphQL wrapper that fails loud on
    auth, rate-limit, network, GraphQL errors envelope, and `data: null`
  - lib/compose-signals.sh: validates JSON inputs before jq composition
  - lib/filter-bots.sh: extensible bot author filter
  - lib/date-utils.sh: cross-platform date arithmetic
  - collect-signals.sh: orchestrator (replaces inline YAML heredoc)
  - match-discussions.sh: deterministic Jaccard matcher (replaces "use
    fuzzy matching" prose) — kills the duplicate-Discussion idempotency hole
  - discussion-mutations.sh: create/comment/label wrappers with DRY_RUN=1
    audit-log mode for safe smoke testing
  - lint-prompt.sh: catches unescaped $() / ${VAR} in direct_prompt blocks
  - validate-signals.py: JSON Schema 2020-12 validator

- .github/schemas/signals.schema.json — producer/consumer contract for
  signals.json. CI rejects any drift.

- .github/workflows/feature-ideation.yml — rewritten (514 → 188 lines).
  Bash logic delegated to scripts. RUN_DATE / SIGNALS_PATH / DRY_RUN
  passed via env: instead of unescaped shell expansions in the prompt.
  New `dry_run` workflow_dispatch input for sandbox smoke testing.

- .github/workflows/feature-ideation-tests.yml — path-filtered CI gate:
  shellcheck + bats + schema validation + prompt linter.

- test/workflows/feature-ideation/ — 91 bats tests across 9 suites,
  14 fixtures, programmable gh stub, shared helpers. Tests cover every
  documented failure mode (auth/rate-limit/network/GraphQL errors/null
  data/empty input/non-JSON/truncation/missing category/idempotent retry).

Risks closed (P × I scoring): R1=9, R2=6, R3=6, R4=6, R5=6, R6=6, R7=6,
R8=4, R9=1, R10=3, R11=4. Zero open critical risks.

Test count: 91 passing, 0 failing, 0 skipped. Run with:
  bats test/workflows/feature-ideation/

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 7, 2026 17:11
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

Introduces a complete feature ideation workflow system that collects repository signals (issues, discussions, PRs, releases), validates them against a JSON schema, and provides structured data to an AI agent for feature proposal generation and GitHub discussion management. Includes bash helpers for safe API interaction, signal composition, bot filtering, discussion matching, and comprehensive test coverage.

Changes

Cohort / File(s) Summary
JSON Schema & Validation
.github/schemas/signals.schema.json, .github/scripts/feature-ideation/validate-signals.py
Defines canonical signals contract with schema version, scan metadata, and typed issue/PR/discussion/release buckets; Python validator enforces schema compliance via jsonschema.
GitHub API Wrappers
.github/scripts/feature-ideation/lib/gh-safe.sh
Defensive wrappers for gh REST and GraphQL calls with structured error handling, JSON validation, null-data rejection, and optional jq filtering.
Signal Collection
.github/scripts/feature-ideation/collect-signals.sh, lib/compose-signals.sh, lib/filter-bots.sh
Orchestrates data fetch (issues, discussions, releases, PRs) via safe API calls, filters bot-authored items, composes canonical signals JSON, and logs truncation warnings.
Date & Utility Helpers
.github/scripts/feature-ideation/lib/date-utils.sh, lint-prompt.sh
Cross-platform UTC date arithmetic; YAML workflow linter enforcing safe direct_prompt content by rejecting unescaped shell expansions.
Discussion Management
.github/scripts/feature-ideation/discussion-mutations.sh, match-discussions.sh
GraphQL mutation wrappers for discussion create/comment/label operations with DRY_RUN audit logging; deterministic title-matching via token-set Jaccard similarity.
CI/CD Workflows
.github/workflows/feature-ideation.yml, .github/workflows/feature-ideation-tests.yml
Production workflow (weekly + manual trigger) gathering signals and invoking AI analysis; test workflow validating scripts, schema, fixtures, and prompt safety.
Test Infrastructure
test/workflows/feature-ideation/helpers/setup.bash, stubs/gh
Shared Bats helper exporting paths and lifecycle utilities; multi-call gh stub for deterministic test sequencing.
Comprehensive Test Suites
test/workflows/feature-ideation/*.bats, fixtures/**/*
Bats tests covering signal collection, composition, date utils, bot filtering, discussion mutations, GH-safe wrappers, prompt linting, matching logic, and schema validation; JSON fixtures for valid/invalid signals and mocked GraphQL responses.
Documentation
.github/scripts/feature-ideation/README.md
Describes refactored workflow structure, helper responsibilities, test execution, CI integration, DRY_RUN audit flow, and schema compatibility requirements.

Sequence Diagrams

sequenceDiagram
    actor User
    participant GitHub as GitHub Actions<br/>(feature-ideation.yml)
    participant Collect as collect-signals.sh
    participant API as GitHub API<br/>(gh-safe.sh)
    participant Schema as validate-signals.py
    participant Analyze as Claude Code<br/>Action (AI Agent)
    participant Match as match-discussions.sh
    participant Mutate as discussion-mutations.sh
    
    User->>GitHub: Trigger (weekly or manual)
    GitHub->>Collect: Execute gather-signals job
    Collect->>API: Fetch open/closed issues
    API->>Collect: Return issues array
    Collect->>API: Fetch discussion category IDs
    API->>Collect: Return categories
    Collect->>API: Fetch Ideas discussions
    API->>Collect: Return discussions
    Collect->>API: Fetch releases & merged PRs
    API->>Collect: Return PRs/releases
    Collect->>Collect: Compose signals.json
    Collect->>Schema: Validate signals
    Schema->>Collect: ✓ Valid
    GitHub->>GitHub: Upload signals artifact
    GitHub->>Analyze: Execute analyze job<br/>(pass signals.json)
    Analyze->>Analyze: Generate proposals.json
    Analyze->>Match: Deterministic matching
    Match->>Analyze: Return matched/<br/>new_candidates
    Analyze->>Mutate: Create discussions<br/>(live or DRY_RUN)
    Mutate->>API: GraphQL mutations
    API->>Mutate: Success
    Mutate->>Analyze: ✓ Complete
    Analyze->>GitHub: Upload dry-run.jsonl<br/>(if DRY_RUN)
Loading
sequenceDiagram
    participant Developer
    participant Test as CI (feature-ideation-tests.yml)
    participant ShellCheck as shellcheck
    participant LintPrompt as lint-prompt.sh
    participant Fixture as Schema Validator<br/>(validate-signals.py)
    participant Bats as Bats Test Suite
    
    Developer->>Test: Push to PR (affecting<br/>feature-ideation files)
    Test->>ShellCheck: Lint all bash scripts
    ShellCheck->>Test: Report syntax issues
    Test->>LintPrompt: Scan workflows for<br/>unsafe $(...) in direct_prompt
    LintPrompt->>Test: Pass/Fail
    Test->>Fixture: Validate each<br/>fixtures/*.signals.json
    Fixture->>Test: Reject INVALID-*<br/>Accept valid fixtures
    Test->>Bats: Run comprehensive<br/>test suite
    Bats->>Bats: collect-signals,<br/>compose, mutations, etc.
    Bats->>Test: Report results
    Test->>Developer: ✓ All checks pass
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 35.56% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and concisely describes the main change: extracting bash logic into testable scripts, adding a JSON schema contract, and introducing comprehensive test coverage (91 bats tests).

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/feature-ideation-workflow
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch feat/feature-ideation-workflow

Comment @coderabbitai help to get the list of available commands and usage tips.

@don-petry don-petry marked this pull request as draft April 7, 2026 17:14
@don-petry
Copy link
Copy Markdown
Contributor Author

⚠️ This PR is based on a stale view of main — converted to draft

While DJ was AFK, I (Murat / Claude) discovered that main has moved on substantially since this branch was created:

That means:

  1. The 514-line workflow this PR refactored no longer exists in TalkTerm as of main.
  2. The bash + jq + GraphQL parsing surface this PR hardens lives in the org reusable workflow, not in TalkTerm.
  3. Merging this PR as-is would revert refactor: use org-wide reusable feature-ideation workflow #68 by re-inlining all the bash that was just removed — almost certainly the wrong call.

What this PR still has of value

The 91 bats tests, schema, and scripts are all sound — they just target the wrong repo. The recommended path forward is one of:

Option A — Move the work to the org repo (highest leverage)
Port .github/scripts/feature-ideation/, .github/schemas/signals.schema.json, and test/workflows/feature-ideation/ to petry-projects/.github, refactor feature-ideation-reusable.yml to call the extracted scripts, and let every BMAD repo benefit from the hardening on next scheduled run. This is the right place architecturally — the parsing logic should live next to the workflow that actually owns it.

Option B — Salvage the prompt linter only
The lint-prompt.sh linter (R2 fix) has general value beyond feature-ideation. Keep that one script + its bats tests as a TalkTerm CI utility, drop everything else, and abandon this PR with a follow-up issue against petry-projects/.github for the rest.

Option C — Close this PR, file findings as an issue against the org repo
Document the risk register (R1–R11) and the proposed test strategy as an issue on petry-projects/.github, let the maintainer of the reusable workflow decide whether to take the test scaffolding wholesale.

What I am NOT doing

Waiting for DJ to choose a path.

— Murat (BMAD TEA)

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the feature-ideation.yml workflow from inline bash/YAML heredocs into a set of testable scripts with a pinned signals.json schema and a dedicated CI gate, aiming to make the workflow deterministic and fail-loud in CI rather than post-merge.

Changes:

  • Extracted feature-ideation logic into .github/scripts/feature-ideation/* (collection, matching, prompt linting, mutations, schema validation).
  • Added JSON Schema contract for signals.json plus fixtures and schema-validation tooling.
  • Added a new path-filtered GitHub Actions workflow that runs shellcheck, schema checks, and a bats suite (91 tests).

Reviewed changes

Copilot reviewed 38 out of 38 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
test/workflows/feature-ideation/stubs/gh gh CLI stub supporting scripted multi-call scenarios for integration tests.
test/workflows/feature-ideation/helpers/setup.bash Shared bats setup helpers (repo paths, temp dirs, stub installation).
test/workflows/feature-ideation/gh-safe.bats Unit tests for defensive gh/GraphQL wrapper behavior.
test/workflows/feature-ideation/compose-signals.bats Tests composition logic and input validation for signals assembly.
test/workflows/feature-ideation/filter-bots.bats Tests filtering bot-authored items from signals inputs.
test/workflows/feature-ideation/date-utils.bats Tests cross-platform date helper functions.
test/workflows/feature-ideation/collect-signals.bats End-to-end integration tests for collect-signals.sh using fixtures and schema validation.
test/workflows/feature-ideation/match-discussions.bats Tests deterministic proposal↔discussion matching and idempotency behavior.
test/workflows/feature-ideation/discussion-mutations.bats Tests DRY_RUN logging contract and “fail loud” behavior for mutations wrapper.
test/workflows/feature-ideation/lint-prompt.bats Tests prompt linter detection for unescaped shell expansions.
test/workflows/feature-ideation/signals-schema.bats Validates fixtures against the new signals JSON Schema.
test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json Fixture: open issues payload for integration tests.
test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json Fixture: closed issues payload for cutoff filtering tests.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json Fixture: discussion categories GraphQL response.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json Fixture: categories response without “Ideas”.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json Fixture: discussions GraphQL response (not truncated).
test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json Fixture: discussions response with hasNextPage=true to trigger warnings.
test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json Fixture: GraphQL errors[] envelope for fail-loud tests.
test/workflows/feature-ideation/fixtures/gh-responses/release-list.json Fixture: releases list payload.
test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json Fixture: merged PR list payload for cutoff filtering tests.
test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json Expected signals.json (empty repo) fixture.
test/workflows/feature-ideation/fixtures/expected/populated.signals.json Expected signals.json (populated) fixture.
test/workflows/feature-ideation/fixtures/expected/truncated.signals.json Expected signals.json fixture including truncation warnings.
test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json Negative fixture: missing required field.
test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json Negative fixture: invalid repo slug.
.github/workflows/feature-ideation.yml Workflow rewrite to call extracted scripts, validate schema, and support dry-run logging.
.github/workflows/feature-ideation-tests.yml New CI gate to run shellcheck, prompt lint, schema fixture validation, and bats suites.
.github/scripts/feature-ideation/README.md Documents script responsibilities, test strategy, DRY_RUN mode, and schema contract.
.github/scripts/feature-ideation/collect-signals.sh Orchestrates signal collection, filtering, truncation warnings, and stable output shape.
.github/scripts/feature-ideation/discussion-mutations.sh Wraps discussion mutations with DRY_RUN JSONL logging contract.
.github/scripts/feature-ideation/match-discussions.sh Deterministic Jaccard-similarity matcher for proposals vs existing discussions.
.github/scripts/feature-ideation/lint-prompt.sh Lints workflow direct_prompt: blocks for unescaped $() / ${VAR}.
.github/scripts/feature-ideation/validate-signals.py Validates signals.json against the schema (Draft 2020-12).
.github/scripts/feature-ideation/lib/gh-safe.sh Defensive wrapper around gh REST/GraphQL calls with fail-loud semantics.
.github/scripts/feature-ideation/lib/filter-bots.sh Filters bot-authored items using a configurable bot-login list.
.github/scripts/feature-ideation/lib/date-utils.sh Cross-platform date arithmetic utilities.
.github/scripts/feature-ideation/lib/compose-signals.sh Validates JSON inputs and composes canonical signals.json.
.github/schemas/signals.schema.json JSON Schema contract for signals.json (additionalProperties=false at top-level).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +151 to +159
# If caller asked for a jq filter, apply it now and return that.
if [ "$has_jq" -eq 1 ]; then
local filtered
filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null || true)
if [ -z "$filtered" ] || [ "$filtered" = "null" ]; then
# The filter resolved to null/empty — caller probably wants "[]" semantics
# for "no nodes found". Return the empty array sentinel.
printf '%s' "$GH_SAFE_EMPTY_ARRAY"
return 0
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In gh_safe_graphql, jq filter failures are currently swallowed (... | jq ... || true), and then treated as “empty/null” and normalized to []. That can mask jq syntax errors or schema drift and reintroduce the “silent empty data” failure mode this wrapper is meant to prevent. Capture jq’s exit status and fail non-zero on jq errors; only normalize to [] when the filter evaluates successfully to null/empty.

Copilot uses AI. Check for mistakes.

gh_safe_graphql -f query="$query" \
-f labelableId="$discussion_id" \
-f labelIds="[\"$label_id\"]"
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add_label_to_discussion passes labelIds via -f labelIds="[\"$label_id\"]", which makes the GraphQL variable a string containing JSON rather than an actual [ID!]! array. This will cause the mutation to fail at runtime. Pass labelIds as a typed JSON array (e.g., using -F), or use the CLI’s array variable syntax so it is encoded as an array.

Suggested change
-f labelIds="[\"$label_id\"]"
-F labelIds[]="$label_id"

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +57
normalize_title() {
python3 -c '
import re, sys, unicodedata
raw = sys.argv[1]
# Strip emoji and other symbol/other categories.
no_emoji = "".join(ch for ch in raw if unicodedata.category(ch)[0] not in ("S",))
# Lowercase + ascii fold.
ascii_text = unicodedata.normalize("NFKD", no_emoji).encode("ascii", "ignore").decode()
ascii_text = ascii_text.lower()
# Replace non-alphanumeric with space.
collapsed = re.sub(r"[^a-z0-9]+", " ", ascii_text).strip()
# Drop stopwords.
stopwords = {"a","an","the","of","for","to","and","or","with","in","on","by","via","as","is","are","be","support","add","new","feature","idea"}
tokens = [t for t in collapsed.split() if t and t not in stopwords]
print(" ".join(tokens))
' "$1"
}
export -f normalize_title

# Compute Jaccard similarity between two normalized token strings.
jaccard_similarity() {
python3 -c '
import sys
a = set(sys.argv[1].split())
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

normalize_title / jaccard_similarity are defined and exported but never used (the script uses the embedded Python implementation instead). Keeping two separate normalization implementations (and two stopword lists) invites drift and makes the behavior harder to audit. Remove these unused bash/Python one-liners or refactor so there is a single source of truth for normalization/similarity.

Suggested change
normalize_title() {
python3 -c '
import re, sys, unicodedata
raw = sys.argv[1]
# Strip emoji and other symbol/other categories.
no_emoji = "".join(ch for ch in raw if unicodedata.category(ch)[0] not in ("S",))
# Lowercase + ascii fold.
ascii_text = unicodedata.normalize("NFKD", no_emoji).encode("ascii", "ignore").decode()
ascii_text = ascii_text.lower()
# Replace non-alphanumeric with space.
collapsed = re.sub(r"[^a-z0-9]+", " ", ascii_text).strip()
# Drop stopwords.
stopwords = {"a","an","the","of","for","to","and","or","with","in","on","by","via","as","is","are","be","support","add","new","feature","idea"}
tokens = [t for t in collapsed.split() if t and t not in stopwords]
print(" ".join(tokens))
' "$1"
}
export -f normalize_title
# Compute Jaccard similarity between two normalized token strings.
jaccard_similarity() {
python3 -c '
import sys
a = set(sys.argv[1].split())
# Title normalization and Jaccard similarity are implemented once in the
# main embedded Python matcher below. Keeping a second shell-exported copy
# here would duplicate the stopword list and normalization rules.

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +38
local IFS=','
# shellcheck disable=SC2206
local extras=($FEATURE_IDEATION_BOT_AUTHORS)
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parsing FEATURE_IDEATION_BOT_AUTHORS via IFS=','; extras=($FEATURE_IDEATION_BOT_AUTHORS) preserves leading/trailing whitespace (e.g. "bot1, bot2" yields " bot2"), which will silently fail to match and filter that bot. Consider trimming whitespace and dropping empty entries when building the extras list.

Suggested change
local IFS=','
# shellcheck disable=SC2206
local extras=($FEATURE_IDEATION_BOT_AUTHORS)
local extra
local extras=()
while IFS= read -r extra; do
extra="${extra#"${extra%%[![:space:]]*}"}"
extra="${extra%"${extra##*[![:space:]]}"}"
if [ -n "$extra" ]; then
extras+=("$extra")
fi
done < <(printf '%s' "$FEATURE_IDEATION_BOT_AUTHORS" | tr ',' '\n')

Copilot uses AI. Check for mistakes.
Comment on lines +18 to +21
tt_make_tmpdir() {
TT_TMP="$(mktemp -d)"
export TT_TMP
}
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mktemp -d is not portable on macOS/BSD mktemp (it typically requires a template or -t). Since the README suggests running these bats suites on macOS, this helper can fail locally. Use a portable mktemp invocation (e.g. provide a template / -t fallback) so tests run cross-platform.

Copilot uses AI. Check for mistakes.
Comment on lines +65 to +66
validator = Draft202012Validator(schema)
errors = sorted(validator.iter_errors(signals), key=lambda e: list(e.absolute_path))
Copy link

Copilot AI Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Draft202012Validator(schema) does not enforce JSON Schema format (e.g. scan_date’s format: date-time) unless a FormatChecker is provided. That means malformed timestamps can still pass “schema validation”. Consider instantiating the validator with a format checker (or explicitly validating formats) so the schema contract is actually enforced.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Nitpick comments (7)
test/workflows/feature-ideation/date-utils.bats (1)

34-38: String comparison for date ordering is correct but could use a comment.

The \< string comparison works for ISO dates (YYYY-MM-DD) because lexicographic ordering matches chronological ordering. This is a subtle but valid approach — a brief inline comment would aid future maintainers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/date-utils.bats` around lines 34 - 38, The
test uses a lexicographic string comparison to assert chronological order; add a
brief inline comment above the assertion in the date_days_ago/date_today test
explaining that ISO date format (YYYY-MM-DD) preserves chronological order when
compared as strings, so using [ "$past" \< "$today" ] is correct; locate the
comparison in the test function (date_days_ago: 30 days ago is earlier than
today) and insert the single-line comment referencing date_days_ago and
date_today.
.github/scripts/feature-ideation/validate-signals.py (1)

65-66: Consider validating the schema itself before use.

If the schema file contains valid JSON but invalid JSON Schema, the error message will be confusing. Adding a validator.check_schema(schema) call would catch malformed schemas early:

♻️ Optional schema validation
+    try:
+        Draft202012Validator.check_schema(schema)
+    except Exception as exc:
+        sys.stderr.write(f"[validate-signals] invalid schema: {exc}\n")
+        return 2
+
     validator = Draft202012Validator(schema)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/validate-signals.py around lines 65 - 66,
Add a schema self-check before iterating errors: after creating
Draft202012Validator(schema) (symbol: Draft202012Validator and variable
validator), call validator.check_schema(schema) and handle any SchemaError to
produce a clear, early error message; this ensures malformed JSON Schema is
caught before running validator.iter_errors(signals).
.github/scripts/feature-ideation/lib/filter-bots.sh (1)

41-41: Consider simplifying the double jq invocation.

The pipeline printf '%s\n' "${list[@]}" | jq -R . | jq -sc . can be simplified to a single jq call using --slurp --raw-input with splitting:

♻️ Optional simplification
-  printf '%s\n' "${list[@]}" | jq -R . | jq -sc .
+  printf '%s\n' "${list[@]}" | jq -Rs 'split("\n") | map(select(. != ""))'
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/filter-bots.sh at line 41, Replace the
two-step jq pipeline on the printf output in the filter-bots.sh snippet with a
single jq invocation that uses --slurp/--raw-input to read all lines as one
string and then split on newlines (dropping the final empty element) to produce
the same JSON array; update the line containing printf '%s\n' "${list[@]}" | jq
-R . | jq -sc . to call jq once (using --slurp/--raw-input and split) so the
result remains a JSON array of entries but with a simpler, single jq call.
test/workflows/feature-ideation/filter-bots.bats (1)

47-57: Potential quoting fragility in env extension test.

Line 51 uses '$input' inside the sh -c string, which correctly passes the literal variable name to be expanded by the outer shell. However, if $input contained single quotes, this would break. Consider using a heredoc or temporary file for more robust JSON passing:

♻️ More robust alternative
 `@test` "filter-bots: env extension adds custom bot logins" {
-  input='[{"number":1,"title":"x","author":{"login":"my-custom-bot"}},{"number":2,"title":"y","author":{"login":"alice"}}]'
-  result=$(FEATURE_IDEATION_BOT_AUTHORS="my-custom-bot" sh -c "
-    . '${TT_SCRIPTS_DIR}/lib/filter-bots.sh'
-    printf '%s' '$input' | filter_bots_apply
-  ")
+  local input_file="${BATS_TEST_TMPDIR}/input.json"
+  cat > "$input_file" <<'JSON'
+[{"number":1,"title":"x","author":{"login":"my-custom-bot"}},{"number":2,"title":"y","author":{"login":"alice"}}]
+JSON
+  result=$(FEATURE_IDEATION_BOT_AUTHORS="my-custom-bot" bash -c "
+    . '${TT_SCRIPTS_DIR}/lib/filter-bots.sh'
+    filter_bots_apply < '$input_file'
+  ")
   count=$(printf '%s' "$result" | jq 'length')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/filter-bots.bats` around lines 47 - 57, The
test uses sh -c with a literal '$input' inside single quotes which breaks if
$input contains single quotes; update the test in filter-bots.bats to pass the
JSON to filter_bots_apply in a robust way (e.g., use a heredoc or a temporary
file) instead of embedding '$input' inside the sh -c string—ensure
FEATURE_IDEATION_BOT_AUTHORS is still exported for the subshell and call
filter_bots_apply (the function sourced from lib/filter-bots.sh) reading from
stdin via the heredoc or temp file so the JSON is preserved regardless of
embedded quotes.
.github/scripts/feature-ideation/lib/date-utils.sh (1)

18-25: Minor inefficiency: date is executed twice for the success path.

The function runs date once to test if the format works (discarding output) and again to capture the result. This could be optimized by capturing the output directly and checking the exit code:

♻️ Optional optimization
-  if date -u -d "${days} days ago" +%Y-%m-%d >/dev/null 2>&1; then
-    date -u -d "${days} days ago" +%Y-%m-%d
-  elif date -u -v-"${days}"d +%Y-%m-%d >/dev/null 2>&1; then
-    date -u -v-"${days}"d +%Y-%m-%d
+  local result
+  if result=$(date -u -d "${days} days ago" +%Y-%m-%d 2>/dev/null); then
+    printf '%s\n' "$result"
+  elif result=$(date -u -v-"${days}"d +%Y-%m-%d 2>/dev/null); then
+    printf '%s\n' "$result"
   else
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/date-utils.sh around lines 18 - 25, The
date invocation is run twice for each successful branch; change the if/elif
blocks so you capture the output of date into a variable and test the command
success, e.g. run date -u -d "${days} days ago" (or date -u -v-"${days}"d) once
into a temp var and use if [ $? -eq 0 ] or if output=$(date ... 2>/dev/null);
then echo "$output"; fi — update both the "-d" branch and the "-v-...d" branch
in the current conditional (the lines using date -u -d and date -u
-v-"${days}"d) to avoid running date twice.
test/workflows/feature-ideation/stubs/gh (1)

21-23: Logging loses argument boundaries with $*.

Using $* concatenates arguments with the first character of IFS (space by default), losing the distinction between gh api "foo bar" and gh api foo bar. For debugging complex test failures, preserving argument boundaries would help:

♻️ Suggested improvement
 if [ -n "${GH_STUB_LOG:-}" ]; then
-  printf '%s\n' "$*" >>"$GH_STUB_LOG"
+  printf '%q ' "$@" >>"$GH_STUB_LOG"
+  printf '\n' >>"$GH_STUB_LOG"
 fi

This uses %q to quote each argument preserving boundaries. Alternatively, use a tab-separated format.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/stubs/gh` around lines 21 - 23, The current
stub logger uses printf with "$*" which collapses argument boundaries; change
the writing to use "$@" with a quoted format so each argument boundary is
preserved (e.g. use printf '%q ' with "$@" and then a newline) and keep the same
GH_STUB_LOG target; update the printf invocation that currently references "$*"
to use "$@" and a %q-based format to retain argument quoting.
test/workflows/feature-ideation/signals-schema.bats (1)

13-16: Consider adding a test for validator error message quality.

The suite verifies exit codes but doesn't assert on error message content. Adding a test that checks the error output contains useful diagnostic info (like the invalid field path) would improve confidence in debugging capability.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/signals-schema.bats` around lines 13 - 16,
Add a new Bats test that runs the validator ($VALIDATOR) against a deliberately
malformed input and asserts the process exits non‑zero and that stderr contains
useful diagnostics (e.g., the invalid field path or property name). Reference
the existing test pattern and variables (use $VALIDATOR and a small temp invalid
payload) and check both the exit code and that the captured error message
includes the invalid field path substring to ensure good error quality.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/scripts/feature-ideation/discussion-mutations.sh:
- Around line 146-158: The gh_safe_graphql call is sending labelIds incorrectly
as a JSON string via -f, so the GraphQL mutation (addLabelsToLabelable with
variable labelIds: [ID!]!) receives a string instead of a list; change the
gh_safe_graphql invocation to use form-array syntax (use -F for form fields) and
send labelIds as an array like labelIds[]="$label_id" while keeping
labelableId="$discussion_id" and query intact so the GraphQL variable labelIds
is parsed as a proper list.

In @.github/scripts/feature-ideation/lib/compose-signals.sh:
- Around line 49-55: The loop that currently checks only JSON validity for the
inputs in the for statement (the variables iterated in the for input in ...
block) must also assert the JSON value is an array; replace the jq validation
that now tests only parseability with a jq expression that verifies
type=="array" (keeping the same error path that prints the truncated input via
printf and returns 65) so non-array JSON is rejected early; update the error
message if you want to mention “not a JSON array” when the jq check fails and
keep using the same idx/printf logic referenced in the current
compose-signals.sh loop.

In @.github/scripts/feature-ideation/lib/gh-safe.sh:
- Around line 151-159: The current block running jq (using filtered=$(... | jq
-c "$jq_filter" 2>/dev/null || true)) collapses jq runtime errors into an empty
result; change it so you capture jq's exit status separately: run jq without
discarding stderr, store its output into the filtered variable and its exit code
into a variable (e.g., jq_status), and if jq_status is non-zero emit the jq
error to stderr and return a non-zero failure (do not return
GH_SAFE_EMPTY_ARRAY); only when jq succeeds and filtered is empty or "null"
should you return the GH_SAFE_EMPTY_ARRAY sentinel. This touches the has_jq
check, the filtered variable, the jq_filter invocation, and the
GH_SAFE_EMPTY_ARRAY return logic.

In @.github/scripts/feature-ideation/lint-prompt.sh:
- Around line 102-113: The loop currently sets exit=2 for missing files but
later unconditionally sets exit=1 when scan_file fails, which can downgrade a
prior file-error code; modify the scan_file failure branch so it only sets
exit=1 when a more severe code hasn't already been recorded (e.g., if [ "$exit"
-ne 2 ]; then exit=1; fi), referencing the loop variable exit, the scan_file
invocation and the '[lint-prompt] not found: %s\n' missing-file path check.

In @.github/scripts/feature-ideation/match-discussions.sh:
- Around line 141-161: The current greedy loop in which each proposal iterates
disc_norm and assigns the first-best available discussion (using p_norm,
jaccard, best_sim, best, seen_disc_ids, matched, new_candidates, threshold) is
order-dependent; instead compute the similarity for every proposal/discussion
pair up front, collect tuples (proposal, discussion, similarity), sort them
descending by similarity, then iterate that sorted list claiming the
highest-similarity pair when both the proposal and discussion are still
unassigned (use sets to track assigned proposal ids and seen_disc_ids), add
claimed pairs to matched only if similarity >= threshold, and after claiming
finish by adding any unassigned proposals to new_candidates with their
best_similarity. Ensure you still round similarities the same way when storing
results.
- Around line 135-139: The loop over proposals currently only checks that
proposal is a dict and has a "title" key but then calls
normalize(proposal["title"]) which will crash if title is not a string (e.g.,
null or number); update the guard in the proposals loop to also verify
isinstance(proposal.get("title"), str) (or type(proposal["title"]) is str)
before calling normalize, and if the check fails write the same sys.stderr
skipping message (including the malformed proposal repr) and continue so
normalize is only invoked with true strings.

In
`@test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json`:
- Line 4: The INVALID-bad-repo fixture is passing because the JSON Schema
pattern for "repo" (^[^/]+/[^/]+$) is not being enforced during validation;
locate the validator used for signal fixtures (e.g., the function
validateSignals or validateFixture that loads the schema) and ensure the schema
is actually applied and pattern validation is enabled (configure your JSON
Schema validator like AJV to compile/validate using the provided schema and not
skip pattern checks). Specifically, confirm the schema is passed when calling
validate(...) for the fixture check, enable/disable relevant options that might
ignore patterns, and add/adjust the test to assert that
INVALID-bad-repo.signals.json (repo = "not-a-valid-slug") fails validation.

In
`@test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json`:
- Around line 1-6: The INVALID-missing-field.signals.json fixture is marked
invalid but passes validation because the schema isn't enforcing the missing
required fields; either update the schema to require the missing properties
(closed_issues_30d, ideas_discussions, releases, merged_prs_30d,
feature_requests, bug_reports, truncation_warnings) in the Signals schema (e.g.,
SignalsSchema / validateSignals function) or change the fixture to include those
fields so it genuinely fails; locate the validator (validateSignals /
validateSignalsSchema) and add the properties to the "required" list or modify
the fixture INVALID-missing-field.signals.json to include placeholder values for
the missing fields so the test outcome matches the intent.

In
`@test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json`:
- Around line 1-20: The fixture's GitHub GraphQL "discussions" payload is
missing fields; update the discussion object(s) under repository.discussions so
pageInfo includes totalCount and endCursor, ensure labels contains totalCount,
pageInfo and its nodes include id, name and color (not just name), and ensure
comments has pageInfo plus a nodes array with comment objects each containing
id, body, createdAt and updatedAt; target the JSON structures named
"discussions", "pageInfo", "labels", and "comments" within the existing
repository.discussions.nodes discussion entries (e.g., the object with id "D_1")
and add the missing fields to match the real API shape.

In `@test/workflows/feature-ideation/gh-safe.bats`:
- Around line 70-76: The assertion that checks for the "rest-failure" text in
the "rest: EXITS NON-ZERO on auth failure (gh exit 4)" test is rendered a no-op
by the trailing "|| true"; locate the test block where GH_STUB_EXIT and
gh_safe_rest are used and either remove the entire redundant text-check line if
the exit-status check is sufficient, or remove the "|| true" and keep the
double-bracket test so it actually fails when the expected "rest-failure" text
is not present (i.e., make the [[ "$output" == *"rest-failure"* ]] || [[
"$stderr" == *"rest-failure"* ]] assertion meaningful).

---

Nitpick comments:
In @.github/scripts/feature-ideation/lib/date-utils.sh:
- Around line 18-25: The date invocation is run twice for each successful
branch; change the if/elif blocks so you capture the output of date into a
variable and test the command success, e.g. run date -u -d "${days} days ago"
(or date -u -v-"${days}"d) once into a temp var and use if [ $? -eq 0 ] or if
output=$(date ... 2>/dev/null); then echo "$output"; fi — update both the "-d"
branch and the "-v-...d" branch in the current conditional (the lines using date
-u -d and date -u -v-"${days}"d) to avoid running date twice.

In @.github/scripts/feature-ideation/lib/filter-bots.sh:
- Line 41: Replace the two-step jq pipeline on the printf output in the
filter-bots.sh snippet with a single jq invocation that uses --slurp/--raw-input
to read all lines as one string and then split on newlines (dropping the final
empty element) to produce the same JSON array; update the line containing printf
'%s\n' "${list[@]}" | jq -R . | jq -sc . to call jq once (using
--slurp/--raw-input and split) so the result remains a JSON array of entries but
with a simpler, single jq call.

In @.github/scripts/feature-ideation/validate-signals.py:
- Around line 65-66: Add a schema self-check before iterating errors: after
creating Draft202012Validator(schema) (symbol: Draft202012Validator and variable
validator), call validator.check_schema(schema) and handle any SchemaError to
produce a clear, early error message; this ensures malformed JSON Schema is
caught before running validator.iter_errors(signals).

In `@test/workflows/feature-ideation/date-utils.bats`:
- Around line 34-38: The test uses a lexicographic string comparison to assert
chronological order; add a brief inline comment above the assertion in the
date_days_ago/date_today test explaining that ISO date format (YYYY-MM-DD)
preserves chronological order when compared as strings, so using [ "$past" \<
"$today" ] is correct; locate the comparison in the test function
(date_days_ago: 30 days ago is earlier than today) and insert the single-line
comment referencing date_days_ago and date_today.

In `@test/workflows/feature-ideation/filter-bots.bats`:
- Around line 47-57: The test uses sh -c with a literal '$input' inside single
quotes which breaks if $input contains single quotes; update the test in
filter-bots.bats to pass the JSON to filter_bots_apply in a robust way (e.g.,
use a heredoc or a temporary file) instead of embedding '$input' inside the sh
-c string—ensure FEATURE_IDEATION_BOT_AUTHORS is still exported for the subshell
and call filter_bots_apply (the function sourced from lib/filter-bots.sh)
reading from stdin via the heredoc or temp file so the JSON is preserved
regardless of embedded quotes.

In `@test/workflows/feature-ideation/signals-schema.bats`:
- Around line 13-16: Add a new Bats test that runs the validator ($VALIDATOR)
against a deliberately malformed input and asserts the process exits non‑zero
and that stderr contains useful diagnostics (e.g., the invalid field path or
property name). Reference the existing test pattern and variables (use
$VALIDATOR and a small temp invalid payload) and check both the exit code and
that the captured error message includes the invalid field path substring to
ensure good error quality.

In `@test/workflows/feature-ideation/stubs/gh`:
- Around line 21-23: The current stub logger uses printf with "$*" which
collapses argument boundaries; change the writing to use "$@" with a quoted
format so each argument boundary is preserved (e.g. use printf '%q ' with "$@"
and then a newline) and keep the same GH_STUB_LOG target; update the printf
invocation that currently references "$*" to use "$@" and a %q-based format to
retain argument quoting.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e344bd8-694d-4f87-b1a8-11da0b110fb2

📥 Commits

Reviewing files that changed from the base of the PR and between 63cdca3 and 4f6fa42.

📒 Files selected for processing (38)
  • .github/schemas/signals.schema.json
  • .github/scripts/feature-ideation/README.md
  • .github/scripts/feature-ideation/collect-signals.sh
  • .github/scripts/feature-ideation/discussion-mutations.sh
  • .github/scripts/feature-ideation/lib/compose-signals.sh
  • .github/scripts/feature-ideation/lib/date-utils.sh
  • .github/scripts/feature-ideation/lib/filter-bots.sh
  • .github/scripts/feature-ideation/lib/gh-safe.sh
  • .github/scripts/feature-ideation/lint-prompt.sh
  • .github/scripts/feature-ideation/match-discussions.sh
  • .github/scripts/feature-ideation/validate-signals.py
  • .github/workflows/feature-ideation-tests.yml
  • .github/workflows/feature-ideation.yml
  • test/workflows/feature-ideation/collect-signals.bats
  • test/workflows/feature-ideation/compose-signals.bats
  • test/workflows/feature-ideation/date-utils.bats
  • test/workflows/feature-ideation/discussion-mutations.bats
  • test/workflows/feature-ideation/filter-bots.bats
  • test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json
  • test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json
  • test/workflows/feature-ideation/fixtures/expected/populated.signals.json
  • test/workflows/feature-ideation/fixtures/expected/truncated.signals.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json
  • test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json
  • test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json
  • test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json
  • test/workflows/feature-ideation/fixtures/gh-responses/release-list.json
  • test/workflows/feature-ideation/gh-safe.bats
  • test/workflows/feature-ideation/helpers/setup.bash
  • test/workflows/feature-ideation/lint-prompt.bats
  • test/workflows/feature-ideation/match-discussions.bats
  • test/workflows/feature-ideation/signals-schema.bats
  • test/workflows/feature-ideation/stubs/gh

Comment on lines +146 to +158
mutation($labelableId: ID!, $labelIds: [ID!]!) {
addLabelsToLabelable(input: {
labelableId: $labelableId,
labelIds: $labelIds
}) {
clientMutationId
}
}
GRAPHQL

gh_safe_graphql -f query="$query" \
-f labelableId="$discussion_id" \
-f labelIds="[\"$label_id\"]"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

How does gh api graphqlencode GraphQL variables passed with-f/--raw-fieldversus-F/--field, and what is the correct syntax for sending a [ID!]!variable likelabelIds?

💡 Result:

  • -f/--raw-field key=value always sends the value as a JSON string (i.e., it’s not type-converted), so GraphQL variables you pass this way will be "some text" even if you intended a number/boolean/array. gh api describes this as “static string parameters.” [1]
  • -F/--field key=value applies “magic” JSON type conversion (e.g., true/false/null, integers, @file, placeholders), and it also supports array construction via repeated key[]=value entries. [1]
  • For GraphQL requests (gh api graphql), every field you pass other than query / operationName becomes a GraphQL variable. [1]

Correct syntax for a [ID!]! variable like labelIds

You must send it as a JSON array of ID strings. With gh api graphql, the idiomatic way is:

gh api graphql -f query='
  mutation($id: ID!, $labelIds: [ID!]!) {
    updateIssue(input: {id: $id, labelIds: $labelIds}) { clientMutationId }
  }
' -F id='PR_NODE_ID' \
  -F 'labelIds[]=LABEL_NODE_ID_1' \
  -F 'labelIds[]=LABEL_NODE_ID_2'

This matches how GitHub’s GraphQL mutations expect labelIds: ["LABEL_ID"] (an array of IDs), not a single string. [2]

If you instead do -f labelIds='["..."]', gh will send that as a string, and GraphQL will reject it because it’s not actually a list. [1]

Sources: [1] gh api manual; [2] GitHub CLI discussion showing labelIds as an ID array in GraphQL.


🏁 Script executed:

cat -n .github/scripts/feature-ideation/discussion-mutations.sh | sed -n '140,165p'

Repository: petry-projects/TalkTerm

Length of output: 597


🏁 Script executed:

# Check context around the function to understand label_id usage
cat -n .github/scripts/feature-ideation/discussion-mutations.sh | sed -n '120,180p'

Repository: petry-projects/TalkTerm

Length of output: 1301


🏁 Script executed:

# Check if gh_safe_graphql is defined and how it handles variables
grep -n "gh_safe_graphql" .github/scripts/feature-ideation/discussion-mutations.sh | head -20

Repository: petry-projects/TalkTerm

Length of output: 191


🏁 Script executed:

# Find where gh_safe_graphql is defined
rg "gh_safe_graphql\s*\(\)" .github/scripts/

Repository: petry-projects/TalkTerm

Length of output: 134


🏁 Script executed:

cat -n .github/scripts/feature-ideation/lib/gh-safe.sh | grep -A 15 "gh_safe_graphql()"

Repository: petry-projects/TalkTerm

Length of output: 515


🏁 Script executed:

cat -n .github/scripts/feature-ideation/lib/gh-safe.sh | sed -n '89,130p'

Repository: petry-projects/TalkTerm

Length of output: 1270


🏁 Script executed:

# Check if there are other calls to add_label_to_discussion or similar patterns
rg "add_label_to_discussion\|labelIds" .github/scripts/

Repository: petry-projects/TalkTerm

Length of output: 49


🏁 Script executed:

# Check other mutations in the file to see how they handle variables
rg "gh_safe_graphql" -A 5 .github/scripts/feature-ideation/discussion-mutations.sh | head -40

Repository: petry-projects/TalkTerm

Length of output: 458


Use -F with array syntax to send labelIds as a proper GraphQL list.

Line 158 passes labelIds using -f (raw-field), which sends the value as a literal JSON string instead of a parsed list. The GraphQL mutation expects labelIds: [ID!]! (a list of IDs), but it will receive a string like "[\"LABEL_ID\"]" and fail at the GitHub API.

Use -F labelIds[]="$label_id" instead, which applies proper JSON type conversion and array construction syntax:

Fix
   gh_safe_graphql -f query="$query" \
     -f labelableId="$discussion_id" \
-    -f labelIds="[\"$label_id\"]"
+    -F labelIds[]="$label_id"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
mutation($labelableId: ID!, $labelIds: [ID!]!) {
addLabelsToLabelable(input: {
labelableId: $labelableId,
labelIds: $labelIds
}) {
clientMutationId
}
}
GRAPHQL
gh_safe_graphql -f query="$query" \
-f labelableId="$discussion_id" \
-f labelIds="[\"$label_id\"]"
mutation($labelableId: ID!, $labelIds: [ID!]!) {
addLabelsToLabelable(input: {
labelableId: $labelableId,
labelIds: $labelIds
}) {
clientMutationId
}
}
GRAPHQL
gh_safe_graphql -f query="$query" \
-f labelableId="$discussion_id" \
-F labelIds[]="$label_id"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/discussion-mutations.sh around lines 146 -
158, The gh_safe_graphql call is sending labelIds incorrectly as a JSON string
via -f, so the GraphQL mutation (addLabelsToLabelable with variable labelIds:
[ID!]!) receives a string instead of a list; change the gh_safe_graphql
invocation to use form-array syntax (use -F for form fields) and send labelIds
as an array like labelIds[]="$label_id" while keeping
labelableId="$discussion_id" and query intact so the GraphQL variable labelIds
is parsed as a proper list.

Comment on lines +49 to +55
for input in "$open_issues" "$closed_issues" "$ideas_discussions" "$releases" \
"$merged_prs" "$feature_requests" "$bug_reports" "$truncation_warnings"; do
idx=$((idx + 1))
if ! printf '%s' "$input" | jq -e . >/dev/null 2>&1; then
printf '[compose-signals] arg #%d is not valid JSON: %s\n' "$idx" "${input:0:120}" >&2
return 65 # EX_DATAERR
fi
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Enforce JSON array type, not just JSON syntax

Line 52 validates only “valid JSON”, but this function contract requires arrays for these args. Non-array JSON can pass here and fail later with less-clear errors. Validate type == "array" at this boundary.

Proposed fix
-  for input in "$open_issues" "$closed_issues" "$ideas_discussions" "$releases" \
-               "$merged_prs" "$feature_requests" "$bug_reports" "$truncation_warnings"; do
+  for input in "$open_issues" "$closed_issues" "$ideas_discussions" "$releases" \
+               "$merged_prs" "$feature_requests" "$bug_reports" "$truncation_warnings"; do
     idx=$((idx + 1))
-    if ! printf '%s' "$input" | jq -e . >/dev/null 2>&1; then
-      printf '[compose-signals] arg #%d is not valid JSON: %s\n' "$idx" "${input:0:120}" >&2
+    if ! printf '%s' "$input" | jq -e 'type == "array"' >/dev/null 2>&1; then
+      printf '[compose-signals] arg #%d must be a JSON array: %s\n' "$idx" "${input:0:120}" >&2
       return 65  # EX_DATAERR
     fi
   done
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for input in "$open_issues" "$closed_issues" "$ideas_discussions" "$releases" \
"$merged_prs" "$feature_requests" "$bug_reports" "$truncation_warnings"; do
idx=$((idx + 1))
if ! printf '%s' "$input" | jq -e . >/dev/null 2>&1; then
printf '[compose-signals] arg #%d is not valid JSON: %s\n' "$idx" "${input:0:120}" >&2
return 65 # EX_DATAERR
fi
for input in "$open_issues" "$closed_issues" "$ideas_discussions" "$releases" \
"$merged_prs" "$feature_requests" "$bug_reports" "$truncation_warnings"; do
idx=$((idx + 1))
if ! printf '%s' "$input" | jq -e 'type == "array"' >/dev/null 2>&1; then
printf '[compose-signals] arg #%d must be a JSON array: %s\n' "$idx" "${input:0:120}" >&2
return 65 # EX_DATAERR
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/compose-signals.sh around lines 49 -
55, The loop that currently checks only JSON validity for the inputs in the for
statement (the variables iterated in the for input in ... block) must also
assert the JSON value is an array; replace the jq validation that now tests only
parseability with a jq expression that verifies type=="array" (keeping the same
error path that prints the truncated input via printf and returns 65) so
non-array JSON is rejected early; update the error message if you want to
mention “not a JSON array” when the jq check fails and keep using the same
idx/printf logic referenced in the current compose-signals.sh loop.

Comment on lines +151 to +159
# If caller asked for a jq filter, apply it now and return that.
if [ "$has_jq" -eq 1 ]; then
local filtered
filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null || true)
if [ -z "$filtered" ] || [ "$filtered" = "null" ]; then
# The filter resolved to null/empty — caller probably wants "[]" semantics
# for "no nodes found". Return the empty array sentinel.
printf '%s' "$GH_SAFE_EMPTY_ARRAY"
return 0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Don't downgrade jq failures to “no results”.

Lines 154-159 turn both jq errors and null outputs into []. A typoed filter or nested response change then looks exactly like an empty collection, which reintroduces the silent data-loss path this wrapper is supposed to prevent.

Possible fix
   if [ "$has_jq" -eq 1 ]; then
     local filtered
-    filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null || true)
-    if [ -z "$filtered" ] || [ "$filtered" = "null" ]; then
-      # The filter resolved to null/empty — caller probably wants "[]" semantics
-      # for "no nodes found". Return the empty array sentinel.
-      printf '%s' "$GH_SAFE_EMPTY_ARRAY"
-      return 0
-    fi
+    if ! filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null); then
+      _gh_safe_err "graphql-bad-jq" "filter=$jq_filter"
+      return 65
+    fi
+    if [ "$filtered" = "null" ]; then
+      _gh_safe_err "graphql-null-jq" "filter=$jq_filter"
+      return 65
+    fi
+    if [ -z "$filtered" ]; then
+      printf '%s' "$GH_SAFE_EMPTY_ARRAY"
+      return 0
+    fi
     printf '%s' "$filtered"
     return 0
   fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# If caller asked for a jq filter, apply it now and return that.
if [ "$has_jq" -eq 1 ]; then
local filtered
filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null || true)
if [ -z "$filtered" ] || [ "$filtered" = "null" ]; then
# The filter resolved to null/empty — caller probably wants "[]" semantics
# for "no nodes found". Return the empty array sentinel.
printf '%s' "$GH_SAFE_EMPTY_ARRAY"
return 0
# If caller asked for a jq filter, apply it now and return that.
if [ "$has_jq" -eq 1 ]; then
local filtered
if ! filtered=$(printf '%s' "$raw" | jq -c "$jq_filter" 2>/dev/null); then
_gh_safe_err "graphql-bad-jq" "filter=$jq_filter"
return 65
fi
if [ "$filtered" = "null" ]; then
_gh_safe_err "graphql-null-jq" "filter=$jq_filter"
return 65
fi
if [ -z "$filtered" ]; then
printf '%s' "$GH_SAFE_EMPTY_ARRAY"
return 0
fi
printf '%s' "$filtered"
return 0
fi
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lib/gh-safe.sh around lines 151 - 159, The
current block running jq (using filtered=$(... | jq -c "$jq_filter" 2>/dev/null
|| true)) collapses jq runtime errors into an empty result; change it so you
capture jq's exit status separately: run jq without discarding stderr, store its
output into the filtered variable and its exit code into a variable (e.g.,
jq_status), and if jq_status is non-zero emit the jq error to stderr and return
a non-zero failure (do not return GH_SAFE_EMPTY_ARRAY); only when jq succeeds
and filtered is empty or "null" should you return the GH_SAFE_EMPTY_ARRAY
sentinel. This touches the has_jq check, the filtered variable, the jq_filter
invocation, and the GH_SAFE_EMPTY_ARRAY return logic.

Comment on lines +102 to +113
local exit=0
for file in "$@"; do
if [ ! -f "$file" ]; then
printf '[lint-prompt] not found: %s\n' "$file" >&2
exit=2
continue
fi
if ! scan_file "$file"; then
exit=1
fi
done
return "$exit"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Preserve exit code 2 when file errors occur

Line 106 sets file errors to 2, but Line 110 can downgrade it to 1 later. That breaks the documented exit-code contract when both conditions occur in one run.

Proposed fix
-  local exit=0
+  local exit_code=0
   for file in "$@"; do
     if [ ! -f "$file" ]; then
       printf '[lint-prompt] not found: %s\n' "$file" >&2
-      exit=2
+      exit_code=2
       continue
     fi
     if ! scan_file "$file"; then
-      exit=1
+      if [ "$exit_code" -lt 2 ]; then
+        exit_code=1
+      fi
     fi
   done
-  return "$exit"
+  return "$exit_code"
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
local exit=0
for file in "$@"; do
if [ ! -f "$file" ]; then
printf '[lint-prompt] not found: %s\n' "$file" >&2
exit=2
continue
fi
if ! scan_file "$file"; then
exit=1
fi
done
return "$exit"
local exit_code=0
for file in "$@"; do
if [ ! -f "$file" ]; then
printf '[lint-prompt] not found: %s\n' "$file" >&2
exit_code=2
continue
fi
if ! scan_file "$file"; then
if [ "$exit_code" -lt 2 ]; then
exit_code=1
fi
fi
done
return "$exit_code"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/lint-prompt.sh around lines 102 - 113, The
loop currently sets exit=2 for missing files but later unconditionally sets
exit=1 when scan_file fails, which can downgrade a prior file-error code; modify
the scan_file failure branch so it only sets exit=1 when a more severe code
hasn't already been recorded (e.g., if [ "$exit" -ne 2 ]; then exit=1; fi),
referencing the loop variable exit, the scan_file invocation and the
'[lint-prompt] not found: %s\n' missing-file path check.

Comment on lines +135 to +139
for proposal in proposals:
if not isinstance(proposal, dict) or "title" not in proposal:
sys.stderr.write(f"[match-discussions] skipping malformed proposal: {proposal!r}\n")
continue
p_norm = normalize(proposal["title"])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard title type before normalizing.

Line 139 assumes every present title is a string. A proposal like {"title": null} or {"title": 123} passes the current check and then crashes the whole matcher instead of being skipped as malformed.

Possible fix
-    if not isinstance(proposal, dict) or "title" not in proposal:
+    if (
+        not isinstance(proposal, dict)
+        or not isinstance(proposal.get("title"), str)
+        or not proposal["title"].strip()
+    ):
         sys.stderr.write(f"[match-discussions] skipping malformed proposal: {proposal!r}\n")
         continue
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
for proposal in proposals:
if not isinstance(proposal, dict) or "title" not in proposal:
sys.stderr.write(f"[match-discussions] skipping malformed proposal: {proposal!r}\n")
continue
p_norm = normalize(proposal["title"])
for proposal in proposals:
if (
not isinstance(proposal, dict)
or not isinstance(proposal.get("title"), str)
or not proposal["title"].strip()
):
sys.stderr.write(f"[match-discussions] skipping malformed proposal: {proposal!r}\n")
continue
p_norm = normalize(proposal["title"])
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/match-discussions.sh around lines 135 -
139, The loop over proposals currently only checks that proposal is a dict and
has a "title" key but then calls normalize(proposal["title"]) which will crash
if title is not a string (e.g., null or number); update the guard in the
proposals loop to also verify isinstance(proposal.get("title"), str) (or
type(proposal["title"]) is str) before calling normalize, and if the check fails
write the same sys.stderr skipping message (including the malformed proposal
repr) and continue so normalize is only invoked with true strings.

Comment on lines +141 to +161
best = None
best_sim = 0.0
for disc, d_norm in disc_norm:
if disc.get("id") in seen_disc_ids:
continue
sim = jaccard(p_norm, d_norm)
if sim > best_sim:
best_sim = sim
best = disc

if best is not None and best_sim >= threshold:
matched.append(
{
"proposal": proposal,
"discussion": best,
"similarity": round(best_sim, 4),
}
)
seen_disc_ids.add(best.get("id"))
else:
new_candidates.append({"proposal": proposal, "best_similarity": round(best_sim, 4)})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

This matching is still order-dependent.

Lines 141-161 greedily claim the best currently-available discussion for each proposal. That lets an earlier weak match consume a discussion that a later proposal matches much better, pushing the later one into new_candidates and risking duplicate discussion creation. Score all proposal/discussion pairs first, then claim the highest-similarity unused pairs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/scripts/feature-ideation/match-discussions.sh around lines 141 -
161, The current greedy loop in which each proposal iterates disc_norm and
assigns the first-best available discussion (using p_norm, jaccard, best_sim,
best, seen_disc_ids, matched, new_candidates, threshold) is order-dependent;
instead compute the similarity for every proposal/discussion pair up front,
collect tuples (proposal, discussion, similarity), sort them descending by
similarity, then iterate that sorted list claiming the highest-similarity pair
when both the proposal and discussion are still unassigned (use sets to track
assigned proposal ids and seen_disc_ids), add claimed pairs to matched only if
similarity >= threshold, and after claiming finish by adding any unassigned
proposals to new_candidates with their best_similarity. Ensure you still round
similarities the same way when storing results.

{
"schema_version": "1.0.0",
"scan_date": "2026-04-07T07:00:00Z",
"repo": "not-a-valid-slug",
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify the repo pattern constraint in the schema and test validation

echo "=== Checking signals.schema.json for repo pattern constraint ==="
if jq -e '.properties.repo.pattern' .github/schemas/signals.schema.json > /dev/null 2>&1; then
  echo "✓ Schema has repo pattern constraint:"
  jq '.properties.repo.pattern' .github/schemas/signals.schema.json
else
  echo "⚠ WARNING: No pattern constraint found for repo field"
fi

echo -e "\n=== Testing INVALID-bad-repo.signals.json (should FAIL) ==="
if .github/scripts/feature-ideation/validate-signals.py \
  .github/schemas/signals.schema.json \
  test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json 2>&1; then
  echo "❌ ERROR: INVALID fixture passed validation"
  exit 1
else
  echo "✓ INVALID fixture correctly failed validation"
fi

Repository: petry-projects/TalkTerm

Length of output: 348


Fix validation: INVALID-bad-repo fixture is incorrectly passing validation.

The schema correctly defines a pattern constraint (^[^/]+/[^/]+$) for the repo field to enforce owner/repo format, but the validation script is not enforcing it. The "not-a-valid-slug" value violates the pattern (missing /) and should fail validation, but the fixture passes instead. Either the validation script is not properly checking the schema constraint, or the schema is not being applied correctly during validation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json`
at line 4, The INVALID-bad-repo fixture is passing because the JSON Schema
pattern for "repo" (^[^/]+/[^/]+$) is not being enforced during validation;
locate the validator used for signal fixtures (e.g., the function
validateSignals or validateFixture that loads the schema) and ensure the schema
is actually applied and pattern validation is enabled (configure your JSON
Schema validator like AJV to compile/validate using the provided schema and not
skip pattern checks). Specifically, confirm the schema is passed when calling
validate(...) for the fixture check, enable/disable relevant options that might
ignore patterns, and add/adjust the test to assert that
INVALID-bad-repo.signals.json (repo = "not-a-valid-slug") fails validation.

Comment on lines +1 to +6
{
"schema_version": "1.0.0",
"scan_date": "2026-04-07T07:00:00Z",
"repo": "petry-projects/talkterm",
"open_issues": { "count": 0, "items": [] }
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify INVALID fixtures fail validation and valid fixtures pass

echo "=== Testing INVALID-missing-field.signals.json (should FAIL) ==="
if .github/scripts/feature-ideation/validate-signals.py \
  .github/schemas/signals.schema.json \
  test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json 2>&1; then
  echo "❌ ERROR: INVALID fixture passed validation"
  exit 1
else
  echo "✓ INVALID fixture correctly failed validation"
fi

echo -e "\n=== Testing empty-repo.signals.json (should PASS) ==="
if .github/scripts/feature-ideation/validate-signals.py \
  .github/schemas/signals.schema.json \
  test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json 2>&1; then
  echo "✓ Valid fixture correctly passed validation"
else
  echo "❌ ERROR: Valid fixture failed validation"
  exit 1
fi

Repository: petry-projects/TalkTerm

Length of output: 233


Fix fixture or schema validation: INVALID-missing-field.signals.json currently PASSES validation when it should FAIL.

The fixture with the INVALID- prefix is supposed to trigger a validation failure, but verification shows it passes validation successfully. This fixture contains only 4 fields (schema_version, scan_date, repo, open_issues) and is missing: closed_issues_30d, ideas_discussions, releases, merged_prs_30d, feature_requests, bug_reports, and truncation_warnings. Either the schema is not enforcing these as required fields, or the fixture should not have the INVALID- prefix. This breaks the intended validation test.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json`
around lines 1 - 6, The INVALID-missing-field.signals.json fixture is marked
invalid but passes validation because the schema isn't enforcing the missing
required fields; either update the schema to require the missing properties
(closed_issues_30d, ideas_discussions, releases, merged_prs_30d,
feature_requests, bug_reports, truncation_warnings) in the Signals schema (e.g.,
SignalsSchema / validateSignals function) or change the fixture to include those
fields so it genuinely fails; locate the validator (validateSignals /
validateSignalsSchema) and add the properties to the "required" list or modify
the fixture INVALID-missing-field.signals.json to include placeholder values for
the missing fields so the test outcome matches the intent.

Comment on lines +1 to +20
{
"data": {
"repository": {
"discussions": {
"pageInfo": {"hasNextPage": false},
"nodes": [
{
"id": "D_1",
"number": 7,
"title": "💡 Streaming voice responses",
"createdAt": "2026-03-01T00:00:00Z",
"updatedAt": "2026-04-01T00:00:00Z",
"labels": {"nodes": [{"name": "enhancement"}]},
"comments": {"totalCount": 3}
}
]
}
}
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

What is the exact structure of GitHub GraphQL API v4 response for repository discussions query, including the fields id, number, title, createdAt, updatedAt, labels, and comments? How are labels and comments structured in the response?

💡 Result:

Below is the typical JSON response shape you’ll get back from a repository discussions GraphQL v4 query that requests:

  • id, number, title, createdAt, updatedAt
  • labels (connection)
  • comments (connection)

Example query (shape that determines the response)

query($owner: String!, $name: String!) {
  repository(owner: $owner, name: $name) {
    discussions(first: 10) {
      totalCount
      pageInfo { endCursor hasNextPage }

      nodes {
        id
        number
        title
        createdAt
        updatedAt

        labels(first: 10) {
          totalCount
          pageInfo { endCursor hasNextPage }
          nodes { id name color }
          edges {
            cursor
            node { id name color }
          }
        }

        comments(first: 10) {
          totalCount
          pageInfo { endCursor hasNextPage }
          nodes { id body createdAt updatedAt }
          edges {
            cursor
            node { id body createdAt updatedAt }
          }
        }
      }

      edges {
        cursor
        node { id }  # (or the full node fields above)
      }
    }
  }
}

GitHub’s repository { discussions(...) } field returns a DiscussionConnection with totalCount, pageInfo, edges { cursor node { ... } }, and nodes [ ... ]. [1]

Corresponding response structure (example)

{
  "data": {
    "repository": {
      "discussions": {
        "totalCount": 123,
        "pageInfo": {
          "endCursor": "Y3Vyc29yOnYyOpH...",
          "hasNextPage": true
        },
        "nodes": [
          {
            "id": "D_kwDO...",
            "number": 42,
            "title": "Discussion title",
            "createdAt": "2024-01-01T12:34:56Z",
            "updatedAt": "2024-01-02T10:00:00Z",

            "labels": {
              "totalCount": 2,
              "pageInfo": { "endCursor": null, "hasNextPage": false },
              "nodes": [
                { "id": "LA_kwDO...", "name": "bug", "color": "d73a4a" }
              ],
              "edges": [
                {
                  "cursor": "Y3Vyc29yOnYyOp...",
                  "node": { "id": "LA_kwDO...", "name": "bug", "color": "d73a4a" }
                }
              ]
            },

            "comments": {
              "totalCount": 5,
              "pageInfo": { "endCursor": "Y3Vyc29yOnYyOp...", "hasNextPage": false },
              "nodes": [
                {
                  "id": "DC_kwDO...",
                  "body": "First comment",
                  "createdAt": "2024-01-01T13:00:00Z",
                  "updatedAt": "2024-01-01T13:05:00Z"
                }
              ],
              "edges": [
                {
                  "cursor": "Y3Vyc29yOnYyOp...",
                  "node": {
                    "id": "DC_kwDO...",
                    "body": "First comment",
                    "createdAt": "2024-01-01T13:00:00Z",
                    "updatedAt": "2024-01-01T13:05:00Z"
                  }
                }
              ]
            }
          }
        ],
        "edges": [
          {
            "cursor": "Y3Vyc29yOnYyOp...",
            "node": { "id": "D_kwDO..." }
          }
        ]
      }
    }
  }
}

How labels and comments are structured

labels

Discussion.labels is a LabelConnection (i.e., a Relay-style connection). [2]
A LabelConnection is structured as: edges[] (each has cursor + node), plus nodes[], plus pageInfo + totalCount. [3]

comments

Discussion.comments(...) returns a DiscussionCommentConnection. [4]
A DiscussionCommentConnection is also a Relay-style connection with: edges[], nodes[], pageInfo, and totalCount. [4]

Key point: in GitHub GraphQL v4, connections almost always let you choose either nodes (simpler) or edges { cursor node } (if you need per-item cursors)—or both. [1][4]

Sources: GitHub Discussions GraphQL guide (connection shape + Discussion fields like number, title, createdAt, updatedAt, and comments). [1] GitHub GraphQL reference indicating labels is a LabelConnection. [2] LabelConnection schema shape. [3] DiscussionCommentConnection schema shape. [4]


Update fixture to match GitHub GraphQL API discussions response structure.

The fixture is incomplete compared to GitHub's actual GraphQL response:

  • discussions connection: Missing totalCount and endCursor in pageInfo
  • labels: Should include totalCount, pageInfo, and label nodes should have id and color fields (not just name)
  • comments: Missing pageInfo and nodes array; should contain comment objects with id, body, createdAt, and updatedAt
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json`
around lines 1 - 20, The fixture's GitHub GraphQL "discussions" payload is
missing fields; update the discussion object(s) under repository.discussions so
pageInfo includes totalCount and endCursor, ensure labels contains totalCount,
pageInfo and its nodes include id, name and color (not just name), and ensure
comments has pageInfo plus a nodes array with comment objects each containing
id, body, createdAt and updatedAt; target the JSON structures named
"discussions", "pageInfo", "labels", and "comments" within the existing
repository.discussions.nodes discussion entries (e.g., the object with id "D_1")
and add the missing fields to match the real API shape.

Comment on lines +70 to +76
@test "rest: EXITS NON-ZERO on auth failure (gh exit 4)" {
GH_STUB_EXIT=4 \
GH_STUB_STDERR='HTTP 401: Bad credentials' \
run gh_safe_rest issue list --repo foo/bar
[ "$status" -ne 0 ]
[[ "$output" == *"rest-failure"* ]] || [[ "$stderr" == *"rest-failure"* ]] || true
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Assertion on line 75 is a no-op due to trailing || true.

The compound condition always succeeds because of || true, making it dead code. If you intend to verify the error message content, remove || true. If the exit status check alone is sufficient, remove line 75 entirely.

Proposed fix

Option A - Remove the no-op line:

 `@test` "rest: EXITS NON-ZERO on auth failure (gh exit 4)" {
   GH_STUB_EXIT=4 \
   GH_STUB_STDERR='HTTP 401: Bad credentials' \
     run gh_safe_rest issue list --repo foo/bar
   [ "$status" -ne 0 ]
-  [[ "$output" == *"rest-failure"* ]] || [[ "$stderr" == *"rest-failure"* ]] || true
 }

Option B - Make the assertion meaningful (if message verification is desired):

-  [[ "$output" == *"rest-failure"* ]] || [[ "$stderr" == *"rest-failure"* ]] || true
+  [[ "$output" == *"rest-failure"* || "$stderr" == *"rest-failure"* ]]
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
@test "rest: EXITS NON-ZERO on auth failure (gh exit 4)" {
GH_STUB_EXIT=4 \
GH_STUB_STDERR='HTTP 401: Bad credentials' \
run gh_safe_rest issue list --repo foo/bar
[ "$status" -ne 0 ]
[[ "$output" == *"rest-failure"* ]] || [[ "$stderr" == *"rest-failure"* ]] || true
}
`@test` "rest: EXITS NON-ZERO on auth failure (gh exit 4)" {
GH_STUB_EXIT=4 \
GH_STUB_STDERR='HTTP 401: Bad credentials' \
run gh_safe_rest issue list --repo foo/bar
[ "$status" -ne 0 ]
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@test/workflows/feature-ideation/gh-safe.bats` around lines 70 - 76, The
assertion that checks for the "rest-failure" text in the "rest: EXITS NON-ZERO
on auth failure (gh exit 4)" test is rendered a no-op by the trailing "|| true";
locate the test block where GH_STUB_EXIT and gh_safe_rest are used and either
remove the entire redundant text-check line if the exit-status check is
sufficient, or remove the "|| true" and keep the double-bracket test so it
actually fails when the expected "rest-failure" text is not present (i.e., make
the [[ "$output" == *"rest-failure"* ]] || [[ "$stderr" == *"rest-failure"* ]]
assertion meaningful).

@don-petry
Copy link
Copy Markdown
Contributor Author

Closing — wrong branch. The hardening work should not be mixed into the old feat/feature-ideation-workflow branch (which is a separate, stale topic). Will move the commit to a fresh branch off current main once DJ confirms the target.

Commit 4f6fa42 is preserved on the branch and unreferenced from any PR — recoverable at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants