fix(compliance-audit): add claude label to individual finding issues#121
fix(compliance-audit): add claude label to individual finding issues#121
Conversation
|
Caution Review failedPull request was closed or merged during review 📝 WalkthroughWalkthroughThis PR refactors the Feature Ideation workflow logic from inline shell/jq commands into a modular, testable script suite. It introduces signal collection and schema validation, discussion matching, mutation handling, and a comprehensive test framework with fixtures and stubs. Changes
Sequence DiagramsequenceDiagram
participant Workflow as Reusable Workflow
participant Collect as collect-signals.sh
participant GH as GitHub API
participant Compose as compose-signals.sh
participant Validate as validate-signals.py
participant Schema as signals.schema.json
participant Match as match-discussions.sh
participant Mutate as discussion-mutations.sh
Workflow->>Collect: execute with REPO, GH_TOKEN
Collect->>GH: fetch open issues, closed issues
Collect->>GH: query discussion categories
Collect->>GH: fetch discussions, releases, merged PRs
GH-->>Collect: JSON response arrays
Collect->>Compose: invoke with fetched arrays
Compose-->>Collect: signals.json payload
Collect->>Validate: validate signals.json
Validate->>Schema: load schema
Schema-->>Validate: schema definition
Validate-->>Collect: validation result
Collect-->>Workflow: signals.json written
Workflow->>Match: match proposals to discussions
Match-->>Workflow: matched/new_candidates JSON
Workflow->>Mutate: create discussions or comments
Mutate-->>Workflow: mutation result (or DRY_RUN log)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
… tests
Refactors the reusable feature-ideation workflow's parsing surface from
an inline 600-line YAML heredoc into testable scripts with deterministic
contracts. Every defect that previously required post-merge review can
now fail in CI before adopters notice.
Why
---
The prior reusable workflow used `2>/dev/null || echo '[]'` for every
gh / GraphQL call, which silently downgraded auth failures, rate limits,
network outages, and GraphQL schema drift to empty arrays. The pipeline
would "succeed" while producing useless signals — and Mary's Discussion
posts would silently degrade across every BMAD repo on the org. The
prompt also instructed Mary to "use fuzzy matching" against existing
Ideas Discussions in her head, which is non-deterministic and untestable.
Risk register (probability × impact, scale 1–9):
R1=9 swallow-all-errors gh wrapper
R2=6 literal $() inside YAML direct prompt
R3=6 no signals.json schema
R4=6 jq --argjson crash on empty input
R5=6 fuzzy match in Mary's prompt → duplicate Discussions
R6=6 retry idempotency hole
R7=6 GraphQL errors[]/null data not detected
R8=4 GraphQL partial errors silently accepted
R10=3 bot filter only catches dependabot/github-actions
R11=4 pagination silently truncates
What's new
----------
.github/scripts/feature-ideation/
collect-signals.sh Orchestrator (replaces inline heredoc)
validate-signals.py JSON Schema 2020-12 validator
match-discussions.sh Deterministic Jaccard matcher (kills R5/R6)
discussion-mutations.sh create/comment/label wrappers + DRY_RUN mode
lint-prompt.sh Catches unescaped $() / ${VAR} in prompt blocks
lib/gh-safe.sh Defensive gh wrapper, fails loud on every
documented failure mode (kills R1, R7, R8)
lib/compose-signals.sh Validates JSON inputs before jq composition
lib/filter-bots.sh Extensible bot author filter (kills R10)
lib/date-utils.sh Cross-platform date helpers
README.md Maintainer docs
.github/schemas/signals.schema.json
Pinned producer/consumer contract for signals.json (Draft 2020-12).
CI rejects any drift; the runtime signals.json is also validated by
the workflow before being handed to Mary.
.github/workflows/feature-ideation-reusable.yml
Rewritten. Adds a self-checkout of petry-projects/.github so the
scripts above are available in the runner. Replaces inline bash with
collect-signals.sh + validate-signals.py. Adds RUN_DATE / SIGNALS_PATH /
PROPOSALS_PATH / MATCH_PLAN_PATH / TOOLING_DIR env vars passed to
claude-code-action via env: instead of unescaped shell expansions in
the prompt body. Adds dry_run input that flows through to
discussion-mutations.sh, which logs every planned action to a JSONL
audit log instead of executing — uploaded as the dry-run-log artifact.
.github/workflows/feature-ideation-tests.yml
New CI gate, path-filtered. Runs shellcheck, lint-prompt, schema
fixture validation, and the full bats suite on every PR that touches
the feature-ideation surface.
standards/workflows/feature-ideation.yml
Updated caller stub template. Adds dry_run workflow_dispatch input
so adopters get safe smoke-testing for free. Existing TalkTerm caller
stub continues to work unchanged (dry_run defaults to false).
test/workflows/feature-ideation/
92 bats tests across 9 suites. 14 GraphQL/REST response fixtures.
5 expected signals.json fixtures (3 valid + 2 INVALID for negative
schema testing). Programmable gh PATH stub with single-call and
multi-call modes for integration testing.
| Suite | Tests | Risks closed |
|-----------------------------|------:|--------------------|
| gh-safe.bats | 19 | R1, R7, R8 |
| compose-signals.bats | 8 | R3, R4 |
| filter-bots.bats | 5 | R10 |
| date-utils.bats | 7 | R9 |
| collect-signals.bats | 14 | R1, R3, R4, R7, R11|
| match-discussions.bats | 13 | R5, R6 |
| discussion-mutations.bats | 10 | DRY_RUN contract |
| lint-prompt.bats | 8 | R2 |
| signals-schema.bats | 8 | R3 |
| TOTAL | 92 | |
Test results: 92 passing, 0 failing, 0 skipped. Run with:
bats test/workflows/feature-ideation/
Backwards compatibility
-----------------------
The reusable workflow's input surface is unchanged for existing callers
(TalkTerm continues to work with no edits). The new dry_run input is
optional and defaults to false. Adopters who copy the new standards
caller stub get dry_run support automatically.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… test CI failure on the previous commit: 91/92 passing, 1 failing. The filter-bots env-extension test used `sh -c` to source filter-bots.sh in a sub-shell with FEATURE_IDEATION_BOT_AUTHORS set. On macOS this works because /bin/sh is bash. On Ubuntu (CI), /bin/sh is dash, which does not support `set -o pipefail`, so sourcing filter-bots.sh produced: sh: 12: set: Illegal option -o pipefail Fixed by switching to `bash -c`. All scripts already use `#!/usr/bin/env bash` shebangs; this is the only place a sub-shell was spawned via `sh`. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Aligns the script-tooling self-checkout with the @v1 pinning convention introduced in #88. Now when a downstream caller stub pins to `@v1` of the workflow file, the reusable workflow defaults to checking out the matching `v1` tag for the scripts. Workflow file and scripts upgrade in lockstep. Override `tooling_ref` only for testing forks (`tooling_ref: my-branch`) or bleeding-edge testing (`tooling_ref: main`). Documented in the input description. Note for the v1 tag move: after this PR merges, the v1 tag must be moved forward to point to the new HEAD so that downstream BMAD repos pinned to @v1 actually pick up the hardening. The change is purely additive (new optional inputs `dry_run` and `tooling_ref`, new env vars in the prompt context), so the move is backwards-compatible. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…6 tests) Triaged 14 inline comments from Copilot's review of #85; two were already fixed by the tooling_ref→v1 commit, the remaining 11 are addressed here. Critical bug fixes ------------------ 1. lint-prompt.sh now scans claude-code-action v1 `prompt:` blocks in addition to v0 `direct_prompt:`. The reusable workflow uses `prompt:` so the linter was silently allowing R2 regressions on the very file it was supposed to protect. Added two regression tests covering both the v1 form and a clean v1 form passes. 2. add_label_to_discussion now sends labelIds as a proper JSON array via gh_safe_graphql_input (new helper). Previously used `gh -f labelIds=` which sent the literal string `["L_1"]` and the GraphQL API would have rejected the mutation at runtime. Added a test that captures gh's stdin and asserts the variables block contains a length-1 array. 3. validate-signals.py now registers a `date-time` format checker via FormatChecker so the `format: date-time` keyword in signals.schema.json is actually enforced. Draft202012Validator does NOT enforce formats by default, and the default FormatChecker omits date-time entirely. Used an inline checker (datetime.fromisoformat with Z normalisation) to avoid pulling in rfc3339-validator. Added two regression tests: one for an invalid timestamp failing, one for a clean timestamp passing. 4. gh_safe_graphql --jq path no longer swallows jq filter errors with `|| true`. Filter typos / wrong paths now exit non-zero instead of silently returning []. Added a regression test using a deliberately broken filter. 5. collect-signals.sh now computes the open-issue truncation warning BEFORE filter_bots_apply. Previously, a result set composed entirely of bots could drop below ISSUE_LIMIT after filtering and mask real truncation. Added an integration test with all-bot fixtures. 6. match-discussions.sh now validates MATCH_THRESHOLD as a non-negative number in [0, 1] before passing to Python. A typo previously surfaced as an opaque traceback. Added regression tests for non-numeric input, out-of-range input, and boundary values 0 and 1. Cleanup ------- 7. Removed dead bash `normalize_title` / `jaccard_similarity` functions from match-discussions.sh — the actual matching is implemented in the embedded Python block and the bash helpers were never called. 8. Schema $id corrected from petry-projects/TalkTerm/... to the canonical petry-projects/.github location. 9. signals-schema.bats "validator script exists and is executable" test now actually checks the `-x` bit (was only checking `-f` and `-r`). 10. README + filter-bots.sh comments now describe the bot list as a "blocklist" (it removes matching authors) instead of "allowlist". 11. test/workflows/feature-ideation/stubs/gh now logs argv with `printf '%q '` so each invocation is shell-quoted and re-parseable, matching its documentation. Previously logged `$*` which lost arg boundaries. New helper ---------- gh_safe_graphql_input — same defensive contract as gh_safe_graphql, but takes a fully-formed JSON request body via stdin instead of -f/-F flags. Use for mutations whose variables include arrays (e.g. labelIds: [ID!]!) that gh's flag-based interface cannot express. Five new tests cover its happy path and every documented failure mode. Tests ----- Test count: 92 → 108 (16 new regression tests, all green). Run with: bats test/workflows/feature-ideation/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… 1 test) Triaged 13 inline comments from CodeRabbit's review of #85; 6 of them overlapped with Copilot's review and were already fixed by bcaa579. The remaining 7 are addressed here. Fixes ----- 1. lint-prompt.sh: ${VAR} branch lookbehind was inconsistent with the $(...) branch — only rejected $$VAR but not \${VAR}. Both branches now use [\\$] so backslash-escaped and dollar-escaped forms are skipped uniformly. 2. filter-bots.sh: FEATURE_IDEATION_BOT_AUTHORS CSV entries are now trimmed of leading/trailing whitespace before being added to the blocklist, so "bot1, bot2" matches both bots correctly instead of keeping a literal " bot2" entry. 3. validate-signals.py: malformed signals JSON now exits 2 (file/data error) to match the documented contract, instead of 1 (which means schema validation error). 4. README.md: corrected the workflow filename reference from feature-ideation.yml to feature-ideation-reusable.yml, and reworded the table cell that contained `\|\|` (escaped pipes that don't render correctly in some Markdown engines) to use plain prose. Also noted that lint-prompt scans both v0 `direct_prompt:` and v1 `prompt:`. 5. collect-signals.sh: added an explicit comment above SCHEMA_VERSION documenting the lockstep requirement with signals.schema.json's $comment version annotation. Backed by a new bats test that parses both files and asserts they match. 6. signals.schema.json: added $comment "version: 1.0.0" annotation so the schema file declares its own version explicitly. Used $comment instead of a custom keyword to keep Draft202012 compliance. 7. test/workflows/feature-ideation/match-discussions.bats: build_signals helper now computes the discussions count from the array length instead of hardcoding 0, so the fixture satisfies its own contract (cosmetic — the matcher only reads .items, but contract hygiene matters in test scaffolding). 8. test/workflows/feature-ideation/gh-safe.bats: removed the `|| true` suffix on the rest-failure assertion that made it always pass. Now uses --separate-stderr to capture stderr and asserts the structured `[gh-safe][rest-failure]` prefix is emitted on the auth failure path. Required `bats_require_minimum_version 1.5.0` to suppress the bats-core warning about flag usage. Tests ----- Test count: 108 → 109 (one new test for SCHEMA_VERSION ↔ schema sync). All 109 passing locally. Run with: bats test/workflows/feature-ideation/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Individual compliance issues were only tagged with `compliance-audit`, so Claude agents couldn't discover them for remediation. Now all issues (new and pre-existing) get the `claude` label alongside the umbrella. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c80148b to
b92b519
Compare
|
There was a problem hiding this comment.
Pull request overview
This PR ensures compliance-audit findings are discoverable by agents by adding the claude label to individual finding issues (new + pre-existing). It also introduces a sizeable “feature-ideation” tooling/test harness (scripts, schema, reusable workflow wiring, and CI tests) including a dry_run input for safer rollout.
Changes:
- Compliance audit: create/ensure
claudelabel, label individual finding issues on create and on re-scan. - Feature-ideation: extract workflow logic into tested scripts + schema validation, add DRY_RUN support, and add CI workflow to run shellcheck/schema/bats.
- Standards stub: add
dry_runinput passthrough for downstream caller templates.
Reviewed changes
Copilot reviewed 40 out of 40 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| test/workflows/feature-ideation/stubs/gh | Adds configurable gh stub for bats suites, including multi-call scripting/logging. |
| test/workflows/feature-ideation/signals-schema.bats | Adds schema validation tests for signals fixtures + schema/script version lockstep check. |
| test/workflows/feature-ideation/match-discussions.bats | Adds deterministic matching tests for proposal ↔ discussion matching logic. |
| test/workflows/feature-ideation/lint-prompt.bats | Adds tests for prompt linting (v0 direct_prompt + v1 prompt). |
| test/workflows/feature-ideation/helpers/setup.bash | Adds shared bats helpers (repo root, tmp dir, gh stub install, fixtures). |
| test/workflows/feature-ideation/gh-safe.bats | Adds unit tests for gh-safe.sh failure modes and JSON handling. |
| test/workflows/feature-ideation/fixtures/gh-responses/release-list.json | Adds fixture data for release listing. |
| test/workflows/feature-ideation/fixtures/gh-responses/pr-list-merged.json | Adds fixture data for merged PR listing. |
| test/workflows/feature-ideation/fixtures/gh-responses/issue-list-open.json | Adds fixture data for open issue listing. |
| test/workflows/feature-ideation/fixtures/gh-responses/issue-list-closed.json | Adds fixture data for closed issue listing. |
| test/workflows/feature-ideation/fixtures/gh-responses/graphql-no-ideas-category.json | Adds fixture for repos without an “Ideas” discussion category. |
| test/workflows/feature-ideation/fixtures/gh-responses/graphql-errors-envelope.json | Adds fixture for GraphQL errors[] envelope handling. |
| test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions.json | Adds fixture for discussions query output. |
| test/workflows/feature-ideation/fixtures/gh-responses/graphql-discussions-truncated.json | Adds fixture to exercise truncation warnings via hasNextPage=true. |
| test/workflows/feature-ideation/fixtures/gh-responses/graphql-categories.json | Adds fixture for discussion categories query. |
| test/workflows/feature-ideation/fixtures/expected/truncated.signals.json | Adds schema-valid example payload including truncation warnings. |
| test/workflows/feature-ideation/fixtures/expected/populated.signals.json | Adds schema-valid example payload with populated buckets. |
| test/workflows/feature-ideation/fixtures/expected/INVALID-missing-field.signals.json | Adds negative fixture for schema-required-field failures. |
| test/workflows/feature-ideation/fixtures/expected/INVALID-bad-repo.signals.json | Adds negative fixture for invalid repo shape failures. |
| test/workflows/feature-ideation/fixtures/expected/empty-repo.signals.json | Adds schema-valid minimal “empty repo” payload. |
| test/workflows/feature-ideation/filter-bots.bats | Adds tests for bot filtering logic + env extensibility. |
| test/workflows/feature-ideation/discussion-mutations.bats | Adds tests for DRY_RUN logging and GraphQL mutation request shaping. |
| test/workflows/feature-ideation/date-utils.bats | Adds tests for cross-platform date helpers. |
| test/workflows/feature-ideation/compose-signals.bats | Adds tests for composing stable signals.json output and arg validation. |
| test/workflows/feature-ideation/collect-signals.bats | Adds end-to-end collector test using multi-call gh stub + schema validation. |
| standards/workflows/feature-ideation.yml | Adds dry_run input to the org-standard caller stub and passes it through. |
| scripts/compliance-audit.sh | Ensures/creates claude label and applies it to created + pre-existing finding issues. |
| .github/workflows/feature-ideation-tests.yml | Adds CI workflow to run shellcheck, prompt-lint, schema validation, and bats. |
| .github/workflows/feature-ideation-reusable.yml | Refactors reusable workflow to use checked-out tooling, adds DRY_RUN + tooling_ref, validates signals, uploads logs. |
| .github/scripts/feature-ideation/validate-signals.py | Adds schema validator with enforced date-time format checking and clearer exit codes. |
| .github/scripts/feature-ideation/README.md | Documents feature-ideation tooling architecture, risks, and test strategy. |
| .github/scripts/feature-ideation/match-discussions.sh | Adds deterministic Jaccard matcher for proposal ↔ discussion alignment. |
| .github/scripts/feature-ideation/lint-prompt.sh | Adds prompt linter to catch unescaped shell expansions inside prompt blocks. |
| .github/scripts/feature-ideation/lib/gh-safe.sh | Adds defensive wrappers for gh REST and GraphQL calls with structured failures. |
| .github/scripts/feature-ideation/lib/filter-bots.sh | Adds configurable bot-author filtering for signals collection. |
| .github/scripts/feature-ideation/lib/date-utils.sh | Adds cross-platform date arithmetic helpers. |
| .github/scripts/feature-ideation/lib/compose-signals.sh | Adds validated composition of the canonical signals.json shape. |
| .github/scripts/feature-ideation/discussion-mutations.sh | Adds mutation wrappers with DRY_RUN logging and safer GraphQL variable handling. |
| .github/scripts/feature-ideation/collect-signals.sh | Adds orchestrated signals collection with truncation detection and summary output. |
| .github/schemas/signals.schema.json | Adds JSON Schema contract for signals payload consumed by the prompt. |
Comments suppressed due to low confidence (2)
scripts/compliance-audit.sh:621
ensure_audit_labelsuppresses all errors from creating/updating theclaudelabel (2>/dev/null || true). Since agent discoverability depends on this label, silent failures (e.g., missing label permissions) will defeat the purpose of this change and be hard to diagnose. Consider letting failures surface, or at least capturing the exit status and emitting a warning/error when label creation fails.
# ---------------------------------------------------------------------------
# Check: required-status-check rulesets reference current names
#
# After centralizing workflows into reusables (#87, #88), GitHub composes
# check names as `<caller-job-id> / <reusable-job-id-or-name>`. Repos
# that updated their workflow files but didn't update their rulesets
# are silently broken — the merge gate references a name that no
# longer exists, so it can never be satisfied.
#
# Inspects both the new ruleset system and classic branch protection.
scripts/compliance-audit.sh:677
- Adding the
claudelabel to existing issues is currently best-effort (2>/dev/null || true). If this fails (e.g., label missing, insufficient permissions), the issue will still be updated but remain undiscoverable to agents, and there’s no log signal. Consider checking thegh issue editexit status and warning (or failing) when the label add fails.
ruleset_contexts=$(gh_api "repos/$ORG/$repo/rules/branches/main" \
--jq '.[] | select(.type == "required_status_checks") | .parameters.required_status_checks[].context' 2>/dev/null || echo "")
contexts+="$ruleset_contexts"$'\n'
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Quality gates for the reusable feature-ideation workflow and its | ||
| # supporting scripts. | ||
| # | ||
| # Triggered on any PR that touches: | ||
| # - .github/workflows/feature-ideation*.yml | ||
| # - .github/scripts/feature-ideation/** | ||
| # - .github/schemas/signals.schema.json | ||
| # - test/workflows/feature-ideation/** | ||
| # - standards/workflows/feature-ideation.yml |
There was a problem hiding this comment.
PR title/description focus on compliance-audit labeling, but this PR also adds a large new/updated feature-ideation workflow, scripts, schema, and an extensive bats test suite. Please update the PR title/description to reflect the broader scope, or split into separate PRs so review/rollback risk is manageable.
| if [ "$rc" -ne 0 ]; then | ||
| _gh_safe_err "rest-failure" "exit=$rc args=$* stderr=$stderr" | ||
| return "$rc" |
There was a problem hiding this comment.
In error paths, gh_safe_rest logs args=$*, which collapses argument boundaries and can make debugging ambiguous (especially when args contain spaces or special characters). Consider logging arguments in a boundary-preserving form (e.g., iterate "$@" and shell-quote each token) to keep diagnostics lossless.



Summary
compliance-audit, notclaude, so agents couldn't discover them for remediationcreate_issue_for_finding()adds theclaudelabel to every new issueclaudeadded when the audit re-comments on themWhat changed
ensure_audit_label()— also creates/ensures theclaudelabel in each repocreate_issue_for_finding()— adds--label "claude"togh issue creategh issue edit --add-label claudeclaude)Test plan
dry_run: falseon a single repo and verify created issues have bothcompliance-auditandclaudelabelsclaudelabel added on re-scan🤖 Generated with Claude Code
Summary by CodeRabbit
Release Notes
New Features
Improvements
Tests