Skip to content

Commit ea312ad

Browse files
chrisleekrclaude
andcommitted
fix(workflow): feature-research silent-success + copilot-cli v1.0.21 pin
Three compounding defects caused run 24232817769 to roll up as `success` while producing zero safe-outputs: 1. Copilot CLI v1.0.22 regression (confirmed in gh-aw v0.68.1 release notes + PR #25689 - "0-byte output due to incompatibilities with v1.0.22"). Fix: upgrade gh-aw to v0.68.1, inherit framework-level pin of Copilot CLI to v1.0.21 via install_copilot_cli.sh. 2. gh-aw has no built-in fail-on-empty for safe-outputs. Fix: new post-steps assertion reads /tmp/gh-aw/agent_output.json and exits non-zero when create_issue AND noop counts are both zero. 3. Agent improvised `bun run check` on a Bun-less runner. Fix: new "Read-only scope" section in prompt body forbidding build/test commands. Spec correction: original attribution to copilot-cli#2479/#2481 is wrong; those bugs were closed as fixed (2026-04-09 / 2026-04-03, fixed in CLI v1.0.19) before the 2026-04-10 failing run. Corrected in spec.md Background / FR-002 / Assumptions. Details and walkthrough evidence: specs/20260411-143905-fix-feature-researcher/ Walkthrough T025-T028 still pending - PR opens as draft. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 89a5a52 commit ea312ad

13 files changed

Lines changed: 2292 additions & 197 deletions

File tree

.github/aw/actions-lock.json

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,15 @@
55
"version": "v8",
66
"sha": "ed597411d8f924073f98dfc5c65a23a2325f34cd"
77
},
8-
"github/gh-aw-actions/setup@v0.67.0": {
8+
"actions/github-script@v9": {
9+
"repo": "actions/github-script",
10+
"version": "v9",
11+
"sha": "373c709c69115d41ff229c7e5df9f8788daa9553"
12+
},
13+
"github/gh-aw-actions/setup@v0.68.1": {
914
"repo": "github/gh-aw-actions/setup",
10-
"version": "v0.67.0",
11-
"sha": "cde65c546c2b0f6d3f3a9492a04e6687887c4fe8"
15+
"version": "v0.68.1",
16+
"sha": "2fe53acc038ba01c3bbdc767d4b25df31ca5bdfc"
1217
}
1318
}
1419
}

.github/workflows/feature-research.lock.yml

Lines changed: 247 additions & 193 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

.github/workflows/feature-research.md

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -36,6 +36,25 @@ safe-outputs:
3636
assignees: [chrisleekr]
3737
close-older-issues: true
3838
expires: 7d
39+
40+
post-steps:
41+
- name: Assert agent emitted at least one safe-output record
42+
if: always()
43+
run: |
44+
set -euo pipefail
45+
AGENT_OUT=/tmp/gh-aw/agent_output.json
46+
if [[ ! -s "$AGENT_OUT" ]]; then
47+
echo "::error::Agent produced no output file at $AGENT_OUT"
48+
exit 1
49+
fi
50+
CREATE_ISSUE_COUNT=$(jq '[.items[] | select(.type == "create_issue")] | length' "$AGENT_OUT")
51+
NOOP_COUNT=$(jq '[.items[] | select(.type == "noop")] | length' "$AGENT_OUT")
52+
echo "create_issue records: $CREATE_ISSUE_COUNT"
53+
echo "noop records: $NOOP_COUNT"
54+
if [[ "$CREATE_ISSUE_COUNT" -lt 1 && "$NOOP_COUNT" -lt 1 ]]; then
55+
echo "::error::FR-001 violation — agent emitted zero create_issue and zero noop safe-outputs. Failing the run to prevent silent success."
56+
exit 1
57+
fi
3958
---
4059

4160
# Feature Opportunity Researcher
@@ -59,6 +78,16 @@ in the **past 7 days** for each of these tools:
5978
You MUST use the `mcp__tavily__search` tool for all web searches and content
6079
retrieval. Do NOT use WebFetch or direct HTTP requests to external URLs unless travily is unable to retrieve the content you need.
6180

81+
## Read-only scope
82+
83+
This research task is **read-only** against the working tree. Do NOT run any
84+
repository build, test, or validation command. In particular, do NOT invoke
85+
`bun run check`, `bun install`, `bun test`, `npm test`, `npm install`, `git add`,
86+
`git commit`, or any equivalent. The GitHub-hosted `ubuntu-24.04` runner does
87+
not have Bun installed, and repository validation is not this workflow's job —
88+
your only task is to read source files under `src/agents/*.ts` and compare
89+
them against the upstream changelogs you retrieve through Tavily.
90+
6291
## What to Check
6392

6493
Compare any newly discovered config fields against the current agent

CLAUDE.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,8 @@ Auto-generated from all feature plans. Last updated: 2026-04-11
99
- Local filesystem (agent config files via `AgentPaths`) (20260406-164513-repo-housekeeping)
1010
- TypeScript 6.x, strict mode (`"strict": true`), Bun ≥ 1.3.9 runtime + `citty` (CLI), `@clack/prompts` (output), `tar` v7 (archive), `age-encryption` (X25519 encryption), `simple-git` (vault Git ops), `zod` (schema validation), `picocolors` (terminal colours) (20260411-002222-agent-skills-sync)
1111
- Local filesystem only. Skill sources at `~/.claude/skills/`, `~/.cursor/skills/`, `~/.codex/skills/`, `~/.copilot/skills/`. Encrypted destinations at `<vaultDir>/<agent>/skills/<name>.tar.age`, wired through `AgentPaths` in `src/config/paths.ts` (20260411-002222-agent-skills-sync)
12+
- YAML (GitHub Actions schema) + Markdown (gh-aw (20260411-143905-fix-feature-researcher)
13+
- N/A. Workflow file only; no persistent state. (20260411-143905-fix-feature-researcher)
1214

1315
- TypeScript 6.x, strict mode (`"strict": true`) + citty 0.2.x, @clack/prompts 1.2.x, zod 4.x, simple-git 3.x, age-encryption 0.3.x (20260406-094347-stabilise-daemon)
1416

@@ -28,9 +30,9 @@ bun run check
2830
TypeScript 6.x, strict mode (`"strict": true`): Follow standard conventions
2931

3032
## Recent Changes
33+
- 20260411-143905-fix-feature-researcher: Added YAML (GitHub Actions schema) + Markdown (gh-aw
3134
- 20260411-002222-agent-skills-sync: Added TypeScript 6.x, strict mode (`"strict": true`), Bun ≥ 1.3.9 runtime + `citty` (CLI), `@clack/prompts` (output), `tar` v7 (archive), `age-encryption` (X25519 encryption), `simple-git` (vault Git ops), `zod` (schema validation), `picocolors` (terminal colours)
3235
- 20260406-164513-repo-housekeeping: Added TypeScript 6.x, strict mode + citty 0.2.x (CLI), @clack/prompts 1.2.x (output), zod 4.x (validation), simple-git 3.x, age-encryption 0.3.x, picocolors 1.1.x (new explicit dep)
33-
- 20260406-125441-config-migration: Added TypeScript 6.x, strict mode (`"strict": true`) + citty 0.2.x (CLI), @clack/prompts 1.2.x (output), @iarna/toml ^2.2.5 (Codex TOML), zod 4.x (validation)
3436

3537

3638
<!-- MANUAL ADDITIONS START -->
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
# Specification Quality Checklist: Fix Feature Opportunity Researcher Workflow Silent Failure
2+
3+
**Purpose**: Validate specification completeness and quality before proceeding to planning
4+
**Created**: 2026-04-11
5+
**Feature**: [spec.md](../spec.md)
6+
7+
## Content Quality
8+
9+
- [x] No implementation details (languages, frameworks, APIs)
10+
- [x] Focused on user value and business needs
11+
- [x] Written for non-technical stakeholders
12+
- [x] All mandatory sections completed
13+
14+
## Requirement Completeness
15+
16+
- [x] No [NEEDS CLARIFICATION] markers remain
17+
- [x] Requirements are testable and unambiguous
18+
- [x] Success criteria are measurable
19+
- [x] Success criteria are technology-agnostic (no implementation details)
20+
- [x] All acceptance scenarios are defined
21+
- [x] Edge cases are identified
22+
- [x] Scope is clearly bounded
23+
- [x] Dependencies and assumptions identified
24+
25+
## Feature Readiness
26+
27+
- [x] All functional requirements have clear acceptance criteria
28+
- [x] User scenarios cover primary flows
29+
- [x] Feature meets measurable outcomes defined in Success Criteria
30+
- [x] No implementation details leak into specification
31+
32+
## Notes
33+
34+
- Iteration 1 left two [NEEDS CLARIFICATION] markers (fallback web-search
35+
policy; shape of the health signal). User review rejected the premise of
36+
the first one ("Tavily must be available — why did it fail?") and
37+
answered the second ("A: fail the run"). User also pushed back on the
38+
third clarification about `bun run check` by asking why the agent runs
39+
check at all — which led to verifying that the prompt does not contain
40+
that instruction; the agent improvised it.
41+
- Iteration 2 (this version) integrates all three decisions:
42+
- FR-002 + User Story 2 + Background root-cause section 1 replace the
43+
fallback-policy question with a concrete diagnosis (Copilot CLI
44+
personal-account MCP allowlist bug,
45+
[github/copilot-cli#2479](https://github.com/github/copilot-cli/issues/2479),
46+
[#2481](https://github.com/github/copilot-cli/issues/2481)).
47+
- FR-001 + SC-005 operationalise Option A: zero-safe-output runs end in
48+
conclusion `failure`.
49+
- FR-004 + Background root-cause section 2 pin responsibility for the
50+
`bun run check` improvisation on prompt under-specification, not on a
51+
missing runner tool.
52+
- Spec deliberately avoids prescribing *how* to apply the Copilot CLI
53+
workaround (env var vs. version pin vs. alternative CLI). Implementation
54+
choices belong in the plan phase.
55+
- Two citations ([#2479](https://github.com/github/copilot-cli/issues/2479)
56+
and [#2481](https://github.com/github/copilot-cli/issues/2481)) appear
57+
in the Background and in FR-002 as authoritative evidence for the root
58+
cause — these are not implementation prescriptions, they are the
59+
incident-diagnosis trail required under the user's "verify with official
60+
documentation or authoritative references" rule.
61+
- All checklist items pass. Spec is ready for `/speckit.plan`.
Lines changed: 93 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
# Contract — Post-execution assertion (NEW)
2+
3+
**Status**: **NEW** — introduced by this feature to satisfy FR-001.
4+
**Location**: injected into `.github/workflows/feature-research.md`
5+
frontmatter as a `post-steps:` entry. Compiled into the `agent` job
6+
of `feature-research.lock.yml` after the existing “Execute GitHub
7+
Copilot CLI” step.
8+
9+
## Purpose
10+
11+
Make silent-success impossible: fail the agent step when it emits
12+
zero `create_issue` safe-output records and zero `noop` safe-output
13+
records, so the overall run conclusion rolls up as `failure` and the
14+
Actions UI shows a red X.
15+
16+
## Producer → Consumer
17+
18+
| Producer | Consumer |
19+
|----------|----------|
20+
| `post-steps` assertion (new) | GitHub Actions job conclusion resolver |
21+
22+
## Input contract
23+
24+
| Input | Source | Required |
25+
|-------|--------|----------|
26+
| `/tmp/gh-aw/agent_output.json` | Written by existing gh-aw lock step “Write agent output placeholder if missing” at `feature-research.lock.yml:770` | Yes (gh-aw guarantees the file exists — it writes `{"items":[]}` if the agent produced nothing) |
27+
28+
The file shape is a JSON object with an `items` array whose entries
29+
match the `SafeOutputRecord` entity in `data-model.md`.
30+
31+
## Output contract (exit semantics)
32+
33+
| Situation | Exit code | Observable effect |
34+
|-----------|-----------|-------------------|
35+
| `items[].type == "create_issue"` count `>= 1` | `0` | Agent step passes; downstream jobs run as normal |
36+
| `items[].type == "noop"` count `>= 1` (no create_issue) | `0` | Agent step passes; noop tracking issue is created by the existing `handle_noop_message.cjs` step in the conclusion job |
37+
| Both counts are zero | `1` | Agent step fails; `needs.agent.result == "failure"`; conclusion job rolls up as `failure`; **red X in Actions UI** |
38+
| `/tmp/gh-aw/agent_output.json` is missing or not valid JSON | `1` | Same as above — silent-success is impossible even on malformed output |
39+
40+
## Frontmatter declaration
41+
42+
```yaml
43+
post-steps:
44+
- name: Assert agent emitted at least one safe-output record
45+
if: always()
46+
run: |
47+
set -euo pipefail
48+
AGENT_OUT=/tmp/gh-aw/agent_output.json
49+
if [[ ! -s "$AGENT_OUT" ]]; then
50+
echo "::error::Agent produced no output file at $AGENT_OUT"
51+
exit 1
52+
fi
53+
CREATE_ISSUE_COUNT=$(jq '[.items[] | select(.type == "create_issue")] | length' "$AGENT_OUT")
54+
NOOP_COUNT=$(jq '[.items[] | select(.type == "noop")] | length' "$AGENT_OUT")
55+
echo "create_issue records: $CREATE_ISSUE_COUNT"
56+
echo "noop records: $NOOP_COUNT"
57+
if [[ "$CREATE_ISSUE_COUNT" -lt 1 && "$NOOP_COUNT" -lt 1 ]]; then
58+
echo "::error::FR-001 violation — agent emitted zero create_issue and zero noop safe-outputs. Failing the run to prevent silent success."
59+
exit 1
60+
fi
61+
```
62+
63+
## Invariants
64+
65+
| Invariant | Enforcement |
66+
|-----------|-------------|
67+
| `set -euo pipefail` must be at the top of the script | Documented above; reviewers reject PRs without it |
68+
| Error messages go to `::error::` workflow command (not just `echo`) | Makes the failure visible in the Actions UI summary |
69+
| Step uses `if: always()` | Runs even if the agent step crashed (spec Edge Case: max-continuations exhaustion) |
70+
| Script MUST NOT write to `$GITHUB_TOKEN`, MUST NOT read secrets | Post-steps run outside the firewall sandbox; avoid leaking anything from the runner environment |
71+
| Script uses `jq` (pre-installed on `ubuntu-24.04`) | Documented in `docs.github.com/en/actions/reference/runner-images#ubuntu-2404-lts-installed-software` |
72+
73+
## What this contract does NOT cover
74+
75+
- **Signal-to-maintainer-via-email** — spec does not require it.
76+
- **Per-iteration progress tracking** — the agent runs up to 3
77+
autopilot continuations; this assertion runs once at the end,
78+
exactly as the agent step completes.
79+
- **Alerting on partial Tavily rate-limit** — spec Edge Case says a
80+
partial-data run is legitimate success; the assertion only fires on
81+
zero output, not on degraded output.
82+
83+
## Test strategy
84+
85+
- **Happy path**: `quickstart.md` step 4 dispatches the workflow and
86+
verifies the post-step passes (green agent job, green run).
87+
- **Forced failure**: `quickstart.md` step 5 temporarily breaks the
88+
MCP gateway (e.g. by invalidating `TAVILY_API_KEY`) to force zero
89+
safe-outputs, verifying the assertion fires and the run ends in
90+
red X.
91+
- **Regression lock-in**: after merge, the next 4 scheduled runs
92+
(SC-001 window) are observed; any silent-success regression is
93+
treated as a FR-001 violation and re-opens this spec.
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Contract — `safe-outputs` shape (PRESERVED)
2+
3+
**Status**: **DO NOT CHANGE.** This contract is the shape the workflow
4+
emits today and MUST continue to emit after the fix. Any divergence
5+
fails spec FR-005. This file is a **read-only reference** for task
6+
phase — it does NOT introduce a new contract.
7+
8+
**Source of truth**: compiled `feature-research.lock.yml` line 370 (the
9+
`GH_AW_SAFE_OUTPUTS_CONFIG_*` block) and
10+
`.github/workflows/feature-research.md` frontmatter lines 32–38.
11+
12+
## Producer → Consumer
13+
14+
| Producer | Consumer |
15+
|----------|----------|
16+
| `agent` job calling `mcp__safeoutputs__create_issue` | `safe_outputs` job running `safe_output_handler_manager.cjs` |
17+
18+
## Wire format
19+
20+
One JSON object per record, one record per line in
21+
`/tmp/gh-aw/safeoutputs.jsonl`.
22+
23+
```jsonc
24+
{
25+
"type": "create_issue",
26+
"title": "...", // sanitized, max 128 chars. Will be prefixed by safe-outputs handler with "[feature-research] "
27+
"body": "...", // sanitized, max 65000 chars (markdown)
28+
"labels": ["..."], // max 128 chars each, sanitized. MUST include "feature-research" and "automated"
29+
"assignees": ["..."], // MUST include "chrisleekr"
30+
"temporary_id": "..." // gh-aw internal; agent leaves blank
31+
}
32+
```
33+
34+
Validation schema is in the compiled lock under
35+
`safeoutputs/validation.json` — the `create_issue` entry with
36+
`required: ["body", "title"]`.
37+
38+
## Frontmatter declaration (MUST remain exactly as shown)
39+
40+
```yaml
41+
safe-outputs:
42+
create-issue:
43+
title-prefix: "[feature-research]"
44+
labels: [feature-research, automated]
45+
assignees: [chrisleekr]
46+
close-older-issues: true
47+
expires: 7d
48+
```
49+
50+
## Invariants the fix must preserve
51+
52+
| Invariant | Why |
53+
|-----------|-----|
54+
| `title-prefix` is `[feature-research]` | spec FR-005 |
55+
| `labels` contains both `feature-research` and `automated` | spec FR-005, SC-001 |
56+
| `assignees` contains `chrisleekr` | spec FR-005 |
57+
| `close-older-issues: true` | spec FR-005 |
58+
| `expires: 7d` | spec FR-005 |
59+
| `create-issue.max` remains `1` | existing gh-aw default; spec does not loosen |
60+
| `noop` safe-output type remains allowed (even though not named in the source frontmatter, gh-aw injects it as a default) | spec User Story 1 scenario 3 needs `noop` for the “no gaps this week” happy-path |
61+
62+
## What the fix MAY change (but is NOT required to change)
63+
64+
- Nothing in this contract. The fix only **reads** safe-outputs (via
65+
the new post-steps assertion in `contracts/post-check.md`); it does
66+
not modify how they are produced or consumed.
67+
68+
## Test strategy
69+
70+
No automated test — this is a docs-only feature per Principle II
71+
exception. Manual walkthrough in `quickstart.md` asserts that on a
72+
successful run:
73+
74+
1. A single issue is created (or updated) with the prefix, labels,
75+
and assignee above.
76+
2. The issue title is visible on `chrisleekr/agentsync/issues` within
77+
~30 seconds of the agent job completing.
78+
3. Prior week’s issue is closed by the `close-older-issues` mechanism.

0 commit comments

Comments
 (0)