diff --git a/openspec/CHANGE_ORDER.md b/openspec/CHANGE_ORDER.md index c8af0e3b..cade5eb3 100644 --- a/openspec/CHANGE_ORDER.md +++ b/openspec/CHANGE_ORDER.md @@ -169,6 +169,17 @@ These are derived extensions of the same 2026-02-15 plan and are required to ope | ai-integration | 02 | ai-integration-02-mcp-server | [#252](https://github.com/nold-ai/specfact-cli/issues/252) | validation-02 | | ai-integration | 03 | ai-integration-03-instruction-files | [#253](https://github.com/nold-ai/specfact-cli/issues/253) | ai-integration-01 | +### CLI end-user validation (validation gap plan, 2026-02-19) + +| Module | Order | Change folder | GitHub # | Blocked by | +|--------|-------|---------------|----------|------------| +| cli-val | 01 | cli-val-01-behavior-contract-standard | [#279](https://github.com/nold-ai/specfact-cli/issues/279) | — | +| cli-val | 02 | cli-val-02-output-snapshot-stability | [#280](https://github.com/nold-ai/specfact-cli/issues/280) | — | +| cli-val | 03 | cli-val-03-misuse-safety-proof | [#281](https://github.com/nold-ai/specfact-cli/issues/281) | #279 | +| cli-val | 04 | cli-val-04-acceptance-test-runner | [#282](https://github.com/nold-ai/specfact-cli/issues/282) | #279, #281 | +| cli-val | 05 | cli-val-05-ci-integration | [#283](https://github.com/nold-ai/specfact-cli/issues/283) | #280, #282 | +| cli-val | 06 | cli-val-06-copilot-test-generation | [#284](https://github.com/nold-ai/specfact-cli/issues/284) | #279 (soft: #283) | + ### Integration governance and proof (architecture integration plan, 2026-02-15) | Module | Order | Change folder | GitHub # | Blocked by | @@ -252,6 +263,7 @@ One parent issue per module group for grouping. Set **Type** to Epic on the proj | Sidecar validation | [Epic] Sidecar validation | [#191](https://github.com/nold-ai/specfact-cli/issues/191) | | Bundle mapping | [Epic] Bundle/spec mapping | [#192](https://github.com/nold-ai/specfact-cli/issues/192) | | Architecture + Marketplace | [Epic] Architecture (CLI structure, modularity, performance) | [#194](https://github.com/nold-ai/specfact-cli/issues/194) | +| CLI end-user validation | [Epic] CLI End-User Validation | [#285](https://github.com/nold-ai/specfact-cli/issues/285) | --- @@ -267,6 +279,12 @@ Dependencies flow left-to-right; a wave may start once all its hard blockers are - backlog-core-01 ✅ - validation-01 ✅, sidecar-01 ✅, bundle-mapper-01 ✅ +- **Wave 1.5 — CLI end-user validation** (cross-cutting, parallel to Wave 2+): + - cli-val-01, cli-val-02 (no blockers — start immediately after Wave 1) + - cli-val-03, cli-val-06 (after cli-val-01) + - cli-val-04 (after cli-val-01 + cli-val-03) + - cli-val-05 (after cli-val-02 + cli-val-04 — capstone) + - **Wave 2 — Marketplace + backlog module layer** (needs Wave 1): - marketplace-01 (needs arch-06) - backlog-core-02 (needs backlog-core-01) @@ -319,6 +337,7 @@ A wave cannot be considered complete until all gate criteria listed for that wav - Wave 0 gate: Core modular CLI and bridge registry flows remain stable and archived changes are validated. - Wave 1 gate: arch-06/07, policy-engine-01, patch-mode-01, backlog-core-01, validation-01 produce passing contract and strict OpenSpec validation. ✅ Completed 2026-02-18. +- Wave 1.5 gate: CLI behavior contract schema validated, snapshot tests pass for all command groups, black-box acceptance tests prove installed binary works, anti-pattern safety assertions pass for all Wave 1 commands, CI gates enforce all of the above. - Wave 2 gate: At least one backlog planning workflow completes with no blocking dependency regressions across backlog-core + marketplace-01. - Wave 3 gate: Higher-order backlog workflows and marketplace-02 interoperate without command-group regressions. - Wave 4 gate: `ceremony-cockpit-01` aliases resolve and execute against installed modules without fallback failures. diff --git a/openspec/changes/cli-val-01-behavior-contract-standard/CHANGE_VALIDATION.md b/openspec/changes/cli-val-01-behavior-contract-standard/CHANGE_VALIDATION.md new file mode 100644 index 00000000..0e10625b --- /dev/null +++ b/openspec/changes/cli-val-01-behavior-contract-standard/CHANGE_VALIDATION.md @@ -0,0 +1,93 @@ +# Change Validation: cli-val-01-behavior-contract-standard + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-01-behavior-contract-standard --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** cli-behavior-contracts +- **Modified capabilities:** none +- **Declared dependencies:** none (foundation change with no prerequisites) +- **Downstream dependents:** cli-val-03-misuse-safety-proof, cli-val-04-acceptance-test-runner, cli-val-06-copilot-test-generation +- **Proposed affected code paths:** + - `tests/cli-contracts/schema/cli-scenario.schema.yaml` (new schema file) + - `tests/cli-contracts/*.scenarios.yaml` (new pilot scenario files) + - `tools/validate_cli_contracts.py` (new validation tool) + - `openspec/config.yaml` (extend with CLI behavior contract artifact type) + - `docs/` (new contributor page) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: CLI Behavior Contract Standard`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses NEW/EXTEND markers with bullet list) + - "Capabilities" section: Present (one capability: `cli-behavior-contracts`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #279, repository nold-ai/specfact-cli) + - Note: Section heading uses `## Dependencies` instead of the standard `## Impact` subsection for dependency declarations. Dependencies are still clearly stated. Minor deviation, non-blocking. +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.`, `## 2.`, ... `## 8.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - PR creation: Last implementation task (`## 8. Delivery`, task 8.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 6.` includes format, type-check, lint, contract-test, smart-test) + - Version and changelog tasks: Present (`## 7.`) + - Note: No explicit worktree bootstrap pre-flight tasks (hatch env create, smart-test-status, contract-test-status). Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 7 scenarios in `specs/cli-behavior-contracts/spec.md` + - Three requirements defined: CLI Behavior Contract Schema (3 scenarios), Pilot Scenario Files (3 scenarios), Schema Validation Tool (1 scenario) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (4 goals, 4 non-goals) + - Decisions: Present (5 decisions documented) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (5-step plan) + - Schema Design: Present (bonus detail showing YAML structure) + - Open Questions: Present (2 questions) + - Note: No bridge adapter integration or sequence diagrams documented, but these are not applicable (no multi-repo flows or adapter integration in this change) + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 01 | cli-val-01-behavior-contract-standard | #279 | ---` + - Blocked by: none (matches proposal declaration of no hard blockers) + - Wave 1.5 position: First in wave alongside cli-val-02 (no blockers -- start immediately after Wave 1) +- **GitHub Issue consistency**: PASS + - Issue #279 is OPEN, title "[Change] CLI Behavior Contract Standard" + - Issue body matches proposal scope (YAML schema, pilot files, validation tool, documentation) + - No "Blocked by" references in issue body (correct -- no blockers) + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states no hard blockers + - CHANGE_ORDER.md shows no blockers for #279 + - GitHub issue #279 shows no blocker references + - All three sources agree + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** None. This change introduces new files only (schema, validation tool, scenario files). No existing production code, interfaces, contracts, or APIs are modified. +- **Risk assessment:** No breaking-change risk. All artifacts are additive. + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** No production CLI code changes. New test infrastructure files only. +- **Test Impact:** New schema validation tests in `tests/unit/`. New scenario files in `tests/cli-contracts/`. No changes to existing tests. +- **Documentation Impact:** New contributor page in `docs/` describing CLI behavior contract format. +- **Release Impact:** Patch/Minor (additive only, no breaking changes) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/cli-behavior-contracts/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, and GitHub issue #279. +- No breaking changes detected. +- Change is ready for implementation-phase intake. No prerequisites to satisfy. diff --git a/openspec/changes/cli-val-01-behavior-contract-standard/design.md b/openspec/changes/cli-val-01-behavior-contract-standard/design.md new file mode 100644 index 00000000..fc138e9f --- /dev/null +++ b/openspec/changes/cli-val-01-behavior-contract-standard/design.md @@ -0,0 +1,73 @@ +## Context + +This change establishes the foundational artifact format for CLI end-user validation. It defines how command behavior expectations are recorded, validated, and consumed by downstream validation changes (cli-val-03, cli-val-04, cli-val-06). No production CLI code is modified. + +## Goals / Non-Goals + +**Goals:** + +- Define a YAML schema for CLI behavior scenarios that is both human-readable and machine-executable +- Pilot the format on 2-3 command groups to validate usability +- Provide a schema validation tool that integrates with the existing hatch script system +- Keep the format simple enough for AI copilots to generate reliably + +**Non-Goals:** + +- No test runner implementation (that is cli-val-04) +- No CI integration (that is cli-val-05) +- No changes to production CLI commands +- No snapshot testing infrastructure (that is cli-val-02) + +## Decisions + +- Store scenario files in `tests/cli-contracts/` to keep them adjacent to existing test tiers +- Use YAML (not JSON or TOML) for scenario files — aligns with existing SpecFact config patterns and is copilot-friendly +- Separate patterns and anti-patterns within the same file using a `type` field rather than separate files — keeps related behavior together +- Use JSON Schema for the schema definition — enables validation with `jsonschema` (already a project dependency) +- Pilot on three command groups: one with many args (e.g., `backlog ceremony`), one with file I/O (e.g., `validate`), one simple (e.g., `--help`/`--version`) + +## Schema Design + +```yaml +# tests/cli-contracts/schema/cli-scenario.schema.yaml +feature: string # command group name (e.g., "spec validate") +scenarios: + - name: string # human-readable scenario name + type: pattern | anti-pattern + argv: [string] # exact command tokens + context: # optional workspace setup + requires: string # "empty-repo" | "sample-bundle" | "initialized-project" + env: {key: value} # environment variable overrides + stdin: string # stdin input (if any) + expect: + exit: int # exact exit code (for patterns) + exit_nonzero: bool # true for anti-patterns + stdout_contains: [string] + stdout_regex: [string] + stderr_contains: [string] + stderr_regex: [string] + no_traceback: bool # assert no Python traceback in output + fs: + creates: [string] # files expected to be created + modifies: [string] # files expected to be modified + forbidden: [string] # files that must NOT be created/modified +``` + +## Risks / Trade-offs + +- [Schema rigidity] -> Mitigation: start with a minimal schema and extend per downstream change needs; version the schema +- [YAML complexity for many scenarios] -> Mitigation: one file per command group keeps files manageable; add `$ref` support later if needed +- [Scenario maintenance burden] -> Mitigation: cli-val-06 provides copilot generation; cli-val-05 CI catches stale scenarios + +## Migration Plan + +1. Define schema and validation tool +2. Create pilot scenario files for 3 command groups +3. Validate all pilots pass schema validation +4. Document format in `docs/` for contributors +5. Downstream changes (cli-val-03/04/06) consume the format + +## Open Questions + +- Exact command groups for the pilot: recommend `backlog ceremony standup`, `validate`, and root `specfact --help`/`--version` +- Whether to support YAML anchors and aliases for reducing repetition in large scenario files diff --git a/openspec/changes/cli-val-01-behavior-contract-standard/proposal.md b/openspec/changes/cli-val-01-behavior-contract-standard/proposal.md new file mode 100644 index 00000000..f723169e --- /dev/null +++ b/openspec/changes/cli-val-01-behavior-contract-standard/proposal.md @@ -0,0 +1,39 @@ +# Change: CLI Behavior Contract Standard + +## Why + +Every SpecFact CLI feature should have a single artifact that answers: "What does this command do, and what happens when you use it wrong?" Today this knowledge lives scattered across 73+ test files, docstrings, and docs pages. When a user reports unexpected behavior, there is no single source of truth to check the expected CLI behavior against. A machine-readable behavior contract per command group — recording exact argv, expected exit codes, output patterns, and filesystem side effects — closes this gap and becomes the foundation for all downstream validation (snapshot tests, acceptance runners, anti-pattern suites, CI gates). + +## What Changes + +- **NEW**: YAML schema definition for CLI behavior scenarios (`tests/cli-contracts/schema/cli-scenario.schema.yaml`) — defines the contract format for command groups +- **NEW**: Scenario files for pilot command groups stored in `tests/cli-contracts/` — one YAML file per command group recording patterns (happy paths) and anti-patterns (misuse/invalid input) +- **NEW**: Schema validation utility (`tools/validate_cli_contracts.py`) that validates scenario YAML files against the schema +- **EXTEND**: `openspec/config.yaml` context to reference CLI behavior contracts as a recognized artifact type +- **EXTEND**: Documentation in `docs/` to describe the CLI behavior contract format and authoring guidelines + +## Capabilities + +### New Capabilities + +- `cli-behavior-contracts`: A YAML-based schema and authoring standard for declaring CLI command behavior expectations — exact argv, required context, expected exit class (success/failure), stdout/stderr patterns, and filesystem diff expectations — separated into patterns (happy paths) and anti-patterns (misuse). + +## Impact + +- **Affected specs**: No existing specs modified; new spec delta defines the schema and authoring process +- **Affected code**: New schema file and validation tool only; no production CLI code changes +- **Integration points**: Consumed by cli-val-03 (anti-patterns), cli-val-04 (acceptance runner), cli-val-06 (copilot generation) +- **Documentation impact**: New page in `docs/` describing the CLI behavior contract format for contributors + +## Dependencies + +- No hard blockers — this is the foundation change with no prerequisites +- Downstream dependents: cli-val-03-misuse-safety-proof, cli-val-04-acceptance-test-runner, cli-val-06-copilot-test-generation + +## Source Tracking + + +- **GitHub Issue**: #279 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/279 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-01-behavior-contract-standard/specs/cli-behavior-contracts/spec.md b/openspec/changes/cli-val-01-behavior-contract-standard/specs/cli-behavior-contracts/spec.md new file mode 100644 index 00000000..f41944d0 --- /dev/null +++ b/openspec/changes/cli-val-01-behavior-contract-standard/specs/cli-behavior-contracts/spec.md @@ -0,0 +1,63 @@ +## ADDED Requirements + +### Requirement: CLI Behavior Contract Schema + +The system SHALL provide a YAML schema that defines the structure for CLI behavior scenario files, enabling machine-readable declaration of expected command behavior. + +#### Scenario: Schema validates a well-formed scenario file + +- **GIVEN** a scenario YAML file with feature name, argv list, expect block (exit code, stdout/stderr patterns), and fs block (creates/modifies) +- **WHEN** the schema validator runs against the file +- **THEN** validation passes with no errors +- **AND** all required fields are present and correctly typed. + +#### Scenario: Schema rejects a scenario file missing required fields + +- **GIVEN** a scenario YAML file missing the `argv` field in a scenario entry +- **WHEN** the schema validator runs against the file +- **THEN** validation fails with a descriptive error identifying the missing field +- **AND** the error message includes the scenario name and line context. + +#### Scenario: Schema supports both pattern and anti-pattern categorization + +- **GIVEN** a scenario YAML file with entries categorized as `type: pattern` and `type: anti-pattern` +- **WHEN** the file is parsed +- **THEN** patterns and anti-patterns are distinguishable programmatically +- **AND** anti-patterns require `exit_nonzero: true` in the expect block. + +### Requirement: Pilot Scenario Files + +The system SHALL include pilot scenario files for at least three command groups demonstrating the contract format across different command characteristics. + +#### Scenario: Pilot covers a command with many arguments + +- **GIVEN** a scenario file for a command group with multiple required and optional arguments +- **WHEN** the file includes happy-path and misuse scenarios +- **THEN** at least 3 pattern scenarios and 3 anti-pattern scenarios are defined +- **AND** each scenario records exact argv, expected exit code, and output patterns. + +#### Scenario: Pilot covers a command with file I/O + +- **GIVEN** a scenario file for a command that reads or writes files +- **WHEN** the file includes filesystem expectation blocks +- **THEN** the `fs.creates` and `fs.modifies` fields list expected file paths +- **AND** anti-patterns assert `fs.creates: []` and `fs.modifies: []` (no side effects on failure). + +#### Scenario: Pilot covers a simple informational command + +- **GIVEN** a scenario file for a command like `--help` or `--version` +- **WHEN** the file defines expected output +- **THEN** stdout patterns match documented help text structure +- **AND** exit code is 0 for valid invocation. + +### Requirement: Schema Validation Tool + +The system SHALL provide a validation tool that checks scenario YAML files against the schema and reports errors. + +#### Scenario: Validation tool runs via hatch script + +- **GIVEN** scenario files exist in `tests/cli-contracts/` +- **WHEN** `hatch run validate-cli-contracts` is executed +- **THEN** all scenario files are validated against the schema +- **AND** errors are reported with file path and line context +- **AND** exit code is 0 when all files pass, non-zero when any fail. diff --git a/openspec/changes/cli-val-01-behavior-contract-standard/tasks.md b/openspec/changes/cli-val-01-behavior-contract-standard/tasks.md new file mode 100644 index 00000000..ddcea9f8 --- /dev/null +++ b/openspec/changes/cli-val-01-behavior-contract-standard/tasks.md @@ -0,0 +1,76 @@ +# Tasks: cli-val-01-behavior-contract-standard + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-01-behavior-contract-standard -b feature/cli-val-01-behavior-contract-standard origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-01-behavior-contract-standard` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/cli-behavior-contracts/spec.md` scenarios. +- [ ] 2.2 Cross-check scenarios against existing CliRunner test patterns in `tests/` for completeness. + +## 3. Test-first: schema validation tests + +- [ ] 3.1 Create `tests/unit/specfact_cli/test_cli_contract_schema.py` with tests for: + - Schema validates well-formed YAML + - Schema rejects missing required fields + - Schema distinguishes patterns from anti-patterns + - Validation tool reports errors with file/line context +- [ ] 3.2 Run tests and capture failing results in `TDD_EVIDENCE.md`. + +## 4. Implementation: schema and validation tool + +- [ ] 4.1 Create `tests/cli-contracts/schema/cli-scenario.schema.yaml` — the JSON Schema for scenario files. +- [ ] 4.2 Create `tools/validate_cli_contracts.py` — validates scenario YAML against schema, reports errors with context. +- [ ] 4.3 Add `validate-cli-contracts` script to `pyproject.toml` hatch scripts. +- [ ] 4.4 Re-run tests until passing; record in `TDD_EVIDENCE.md`. + +## 5. Pilot scenario files + +- [ ] 5.1 Create `tests/cli-contracts/backlog-ceremony-standup.scenarios.yaml` — command with multiple args, 3+ patterns and 3+ anti-patterns. +- [ ] 5.2 Create `tests/cli-contracts/validate.scenarios.yaml` — command with file I/O, filesystem expectations. +- [ ] 5.3 Create `tests/cli-contracts/root-help.scenarios.yaml` — simple `--help`/`--version` scenarios. +- [ ] 5.4 Run schema validation tool against all pilot files; fix issues. + +## 6. Quality gates and documentation + +- [ ] 6.1 `hatch run format` — ruff format + autofix. +- [ ] 6.2 `hatch run type-check` — basedpyright strict. +- [ ] 6.3 `hatch run lint` — full lint suite. +- [ ] 6.4 `hatch run contract-test` — contract-first validation. +- [ ] 6.5 `hatch run smart-test` — targeted test run. +- [ ] 6.6 Update `docs/` with a page describing the CLI behavior contract format and authoring guidelines. +- [ ] 6.7 Run `openspec validate cli-val-01-behavior-contract-standard --strict` and resolve all issues. + +## 7. Version and changelog + +- [ ] 7.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 7.2 Add CHANGELOG.md entry under new version section with `Added` items for CLI behavior contract standard. + +## 8. Delivery + +- [ ] 8.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 8.2 Stage and commit: `git add . && git commit -m "feat: add CLI behavior contract standard (cli-val-01)"` +- [ ] 8.3 Push: `git push -u origin feature/cli-val-01-behavior-contract-standard` +- [ ] 8.4 Create PR from `feature/cli-val-01-behavior-contract-standard` to `dev` using `gh pr create`. +- [ ] 8.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-01-behavior-contract-standard` +- [ ] `git branch -d feature/cli-val-01-behavior-contract-standard` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-01-behavior-contract-standard` diff --git a/openspec/changes/cli-val-02-output-snapshot-stability/CHANGE_VALIDATION.md b/openspec/changes/cli-val-02-output-snapshot-stability/CHANGE_VALIDATION.md new file mode 100644 index 00000000..b6e10b1a --- /dev/null +++ b/openspec/changes/cli-val-02-output-snapshot-stability/CHANGE_VALIDATION.md @@ -0,0 +1,91 @@ +# Change Validation: cli-val-02-output-snapshot-stability + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-02-output-snapshot-stability --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** output-snapshot-stability +- **Modified capabilities:** none +- **Declared dependencies:** none (can develop in parallel with cli-val-01) +- **Downstream dependents:** cli-val-05-ci-integration +- **Proposed affected code paths:** + - `tests/snapshots/test_help_snapshots.py` (new) + - `tests/snapshots/test_output_snapshots.py` (new) + - `tests/snapshots/test_error_snapshots.py` (new) + - `pyproject.toml` (extend with syrupy dependency and hatch scripts) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: Output Snapshot Stability`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses NEW/EXTEND markers with bullet list) + - "Capabilities" section: Present (one capability: `output-snapshot-stability`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #280, repository nold-ai/specfact-cli) + - Note: Section heading uses `## Dependencies` instead of standard `## Impact` subsection for dependency declarations. Dependencies are still clearly stated. Minor deviation, non-blocking. +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.` through `## 7.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - PR creation: Last implementation task (`## 7. Delivery`, task 7.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 5.` includes format, type-check, lint, contract-test, smart-test) + - Version and changelog tasks: Present (`## 6.`) + - Note: No explicit worktree bootstrap pre-flight tasks (hatch env create, smart-test-status, contract-test-status). Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 6 scenarios in `specs/output-snapshot-stability/spec.md` + - Four requirements defined: Help Text Snapshots (2 scenarios), Structured Output Snapshots (1 scenario), Error Message Snapshots (1 scenario), Snapshot Update Workflow (2 scenarios) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (5 goals, 4 non-goals) + - Decisions: Present (5 decisions documented) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (5-step plan) + - Open Questions: Present (2 questions) + - Note: No bridge adapter integration or sequence diagrams documented, but not applicable for this change + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 02 | cli-val-02-output-snapshot-stability | #280 | ---` + - Blocked by: none (matches proposal declaration of no hard blockers) + - Wave 1.5 position: First in wave alongside cli-val-01 (no blockers) +- **GitHub Issue consistency**: PASS + - Issue #280 is OPEN, title "[Change] Output Snapshot Stability" + - Issue body matches proposal scope (syrupy, snapshot tests, hatch scripts) + - No "Blocked by" references in issue body (correct -- no blockers) + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states no hard blockers + - CHANGE_ORDER.md shows no blockers for #280 + - GitHub issue #280 shows no blocker references + - All three sources agree + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** None. This change introduces new test files and a dev dependency (syrupy). No existing production code, interfaces, contracts, or APIs are modified. +- **Risk assessment:** No breaking-change risk. All artifacts are additive. + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** No production CLI code changes. New test infrastructure files and dev dependency only. +- **Test Impact:** New snapshot test files in `tests/snapshots/`. New hatch scripts (`snapshot-update`, `snapshot-check`). No changes to existing tests. +- **Documentation Impact:** New contributor page in `docs/` describing snapshot update workflow. +- **Release Impact:** Patch/Minor (additive only, no breaking changes) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/output-snapshot-stability/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, and GitHub issue #280. +- No breaking changes detected. +- Change is ready for implementation-phase intake. No prerequisites to satisfy. diff --git a/openspec/changes/cli-val-02-output-snapshot-stability/design.md b/openspec/changes/cli-val-02-output-snapshot-stability/design.md new file mode 100644 index 00000000..e21757e2 --- /dev/null +++ b/openspec/changes/cli-val-02-output-snapshot-stability/design.md @@ -0,0 +1,47 @@ +## Context + +This change adds snapshot testing infrastructure to freeze CLI output as versioned contracts. It uses syrupy (a pytest snapshot plugin) to capture and compare help text, structured output, and error messages. No production CLI code changes are needed. + +## Goals / Non-Goals + +**Goals:** + +- Add syrupy as a dev/test dependency +- Create snapshot tests for all command `--help` outputs +- Create snapshot tests for key structured outputs and error messages +- Define a snapshot update policy that prevents accidental drift +- Integrate with hatch scripts for developer workflow + +**Non-Goals:** + +- No CI gating implementation (that is cli-val-05) +- No production CLI code changes +- No changes to command behavior or output format +- No interactive test output (snapshots are for deterministic, non-interactive commands only) + +## Decisions + +- Use syrupy over inline assert strings — syrupy manages snapshot files automatically, supports multiple serialization formats, and provides clear diffs on failure +- Store snapshots in `tests/snapshots/` — keeps them separate from test code for cleaner diffs +- Normalize dynamic values (timestamps, absolute paths, version strings) before snapshot comparison — prevents false failures across environments +- Create one snapshot test file per command tier: `test_help_snapshots.py`, `test_output_snapshots.py`, `test_error_snapshots.py` +- Add `hatch run snapshot-update` script for explicit snapshot refresh + +## Risks / Trade-offs + +- [Snapshot maintenance on intentional changes] -> Mitigation: explicit `--snapshot-update` flag + PR review of snapshot diffs +- [Dynamic output causes false failures] -> Mitigation: normalize timestamps, paths, and version strings before comparison +- [Large snapshot files in git] -> Mitigation: text-based snapshots are compressible; help text snapshots are small + +## Migration Plan + +1. Add syrupy dependency to pyproject.toml dev extras +2. Create snapshot test files with normalization helpers +3. Generate initial snapshots for all existing commands +4. Verify all snapshots pass on clean run +5. Document update workflow in docs/ + +## Open Questions + +- Whether to snapshot Rich-formatted output (with ANSI codes) or plain text — recommend plain text for stability +- Whether error message snapshots should cover all error paths or only the most common ones — recommend starting with 5-10 key error templates diff --git a/openspec/changes/cli-val-02-output-snapshot-stability/proposal.md b/openspec/changes/cli-val-02-output-snapshot-stability/proposal.md new file mode 100644 index 00000000..19e59da6 --- /dev/null +++ b/openspec/changes/cli-val-02-output-snapshot-stability/proposal.md @@ -0,0 +1,40 @@ +# Change: Output Snapshot Stability + +## Why + +Users who script SpecFact in CI/CD pipelines rely on stable output: help text format, error message wording, JSON/YAML structured output shapes. Today, a refactoring PR can silently change help text or error phrasing with no test catching it. A user's `specfact validate --help | grep "spec"` pipeline step breaks without warning. Snapshot testing makes output changes explicit, reviewable, and intentional — treating help text and structured output as versioned contracts. + +## What Changes + +- **NEW**: `syrupy` added as a dev/test dependency for pytest snapshot testing +- **NEW**: Snapshot tests for all top-level and subcommand `--help` outputs in `tests/snapshots/` +- **NEW**: Snapshot tests for structured output shapes (JSON/YAML from commands that produce machine output) +- **NEW**: Snapshot tests for key error message templates (ensuring consistent user-facing error phrasing) +- **EXTEND**: CI pipeline to reject unreviewed snapshot changes (snapshot mismatches fail CI; updates require explicit `--snapshot-update` flag) +- **EXTEND**: `pyproject.toml` with syrupy configuration and hatch script for snapshot updates + +## Capabilities + +### New Capabilities + +- `output-snapshot-stability`: Pytest snapshot tests using syrupy that freeze help text, structured output shapes, and error message templates as versioned contracts. Any output change requires explicit snapshot update and PR review. + +## Impact + +- **Affected specs**: No existing specs modified; new spec delta defines snapshot testing requirements +- **Affected code**: No production CLI code changes; new test files and dev dependency only +- **Integration points**: Consumed by cli-val-05 (CI integration — snapshot mismatch becomes a hard gate) +- **Documentation impact**: Developer guide update in `docs/` describing snapshot update workflow + +## Dependencies + +- No hard blockers — can develop in parallel with cli-val-01 +- Downstream dependents: cli-val-05-ci-integration + +## Source Tracking + + +- **GitHub Issue**: #280 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/280 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-02-output-snapshot-stability/specs/output-snapshot-stability/spec.md b/openspec/changes/cli-val-02-output-snapshot-stability/specs/output-snapshot-stability/spec.md new file mode 100644 index 00000000..82fd2b29 --- /dev/null +++ b/openspec/changes/cli-val-02-output-snapshot-stability/specs/output-snapshot-stability/spec.md @@ -0,0 +1,59 @@ +## ADDED Requirements + +### Requirement: Help Text Snapshots + +The system SHALL maintain snapshot tests for all command `--help` outputs to detect unintentional changes. + +#### Scenario: Help text snapshot matches for top-level command + +- **GIVEN** the SpecFact CLI is installed +- **WHEN** `specfact --help` is invoked via CliRunner +- **THEN** the output matches the stored snapshot exactly +- **AND** any difference causes the test to fail. + +#### Scenario: Help text snapshot matches for subcommands + +- **GIVEN** the SpecFact CLI is installed +- **WHEN** each registered subcommand `--help` is invoked via CliRunner +- **THEN** each output matches its respective stored snapshot +- **AND** new commands automatically require a snapshot to be created. + +### Requirement: Structured Output Snapshots + +The system SHALL maintain snapshot tests for commands that produce machine-readable output (JSON/YAML). + +#### Scenario: JSON output shape matches snapshot + +- **GIVEN** a command that produces structured JSON output +- **WHEN** the command is invoked with a deterministic input +- **THEN** the JSON output structure matches the stored snapshot +- **AND** dynamic values (timestamps, absolute paths) are normalized before comparison. + +### Requirement: Error Message Snapshots + +The system SHALL maintain snapshot tests for key error message templates. + +#### Scenario: Error messages for common failure modes are stable + +- **GIVEN** a command invoked with a known-invalid input +- **WHEN** the command produces an error message +- **THEN** the error message matches the stored snapshot +- **AND** changes to error wording require explicit snapshot update. + +### Requirement: Snapshot Update Workflow + +The system SHALL provide an explicit workflow for updating snapshots that prevents accidental drift. + +#### Scenario: Snapshot update requires explicit flag + +- **GIVEN** a test run with snapshot mismatches +- **WHEN** tests are run without `--snapshot-update` flag +- **THEN** mismatching tests fail +- **AND** no snapshots are overwritten silently. + +#### Scenario: Snapshot update with explicit flag succeeds + +- **GIVEN** a test run with intentional output changes +- **WHEN** tests are run with `--snapshot-update` flag +- **THEN** snapshots are updated to match new output +- **AND** updated snapshot files appear in git diff for review. diff --git a/openspec/changes/cli-val-02-output-snapshot-stability/tasks.md b/openspec/changes/cli-val-02-output-snapshot-stability/tasks.md new file mode 100644 index 00000000..b7a8dd9b --- /dev/null +++ b/openspec/changes/cli-val-02-output-snapshot-stability/tasks.md @@ -0,0 +1,69 @@ +# Tasks: cli-val-02-output-snapshot-stability + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-02-output-snapshot-stability -b feature/cli-val-02-output-snapshot-stability origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-02-output-snapshot-stability` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/output-snapshot-stability/spec.md` scenarios. +- [ ] 2.2 Inventory all existing command groups and their `--help` outputs for snapshot coverage planning. + +## 3. Test-first: snapshot test structure + +- [ ] 3.1 Add `syrupy` to dev dependencies in `pyproject.toml` (all relevant dependency sections). +- [ ] 3.2 Create `tests/snapshots/test_help_snapshots.py` with parametrized tests for all command `--help` outputs. +- [ ] 3.3 Create `tests/snapshots/test_output_snapshots.py` with tests for structured JSON/YAML outputs. +- [ ] 3.4 Create `tests/snapshots/test_error_snapshots.py` with tests for key error message templates. +- [ ] 3.5 Create normalization helpers for dynamic values (timestamps, paths, versions). +- [ ] 3.6 Run tests to generate initial snapshots; capture results in `TDD_EVIDENCE.md`. + +## 4. Configuration and hatch integration + +- [ ] 4.1 Add syrupy configuration to `pyproject.toml` (snapshot directory, serializer settings). +- [ ] 4.2 Add `snapshot-update` hatch script: `pytest tests/snapshots/ --snapshot-update`. +- [ ] 4.3 Add `snapshot-check` hatch script: `pytest tests/snapshots/` (fails on mismatch). +- [ ] 4.4 Verify all snapshots pass on clean run; record in `TDD_EVIDENCE.md`. + +## 5. Quality gates and documentation + +- [ ] 5.1 `hatch run format` — ruff format + autofix. +- [ ] 5.2 `hatch run type-check` — basedpyright strict. +- [ ] 5.3 `hatch run lint` — full lint suite. +- [ ] 5.4 `hatch run contract-test` — contract-first validation. +- [ ] 5.5 `hatch run smart-test` — targeted test run. +- [ ] 5.6 Update `docs/` with snapshot testing workflow guide for contributors. +- [ ] 5.7 Run `openspec validate cli-val-02-output-snapshot-stability --strict` and resolve all issues. + +## 6. Version and changelog + +- [ ] 6.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 6.2 Add CHANGELOG.md entry under new version section with `Added` items for snapshot testing. + +## 7. Delivery + +- [ ] 7.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 7.2 Stage and commit: `git add . && git commit -m "feat: add output snapshot stability tests (cli-val-02)"` +- [ ] 7.3 Push: `git push -u origin feature/cli-val-02-output-snapshot-stability` +- [ ] 7.4 Create PR from `feature/cli-val-02-output-snapshot-stability` to `dev` using `gh pr create`. +- [ ] 7.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-02-output-snapshot-stability` +- [ ] `git branch -d feature/cli-val-02-output-snapshot-stability` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-02-output-snapshot-stability` diff --git a/openspec/changes/cli-val-03-misuse-safety-proof/CHANGE_VALIDATION.md b/openspec/changes/cli-val-03-misuse-safety-proof/CHANGE_VALIDATION.md new file mode 100644 index 00000000..b05be384 --- /dev/null +++ b/openspec/changes/cli-val-03-misuse-safety-proof/CHANGE_VALIDATION.md @@ -0,0 +1,92 @@ +# Change Validation: cli-val-03-misuse-safety-proof + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-03-misuse-safety-proof --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** misuse-safety-proof +- **Modified capabilities:** none +- **Declared dependencies:** cli-val-01-behavior-contract-standard (#279) -- hard blocker +- **Downstream dependents:** cli-val-04-acceptance-test-runner, cli-val-05-ci-integration +- **Proposed affected code paths:** + - `tests/cli-contracts/*.scenarios.yaml` (extend with anti-patterns) + - `tests/unit/specfact_cli/test_cli_misuse_safety.py` (new) + - `tests/unit/specfact_cli/test_cli_hypothesis_fuzz.py` (new) + - `docs/` (contributor guide for anti-pattern authoring) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: Misuse Safety Proof`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses NEW/EXTEND markers with bullet list) + - "Capabilities" section: Present (one capability: `misuse-safety-proof`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #281, repository nold-ai/specfact-cli) +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.` through `## 8.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - Blocker verification: Task 1.1.6 verifies cli-val-01 is merged before starting (correct) + - PR creation: Last implementation task (`## 8. Delivery`, task 8.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 6.`) + - Version and changelog tasks: Present (`## 7.`) + - Bug triage section: Present (`## 5.` -- documents discovered bugs as separate issues) + - Note: No explicit worktree bootstrap pre-flight tasks. Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 9 scenarios in `specs/misuse-safety-proof/spec.md` + - Four requirements defined: Systematic Anti-Pattern Catalog (2 scenarios), Three-Property Safety Assertion (3 scenarios), Hypothesis Property-Based Fuzzing (3 scenarios) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly + - Note: The "Hypothesis Property-Based Fuzzing" section has 3 scenarios but is listed as the 4th requirement. Correct count is 3 requirements with 2+3+3 = 8 scenarios, plus one additional scenario under the catalog requirement for a total of 9 scenarios. +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (4 goals, 4 non-goals) + - Decisions: Present (5 decisions documented) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (5-step plan) + - Open Questions: Present (2 questions) + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 03 | cli-val-03-misuse-safety-proof | #281 | #279` + - Blocked by: #279 (cli-val-01-behavior-contract-standard) + - Wave 1.5 position: After cli-val-01 (matches dependency) +- **GitHub Issue consistency**: PASS + - Issue #281 is OPEN, title "[Change] Misuse Safety Proof" + - Issue body states: `**Blocked by**: #279 (cli-val-01-behavior-contract-standard)` + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states hard blocker: cli-val-01-behavior-contract-standard + - CHANGE_ORDER.md shows blocker: #279 + - GitHub issue #281 states: Blocked by #279 + - tasks.md task 1.1.6 includes blocker verification step + - All four sources agree + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** None. This change introduces new test files and extends existing scenario YAML files with anti-patterns. No existing production code, interfaces, contracts, or APIs are modified. +- **Risk assessment:** No breaking-change risk. Anti-pattern testing may surface production bugs, but those are tracked as separate issues per the explicit bug triage task (task 5). + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** No production CLI code changes. New test infrastructure files only. May discover production bugs (documented separately). +- **Test Impact:** New test files in `tests/unit/`. Extended scenario files in `tests/cli-contracts/`. No changes to existing tests. +- **Documentation Impact:** New contributor page in `docs/` for anti-pattern authoring conventions. +- **Release Impact:** Patch/Minor (additive only, no breaking changes) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/misuse-safety-proof/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, GitHub issue #281, and tasks.md blocker verification. +- No breaking changes detected. +- Change is ready for implementation-phase intake once cli-val-01 (#279) is implemented and merged. diff --git a/openspec/changes/cli-val-03-misuse-safety-proof/design.md b/openspec/changes/cli-val-03-misuse-safety-proof/design.md new file mode 100644 index 00000000..cd16472d --- /dev/null +++ b/openspec/changes/cli-val-03-misuse-safety-proof/design.md @@ -0,0 +1,46 @@ +## Context + +This change creates a systematic misuse/anti-pattern test suite that proves every command group fails safely on bad input. It builds on the scenario schema from cli-val-01 and adds Hypothesis fuzzing for edge cases humans do not anticipate. + +## Goals / Non-Goals + +**Goals:** + +- Create anti-pattern scenarios for all Wave 1 command groups using the cli-val-01 YAML schema +- Implement a test runner that asserts three safety properties per anti-pattern +- Add Hypothesis strategies for 3-5 major command groups +- Surface and document any production code bugs found during anti-pattern testing + +**Non-Goals:** + +- No production CLI code changes in this change scope (bugs found are tracked separately) +- No acceptance test runner (that is cli-val-04) +- No CI integration (that is cli-val-05) +- No interactive prompt testing (pexpect-based testing is out of scope) + +## Decisions + +- Anti-pattern scenarios are stored in the same `tests/cli-contracts/` directory as patterns, using `type: anti-pattern` in the YAML +- The three-property assertion (exit code, error message, filesystem) is implemented as a reusable test helper function to avoid duplication +- Hypothesis settings: `max_examples=50`, `deadline=30000ms` per command group — bounded to keep CI runtime predictable +- Traceback detection uses regex matching for `Traceback (most recent call last)` — covers standard Python tracebacks +- Filesystem side-effect detection compares directory snapshots before and after command execution using `tmp_path` + +## Risks / Trade-offs + +- [Anti-patterns may reveal production bugs] -> Mitigation: track found bugs as separate issues; do not block this change on fixing them +- [Hypothesis may find flaky edge cases] -> Mitigation: use `@settings(suppress_health_check=[HealthCheck.too_slow])` and bounded strategies +- [Coverage of all misuse categories] -> Mitigation: start with the 5 common categories (missing args, invalid flags, bad paths, malformed input, forbidden combos); extend incrementally + +## Migration Plan + +1. Create anti-pattern YAML scenarios for all Wave 1 command groups +2. Implement three-property assertion helper +3. Create test files that load anti-patterns and run assertions +4. Add Hypothesis strategies for 3-5 major commands +5. Run full suite and document findings + +## Open Questions + +- Whether Hypothesis-discovered edge cases should be added back to the anti-pattern catalog automatically or manually curated +- Whether to include `--debug` mode testing (which intentionally shows tracebacks) as a separate scenario category diff --git a/openspec/changes/cli-val-03-misuse-safety-proof/proposal.md b/openspec/changes/cli-val-03-misuse-safety-proof/proposal.md new file mode 100644 index 00000000..f6e88c7e --- /dev/null +++ b/openspec/changes/cli-val-03-misuse-safety-proof/proposal.md @@ -0,0 +1,38 @@ +# Change: Misuse Safety Proof + +## Why + +When a user invokes SpecFact CLI with wrong flags, missing files, invalid YAML, or forbidden option combinations, the CLI should always: exit non-zero, print a human-readable error to stderr, and leave no partial artifacts on disk. Today, anti-pattern handling is tested ad-hoc in some test files but not systematically per command. Users have encountered stack traces in error cases. A systematic anti-pattern catalog per command group — combined with Hypothesis property-based fuzzing — proves that every misuse case fails safely and predictably. + +## What Changes + +- **NEW**: Anti-pattern catalog per command group in `tests/cli-contracts/` following the scenario schema from cli-val-01 +- **NEW**: Anti-pattern test suite in `tests/unit/specfact_cli/test_cli_misuse_safety.py` asserting three properties for every anti-pattern: (1) non-zero exit, (2) human-readable error without tracebacks, (3) no unintended filesystem side effects +- **NEW**: Hypothesis property-based test strategies in `tests/unit/specfact_cli/test_cli_hypothesis_fuzz.py` generating invalid enum values, path edge cases, Unicode edge cases per major command group +- **EXTEND**: Existing CliRunner test patterns to include systematic traceback-absence assertions + +## Capabilities + +### New Capabilities + +- `misuse-safety-proof`: Systematic anti-pattern testing for every command group — codifying wrong flags, missing values, invalid paths, malformed input files, illegal combinations — with three-property assertion (non-zero exit, clean error, no side effects) and Hypothesis fuzzing for undiscovered edge cases. + +## Impact + +- **Affected specs**: No existing specs modified; new spec delta defines anti-pattern testing requirements +- **Affected code**: No production CLI code changes; new test files only (may surface bugs that require production fixes) +- **Integration points**: Consumes cli-val-01 scenario schema for anti-pattern definitions; consumed by cli-val-04 (acceptance runner executes anti-patterns) and cli-val-05 (CI gate) +- **Documentation impact**: Contributor guide update describing anti-pattern authoring conventions + +## Dependencies + +- **Hard blocker**: cli-val-01-behavior-contract-standard (anti-patterns follow the scenario YAML schema) +- Downstream dependents: cli-val-04-acceptance-test-runner, cli-val-05-ci-integration + +## Source Tracking + + +- **GitHub Issue**: #281 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/281 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-03-misuse-safety-proof/specs/misuse-safety-proof/spec.md b/openspec/changes/cli-val-03-misuse-safety-proof/specs/misuse-safety-proof/spec.md new file mode 100644 index 00000000..a137f6c4 --- /dev/null +++ b/openspec/changes/cli-val-03-misuse-safety-proof/specs/misuse-safety-proof/spec.md @@ -0,0 +1,66 @@ +## ADDED Requirements + +### Requirement: Systematic Anti-Pattern Catalog + +The system SHALL maintain an anti-pattern catalog for every command group documenting known misuse cases and their expected safe failure behavior. + +#### Scenario: Each command group has anti-pattern scenarios + +- **GIVEN** a command group registered in the CLI +- **WHEN** its anti-pattern catalog is loaded +- **THEN** at least 3 anti-pattern scenarios are defined +- **AND** each scenario specifies exact argv, expected non-zero exit, and error pattern. + +#### Scenario: Anti-patterns cover common misuse categories + +- **GIVEN** an anti-pattern catalog for a command group +- **WHEN** the catalog is reviewed +- **THEN** it includes scenarios for: missing required arguments, invalid flag values, nonexistent file paths, malformed input files, and forbidden option combinations where applicable. + +### Requirement: Three-Property Safety Assertion + +Every anti-pattern test SHALL assert three properties: non-zero exit, human-readable error, and no unintended side effects. + +#### Scenario: Non-zero exit on invalid input + +- **GIVEN** a CLI command invoked with an anti-pattern argv +- **WHEN** the command executes +- **THEN** the exit code is non-zero. + +#### Scenario: Human-readable error without traceback + +- **GIVEN** a CLI command invoked with an anti-pattern argv and `--debug` is NOT set +- **WHEN** the command produces error output +- **THEN** stderr contains a human-readable error message +- **AND** neither stdout nor stderr contains a Python traceback (`Traceback (most recent call last)`). + +#### Scenario: No unintended filesystem side effects + +- **GIVEN** a CLI command invoked with an anti-pattern argv inside a sandboxed `tmp_path` workspace +- **WHEN** the command fails +- **THEN** no files are created or modified in the workspace beyond what existed before invocation. + +### Requirement: Hypothesis Property-Based Fuzzing + +The system SHALL use Hypothesis to generate edge-case inputs for major command groups and assert safe failure. + +#### Scenario: Fuzz testing with invalid enum values + +- **GIVEN** a command that accepts enum-typed arguments +- **WHEN** Hypothesis generates values outside the valid enum set +- **THEN** the command exits non-zero with a descriptive error +- **AND** no crash or unhandled exception occurs. + +#### Scenario: Fuzz testing with path edge cases + +- **GIVEN** a command that accepts file path arguments +- **WHEN** Hypothesis generates paths with Unicode, spaces, empty strings, and deeply nested paths +- **THEN** the command exits non-zero for invalid paths +- **AND** no crash or unhandled exception occurs. + +#### Scenario: Fuzz testing completes within time budget + +- **GIVEN** Hypothesis strategies configured for CLI fuzzing +- **WHEN** the fuzz suite runs +- **THEN** all tests complete within 30 seconds per command group +- **AND** any failure is reported with a minimal reproducing example. diff --git a/openspec/changes/cli-val-03-misuse-safety-proof/tasks.md b/openspec/changes/cli-val-03-misuse-safety-proof/tasks.md new file mode 100644 index 00000000..c028e55e --- /dev/null +++ b/openspec/changes/cli-val-03-misuse-safety-proof/tasks.md @@ -0,0 +1,79 @@ +# Tasks: cli-val-03-misuse-safety-proof + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-03-misuse-safety-proof -b feature/cli-val-03-misuse-safety-proof origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-03-misuse-safety-proof` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + - [ ] 1.1.6 Verify cli-val-01 is merged into dev before starting. + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/misuse-safety-proof/spec.md` scenarios. +- [ ] 2.2 Inventory all Wave 1 command groups and their argument signatures for anti-pattern coverage. + +## 3. Anti-pattern catalog creation + +- [ ] 3.1 Create anti-pattern YAML scenarios for each Wave 1 command group in `tests/cli-contracts/`: + - `backlog-ceremony-standup.scenarios.yaml` (extend with anti-patterns) + - `validate.scenarios.yaml` (extend with anti-patterns) + - Additional command groups as identified in 2.2 +- [ ] 3.2 Ensure each command group has at least 3 anti-pattern scenarios covering: missing args, invalid flags, bad paths. + +## 4. Test-first: safety assertion suite + +- [ ] 4.1 Create `tests/unit/specfact_cli/test_cli_misuse_safety.py` with: + - Reusable three-property assertion helper (exit code, error message, filesystem) + - Parametrized tests loading anti-patterns from YAML scenario files + - Traceback-absence assertion for non-debug mode +- [ ] 4.2 Create `tests/unit/specfact_cli/test_cli_hypothesis_fuzz.py` with: + - Hypothesis strategies for invalid enum values + - Hypothesis strategies for path edge cases (Unicode, spaces, empty, deeply nested) + - Bounded settings (`max_examples=50`, `deadline=30s`) +- [ ] 4.3 Run tests and capture results in `TDD_EVIDENCE.md` (expect some failures from discovered bugs). + +## 5. Bug triage and documentation + +- [ ] 5.1 Document any production bugs discovered during anti-pattern testing as separate GitHub issues. +- [ ] 5.2 For each discovered bug, add a `# Known issue: #NNN` comment in the anti-pattern scenario. + +## 6. Quality gates and documentation + +- [ ] 6.1 `hatch run format` — ruff format + autofix. +- [ ] 6.2 `hatch run type-check` — basedpyright strict. +- [ ] 6.3 `hatch run lint` — full lint suite. +- [ ] 6.4 `hatch run contract-test` — contract-first validation. +- [ ] 6.5 `hatch run smart-test` — targeted test run. +- [ ] 6.6 Update `docs/` with anti-pattern authoring guide for contributors. +- [ ] 6.7 Run `openspec validate cli-val-03-misuse-safety-proof --strict` and resolve all issues. + +## 7. Version and changelog + +- [ ] 7.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 7.2 Add CHANGELOG.md entry under new version section with `Added` items for misuse safety proof. + +## 8. Delivery + +- [ ] 8.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 8.2 Stage and commit: `git add . && git commit -m "feat: add misuse safety proof tests (cli-val-03)"` +- [ ] 8.3 Push: `git push -u origin feature/cli-val-03-misuse-safety-proof` +- [ ] 8.4 Create PR from `feature/cli-val-03-misuse-safety-proof` to `dev` using `gh pr create`. +- [ ] 8.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-03-misuse-safety-proof` +- [ ] `git branch -d feature/cli-val-03-misuse-safety-proof` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-03-misuse-safety-proof` diff --git a/openspec/changes/cli-val-04-acceptance-test-runner/CHANGE_VALIDATION.md b/openspec/changes/cli-val-04-acceptance-test-runner/CHANGE_VALIDATION.md new file mode 100644 index 00000000..e6b3319b --- /dev/null +++ b/openspec/changes/cli-val-04-acceptance-test-runner/CHANGE_VALIDATION.md @@ -0,0 +1,93 @@ +# Change Validation: cli-val-04-acceptance-test-runner + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-04-acceptance-test-runner --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** acceptance-test-runner +- **Modified capabilities:** none +- **Declared dependencies:** cli-val-01-behavior-contract-standard (#279), cli-val-03-misuse-safety-proof (#281) -- both hard blockers +- **Downstream dependents:** cli-val-05-ci-integration +- **Proposed affected code paths:** + - `tools/cli_acceptance_runner.py` (new dual-path runner) + - `tests/e2e/test_cli_acceptance.py` (new pytest integration) + - `tests/e2e/test_cli_chain_init.py` (new flagship chain test) + - `tests/e2e/test_cli_chain_validate.py` (new flagship chain test) + - `tests/e2e/test_cli_chain_help.py` (new flagship chain test) + - `pyproject.toml` (extend with hatch scripts and pytest marker) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: Acceptance Test Runner`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses NEW/EXTEND markers with bullet list) + - "Capabilities" section: Present (one capability: `acceptance-test-runner`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #282, repository nold-ai/specfact-cli) +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.` through `## 8.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - Blocker verification: Task 1.1.6 verifies cli-val-01 and cli-val-03 are merged before starting (correct -- both hard blockers) + - PR creation: Last implementation task (`## 8. Delivery`, task 8.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 6.`) + - Version and changelog tasks: Present (`## 7.`) + - Flagship command chain tests: Present (`## 5.` -- 3 chain tests for init, validate, help workflows) + - Note: No explicit worktree bootstrap pre-flight tasks. Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 8 scenarios in `specs/acceptance-test-runner/spec.md` + - Four requirements defined: Dual-Path Scenario Execution (3 scenarios), YAML Scenario Loading (2 scenarios), Pytest Integration (2 scenarios), Flagship Command Chain Tests (1 scenario) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (5 goals, 4 non-goals) + - Decisions: Present (5 decisions documented) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (5-step plan) + - Open Questions: Present (2 questions) + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 04 | cli-val-04-acceptance-test-runner | #282 | #279, #281` + - Blocked by: #279 (cli-val-01), #281 (cli-val-03) + - Wave 1.5 position: After cli-val-01 + cli-val-03 (matches dependency chain) +- **GitHub Issue consistency**: PASS + - Issue #282 is OPEN, title "[Change] Acceptance Test Runner" + - Issue body states: `**Blocked by**: #279 (cli-val-01), #281 (cli-val-03)` + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states two hard blockers: cli-val-01 (YAML scenario format), cli-val-03 (anti-patterns) + - CHANGE_ORDER.md shows blockers: #279, #281 + - GitHub issue #282 states: Blocked by #279, #281 + - tasks.md task 1.1.6 includes blocker verification for both + - All four sources agree + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** None. This change introduces new tools and test files. No existing production code, interfaces, contracts, or APIs are modified. +- **Risk assessment:** No breaking-change risk. All artifacts are additive. The dual-path runner is a new tool with no impact on existing CLI behavior. + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** No production CLI code changes. New test runner tool and e2e test files only. +- **Test Impact:** New tool in `tools/`. New test files in `tests/e2e/` and `tests/unit/tools/`. New pytest marker `@pytest.mark.blackbox`. New hatch scripts (`cli-acceptance-fast`, `cli-acceptance-blackbox`). No changes to existing tests. +- **Documentation Impact:** New contributor page in `docs/` for acceptance test workflow. +- **Release Impact:** Patch/Minor (additive only, no breaking changes) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/acceptance-test-runner/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, GitHub issue #282, and tasks.md blocker verification. +- No breaking changes detected. +- Change is ready for implementation-phase intake once cli-val-01 (#279) and cli-val-03 (#281) are implemented and merged. diff --git a/openspec/changes/cli-val-04-acceptance-test-runner/design.md b/openspec/changes/cli-val-04-acceptance-test-runner/design.md new file mode 100644 index 00000000..08abc8a9 --- /dev/null +++ b/openspec/changes/cli-val-04-acceptance-test-runner/design.md @@ -0,0 +1,47 @@ +## Context + +This change implements the runner that compiles CLI behavior scenarios (from cli-val-01) and anti-patterns (from cli-val-03) into executable tests, supporting both fast in-process and true black-box execution paths. + +## Goals / Non-Goals + +**Goals:** + +- Implement a dual-path runner (CliRunner + subprocess.run) that reads YAML scenarios +- Integrate with pytest collection for standard test reporting +- Support workspace setup, output assertions, and filesystem diff verification +- Create flagship command chain acceptance tests for key workflows +- Add hatch scripts for both execution paths + +**Non-Goals:** + +- No CI gating (that is cli-val-05) +- No scenario file authoring (that is cli-val-01 + cli-val-03) +- No interactive prompt testing (pexpect integration is future scope) +- No Cram/Prysk/Scrut integration in this iteration (evaluate after initial runner proves value) + +## Decisions + +- Runner is a tool (`tools/cli_acceptance_runner.py`) rather than a pytest plugin — keeps it simple and reusable +- Subprocess path uses `subprocess.run()` with `capture_output=True` and `text=True` — standard Python, no external dependencies +- Workspace context setup uses a factory pattern: `empty-repo`, `sample-bundle`, `initialized-project` are predefined fixtures +- Black-box tests use a `@pytest.mark.blackbox` marker for selective execution in CI +- Flagship command chain tests are hand-written in `tests/e2e/` rather than YAML-driven — captures workflow sequencing that YAML scenarios cannot express + +## Risks / Trade-offs + +- [subprocess tests depend on installed binary] -> Mitigation: CI installs via `pip install -e .` before running black-box tests +- [Subprocess tests are slower] -> Mitigation: mark as `blackbox` for selective execution; fast path covers most validation +- [Context setup complexity] -> Mitigation: start with 3 simple context types; extend as needed + +## Migration Plan + +1. Implement scenario loader and dual-path runner +2. Create pytest integration test file +3. Wire hatch scripts for fast and black-box paths +4. Create 3-5 flagship command chain tests +5. Verify both paths produce consistent results + +## Open Questions + +- Whether to adopt Cram/Prysk/Scrut for the flagship tests in a future iteration +- Whether the black-box path should test the `specfact` or `specfact-cli` entry point (or both) diff --git a/openspec/changes/cli-val-04-acceptance-test-runner/proposal.md b/openspec/changes/cli-val-04-acceptance-test-runner/proposal.md new file mode 100644 index 00000000..9fb421c0 --- /dev/null +++ b/openspec/changes/cli-val-04-acceptance-test-runner/proposal.md @@ -0,0 +1,41 @@ +# Change: Acceptance Test Runner + +## Why + +The 73+ existing CliRunner tests prove commands work in-process. But SpecFact is distributed as a pip-installable binary (`specfact` and `specfact-cli` entry points). Nothing currently proves the installed binary works — entry point resolution, environment variable handling, exit code propagation to shell, real stdout/stderr separation. A packaging bug could ship a CLI that passes all tests but fails on `pip install specfact-cli && specfact --help`. A dual-path test runner that executes CLI behavior scenarios both in-process (fast) and as a real subprocess (true black-box) closes this gap. + +## What Changes + +- **NEW**: Dual-path scenario runner in `tools/cli_acceptance_runner.py` that reads YAML scenarios from cli-val-01 and executes them via: + - Fast path: `typer.testing.CliRunner` (for development speed) + - Black-box path: `subprocess.run()` against the installed `specfact` binary (for CI/release validation) +- **NEW**: Acceptance test file `tests/e2e/test_cli_acceptance.py` wiring the runner into pytest collection +- **NEW**: Small set of Cram-style `.t` or Markdown acceptance tests for 3-5 flagship command chains (living documentation + executable tests) +- **EXTEND**: `pyproject.toml` with hatch scripts for fast-path and black-box acceptance runs + +## Capabilities + +### New Capabilities + +- `acceptance-test-runner`: A dual-path test runner that executes CLI behavior scenarios from YAML files via CliRunner (fast, in-process) or subprocess (black-box, against installed binary). Supports workspace setup, exit code assertions, output pattern matching, and filesystem diff verification. + +## Impact + +- **Affected specs**: No existing specs modified; new spec delta defines runner behavior +- **Affected code**: New tools/ and tests/ files only; no production CLI code changes +- **Integration points**: Consumes cli-val-01 YAML scenarios and cli-val-03 anti-patterns; consumed by cli-val-05 (CI gate runs black-box path) +- **Documentation impact**: Developer guide update describing acceptance test workflow and how to add new scenarios + +## Dependencies + +- **Hard blocker**: cli-val-01-behavior-contract-standard (YAML scenario format) +- **Hard blocker**: cli-val-03-misuse-safety-proof (anti-patterns run through the runner) +- Downstream dependents: cli-val-05-ci-integration + +## Source Tracking + + +- **GitHub Issue**: #282 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/282 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-04-acceptance-test-runner/specs/acceptance-test-runner/spec.md b/openspec/changes/cli-val-04-acceptance-test-runner/specs/acceptance-test-runner/spec.md new file mode 100644 index 00000000..b0458add --- /dev/null +++ b/openspec/changes/cli-val-04-acceptance-test-runner/specs/acceptance-test-runner/spec.md @@ -0,0 +1,76 @@ +## ADDED Requirements + +### Requirement: Dual-Path Scenario Execution + +The system SHALL execute CLI behavior scenarios via both in-process (CliRunner) and subprocess (installed binary) paths. + +#### Scenario: Fast path executes scenario via CliRunner + +- **GIVEN** a YAML scenario file with argv, expected exit code, and output patterns +- **WHEN** the runner executes the scenario in fast path mode +- **THEN** the scenario is invoked via `typer.testing.CliRunner` +- **AND** exit code and output assertions are verified +- **AND** execution completes in under 1 second per scenario. + +#### Scenario: Black-box path executes scenario via subprocess + +- **GIVEN** a YAML scenario file and the `specfact` binary is installed on PATH +- **WHEN** the runner executes the scenario in black-box mode +- **THEN** the scenario is invoked via `subprocess.run()` +- **AND** real exit code, stdout, and stderr are captured and asserted +- **AND** the test validates the installed binary, not the source tree. + +#### Scenario: Filesystem diff verification in sandboxed workspace + +- **GIVEN** a scenario with `fs.creates` and `fs.forbidden` expectations +- **WHEN** the runner executes the scenario in a sandboxed `tmp_path` workspace +- **THEN** expected files are verified as created +- **AND** forbidden files are verified as absent +- **AND** the workspace is isolated from the real filesystem. + +### Requirement: YAML Scenario Loading + +The system SHALL load and parse YAML scenario files following the cli-val-01 schema. + +#### Scenario: Runner discovers and loads all scenario files + +- **GIVEN** scenario YAML files exist in `tests/cli-contracts/` +- **WHEN** the runner initializes +- **THEN** all `.scenarios.yaml` files are discovered and parsed +- **AND** both pattern and anti-pattern scenarios are loaded. + +#### Scenario: Runner skips scenarios with unmet context requirements + +- **GIVEN** a scenario requiring context `requires: sample-bundle` and the workspace lacks a sample bundle +- **WHEN** the runner evaluates the scenario +- **THEN** the scenario is skipped with a descriptive message +- **AND** skipped scenarios do not count as failures. + +### Requirement: Pytest Integration + +The system SHALL integrate with pytest so acceptance tests appear in standard test reports. + +#### Scenario: Acceptance tests collected by pytest + +- **GIVEN** `tests/e2e/test_cli_acceptance.py` exists +- **WHEN** `pytest tests/e2e/test_cli_acceptance.py` is run +- **THEN** each scenario appears as a separate test case +- **AND** test names include the scenario name for identification. + +#### Scenario: Black-box tests selectable via marker + +- **GIVEN** acceptance tests with both fast and black-box paths +- **WHEN** `pytest -m "blackbox"` is run +- **THEN** only black-box (subprocess) tests execute +- **AND** fast-path tests are excluded. + +### Requirement: Flagship Command Chain Tests + +The system SHALL include living-documentation acceptance tests for 3-5 flagship command workflows. + +#### Scenario: Init workflow tested end-to-end + +- **GIVEN** an empty temporary directory +- **WHEN** `specfact init` is run followed by a validation command +- **THEN** the full workflow completes successfully +- **AND** the test serves as executable documentation of the expected workflow. diff --git a/openspec/changes/cli-val-04-acceptance-test-runner/tasks.md b/openspec/changes/cli-val-04-acceptance-test-runner/tasks.md new file mode 100644 index 00000000..9b92f7c6 --- /dev/null +++ b/openspec/changes/cli-val-04-acceptance-test-runner/tasks.md @@ -0,0 +1,80 @@ +# Tasks: cli-val-04-acceptance-test-runner + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-04-acceptance-test-runner -b feature/cli-val-04-acceptance-test-runner origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-04-acceptance-test-runner` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + - [ ] 1.1.6 Verify cli-val-01 and cli-val-03 are merged into dev before starting. + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/acceptance-test-runner/spec.md` scenarios. +- [ ] 2.2 Verify cli-val-01 scenario files and cli-val-03 anti-patterns are available in the worktree. + +## 3. Test-first: runner tests + +- [ ] 3.1 Create `tests/unit/tools/test_cli_acceptance_runner.py` with tests for: + - YAML scenario loading and parsing + - Fast path (CliRunner) execution and assertion + - Black-box path (subprocess) execution and assertion + - Filesystem diff verification + - Context setup (empty-repo, sample-bundle, initialized-project) +- [ ] 3.2 Run tests and capture failing results in `TDD_EVIDENCE.md`. + +## 4. Implementation: dual-path runner + +- [ ] 4.1 Create `tools/cli_acceptance_runner.py` — scenario loader, dual-path executor, assertion engine. +- [ ] 4.2 Implement workspace context factory (empty-repo, sample-bundle, initialized-project fixtures). +- [ ] 4.3 Create `tests/e2e/test_cli_acceptance.py` — pytest integration wiring scenarios to test cases. +- [ ] 4.4 Add `@pytest.mark.blackbox` marker to `pyproject.toml` markers list. +- [ ] 4.5 Add hatch scripts: `cli-acceptance-fast` (CliRunner path) and `cli-acceptance-blackbox` (subprocess path). +- [ ] 4.6 Re-run tests until passing; record in `TDD_EVIDENCE.md`. + +## 5. Flagship command chain tests + +- [ ] 5.1 Create `tests/e2e/test_cli_chain_init.py` — init workflow end-to-end test. +- [ ] 5.2 Create `tests/e2e/test_cli_chain_validate.py` — validate workflow end-to-end test. +- [ ] 5.3 Create `tests/e2e/test_cli_chain_help.py` — help and version end-to-end test. +- [ ] 5.4 Verify all flagship tests pass in both fast and black-box modes. + +## 6. Quality gates and documentation + +- [ ] 6.1 `hatch run format` — ruff format + autofix. +- [ ] 6.2 `hatch run type-check` — basedpyright strict. +- [ ] 6.3 `hatch run lint` — full lint suite. +- [ ] 6.4 `hatch run contract-test` — contract-first validation. +- [ ] 6.5 `hatch run smart-test` — targeted test run. +- [ ] 6.6 Update `docs/` with acceptance test workflow guide. +- [ ] 6.7 Run `openspec validate cli-val-04-acceptance-test-runner --strict` and resolve all issues. + +## 7. Version and changelog + +- [ ] 7.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 7.2 Add CHANGELOG.md entry under new version section with `Added` items for acceptance test runner. + +## 8. Delivery + +- [ ] 8.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 8.2 Stage and commit: `git add . && git commit -m "feat: add CLI acceptance test runner (cli-val-04)"` +- [ ] 8.3 Push: `git push -u origin feature/cli-val-04-acceptance-test-runner` +- [ ] 8.4 Create PR from `feature/cli-val-04-acceptance-test-runner` to `dev` using `gh pr create`. +- [ ] 8.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-04-acceptance-test-runner` +- [ ] `git branch -d feature/cli-val-04-acceptance-test-runner` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-04-acceptance-test-runner` diff --git a/openspec/changes/cli-val-05-ci-integration/CHANGE_VALIDATION.md b/openspec/changes/cli-val-05-ci-integration/CHANGE_VALIDATION.md new file mode 100644 index 00000000..f86828fd --- /dev/null +++ b/openspec/changes/cli-val-05-ci-integration/CHANGE_VALIDATION.md @@ -0,0 +1,98 @@ +# Change Validation: cli-val-05-ci-integration + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-05-ci-integration --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** cli-validation-ci-gates +- **Modified capabilities:** none +- **Declared dependencies:** cli-val-02-output-snapshot-stability (#280), cli-val-04-acceptance-test-runner (#282) -- both hard blockers +- **Downstream dependents:** cli-val-06-copilot-test-generation (soft dependency for enforcement convention) +- **Proposed affected code paths:** + - `.github/workflows/pr-orchestrator.yml` (extend with new validation steps and job) + - `.github/workflows/snapshot-update.yml` (new manual workflow) + - `tools/contract_first_smart_test.py` (extend with CLI behavior contract tier) + - `pyproject.toml` (extend with combined CLI validation hatch script) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: CLI Validation CI Integration`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses EXTEND/NEW markers with bullet list and numbered sub-items) + - "Capabilities" section: Present (one capability: `cli-validation-ci-gates`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #283, repository nold-ai/specfact-cli) + - Note: Proposal mentions cli-val-06 as a downstream dependent for "enforcement convention", but the actual dependency direction is soft (cli-val-06 softly depends on cli-val-05, not the reverse). The proposal correctly captures this as "Downstream dependents" which is accurate. +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.` through `## 8.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - Blocker verification: Task 1.1.6 verifies cli-val-02 and cli-val-04 are merged before starting (correct -- both hard blockers) + - PR creation: Last implementation task (`## 8. Delivery`, task 8.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 6.`) + - Version and changelog tasks: Present (`## 7.`) + - Contract-test tier extension: Present (`## 5.` -- extends contract_first_smart_test.py) + - Note: No explicit worktree bootstrap pre-flight tasks. Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 7 scenarios in `specs/cli-validation-ci-gates/spec.md` + - Four requirements defined: Snapshot Validation Gate (2 scenarios), Black-Box Acceptance Gate (2 scenarios), Tiered Gating Policy (2 scenarios), Contract-Test Tier Extension (1 scenario) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (5 goals, 3 non-goals) + - Decisions: Present (6 decisions documented) + - CI Job Dependency Chain: Present (diagram showing job flow) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (6-step plan) + - Open Questions: Present (2 questions) + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 05 | cli-val-05-ci-integration | #283 | #280, #282` + - Blocked by: #280 (cli-val-02), #282 (cli-val-04) + - Wave 1.5 position: Capstone -- after cli-val-02 + cli-val-04 (matches dependency chain) +- **GitHub Issue consistency**: PASS + - Issue #283 is OPEN, title "[Change] CLI Validation CI Integration" + - Issue body states: `**Blocked by**: #280 (cli-val-02), #282 (cli-val-04)` + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states two hard blockers: cli-val-02 (snapshot tests), cli-val-04 (acceptance runner) + - CHANGE_ORDER.md shows blockers: #280, #282 + - GitHub issue #283 states: Blocked by #280, #282 + - tasks.md task 1.1.6 includes blocker verification for both + - All four sources agree +- **Transitive dependency note**: cli-val-05 transitively depends on cli-val-01 and cli-val-03 through cli-val-04. This is correctly represented in the CHANGE_ORDER.md wave sequencing (cli-val-05 is the capstone of Wave 1.5). + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** None at the production code level. This change modifies CI workflow YAML files and extends the contract-test tool. No existing production CLI interfaces, contracts, or APIs are modified. +- **CI workflow changes:** + - `pr-orchestrator.yml` gains new steps and a new job (`cli-acceptance`). These are additive extensions to existing CI infrastructure. + - New `snapshot-update.yml` workflow is manual-trigger only (workflow_dispatch). + - `contract_first_smart_test.py` gains a new CLI contract tier. This is additive to the existing tier architecture. +- **Risk assessment:** No breaking-change risk. All changes are additive to CI infrastructure. Existing CI jobs remain unmodified in their current behavior. + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** No production CLI code changes. CI workflow YAML and contract-test tool extension only. +- **Test Impact:** New CI test step in `tests/unit/tools/`. No changes to existing tests. Existing CI jobs gain new siblings but are not modified. +- **Documentation Impact:** CI documentation update describing new gates and snapshot update workflow. +- **Release Impact:** Patch/Minor (additive only, no breaking changes to production code) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/cli-validation-ci-gates/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, GitHub issue #283, and tasks.md blocker verification. +- No breaking changes detected. +- Change is ready for implementation-phase intake once cli-val-02 (#280) and cli-val-04 (#282) are implemented and merged. diff --git a/openspec/changes/cli-val-05-ci-integration/design.md b/openspec/changes/cli-val-05-ci-integration/design.md new file mode 100644 index 00000000..03acef00 --- /dev/null +++ b/openspec/changes/cli-val-05-ci-integration/design.md @@ -0,0 +1,58 @@ +## Context + +This change wires the CLI validation layer (snapshots, anti-patterns, acceptance tests) into the existing CI pipeline. It extends `pr-orchestrator.yml` with new jobs and steps, and adds a snapshot update workflow. + +## Goals / Non-Goals + +**Goals:** + +- Add snapshot validation as a hard gate in the existing tests job +- Add black-box acceptance testing as a new job with wheel installation +- Define and document tiered gating (hard vs advisory) +- Add snapshot update workflow for developers +- Extend contract-test system with CLI behavior contract tier + +**Non-Goals:** + +- No new test creation (those are cli-val-01 through cli-val-04) +- No production CLI code changes +- No changes to existing test behavior — only CI pipeline configuration + +## Decisions + +- Snapshot tests run in the existing `tests` job for speed (no separate job overhead) +- Black-box acceptance tests run in a new `cli-acceptance` job that builds and installs the wheel first — true black-box isolation +- The `cli-acceptance` job depends on `tests` (fast path validates first; no point running slow black-box if fast path fails) +- Snapshot update workflow is `workflow_dispatch` only (manual trigger) — prevents accidental automated updates +- Hypothesis edge cases are advisory, not hard gates — avoids blocking PRs on non-deterministic fuzz results +- CLI behavior contracts are added as a new tier in `tools/contract_first_smart_test.py` — fits the existing tiered architecture + +## CI Job Dependency Chain + +``` +tests (existing) → cli-acceptance (new, hard gate) → package-validation (existing) + ↓ + snapshot check (step in tests, hard gate) + anti-pattern safety (step in tests, hard gate) + hypothesis fuzz (step in tests, advisory) +``` + +## Risks / Trade-offs + +- [Black-box tests add CI runtime] -> Mitigation: only 3-5 flagship chain tests + YAML scenarios; bounded to ~60s +- [Snapshot updates add PR friction] -> Mitigation: clear workflow documentation; snapshot update job is one-click +- [dev-to-main PR skip may bypass new gates] -> Mitigation: new gates run on all PRs to dev; main skip applies only to already-validated code + +## Migration Plan + +1. Add snapshot check step to `tests` job in pr-orchestrator.yml +2. Add cli-acceptance job to pr-orchestrator.yml +3. Create snapshot update workflow file +4. Extend contract_first_smart_test.py with CLI contract tier +5. Add combined hatch script +6. Test CI changes on a feature branch PR + +## Open Questions + +- Whether to add the CLI acceptance gate to specfact.yml as well (dedicated contract validation workflow) +- Whether dev-to-main PRs should run the full CLI validation or rely on dev-branch validation diff --git a/openspec/changes/cli-val-05-ci-integration/proposal.md b/openspec/changes/cli-val-05-ci-integration/proposal.md new file mode 100644 index 00000000..1d57b19e --- /dev/null +++ b/openspec/changes/cli-val-05-ci-integration/proposal.md @@ -0,0 +1,42 @@ +# Change: CLI Validation CI Integration + +## Why + +Validation only has value if it runs automatically. Today, `pr-orchestrator.yml` runs a `cli-validation` job that only checks `specfact --help` — a single smoke test. The new validation layers (snapshots from cli-val-02, anti-patterns from cli-val-03, acceptance tests from cli-val-04) need to slot into CI with clear gating rules so broken CLI behavior never reaches PyPI. This change wires the complete CLI validation layer into the existing CI pipeline with tiered gates (hard and advisory). + +## What Changes + +- **EXTEND**: `.github/workflows/pr-orchestrator.yml` with three new validation steps: + 1. Snapshot validation step in existing `tests` job (fast, fails on mismatch) + 2. Black-box acceptance test job (installs wheel, runs subprocess-path scenarios) + 3. Anti-pattern safety assertion step +- **NEW**: Snapshot update CI workflow (manual trigger) for developers to update snapshots after intentional output changes +- **EXTEND**: `contract-test` system to include CLI behavior contracts as a new tier +- **EXTEND**: `pyproject.toml` hatch scripts with combined CLI validation command + +## Capabilities + +### New Capabilities + +- `cli-validation-ci-gates`: CI pipeline integration for the CLI validation layer — snapshot mismatch detection, black-box acceptance testing, anti-pattern safety verification — with tiered gating (hard gates block merge, advisory gates warn). + +## Impact + +- **Affected specs**: No existing specs modified +- **Affected code**: CI workflow YAML files and pyproject.toml hatch scripts only; no production CLI code changes +- **Integration points**: Consumes cli-val-02 (snapshots), cli-val-03 (anti-patterns), cli-val-04 (acceptance runner) +- **Documentation impact**: CI documentation update describing new gates and snapshot update workflow + +## Dependencies + +- **Hard blocker**: cli-val-02-output-snapshot-stability (snapshot tests to gate) +- **Hard blocker**: cli-val-04-acceptance-test-runner (acceptance tests to gate) +- Downstream dependents: cli-val-06-copilot-test-generation (enforcement convention) + +## Source Tracking + + +- **GitHub Issue**: #283 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/283 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-05-ci-integration/specs/cli-validation-ci-gates/spec.md b/openspec/changes/cli-val-05-ci-integration/specs/cli-validation-ci-gates/spec.md new file mode 100644 index 00000000..94a01108 --- /dev/null +++ b/openspec/changes/cli-val-05-ci-integration/specs/cli-validation-ci-gates/spec.md @@ -0,0 +1,66 @@ +## ADDED Requirements + +### Requirement: Snapshot Validation Gate + +The CI pipeline SHALL fail on snapshot mismatches to prevent unintentional output changes. + +#### Scenario: PR fails when snapshots do not match + +- **GIVEN** a PR modifies CLI command output (help text, error messages, structured output) +- **WHEN** the CI pipeline runs snapshot tests without `--snapshot-update` flag +- **THEN** the pipeline fails with a clear error listing mismatched snapshots +- **AND** the PR cannot merge until snapshots are updated or code is fixed. + +#### Scenario: Snapshot update workflow available for intentional changes + +- **GIVEN** a developer intentionally changes CLI output +- **WHEN** the developer triggers the snapshot update workflow +- **THEN** snapshots are regenerated and committed +- **AND** the updated snapshot files appear in the PR diff for review. + +### Requirement: Black-Box Acceptance Gate + +The CI pipeline SHALL run acceptance tests against the installed binary as a hard gate on PRs to main. + +#### Scenario: Black-box tests run against installed wheel + +- **GIVEN** the CI pipeline builds a wheel from the PR branch +- **WHEN** the black-box acceptance job installs the wheel and runs subprocess-path scenarios +- **THEN** all pattern and anti-pattern scenarios pass +- **AND** the installed binary is verified to work as expected. + +#### Scenario: Black-box failure blocks merge to main + +- **GIVEN** a black-box acceptance test fails +- **WHEN** the PR targets main +- **THEN** the merge is blocked +- **AND** the failure is reported with scenario name and expected vs actual output. + +### Requirement: Tiered Gating Policy + +The CI pipeline SHALL distinguish hard gates (block merge) from advisory gates (warn only). + +#### Scenario: Hard gates block merge + +- **GIVEN** a snapshot mismatch, black-box acceptance failure, or anti-pattern safety violation +- **WHEN** the CI pipeline evaluates gates +- **THEN** the PR is blocked from merging +- **AND** the failure reason is visible in PR checks. + +#### Scenario: Advisory gates warn without blocking + +- **GIVEN** a Hypothesis-discovered edge case or coverage threshold miss +- **WHEN** the CI pipeline evaluates gates +- **THEN** a warning annotation is added to the PR +- **AND** the PR is not blocked from merging. + +### Requirement: Contract-Test Tier Extension + +The contract-first test system SHALL include CLI behavior contracts as a recognized tier. + +#### Scenario: CLI validation runs as part of contract-test + +- **GIVEN** the developer runs `hatch run contract-test` +- **WHEN** CLI behavior contract files exist in `tests/cli-contracts/` +- **THEN** CLI scenario validation is included in the test run +- **AND** results appear alongside existing contract/exploration/scenario tiers. diff --git a/openspec/changes/cli-val-05-ci-integration/tasks.md b/openspec/changes/cli-val-05-ci-integration/tasks.md new file mode 100644 index 00000000..2ab03c98 --- /dev/null +++ b/openspec/changes/cli-val-05-ci-integration/tasks.md @@ -0,0 +1,87 @@ +# Tasks: cli-val-05-ci-integration + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-05-ci-integration -b feature/cli-val-05-ci-integration origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-05-ci-integration` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + - [ ] 1.1.6 Verify cli-val-02 and cli-val-04 are merged into dev before starting. + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/cli-validation-ci-gates/spec.md` scenarios. +- [ ] 2.2 Review current `pr-orchestrator.yml` job structure for integration points. + +## 3. Test-first: CI gate validation + +- [ ] 3.1 Create `tests/unit/tools/test_cli_validation_ci.py` with tests for: + - Contract-test tier extension recognizes CLI behavior contracts + - Combined CLI validation hatch script runs all tiers +- [ ] 3.2 Run tests and capture failing results in `TDD_EVIDENCE.md`. + +## 4. Implementation: CI workflow changes + +- [ ] 4.1 Add snapshot validation step to `tests` job in `.github/workflows/pr-orchestrator.yml`: + - Run `hatch run snapshot-check` (from cli-val-02) + - Fail on mismatch (hard gate) +- [ ] 4.2 Add anti-pattern safety step to `tests` job: + - Run `hatch run cli-acceptance-fast` for anti-pattern validation + - Fail on safety violations (hard gate) +- [ ] 4.3 Add Hypothesis fuzz step to `tests` job: + - Run Hypothesis tests + - Continue on error (advisory gate) + - Post warning annotation +- [ ] 4.4 Add `cli-acceptance` job to `pr-orchestrator.yml`: + - Build wheel: `hatch build` + - Install wheel: `pip install dist/*.whl` + - Run black-box acceptance: `hatch run cli-acceptance-blackbox` + - Hard gate on failure +- [ ] 4.5 Create `.github/workflows/snapshot-update.yml` — manual workflow_dispatch for updating snapshots. +- [ ] 4.6 Re-run tests until passing; record in `TDD_EVIDENCE.md`. + +## 5. Contract-test tier extension + +- [ ] 5.1 Extend `tools/contract_first_smart_test.py` to include CLI behavior contracts as a new tier. +- [ ] 5.2 Add `cli-validation` hatch script that runs snapshot check + acceptance fast path + anti-pattern suite. +- [ ] 5.3 Verify `hatch run contract-test` now includes the CLI validation tier. + +## 6. Quality gates and documentation + +- [ ] 6.1 `hatch run format` — ruff format + autofix. +- [ ] 6.2 `hatch run type-check` — basedpyright strict. +- [ ] 6.3 `hatch run lint` — full lint suite. +- [ ] 6.4 `hatch run contract-test` — contract-first validation. +- [ ] 6.5 `hatch run smart-test` — targeted test run. +- [ ] 6.6 Update `docs/` with CI validation gates documentation and snapshot update guide. +- [ ] 6.7 Run `openspec validate cli-val-05-ci-integration --strict` and resolve all issues. + +## 7. Version and changelog + +- [ ] 7.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 7.2 Add CHANGELOG.md entry under new version section with `Added` items for CI validation gates. + +## 8. Delivery + +- [ ] 8.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 8.2 Stage and commit: `git add . && git commit -m "feat: add CLI validation CI gates (cli-val-05)"` +- [ ] 8.3 Push: `git push -u origin feature/cli-val-05-ci-integration` +- [ ] 8.4 Create PR from `feature/cli-val-05-ci-integration` to `dev` using `gh pr create`. +- [ ] 8.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-05-ci-integration` +- [ ] `git branch -d feature/cli-val-05-ci-integration` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-05-ci-integration` diff --git a/openspec/changes/cli-val-06-copilot-test-generation/CHANGE_VALIDATION.md b/openspec/changes/cli-val-06-copilot-test-generation/CHANGE_VALIDATION.md new file mode 100644 index 00000000..ddc81721 --- /dev/null +++ b/openspec/changes/cli-val-06-copilot-test-generation/CHANGE_VALIDATION.md @@ -0,0 +1,92 @@ +# Change Validation: cli-val-06-copilot-test-generation + +- **Validated on (UTC):** 2026-02-19T12:00:00Z +- **Workflow:** /wf-validate-change (proposal-stage dry-run validation) +- **Strict command:** `openspec validate cli-val-06-copilot-test-generation --strict` +- **Result:** PASS + +## Scope Summary + +- **New capabilities:** copilot-scenario-generation +- **Modified capabilities:** none +- **Declared dependencies:** cli-val-01-behavior-contract-standard (#279) -- hard blocker; cli-val-05-ci-integration (#283) -- soft dependency +- **Downstream dependents:** none +- **Proposed affected code paths:** + - `resources/prompts/cli-scenario-generation.j2` (new Jinja2 prompt template) + - `specfact generate test-prompt` workflow (extend to detect CLI commands) + - `docs/` (new contributor page on copilot-driven scenario authoring) + +## Format Validation + +- **proposal.md Format**: PASS + - Title format: Correct (`# Change: Copilot Test Generation for CLI Scenarios`) + - Required sections: All present (Why, What Changes, Capabilities, Impact) + - "What Changes" format: Correct (uses NEW/EXTEND markers with bullet list) + - "Capabilities" section: Present (one capability: `copilot-scenario-generation`) + - "Impact" format: Correct (lists Affected specs, Affected code, Integration points, Documentation impact) + - Source Tracking section: Present (GitHub Issue #284, repository nold-ai/specfact-cli) +- **tasks.md Format**: PASS + - Section headers: Correct (hierarchical numbered format: `## 1.` through `## 8.`) + - Task format: Correct (`- [ ] 1.1 [Description]`) + - Sub-task format: Correct (`- [ ] 1.1.1 [Description]`) + - TDD order enforced section: Present at top + - Git worktree creation: First task (`## 1.`) + - Blocker verification: Task 1.1.6 verifies cli-val-01 is merged before starting (correct -- hard blocker) + - PR creation: Last implementation task (`## 8. Delivery`, task 8.4) + - Post-merge cleanup: Present + - Quality gate tasks: Present (`## 6.` -- includes format, type-check, lint, contract-test, smart-test) + - Version and changelog tasks: Present (`## 7.`) + - Validation with example commands: Present (`## 5.` -- tests template against 3 command types) + - Note: Task 1.1.6 correctly verifies only cli-val-01 (hard blocker) and does not require cli-val-05 (soft dependency). This is correct per CHANGE_ORDER.md. + - Note: No explicit worktree bootstrap pre-flight tasks. Minor gap, non-blocking. +- **specs Format**: PASS + - Given/When/Then format: Verified across all 5 scenarios in `specs/copilot-scenario-generation/spec.md` + - Three requirements defined: CLI Scenario Prompt Template (3 scenarios), Generate Test-Prompt Integration (1 scenario), Convention Enforcement Documentation (1 scenario) + - All scenarios use GIVEN/WHEN/THEN/AND structure correctly +- **design.md Format**: PASS + - Context section: Present + - Goals / Non-Goals: Present (4 goals, 3 non-goals) + - Decisions: Present (4 decisions documented) + - Risks / Trade-offs: Present (3 risks with mitigations) + - Migration Plan: Present (4-step plan) + - Open Questions: Present (2 questions) + +## Dependency and Integration Review + +- **CHANGE_ORDER.md consistency**: PASS + - CHANGE_ORDER.md row: `cli-val | 06 | cli-val-06-copilot-test-generation | #284 | #279 (soft: #283)` + - Hard blocked by: #279 (cli-val-01) + - Soft dependency: #283 (cli-val-05) -- for enforcement convention + - Wave 1.5 position: After cli-val-01 (hard blocker); parallel with cli-val-03 +- **GitHub Issue consistency**: PASS + - Issue #284 is OPEN, title "[Change] Copilot Test Generation for CLI Scenarios" + - Issue body states: `**Blocked by**: #279 (cli-val-01). Soft dependency: #283 (cli-val-05) for enforcement.` + - Labels: enhancement, change-proposal +- **Cross-artifact dependency consistency**: PASS + - Proposal states hard blocker: cli-val-01 (schema the templates generate); soft dependency: cli-val-05 (enforcement) + - CHANGE_ORDER.md shows hard blocker: #279; soft: #283 + - GitHub issue #284 states: Blocked by #279; soft dependency #283 + - tasks.md task 1.1.6 verifies cli-val-01 only (correct -- only hard blocker needs verification) + - All four sources agree on hard vs soft dependency distinction + +## Breaking-Change Analysis + +- **Breaking changes detected:** 0 +- **Interface changes:** Minimal. This change extends the existing `specfact generate test-prompt` workflow to detect CLI commands and offer a new template option. The extension is additive -- existing template options remain unchanged. The new prompt template in `resources/prompts/` is a new file. +- **Risk assessment:** No breaking-change risk. The only existing code touched is the `generate test-prompt` workflow, and the modification is purely additive (new option detection, not modification of existing behavior). + +## Impact Assessment + +- **Impact Level:** Low +- **Code Impact:** Minor extension to `generate test-prompt` workflow (additive detection logic). New Jinja2 prompt template. No other production CLI code changes. +- **Test Impact:** New test file in `tests/unit/`. No changes to existing tests. +- **Documentation Impact:** New contributor page in `docs/` for copilot-driven scenario authoring. Convention documentation for requiring scenario files with CLI command changes. +- **Release Impact:** Patch/Minor (additive only, no breaking changes) + +## Validation Outcome + +- Required artifacts are present: `proposal.md`, `design.md`, `specs/copilot-scenario-generation/spec.md`, `tasks.md`. +- All format checks pass with minor non-blocking observations noted. +- Dependency declarations are consistent across proposal, CHANGE_ORDER.md, GitHub issue #284, and tasks.md blocker verification. Hard vs soft dependency distinction is correctly maintained. +- No breaking changes detected. +- Change is ready for implementation-phase intake once cli-val-01 (#279) is implemented and merged. Soft dependency on cli-val-05 (#283) does not block implementation. diff --git a/openspec/changes/cli-val-06-copilot-test-generation/design.md b/openspec/changes/cli-val-06-copilot-test-generation/design.md new file mode 100644 index 00000000..7e35a7c0 --- /dev/null +++ b/openspec/changes/cli-val-06-copilot-test-generation/design.md @@ -0,0 +1,43 @@ +## Context + +This change extends the existing test-prompt generation workflow to support CLI behavior scenarios. It creates prompt templates that AI copilots use to generate scenario YAML, anti-pattern catalogs, and acceptance tests for new or modified CLI commands. + +## Goals / Non-Goals + +**Goals:** + +- Create a prompt template in `resources/prompts/` for CLI scenario generation +- Extend `specfact generate test-prompt` to detect CLI commands and offer the template +- Document the convention that CLI command changes require scenario files +- Make scenario authoring a natural part of the copilot-assisted development workflow + +**Non-Goals:** + +- No new CLI commands or production code beyond prompt template and workflow extension +- No automatic enforcement (that is cli-val-05) +- No schema changes (the template generates cli-val-01-compliant YAML) + +## Decisions + +- Prompt template is a Jinja2 template in `resources/prompts/` — consistent with existing prompt templates +- Template takes command metadata (name, args, options, types) as input and generates three outputs: scenario YAML, anti-pattern list, and Markdown acceptance test +- The `generate test-prompt` workflow auto-detects CLI commands by scanning for `@app.command()` and `typer.Typer()` patterns +- Convention enforcement is documentation-only in this change; CI enforcement comes from cli-val-05 + +## Risks / Trade-offs + +- [Copilot output quality varies] -> Mitigation: template provides structured examples; schema validation (cli-val-01) catches malformed output +- [Convention may not be followed without enforcement] -> Mitigation: cli-val-05 CI gates enforce scenario presence +- [Template maintenance as schema evolves] -> Mitigation: template references the schema directly; schema version field enables compatibility checking + +## Migration Plan + +1. Create prompt template with example inputs and outputs +2. Extend generate test-prompt workflow to detect CLI commands +3. Document the convention and workflow in docs/ +4. Test with 2-3 example commands to validate template quality + +## Open Questions + +- Whether the prompt template should generate scenario files directly to disk or output to stdout for copilot review +- Whether to include a `specfact generate cli-scenarios` subcommand or rely solely on the existing `generate test-prompt` workflow diff --git a/openspec/changes/cli-val-06-copilot-test-generation/proposal.md b/openspec/changes/cli-val-06-copilot-test-generation/proposal.md new file mode 100644 index 00000000..7a72926c --- /dev/null +++ b/openspec/changes/cli-val-06-copilot-test-generation/proposal.md @@ -0,0 +1,39 @@ +# Change: Copilot Test Generation for CLI Scenarios + +## Why + +SpecFact already has a `generate test-prompt` workflow for creating tests using IDE copilots. Extending this to generate CLI behavior scenarios means every new feature ships with end-user validation from day one — without developers manually writing YAML. This also makes the "patterns + anti-patterns as first-class artifacts" vision operationally sustainable. When the scenario format (cli-val-01) is copilot-native and CI enforces its presence (cli-val-05), the validation layer becomes self-sustaining. + +## What Changes + +- **NEW**: Prompt template in `resources/prompts/` that generates CLI behavior scenario YAML, anti-pattern catalog, and Markdown acceptance tests in one pass +- **EXTEND**: Existing `specfact generate test-prompt` workflow to detect CLI commands and offer the behavior scenario template alongside unit test templates +- **EXTEND**: OpenSpec convention requiring every change that adds or modifies a CLI command to include updated scenario files +- **EXTEND**: Documentation in `docs/` describing the copilot-driven scenario authoring workflow + +## Capabilities + +### New Capabilities + +- `copilot-scenario-generation`: Prompt templates and workflow integration that enable AI copilots to generate CLI behavior scenario files (patterns + anti-patterns) for new or modified commands, making end-user validation a default part of feature development. + +## Impact + +- **Affected specs**: No existing specs modified +- **Affected code**: New prompt template in resources/prompts/; minor extension to generate test-prompt workflow +- **Integration points**: Consumes cli-val-01 schema (template generates compliant YAML); consumes cli-val-05 convention (CI enforces scenario presence) +- **Documentation impact**: New contributor guide page on copilot-driven scenario authoring + +## Dependencies + +- **Hard blocker**: cli-val-01-behavior-contract-standard (schema the templates generate) +- **Soft dependency**: cli-val-05-ci-integration (enforcement of scenario presence in CI) +- No downstream dependents + +## Source Tracking + + +- **GitHub Issue**: #284 +- **Issue URL**: https://github.com/nold-ai/specfact-cli/issues/284 +- **Repository**: nold-ai/specfact-cli +- **Last Synced Status**: proposed diff --git a/openspec/changes/cli-val-06-copilot-test-generation/specs/copilot-scenario-generation/spec.md b/openspec/changes/cli-val-06-copilot-test-generation/specs/copilot-scenario-generation/spec.md new file mode 100644 index 00000000..ff2373c2 --- /dev/null +++ b/openspec/changes/cli-val-06-copilot-test-generation/specs/copilot-scenario-generation/spec.md @@ -0,0 +1,47 @@ +## ADDED Requirements + +### Requirement: CLI Scenario Prompt Template + +The system SHALL provide a prompt template that generates CLI behavior scenario YAML, anti-pattern catalog, and acceptance test content for a given command. + +#### Scenario: Prompt generates valid scenario YAML + +- **GIVEN** a CLI command signature (name, arguments, options) +- **WHEN** the prompt template is applied by a copilot +- **THEN** the output includes a valid YAML scenario file conforming to the cli-val-01 schema +- **AND** the file contains at least 3 pattern scenarios and 3 anti-pattern scenarios. + +#### Scenario: Prompt generates anti-pattern catalog + +- **GIVEN** a CLI command that accepts file paths, enum values, and optional flags +- **WHEN** the prompt template is applied +- **THEN** anti-patterns are generated for: missing required args, invalid enum values, nonexistent paths, and forbidden combinations +- **AND** each anti-pattern includes expected non-zero exit and error pattern. + +#### Scenario: Prompt generates Markdown acceptance test + +- **GIVEN** a CLI command with a documented workflow +- **WHEN** the prompt template is applied +- **THEN** the output includes a Markdown-formatted acceptance test showing the command workflow as a terminal session +- **AND** the acceptance test includes expected output patterns. + +### Requirement: Generate Test-Prompt Integration + +The system SHALL extend the existing `specfact generate test-prompt` workflow to offer CLI scenario templates. + +#### Scenario: Workflow detects CLI command and offers scenario template + +- **GIVEN** the user runs `specfact generate test-prompt` targeting a file containing a Typer command +- **WHEN** the workflow detects `@app.command()` decorators +- **THEN** it offers the CLI behavior scenario template as an option alongside unit test templates. + +### Requirement: Convention Enforcement Documentation + +The system SHALL document the convention that every OpenSpec change modifying CLI commands must include updated scenario files. + +#### Scenario: Convention is documented in contributor guide + +- **GIVEN** the documentation site at docs.specfact.io +- **WHEN** a contributor reads the CLI validation guide +- **THEN** the guide states that scenario files are required for CLI command changes +- **AND** provides examples of how to generate scenarios using the copilot workflow. diff --git a/openspec/changes/cli-val-06-copilot-test-generation/tasks.md b/openspec/changes/cli-val-06-copilot-test-generation/tasks.md new file mode 100644 index 00000000..eba35892 --- /dev/null +++ b/openspec/changes/cli-val-06-copilot-test-generation/tasks.md @@ -0,0 +1,76 @@ +# Tasks: cli-val-06-copilot-test-generation + +## TDD / SDD order (enforced) + +Per `openspec/config.yaml`, tests before code for any behavior-changing task. Order: (1) Spec deltas, (2) Tests from scenarios (expect failure), (3) Code last. Do not implement production code until tests exist and have been run (expecting failure). + +--- + +## 1. Create git worktree for this change + +- [ ] 1.1 Fetch latest and create a worktree with a new branch from `origin/dev`. + - [ ] 1.1.1 `git fetch origin` + - [ ] 1.1.2 `git worktree add ../specfact-cli-worktrees/feature/cli-val-06-copilot-test-generation -b feature/cli-val-06-copilot-test-generation origin/dev` + - [ ] 1.1.3 Change into the worktree: `cd ../specfact-cli-worktrees/feature/cli-val-06-copilot-test-generation` + - [ ] 1.1.4 Create a virtual environment: `python -m venv .venv && source .venv/bin/activate && pip install -e ".[dev]"` + - [ ] 1.1.5 `git branch --show-current` (verify correct branch) + - [ ] 1.1.6 Verify cli-val-01 is merged into dev before starting. + +## 2. Spec-first preparation + +- [ ] 2.1 Review and finalize `specs/copilot-scenario-generation/spec.md` scenarios. +- [ ] 2.2 Review existing prompt templates in `resources/prompts/` for pattern alignment. + +## 3. Test-first: template validation tests + +- [ ] 3.1 Create `tests/unit/specfact_cli/test_cli_scenario_prompt.py` with tests for: + - Prompt template renders valid YAML for a sample command + - Rendered YAML validates against the cli-val-01 schema + - Template detects CLI commands by scanning for `@app.command()` patterns + - Anti-pattern generation covers required categories +- [ ] 3.2 Run tests and capture failing results in `TDD_EVIDENCE.md`. + +## 4. Implementation: prompt template and workflow extension + +- [ ] 4.1 Create `resources/prompts/cli-scenario-generation.j2` — Jinja2 template generating scenario YAML + anti-patterns + Markdown acceptance test. +- [ ] 4.2 Extend `specfact generate test-prompt` workflow to detect CLI commands and offer scenario template. +- [ ] 4.3 Re-run tests until passing; record in `TDD_EVIDENCE.md`. + +## 5. Validation with example commands + +- [ ] 5.1 Test template output against `backlog ceremony standup` command — verify generated YAML passes schema validation. +- [ ] 5.2 Test template output against `validate` command — verify file I/O scenarios are generated. +- [ ] 5.3 Test template output against a simple command — verify minimal scenario set is generated. + +## 6. Quality gates and documentation + +- [ ] 6.1 `hatch run format` — ruff format + autofix. +- [ ] 6.2 `hatch run type-check` — basedpyright strict. +- [ ] 6.3 `hatch run lint` — full lint suite. +- [ ] 6.4 `hatch run contract-test` — contract-first validation. +- [ ] 6.5 `hatch run smart-test` — targeted test run. +- [ ] 6.6 Add `docs/` page on copilot-driven CLI scenario authoring workflow. +- [ ] 6.7 Update contributor guide with convention: CLI command changes require scenario files. +- [ ] 6.8 Run `openspec validate cli-val-06-copilot-test-generation --strict` and resolve all issues. + +## 7. Version and changelog + +- [ ] 7.1 Bump minor version in `pyproject.toml`, `setup.py`, `src/specfact_cli/__init__.py`. +- [ ] 7.2 Add CHANGELOG.md entry under new version section with `Added` items for copilot scenario generation. + +## 8. Delivery + +- [ ] 8.1 Update `openspec/CHANGE_ORDER.md` with implementation status. +- [ ] 8.2 Stage and commit: `git add . && git commit -m "feat: add copilot CLI scenario generation (cli-val-06)"` +- [ ] 8.3 Push: `git push -u origin feature/cli-val-06-copilot-test-generation` +- [ ] 8.4 Create PR from `feature/cli-val-06-copilot-test-generation` to `dev` using `gh pr create`. +- [ ] 8.5 Link PR to GitHub issue and project board. + +## Post-merge cleanup (after PR is merged) + +- [ ] Return to primary checkout: `cd .../specfact-cli` +- [ ] `git fetch origin` +- [ ] `git worktree remove ../specfact-cli-worktrees/feature/cli-val-06-copilot-test-generation` +- [ ] `git branch -d feature/cli-val-06-copilot-test-generation` +- [ ] `git worktree prune` +- [ ] (Optional) `git push origin --delete feature/cli-val-06-copilot-test-generation`