EmeaAppGbb · kostapetan · Mar 30, 2026 · Mar 30, 2026 · Mar 30, 2026 · Mar 30, 2026
diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md
@@ -15,7 +15,7 @@ is driven by automated tests generated from those specifications.
 ## Coding Conventions
 - Write minimal code to pass tests — no gold-plating
 - Follow existing patterns in the codebase
-- All code must be covered by tests (unit, integration, or e2e)
+- All code must be covered by tests (unit, integration, or e2e). **Exception:** Brownfield Track B features use manual verification checklists when automated tests are not feasible (see AGENTS.md Testability Gate).
 - No hardcoded secrets — use environment variables
 - No hardcoded URLs — use configuration
 - Error handling: every external call must have error handling
@@ -25,15 +25,41 @@ is driven by automated tests generated from those specifications.
 ## Spec-Driven Development Rules
 - **Never implement features not in the specs** (specs/frd-*.md)
 - **Never modify tests** without human approval — tests are the contract
-- **Always check Gherkin scenarios** before implementing a feature
 - **Never skip tests** — no `test.skip()`, no `xit()`, no `@pending`, no commenting out. Tests are proof that specs are met.
+- **Always check Gherkin scenarios** before implementing a feature
 - **Always run tests** after making changes — the FULL suite, not just "relevant" tests
 - If a spec seems wrong, flag it — do not silently deviate
 
+## Test Discipline Gospel
+Tests are the **proof** that specifications have been implemented correctly. They are not optional, not skippable, and not negotiable. See AGENTS.md §9 for the full protocol.
+
+Key rules:
+- **Tests = proof of spec completion.** A feature without passing tests is not done.
+- **Phase order is sacred.** Tests → Contracts → Implementation → Verify. Never skip or reorder.
+- **All tests must pass before advancing.** No deployment, no commit to main while any test fails.
+- **Regression is mandatory.** After every change, run ALL tests — not just new ones.
+
+### When the User Spots a Problem
+At any point during the flow, if the user identifies something wrong:
+1. **Pause** current work
+2. **Offer** to generate a test that captures the problem
+3. **Present** the test for user approval (mandatory human gate)
+4. **Fix** minimally after approval
+5. **Verify** the test passes and full regression is green
+6. **Resume** the interrupted work
+
+This is the **bug-spot protocol** — it applies in every phase. The user's approval of the test is mandatory because the test defines what "fixed" means.
+
 ## File Organization
 ```
 specs/          → Specifications (PRD, FRDs, Gherkin)
 specs/contracts/→ Contracts (API specs per feature, infrastructure resources)
+specs/ui/       → UI/UX design artifacts (screen-map, design-system, component-inventory, prototypes/, walkthroughs)
+specs/features/ → Gherkin feature files (greenfield + brownfield Track A)
+specs/adrs/     → Architecture Decision Records
+specs/docs/     → Brownfield extraction outputs (technology/, architecture/, testing/)
+specs/assessment/ → Assessment outputs (modernization, security, cloud-native, etc.)
+specs/increment-plan.md → Ordered increment delivery plan
 specs/tech-stack.md → Resolved tech stack (all technology decisions, wiring, deployment)
 e2e/            → Playwright end-to-end tests (integration slice)
 tests/          → Cucumber.js BDD tests (integration slice)
@@ -66,9 +92,11 @@ The agent MUST pause and ask for human approval at these points:
 - After UI/UX design (before increment planning)
 - After increment plan approval (before tech stack resolution)
 - After tech stack resolution (before first increment) — technology choices must be approved
-- Per increment: after Gherkin generation (before BDD test scaffolding)
+- Per increment: after Gherkin generation (before BDD test scaffolding) — **user approves scenarios**
+- Per increment: after test generation (before contracts) — **user approves test code**
 - Per increment: after implementation (before deployment) — via PR review
 - Per increment: after deployment (verify it works)
+- **Bug-spot protocol: user must approve the bug-reproducing test before any fix proceeds**
 
 ## State Management
 - Read `.spec2cloud/state.json` at the start of every session
@@ -139,25 +167,11 @@ specs/
 
 ## Bug Fix Protocol
 - Bug fixes use the `bug-fix` skill — lightweight path through the pipeline
-- Every bug fix MUST: link to an FRD, create a failing test, fix minimally, verify regression
+- Every bug fix MUST: link to an FRD, create a failing test, **get user approval on the test**, fix minimally, verify regression
+- The user approves the test because it defines what "fixed" means — this gate is not optional
 - Commit format: `[bugfix] {frd-id}: {description}`
 - Tracked as micro-increments in state.json
 
-
-## Test Discipline Gospel
-Tests are the **proof** that specifications have been implemented correctly. They are not optional, not skippable, and not negotiable.
-
-Key rules:
-- **Tests = proof of spec completion.** A feature without passing tests is not done.
-- **Phase order is sacred.** Tests → Contracts → Implementation → Verify. Never skip or reorder.
-- **All tests must pass before advancing.** No deployment, no commit to main while any test fails.
-- **Regression is mandatory.** After every change, run ALL tests — not just new ones.
-
-### When the User Spots a Problem
-At any point during the flow, if the user identifies something wrong:
-1. **Pause** current work
-2. **Offer** to generate a test that captures the problem
-
 ## Shell-Specific Extensions
 <!-- Shells should add stack-specific instructions below this line -->
 

diff --git a/.github/skills/api-extractor/SKILL.md b/.github/skills/api-extractor/SKILL.md
@@ -228,3 +228,15 @@ sharedTypes:
    Flask, etc.) in the `path` field.
 7. **Every endpoint.** Missing a route that exists in code is a failure.
    Scan systematically — do not rely on sampling.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking api-extractor as complete:
+
+- [ ] At least one contract file exists in `specs/contracts/api/` in OpenAPI-compatible YAML format
+- [ ] Every HTTP route/endpoint found in source code is documented (method, path, parameters)
+- [ ] Request and response schemas are documented where determinable; noted as "not determined" otherwise
+- [ ] Authentication/authorization patterns are identified per endpoint (e.g., JWT, API key, public)
+- [ ] Error response shapes are documented where the code defines them
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to the next extraction step.
diff --git a/.github/skills/architecture-mapper/SKILL.md b/.github/skills/architecture-mapper/SKILL.md
@@ -245,3 +245,15 @@ _Extracted on [date]._
    external API, queue) is a failure. Trace every outbound connection.
 7. **Ambiguity is okay.** If you cannot determine a pattern or relationship
    from the code, say "not determinable" rather than guessing.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking architecture-mapper as complete:
+
+- [ ] `specs/docs/architecture/overview.md` exists with a high-level Mermaid component diagram
+- [ ] `specs/docs/architecture/components.md` exists with per-component detail (responsibility, dependencies, interfaces)
+- [ ] All integration points are cataloged (databases, external APIs, queues, caches, file systems)
+- [ ] Data flow between components is documented with a Mermaid sequence or flow diagram
+- [ ] Layer boundaries (if any) are identified (e.g., controller → service → repository)
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to the next extraction step.
diff --git a/.github/skills/audit-log/SKILL.md b/.github/skills/audit-log/SKILL.md
@@ -39,3 +39,15 @@ Append every significant action to `.spec2cloud/audit.log`. Never overwrite —
 ```
 [2026-02-09T14:40:00Z] phase=deployment action=azd-provision result=error message="quota exceeded in eastus"
 ```
+
+## Mandatory Completion Checklist
+
+Every audit log entry MUST contain ALL of the following fields:
+
+- [ ] ISO-8601 timestamp (e.g., `2026-02-09T14:40:00Z`)
+- [ ] Context identifier: `phase=` or `increment=` (what work is being done)
+- [ ] `action=` field (what specific action was taken)
+- [ ] `result=` field (outcome: `done`, `error`, `approved`, `rejected`, etc.)
+- [ ] `message=` field for errors and rejections (human-readable detail)
+
+**BLOCKING**: An audit entry missing any required field makes the audit trail incomplete. The orchestrator must ensure every append follows this schema.
diff --git a/.github/skills/azure-deployment/SKILL.md b/.github/skills/azure-deployment/SKILL.md
@@ -146,3 +146,17 @@ If smoke tests fail after deployment:
 See `references/stack-deployment.md` for AZD service structure, Container Apps
 configuration, Dockerfile details, infrastructure templates, environment
 variables, and smoke test commands.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking azure-deployment as complete:
+
+- [ ] Full regression suite passes locally before deployment
+- [ ] `azd provision` completes without errors
+- [ ] `azd deploy` completes without errors
+- [ ] Smoke tests pass against the deployed endpoints (not localhost)
+- [ ] HTTPS is enabled on all deployed endpoints
+- [ ] Application logs show no startup errors (check via `azd monitor`)
+- [ ] State JSON and audit log are updated with deployment URLs and status
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before marking the increment as delivered.
diff --git a/.github/skills/bug-fix/SKILL.md b/.github/skills/bug-fix/SKILL.md
@@ -216,3 +216,18 @@ Append to `.spec2cloud/audit.log`:
 - **Minimal change only.** Refactoring goes in separate increments.
 - **No silent fixes.** Every fix is committed, logged, and tracked in state.
 - **Regression must pass.** If the fix breaks other tests, the fix is wrong.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking bug-fix as complete:
+
+- [ ] Bug is linked to a specific FRD (or flagged as a feature request if no FRD exists)
+- [ ] A failing test exists that reproduces the bug
+- [ ] Human approved the reproduction test (mandatory gate)
+- [ ] The fix is minimal (< 50 lines; larger fixes escalated to full increments)
+- [ ] The reproduction test now passes
+- [ ] Full regression suite passes (no other tests broken)
+- [ ] State JSON records the micro-increment; audit log has the fix entry
+- [ ] Commit uses `[bugfix] {frd-id}: {description}` format
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items.
diff --git a/.github/skills/build-check/SKILL.md b/.github/skills/build-check/SKILL.md
@@ -38,3 +38,14 @@ Details:
 
 - Web (Next.js) requires `output: 'standalone'` in `next.config.ts`
 - API uses TypeScript with Express
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking build-check as complete:
+
+- [ ] Every service (API, Web) has been built — none skipped
+- [ ] Each service reports PASS or FAIL with error/warning counts
+- [ ] If any service fails to build, the failure details (file, line, message) are included
+- [ ] A partial build (one service passes, another not attempted) is reported as "incomplete" — not as PASS
+
+**BLOCKING**: A build-check with any FAIL or incomplete service means the codebase is not ready for testing or deployment. The orchestrator must fix build errors before proceeding.
diff --git a/.github/skills/cloud-native-assessment/SKILL.md b/.github/skills/cloud-native-assessment/SKILL.md
@@ -161,3 +161,16 @@ Generate ADRs via the `adr` skill when the assessment reveals service selection
 - Azure service recommendations should consider cost. Don't recommend AKS when Container Apps suffices.
 - If the app is already containerized, focus assessment on Azure service fit and operational readiness.
 - Monolith decomposition is out of scope. Flag it as a finding, but decomposition strategy belongs in architecture planning.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking cloud-native-assessment as complete:
+
+- [ ] `specs/assessment/cloud-native.md` exists with: 12-factor compliance scorecard, containerization readiness, and Azure service fit analysis
+- [ ] Each 12-factor principle is rated (compliant / partially compliant / non-compliant) with evidence
+- [ ] Azure service recommendations are provided with cost tier awareness (e.g., Container Apps vs AKS)
+- [ ] Configuration externalization gaps are identified (hardcoded values, file-based config)
+- [ ] At least one ADR exists in `specs/adrs/` for significant platform/service decisions
+- [ ] State JSON and audit log are updated
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to planning.
diff --git a/.github/skills/cloud-native-planner/SKILL.md b/.github/skills/cloud-native-planner/SKILL.md
@@ -239,3 +239,16 @@ proceeds through the standard Phase 2 pipeline:
 2. **Contract generation** — update infra contracts for new Azure resources
 3. **Implementation** — build Dockerfiles, IaC, CI/CD workflows
 4. **Build & deploy** — verify containerized app builds and deploys to Azure
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking cloud-native-planner as complete:
+
+- [ ] `specs/increment-plan.md` is updated with all cloud-native increments (unique IDs, scope, dependencies, effort)
+- [ ] Infrastructure increments are ordered: containerization → config externalization → IaC → observability → CI/CD
+- [ ] Every new Azure resource has a corresponding entry in `specs/contracts/infra/resources.yaml`
+- [ ] Security defaults are planned (non-root containers, managed identity, Key Vault for secrets)
+- [ ] All increments are consistent with existing ADRs for Azure service choices
+- [ ] State JSON and audit log are updated
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to Phase 2 delivery.
diff --git a/.github/skills/codebase-scanner/SKILL.md b/.github/skills/codebase-scanner/SKILL.md
@@ -206,3 +206,15 @@ _Extracted on [date]. This is a factual inventory of the project as it exists._
    files (e.g., `"^18.2.0"` not just `"18"`).
 7. **No inferencing beyond the code.** If you can't determine something from the
    files, say "not determinable from source" — do not guess.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking codebase-scanner as complete:
+
+- [ ] `specs/docs/technology/stack.md` exists and contains: languages, frameworks, package managers, build tools, and entry points
+- [ ] Every language detected has version constraints documented (from manifest files)
+- [ ] Every framework detected has its role documented (web, API, testing, etc.)
+- [ ] Entry points are identified for each application boundary
+- [ ] Monorepo workspaces (if any) are inventoried separately
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to the next extraction step.
diff --git a/.github/skills/commit-protocol/SKILL.md b/.github/skills/commit-protocol/SKILL.md
@@ -40,3 +40,15 @@ Each increment is a self-contained delivery. Commits at increment boundaries mea
 - `git log --oneline --grep="\[increment\]"` shows the delivery timeline
 - Each delivered increment is a revertable, deployable unit
 - Mid-increment commits at slice granularity create resumable checkpoints
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following for every commit:
+
+- [ ] Commit message follows the format table above (correct prefix for the phase/step)
+- [ ] `.spec2cloud/state.json` is included in the commit
+- [ ] `.spec2cloud/audit.log` is included in the commit
+- [ ] No secrets, `.env` files, or `node_modules` are staged
+- [ ] Co-authored-by trailer is present
+
+**BLOCKING**: A commit without state.json and audit.log breaks resume capability. The orchestrator must include them before finalizing the commit.
diff --git a/.github/skills/contract-generation/SKILL.md b/.github/skills/contract-generation/SKILL.md
@@ -149,3 +149,15 @@ Append to `.spec2cloud/audit.log`:
 ```
 [ISO-timestamp] increment={id} step=contracts action=contracts-generated result=done
 ```
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking contract-generation as complete:
+
+- [ ] API contracts exist in `specs/contracts/api/` for every endpoint in the increment
+- [ ] Shared types in `src/shared/types/` cover all data structures used across API and Web
+- [ ] `specs/contracts/infra/resources.yaml` is updated with any NEW infrastructure needs for this increment (new Azure resources, new model deployments, new env vars)
+- [ ] Any new Azure resources needed are flagged for provisioning and have corresponding Bicep definitions in `infra/`
+- [ ] Any new environment variables are documented in both `resources.yaml` (per-resource env_vars) and `specs/tech-stack.md` (per-increment map)
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to implementation.
diff --git a/.github/skills/data-model-extractor/SKILL.md b/.github/skills/data-model-extractor/SKILL.md
@@ -261,3 +261,15 @@ _Extracted on [date]. Documents the data layer as defined in code._
 8. **Relationship accuracy.** Every foreign key and relationship decorator
    must appear in the output. Verify by checking both sides of bidirectional
    relationships.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking data-model-extractor as complete:
+
+- [ ] `specs/docs/architecture/data-models.md` exists with a Mermaid ERD diagram
+- [ ] Every ORM model, migration file, and schema definition in the codebase is covered
+- [ ] All entities have their fields, types, and constraints documented
+- [ ] All relationships (foreign keys, one-to-many, many-to-many) are documented with both sides verified
+- [ ] Index definitions are cataloged where present in schema files
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to the next extraction step.
diff --git a/.github/skills/dependency-inventory/SKILL.md b/.github/skills/dependency-inventory/SKILL.md
@@ -208,3 +208,15 @@ _Extracted on [date]. This is a factual catalog of all declared dependencies._
    then produce a unified summary.
 8. **Transitive accuracy.** Only report transitive relationships you can verify
    from lock files. Do not guess transitive trees.
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking dependency-inventory as complete:
+
+- [ ] `specs/docs/technology/dependencies.md` exists and contains: all direct dependencies with versions, purposes, and license info
+- [ ] Every package manifest file in the project is covered (package.json, requirements.txt, *.csproj, etc.)
+- [ ] Lock file versions are used where available; absence of lock file is noted
+- [ ] Dependency relationships (which packages depend on which) are documented
+- [ ] Monorepo workspaces (if any) have per-workspace inventories plus a unified summary
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to the next extraction step.
diff --git a/.github/skills/e2e-generation/SKILL.md b/.github/skills/e2e-generation/SKILL.md
@@ -229,3 +229,17 @@ After completing e2e generation:
    [TIMESTAMP] e2e-generation: All specs compile and are listed ✅
    ```
 3. Commit all generated files with message: `[e2e-gen] scaffold e2e tests for all flows`
+
+## Mandatory Completion Checklist
+
+The orchestrator MUST verify ALL of the following before marking e2e-generation as complete:
+
+- [ ] At least one Playwright spec file (`e2e/*.spec.ts`) exists for every user flow in `specs/ui/flow-walkthrough.md`
+- [ ] A Page Object Model (`e2e/pages/*.page.ts`) exists for every screen in `specs/ui/screen-map.md`
+- [ ] Every POM uses `data-testid` selectors from `specs/ui/component-inventory.md` (not CSS classes or XPath)
+- [ ] `e2e/playwright.config.ts` exists and is configured for the Aspire environment
+- [ ] All spec files compile successfully (`npx playwright test --list` returns all specs without errors)
+- [ ] Navigation flows between pages are tested (not just individual page interactions)
+- [ ] State JSON and audit log are updated
+
+**BLOCKING**: If any item is unchecked, the skill has NOT completed successfully. The orchestrator must loop back and complete the missing items before advancing to Gherkin generation.