Roxabi · MickaelV0 · May 13, 2026 · May 13, 2026 · May 13, 2026
diff --git a/docs/release-convention.md b/docs/release-convention.md
@@ -0,0 +1,49 @@
+# Release Convention
+
+## Tag format
+
+```
+<component>/vX.Y.Z     # monorepo subdir package (e.g. roxabi-nats/v1.2.3)
+vX.Y.Z                 # single-package repo (e.g. v0.5.0)
+```
+
+PRs: merge-commit only (¬squash) — squash causes history divergence on next promotion.
+
+## Branch convention for uv git deps
+
+Roxabi Python repos consume cross-repo deps via `[tool.uv.sources]` in `pyproject.toml`.
+
+| Branch | Ref style | When |
+|--------|-----------|------|
+| `staging` | `branch = "staging"` | Development — tracks latest staging SHA |
+| `main` | `tag = "vX.Y.Z"` | Production — pinned to exact release tag |
+
+This means `pyproject.toml` on `staging` uses `branch=`, and on `main` uses `tag=`. The swap is automated by `/promote` (Step 1b — pin-swap phase).
+
+## `/promote` pin-swap phase
+
+At promotion time (staging→main), `/promote` automatically:
+
+1. Detects `[tool.uv.sources]` entries with `branch=`
+2. Resolves the SHA pinned in `uv.lock` to a release tag on the remote (`git ls-remote --tags`)
+3. Shows a DP(A) diff: `branch=staging → tag=vX.Y.Z`
+4. On Apply: rewrites `pyproject.toml`, regenerates `uv.lock`, stages both
+
+If no release tag exists at the locked SHA, promotion FAILS with:
+
+```
+FAIL: No release tag found at <pkg>@<sha8>.
+Cut a release tag (e.g. <pkg>/vX.Y.Z) at <sha8> upstream first.
+```
+
+This is intentional friction — promotion must ship exactly what staging tested.
+
+## Scope
+
+uv-only (`[tool.uv.sources]`). pip / poetry / pnpm deferred until a real consumer appears.
+
+## References
+
+- `/promote` SKILL.md — Step 1b full spec
+- `lib/pin-swap.ts` — implementation (pure functions, I/O-injected)
+- `__tests__/pin-swap.test.ts` — unit tests
diff --git a/plugins/dev-core/skills/frame/README.md b/plugins/dev-core/skills/frame/README.md
@@ -19,18 +19,23 @@ Triggers: `"frame"` | `"frame this"` | `"what's the problem"` | `"define the pro
 
 1. **Parse + Seed** — reads the GitHub issue (title, body, labels) or free text as context.
 2. **Interview** — asks 3–5 focused questions (skips what's already clear from the issue body): problem/pain, affected users, constraints, out-of-scope, related work.
-3. **Tier detection** — infers S / F-lite / F-full from complexity signals (file count, domain breadth, unknowns); lets you override.
-4. **Write frame doc** — creates `artifacts/frames/{N}-{slug}-frame.mdx` with status: `draft`.
-5. **User approval** — presents the frame for confirmation; loops on revisions until approved.
-6. **Commit + status update** — sets issue status to `Analysis` and commits the artifact.
+3. **Premise-validity gate** — required before tier classification. Captures three fields:
+   - `success_in_6mo` — what does success look like? (concrete, observable)
+   - `failure_in_6mo` — what does failure look like? (must be falsifiable)
+   - `simplest_alternative` + why it's insufficient — forces explicit comparison against the minimal solution
+   Cannot proceed without all three. Non-falsifiable failure modes trigger an abort prompt.
+4. **Tier detection** — infers S / F-lite / F-full from complexity signals (file count, domain breadth, unknowns); lets you override.
+5. **Write frame doc** — creates `artifacts/frames/{N}-{slug}-frame.mdx` with status: `draft`.
+6. **User approval** — presents the frame for confirmation; loops on revisions until approved.
+7. **Commit + status update** — sets issue status to `Analysis` and commits the artifact.
 
 ## Output artifact
 
 ```
 artifacts/frames/{N}-{slug}-frame.mdx
 ```
 
-Fields: `title`, `issue`, `status: approved`, `tier`, `date`, Problem, Who, Constraints, Out of Scope, Complexity.
+Fields: `title`, `issue`, `status: approved`, `tier`, `date`, Problem, Who, Constraints, Out of Scope, Premise Validity (required: `success_in_6mo`, `failure_in_6mo`, `simplest_alternative` + why-not), Complexity.
 
 ## Chain position
 

diff --git a/plugins/dev-core/skills/frame/SKILL.md b/plugins/dev-core/skills/frame/SKILL.md
@@ -2,7 +2,7 @@
 name: frame
 argument-hint: '["idea" | --issue <N>]'
 description: Problem framing — capture problem, constraints, scope, tier. Triggers: "frame" | "frame this" | "what's the problem" | "define the problem" | "scope this out" | "define the scope" | "what are we solving" | "help me think through this problem" | "problem statement".
-version: 0.2.0
+version: 0.3.0
 allowed-tools: Bash, Read, Write, Edit, Glob, Grep, ToolSearch
 ---
 
@@ -35,6 +35,7 @@ Standalone-safe: callable without `/dev`. Output consumed by `/analyze`, `/spec`
 |------|----|----------|---------------|-------|
 | 0 | parse | ✓ | `gh issue view N` → JSON | — |
 | 1 | interview | — | — | 3–5 Q max |
+| 1b | premise | ✓ | 3 fields non-empty | **gate** — blocks Step 2 |
 | 2 | tier | ✓ | τ ∈ frontmatter | — |
 | 3 | write | ✓ | φ ∃ | — |
 | 4 | approval | ✓ | `status: approved` | gate |
@@ -43,7 +44,7 @@ Standalone-safe: callable without `/dev`. Output consumed by `/analyze`, `/spec`
 
 Success: φ written ∧ status: approved
 Evidence: `ls artifacts/frames/` after execution
-Steps: parse → interview → tier → write → approval
+Steps: parse → interview → premise-gate → tier → write → approval
 ¬clear → STOP + ask: "What problem are you solving?"
 
 ## Step 0 — Parse + Seed
@@ -79,6 +80,28 @@ Check ∃ φ:
 
 ¬ask all 5 if seed is rich — ask only what's missing.
 
+## Step 1b — Premise-Validity Gate
+
+**Gate: cannot proceed to Step 2 without all 3 fields answered.**
+
+Capture in a single AQ (present all 3 together):
+
+| Field | Prompt | Requirement |
+|-------|--------|-------------|
+| `success_in_6mo` | "What does success look like in 6 months?" | Concrete, observable outcome — ¬vague ("things are better") |
+| `failure_in_6mo` | "What does failure look like in 6 months?" | Must be **falsifiable** — a condition you could actually observe ∧ decide to abort |
+| `simplest_alternative` | "What's the simplest version that would meet the goal — and why isn't it enough?" | Forces explicit comparison; the "why not" is required, ¬optional |
+
+Evaluation rules:
+
+- `failure_in_6mo` ¬falsifiable (e.g. "people aren't happy") → reject, re-ask. Example of falsifiable: "DEBT count stays flat or rises despite 3 sprint cycles." Example of non-falsifiable: "the team doesn't feel better."
+- `simplest_alternative` answer omits the "why not" half → re-ask: "You described the simpler version — why won't it be enough?"
+- Any field empty or answered with ≤5 words → treat as unanswered, re-ask.
+
+**Abort signal:** if the user answers `failure_in_6mo` with a description that matches "we'd still have the problem but with extra bookkeeping" (i.e. the initiative measures proxy metrics, ¬the underlying issue) → surface: "This failure mode suggests the premise may be invalid. Do you want to reframe the problem or abort?" AQ: **Reframe** | **Abort**.
+
+Origin: pattern surfaced by Roxabi/lyra#1162 — quality-debt annotation infrastructure (~1100 LOC + 6 registry files) where the ratchet measured bookkeeping compliance, not code quality. A falsification check at /frame would have caught this.
+
 ## Step 2 — Tier Detection
 
 Auto-detect τ from complexity signals:
@@ -126,6 +149,17 @@ date: {YYYY-MM-DD}
 
 - {explicit non-goals as bullets}
 
+## Premise Validity
+
+**Required — populated from Step 1b. ¬leave blank.**
+
+**Success in 6 months:** {concrete, observable outcome}
+
+**Failure in 6 months:** {falsifiable condition — observable ∧ actionable}
+
+**Simplest alternative:** {minimal version that meets the goal}
+**Why not simplest:** {explicit reason the simpler path is insufficient}
+
 ## Complexity
 
 **Tier: {τ}** — {1-sentence rationale}

diff --git a/plugins/dev-core/skills/plan/SKILL.md b/plugins/dev-core/skills/plan/SKILL.md
@@ -90,6 +90,21 @@ Intra-domain parallel: ≥4 independent tasks in 1 domain → multiple same-type
 **2d. Tasks:** ∀ task: description, files, agent, dependencies, parallel-safe (Y/N).
 Order: types → backend → frontend → tests → docs → config.
 
+**Budget heuristic (ops estimate):** After listing tasks, classify each by cost class and compute estimated tool-call ops. Record in the plan artifact's Wave Structure section as a Budget Table.
+
+Cost classes:
+
+| Class | Ops/item | Examples |
+|-------|----------|---------|
+| `trivial` | 1–2 | string replace, single grep |
+| `bounded` | 2–3 | read + edit known file |
+| `judgmental` | 4–6 | read + context + judge + edit |
+| `exploratory` | 8–15 | open-ended cross-file search |
+
+Rule: if `estimated_total_ops > 50` for a task → **force-split** the task into smaller sub-tasks, or present a DP(A) **Split now** | **Keep as-is (flag)** decision before proceeding.
+
+Implementation helper: `plugins/dev-core/skills/plan/lib/budget.ts` — exports `classifyTask`, `computeBudget`, `renderBudgetTable`.
+
 **2e. Slice Selection (multi-slice only):** ≥2 slices → → DP(C) 1 option/slice `V{N}: {desc} ({files}, {agents})`.
 Default: next unimplemented slice. Respect deps. Re-run `/plan` for remaining.
 
@@ -178,6 +193,21 @@ After micro-tasks, derive waves from the dependency graph. Name parallel agent i
 | 2 | Wave 1 done | {K} ∥ | ... |
 ```
 
+After the wave table, include a **Budget Table** derived from Step 2d classification:
+
+```markdown
+### Budget
+
+| Task | Items | Class | Est. ops | Split? |
+|------|-------|-------|----------|--------|
+| {task name} | {N} | {class} | {ops} | — |
+| {large task} | {N} | exploratory | {ops} | YES — split required |
+
+**Total estimated ops: {total}**
+```
+
+Tasks marked `YES — split required` must be resolved (split or DP-approved) before the plan is finalized.
+
 Rules:
 - Wave 1 = all tasks with no deps.
 - Wave N = tasks whose deps are all in earlier waves.

diff --git a/plugins/dev-core/skills/plan/__tests__/budget.test.ts b/plugins/dev-core/skills/plan/__tests__/budget.test.ts
@@ -0,0 +1,123 @@
+import { describe, expect, it } from 'vitest'
+import { classifyTask, computeBudget, renderBudgetTable, SPLIT_THRESHOLD } from '../lib/budget'
+
+describe('budget classifier', () => {
+  describe('classifyTask', () => {
+    it('trivial: 1 item → 2 ops (rounded mid 1.5)', () => {
+      const row = classifyTask({ name: 'Fix typo', items: 1, costClass: 'trivial' })
+      expect(row.estimatedOps).toBe(2)
+      expect(row.mustSplit).toBe(false)
+    })
+
+    it('bounded: 3 items → 8 ops (3 * 2.5 = 7.5 → 8)', () => {
+      const row = classifyTask({ name: 'Edit config files', items: 3, costClass: 'bounded' })
+      expect(row.estimatedOps).toBe(8)
+      expect(row.mustSplit).toBe(false)
+    })
+
+    it('judgmental: 6 items → 30 ops (6 * 5)', () => {
+      const row = classifyTask({ name: 'Review route handlers', items: 6, costClass: 'judgmental' })
+      expect(row.estimatedOps).toBe(30)
+      expect(row.mustSplit).toBe(false)
+    })
+
+    it('exploratory: 5 items → 58 ops (5 * 11.5 = 57.5 → 58) — exceeds threshold', () => {
+      const row = classifyTask({ name: 'Audit cross-file deps', items: 5, costClass: 'exploratory' })
+      expect(row.estimatedOps).toBe(58)
+      expect(row.mustSplit).toBe(true)
+    })
+
+    it('mustSplit is false at exactly the threshold', () => {
+      // judgmental: 10 items * 5 mid = 50 — NOT > 50, no split
+      const row = classifyTask({ name: 'Exactly at threshold', items: 10, costClass: 'judgmental' })
+      expect(row.estimatedOps).toBe(50)
+      expect(row.mustSplit).toBe(false)
+    })
+
+    it('mustSplit is true one item above the threshold boundary', () => {
+      // judgmental: 11 items * 5 = 55 — > 50, split required
+      const row = classifyTask({ name: 'Just over threshold', items: 11, costClass: 'judgmental' })
+      expect(row.estimatedOps).toBe(55)
+      expect(row.mustSplit).toBe(true)
+    })
+
+    it('preserves name and items in output', () => {
+      const row = classifyTask({ name: 'My task', items: 4, costClass: 'bounded' })
+      expect(row.name).toBe('My task')
+      expect(row.items).toBe(4)
+      expect(row.costClass).toBe('bounded')
+    })
+  })
+
+  describe('computeBudget', () => {
+    it('totals ops across all tasks', () => {
+      const { rows, totalOps } = computeBudget([
+        { name: 'T1', items: 2, costClass: 'trivial' }, // 2 * 1.5 = 3 → 3
+        { name: 'T2', items: 4, costClass: 'bounded' }, // 4 * 2.5 = 10
+      ])
+      expect(rows).toHaveLength(2)
+      expect(totalOps).toBe(rows.reduce((s, r) => s + r.estimatedOps, 0))
+    })
+
+    it('returns empty rows and 0 total for empty input', () => {
+      const { rows, totalOps } = computeBudget([])
+      expect(rows).toHaveLength(0)
+      expect(totalOps).toBe(0)
+    })
+
+    it('flags tasks that individually exceed the threshold', () => {
+      const { rows } = computeBudget([
+        { name: 'Big task', items: 6, costClass: 'exploratory' }, // 6 * 11.5 = 69 → mustSplit
+        { name: 'Small task', items: 2, costClass: 'bounded' }, // 2 * 2.5 = 5 → fine
+      ])
+      expect(rows[0].mustSplit).toBe(true)
+      expect(rows[1].mustSplit).toBe(false)
+    })
+  })
+
+  describe('renderBudgetTable', () => {
+    it('includes header and separator rows', () => {
+      const rows = [classifyTask({ name: 'T1', items: 2, costClass: 'bounded' })]
+      const output = renderBudgetTable(rows)
+      expect(output).toContain('| Task | Items | Class | Est. ops | Split? |')
+      expect(output).toContain('|------|-------|-------|----------|--------|')
+    })
+
+    it('shows — for tasks that do not need splitting', () => {
+      const rows = [classifyTask({ name: 'Small task', items: 1, costClass: 'trivial' })]
+      const output = renderBudgetTable(rows)
+      expect(output).toContain('| — |')
+    })
+
+    it('shows YES — split required for tasks over the threshold', () => {
+      const rows = [classifyTask({ name: 'Big task', items: 6, costClass: 'exploratory' })]
+      const output = renderBudgetTable(rows)
+      expect(output).toContain('YES — split required')
+    })
+
+    it('includes total ops footer', () => {
+      const rows = [
+        classifyTask({ name: 'T1', items: 2, costClass: 'bounded' }),
+        classifyTask({ name: 'T2', items: 1, costClass: 'trivial' }),
+      ]
+      const output = renderBudgetTable(rows)
+      const total = rows.reduce((s, r) => s + r.estimatedOps, 0)
+      expect(output).toContain(`**Total estimated ops: ${total}**`)
+    })
+
+    it('renders all tasks as table rows', () => {
+      const inputs = [
+        { name: 'Alpha', items: 3, costClass: 'bounded' as const },
+        { name: 'Beta', items: 2, costClass: 'judgmental' as const },
+      ]
+      const { rows } = computeBudget(inputs)
+      const output = renderBudgetTable(rows)
+      expect(output).toContain('Alpha')
+      expect(output).toContain('Beta')
+    })
+
+    it('SPLIT_THRESHOLD constant is 50', () => {
+      expect(SPLIT_THRESHOLD).toBe(50)
+    })
+  })
+})