diff --git a/CHANGELOG.md b/CHANGELOG.md index 4abaf49..207d1af 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +## [0.20.0] - 2026-04-20 + +### Added + +- **`governance_source` on `oddkit_gate` envelope** — Gate response `result` now declares which tier served its governance vocabulary: `"knowledge_base"` (both `odd/gate/transitions.md` and `odd/gate/prerequisites.md` parsed from canon) or `"minimal"` (one or both files unreachable; hardcoded vocabulary snapshot used). Strict aggregation rule per P1.3.1 precedent: any helper falling through to minimal makes the aggregate `"minimal"`. Two-tier cascade today — `workers/baseline/` is not yet shipped, and `odd/gate/` is explicitly canon-only per `klappy://canon/constraints/core-governance-baseline` §What-Ships-in-Baseline. + +- **`governance_uris` (plural array of 2) on `oddkit_gate` envelope** — Gate reads two peer governance documents (`odd/gate/transitions`, `odd/gate/prerequisites`); the envelope surfaces both URIs in alphabetical order by path-tail. **This is an intentional shape divergence from `oddkit_encode`'s singular `governance_uri`** — encode's encoding-type docs sit under a single canonical umbrella, but gate's two files are peers in a foreign-key relation (transitions references prereq ids defined in prerequisites). Same divergence rationale as `oddkit_challenge` in 0.19.0; gate's array is structurally symmetric because both entries point to peer single files. Consumers that prefer a singular anchor can read `governance_uris[0]` — alphabetical ordering makes this stable. + +- **`debug.knowledge_base_url` echo on `oddkit_gate` envelope** — Gate now echoes the caller's `knowledge_base_url` override in the debug envelope, matching encode (0.18.0) and challenge (0.19.0). + +- **Two new canon files define gate's governance:** `odd/gate/transitions.md` (four transition keys, from/to endpoints, prerequisite id mappings, BM25 detection terms) and `odd/gate/prerequisites.md` (eight prerequisite ids with check vocabularies and gap messages). Canon-first contract: both files merged to klappy.dev main before this release (klappy/klappy.dev#120). + +### Changed + +- **`oddkit_gate` transition detection now uses BM25 stemmed matching over canon-supplied vocabulary** (replaces the prior literal word-boundary regex cascade). This is **strictly additive**: every input that matched the prior regex still matches, plus stemmed variations now match too. `deploying`, `released`, `started building`, `building`, and `reconsidering` now match their canonical transitions via stemming. The Porter-style stemmer does not currently reverse consonant gemination (`shipping` → `shipp`, not `ship`), so the small number of geminating verbs gate cares about (`ship`, `step back`) have their inflected forms listed explicitly in `odd/gate/transitions.md` rather than relying on the stemmer. Priority resolution between competing transitions uses BM25 scoring (specific phrase beats bare word — `ready to build` outscores bare `ready` via 2-term-vs-1-term match) rather than the prior fragile regex-cascade order. Row order in `odd/gate/transitions.md` remains as deterministic tiebreaker for genuine ties. + +- **`oddkit_gate` prerequisite evaluation now uses stemmed set intersection** (not BM25). Each prereq evaluates independently: pass if any stemmed input token matches any stemmed check term; fail otherwise. This is fit-to-problem — prereqs return gap-or-not in isolation, not a ranking. Avoids BM25's IDF-negative pathology on the small 8-prereq corpus where common vocabulary across prereqs (words like `goal`, `done`, `constraint`) would flip `log((N-df+0.5)/(df+0.5))` negative and produce score-zero contributions on valid matches. Stemming consequence for prereqs: `problems identified` satisfies `problem_defined`, `constraints addressed` satisfies `constraints_satisfied`, `deployed it` satisfies `dod_met`. + +- **`oddkit_gate` matching is uniform across tiers.** The `knowledge_base` tier reads vocabulary from canon; the `minimal` tier uses a hardcoded vocabulary snapshot whose content mirrors the pre-0.20.0 regex alternations flattened to comma-separated phrases and words. Both tiers run the same BM25-for-transitions / set-intersection-for-prereqs matchers. The difference between tiers is edit-ability (canon is editable without deploy; minimal is locked to the deployed worker version), not capability. Stemming works in both tiers. + +- **`runGateAction` now reads transitions and prerequisites from canon at runtime** via `fetchGateTransitions` and `fetchGatePrerequisites` helpers, replacing the prior hardcoded three-arm if/else over transition tuples and the hardcoded `checkPatterns` regex map. `MINIMAL_TRANSITIONS` and `MINIMAL_PREREQUISITES` module-level constants hold the fallback-tier vocabulary. + +- **`result.prerequisites.met` format change (minor):** previously returned prereq description strings (e.g. `"Problem statement is clearly defined"`); now returns prereq ids (e.g. `"problem_defined"`). `result.prerequisites.unmet` now returns the canon-supplied gap messages (e.g. `"Problem statement not defined — the goal or issue being solved is unclear"`) which are more informative than the prior descriptions. Callers doing string-matching on these arrays should update their expectations. + +### Fixed + +- (none specific to this release) + +### Known limitations + +- **Stemmer does not handle consonant gemination.** The Porter-style stemmer in `workers/src/bm25.ts` drops common suffixes (`-ing`, `-ed`, etc.) but does not reverse doubled-consonant gemination — `shipping` stems to `shipp` rather than `ship`, `stepped` stems to `stepp` rather than `step`. Gate works around this by listing the handful of geminating inflected forms explicitly in `odd/gate/transitions.md` rather than relying on the stemmer. Non-geminating verbs (`deploy`, `build`, `start`, `reconsider`, etc.) continue to match their inflections via the stemmer alone. Same limitation applies to challenge and any future stemmed-matching tool; a proper Porter stemmer upgrade is tracked as a sweep follow-up. + +- **`getIndex` strict-mode (`skipBaselineFallback`) still inherited from 0.18.0 and 0.19.0.** Same limitation documented in prior entries. No tool in the sweep has exercised the code path non-trivially yet; tracked as a P1.3.x follow-up. + +- **`workers/baseline/` build pipeline still not shipped.** Two-tier cascade (`"knowledge_base" | "minimal"`) remains the operational envelope enum for gate; `"bundled"` stays out of the enum until the pipeline ships. + +- **`oddkit_challenge`'s `evaluatePrerequisiteCheck` is still regex-based.** Migration to stemmed set intersection (same matcher as gate's prereqs per this release's D5) is on the sweep trajectory for challenge's next revisit, bundled with a review of `cachedChallengeTypeIndex` under the "don't cache microsecond derivations" principle applied to gate in this release. + ## [0.19.0] - 2026-04-20 ### Added diff --git a/package.json b/package.json index 061a9af..0fb596e 100644 --- a/package.json +++ b/package.json @@ -1,6 +1,6 @@ { "name": "oddkit", - "version": "0.19.0", + "version": "0.20.0", "description": "Agent-first CLI for ODD-governed repos. Epistemic terrain rendering with portable baseline.", "type": "module", "bin": { diff --git a/workers/package.json b/workers/package.json index a1e814c..18a6711 100644 --- a/workers/package.json +++ b/workers/package.json @@ -1,6 +1,6 @@ { "name": "oddkit-mcp-worker", - "version": "0.19.0", + "version": "0.20.0", "private": true, "type": "module", "scripts": { diff --git a/workers/src/index.ts b/workers/src/index.ts index 4a503cb..58fb122 100644 --- a/workers/src/index.ts +++ b/workers/src/index.ts @@ -292,7 +292,7 @@ Use when: }, { name: "oddkit_gate", - description: "Check transition prerequisites before changing epistemic modes. Validates readiness and blocks premature convergence. Gate at every implicit mode transition, not just formal ones.", + description: "Check transition prerequisites before changing epistemic modes. Reads governance from klappy://odd/gate/transitions (transition keys, from/to, prereq mappings, detection terms) and klappy://odd/gate/prerequisites (prerequisite definitions and check vocabularies) at runtime; falls back to a minimal hardcoded vocabulary snapshot when canon is unreachable. Transition detection uses BM25 stemmed matching — 'deploying', 'started building', 'reconsidering' and other inflected variations match the same canonical transitions as their base forms. Geminating verbs (ship, step) have common inflections listed directly in canon to cover the stemmer's gemination gap. Prereq evaluation uses stemmed set intersection (independent gap-or-not per prereq; no ranking, no BM25 IDF pathology on the small prereq corpus). Response envelope declares governance_source (knowledge_base|minimal) and governance_uris (plural array of 2) per canon/constraints/core-governance-baseline. Accepts knowledge_base_url to read from an alternate canon. Gate at every implicit mode transition, not just formal ones.", action: "gate", schema: { input: z.string().describe("The proposed transition (e.g., 'ready to build', 'moving to planning')."), diff --git a/workers/src/orchestrate.ts b/workers/src/orchestrate.ts index fd88fef..cf48a0f 100644 --- a/workers/src/orchestrate.ts +++ b/workers/src/orchestrate.ts @@ -17,7 +17,7 @@ import { type IndexEntry, type SectionResult, } from "./zip-baseline-fetcher"; -import { buildBM25Index, searchBM25, type BM25Index } from "./bm25"; +import { buildBM25Index, searchBM25, tokenize, type BM25Index } from "./bm25"; import { parseTableRow } from "./markdown-utils"; import type { RequestTracer } from "./tracing"; import pkg from "../package.json"; @@ -102,6 +102,39 @@ interface BasePrerequisite { gapMessage: string; } +// Gate governance types — P1.3.2 (0.20.0). Consumed by runGateAction via +// fetchGateTransitions and fetchGatePrerequisites. Both read from canon +// at runtime with a hardcoded minimal vocabulary as the fallback tier. +// See canon/constraints/core-governance-baseline §Canon-Only — odd/gate/ +// is explicitly canon-only with "structural prereqs" as the minimal tier. +interface TransitionDef { + /** Canon key, e.g. "planning-to-execution". Used as BM25 doc id and for tiebreaker lookup. */ + key: string; + /** The mode being exited. */ + from: string; + /** The mode being entered. */ + to: string; + /** Prerequisite ids that must be satisfied. Resolved against GatePrerequisite[]. */ + prereqIds: string[]; + /** Comma-separated detection phrases concatenated into one string, fed to buildBM25Index. */ + detectionText: string; + /** Canon table row index (0-based). Deterministic tiebreaker for BM25 score ties. */ + rowOrder: number; +} + +interface GatePrerequisite { + /** Prereq id, e.g. "problem_defined". Referenced by TransitionDef.prereqIds. */ + id: string; + /** Raw comma-separated check vocabulary from canon, preserved for debugging/introspection. */ + check: string; + /** Surfaced to callers when the prereq fails. */ + gapMessage: string; + /** Precomputed stems of the check vocabulary. Populated at parse time in fetchGatePrerequisites; + * reused across requests (cache fetches and parses, not microsecond derivations — per PRD D9). + * Prereq evaluation is stemmed set intersection: inputStems.intersect(prereq.stemmedTokens) non-empty → pass. */ + stemmedTokens: Set; +} + interface NormativeVocabulary { caseSensitiveRegex: RegExp | null; caseInsensitiveRegex: RegExp | null; @@ -139,6 +172,18 @@ let cachedStakesCalibration: StakesCalibration | null = null; let cachedStakesCalibrationKnowledgeBaseUrl: string | undefined = undefined; let cachedStakesCalibrationSource: "knowledge_base" | "minimal" = "minimal"; +// Gate governance caches — P1.3.2 (0.20.0). Parsed governance arrays are +// cached here; BM25 indexes over transitions are built per-request (not +// cached — see PRD D9). GatePrerequisite.stemmedTokens is a parse product +// cached inside each struct, which differs from transitions' inline index: +// cache fetches and parses, not microsecond derivations. +let cachedGateTransitions: TransitionDef[] | null = null; +let cachedGateTransitionsKnowledgeBaseUrl: string | undefined = undefined; +let cachedGateTransitionsSource: "knowledge_base" | "minimal" = "minimal"; +let cachedGatePrerequisites: GatePrerequisite[] | null = null; +let cachedGatePrerequisitesKnowledgeBaseUrl: string | undefined = undefined; +let cachedGatePrerequisitesSource: "knowledge_base" | "minimal" = "minimal"; + export interface UnifiedParams { action: string; input: string; @@ -606,6 +651,180 @@ function getOrBuildChallengeTypeIndex( return bm25Index; } +// Gate minimal-tier vocabulary — P1.3.2 D6. Used when canon is unreachable +// or missing required sections. Vocabulary mirrors the pre-0.20.0 hardcoded +// detectTransition regexes (L306–L324 pre-refactor) and checkPatterns map +// (L2154–L2163 pre-refactor) flattened to comma-separated phrases and +// words. Algorithm is uniform across tiers (BM25 for transitions, set +// intersection for prereqs); only the vocabulary source differs. +const MINIMAL_TRANSITIONS: Array<{ + key: string; + from: string; + to: string; + prereqIds: string[]; + detectionText: string; +}> = [ + { + key: "planning-to-execution", + from: "planning", + to: "execution", + prereqIds: ["decisions_locked", "dod_defined", "irreversibility_assessed", "constraints_satisfied"], + detectionText: "ready to build, ready to implement, start building, let's code, start coding, moving to execution, moving to build", + }, + { + key: "exploration-to-planning", + from: "exploration", + to: "planning", + prereqIds: ["problem_defined", "constraints_reviewed"], + detectionText: "ready to plan, start planning, let's plan, time to plan, move to planning, moving to planning, ready, let's go, proceed, move forward, next step", + }, + { + key: "execution-to-exploration", + from: "execution", + to: "exploration", + prereqIds: [], + detectionText: "back to exploration, need to rethink, step back, stepped back, stepping back, reconsider", + }, + { + key: "execution-to-completion", + from: "execution", + to: "completion", + prereqIds: ["dod_met", "artifacts_present"], + detectionText: "ship, shipping, shipped, deploy, release, go live, push to prod", + }, +]; + +const MINIMAL_PREREQUISITES: Array<{ id: string; check: string; gapMessage: string }> = [ + { id: "problem_defined", check: "problem, goal, objective, need, issue", gapMessage: "Problem statement not defined — the goal or issue being solved is unclear" }, + { id: "constraints_reviewed", check: "constraint, rule, policy, reviewed, checked", gapMessage: "Relevant constraints have not been reviewed — what MUST-rules apply here?" }, + { id: "decisions_locked", check: "decided, locked, chosen, selected, committed", gapMessage: "Key decisions are not locked — which options have been closed?" }, + { id: "dod_defined", check: "definition of done, dod, done when, acceptance criteria", gapMessage: "Definition of done is unclear — what does the finished artifact look like?" }, + { id: "irreversibility_assessed", check: "irreversible, can't undo, one-way, point of no return", gapMessage: "Irreversibility not assessed — which aspects cannot be undone after execution?" }, + { id: "constraints_satisfied", check: "constraints met, constraints satisfied, constraints addressed", gapMessage: "Constraints not confirmed satisfied — are all MUST-rules addressable?" }, + { id: "dod_met", check: "done, complete, finished, all criteria", gapMessage: "DoD not met — the completion claim is missing evidence against the criteria" }, + { id: "artifacts_present", check: "screenshot, test, log, artifact, evidence, proof", gapMessage: "Required artifacts not present — what observable proof exists?" }, +]; + +/** Fetch gate transitions from canon at klappy://odd/gate/transitions. + * Parses the `## Transitions` table (columns: Transition Key | From | To | Prerequisites | Detection Terms). + * Empty result → source: "minimal" with MINIMAL_TRANSITIONS vocabulary. + * BM25 index construction is the caller's responsibility and happens per-request per PRD D9 + * (microsecond derivation, not worth caching on gate's tiny corpus). */ +async function fetchGateTransitions( + fetcher: KnowledgeBaseFetcher, + knowledgeBaseUrl?: string, +): Promise<{ transitions: TransitionDef[]; source: "knowledge_base" | "minimal" }> { + if (cachedGateTransitions && cachedGateTransitionsKnowledgeBaseUrl === knowledgeBaseUrl) + return { transitions: cachedGateTransitions, source: cachedGateTransitionsSource }; + + const parsed: TransitionDef[] = []; + try { + const content = await fetcher.getFile("odd/gate/transitions.md", knowledgeBaseUrl); + if (content) { + const section = content.match( + /## Transitions[\s\S]*?\| Transition Key[\s\S]*?\|[-|\s]+\|\n([\s\S]*?)(?=\n\n|\n##|$)/, + ); + if (section) { + let rowOrder = 0; + for (const row of section[1].split("\n").filter((r: string) => r.includes("|"))) { + const cols = parseTableRow(row); + if (cols.length >= 5) { + // Column layout: key | from | to | prereq_ids (comma-separated) | detection terms (comma-separated) + const key = cols[0].replace(/`/g, "").trim(); + const from = cols[1].trim(); + const to = cols[2].trim(); + const prereqIdsRaw = cols[3].trim(); + const detectionText = cols[4].trim(); + if (key.length === 0) continue; + const prereqIds = prereqIdsRaw.length > 0 + ? prereqIdsRaw.split(",").map((s: string) => s.trim()).filter((s: string) => s.length > 0) + : []; + parsed.push({ key, from, to, prereqIds, detectionText, rowOrder }); + rowOrder++; + } + } + } + } + } catch { + // Graceful degradation: canon unreachable → minimal fallback below + } + + let transitions: TransitionDef[]; + let source: "knowledge_base" | "minimal"; + if (parsed.length > 0) { + transitions = parsed; + source = "knowledge_base"; + } else { + transitions = MINIMAL_TRANSITIONS.map((t, i) => ({ ...t, rowOrder: i })); + source = "minimal"; + } + + cachedGateTransitions = transitions; + cachedGateTransitionsKnowledgeBaseUrl = knowledgeBaseUrl; + cachedGateTransitionsSource = source; + return { transitions, source }; +} + +/** Fetch gate prerequisites from canon at klappy://odd/gate/prerequisites. + * Parses the `## Prerequisite Overlays` table (columns: Prerequisite | Check | Gap message). + * Precomputes stemmedTokens per prereq at parse time (per PRD D5 + D9 — parse product, + * worth caching; prereq matching is stemmed set intersection at runtime, no BM25). */ +async function fetchGatePrerequisites( + fetcher: KnowledgeBaseFetcher, + knowledgeBaseUrl?: string, +): Promise<{ prerequisites: GatePrerequisite[]; source: "knowledge_base" | "minimal" }> { + if (cachedGatePrerequisites && cachedGatePrerequisitesKnowledgeBaseUrl === knowledgeBaseUrl) + return { prerequisites: cachedGatePrerequisites, source: cachedGatePrerequisitesSource }; + + const parsed: GatePrerequisite[] = []; + try { + const content = await fetcher.getFile("odd/gate/prerequisites.md", knowledgeBaseUrl); + if (content) { + const section = content.match( + /## Prerequisite Overlays[\s\S]*?\| Prerequisite[\s\S]*?\|[-|\s]+\|\n([\s\S]*?)(?=\n\n|\n##|$)/, + ); + if (section) { + for (const row of section[1].split("\n").filter((r: string) => r.includes("|"))) { + const cols = parseTableRow(row); + if (cols.length >= 3) { + const id = cols[0].trim(); + const check = cols[1].trim(); + const gapMessage = cols[2].replace(/^"|"$/g, "").trim(); + if (id.length === 0) continue; + // Precompute stemmed tokens from check vocabulary. tokenize() stems + // and filters stop words using the default set; for gate's small + // vocabulary this is appropriate — we want stop-word filtering so + // multi-word entries like "definition of done" contribute only the + // content-bearing stems. + const stemmedTokens = new Set(tokenize(check)); + parsed.push({ id, check, gapMessage, stemmedTokens }); + } + } + } + } + } catch { + // Graceful degradation + } + + let prerequisites: GatePrerequisite[]; + let source: "knowledge_base" | "minimal"; + if (parsed.length > 0) { + prerequisites = parsed; + source = "knowledge_base"; + } else { + prerequisites = MINIMAL_PREREQUISITES.map((p) => ({ + ...p, + stemmedTokens: new Set(tokenize(p.check)), + })); + source = "minimal"; + } + + cachedGatePrerequisites = prerequisites; + cachedGatePrerequisitesKnowledgeBaseUrl = knowledgeBaseUrl; + cachedGatePrerequisitesSource = source; + return { prerequisites, source }; +} + async function fetchBasePrerequisites( fetcher: KnowledgeBaseFetcher, knowledgeBaseUrl?: string, @@ -1307,6 +1526,13 @@ async function runCleanupStorage( cachedStakesCalibration = null; cachedStakesCalibrationKnowledgeBaseUrl = undefined; cachedStakesCalibrationSource = "minimal"; + // E0008.3 — gate governance caches (P1.3.2, 0.20.0) + cachedGateTransitions = null; + cachedGateTransitionsKnowledgeBaseUrl = undefined; + cachedGateTransitionsSource = "minimal"; + cachedGatePrerequisites = null; + cachedGatePrerequisitesKnowledgeBaseUrl = undefined; + cachedGatePrerequisitesSource = "minimal"; return { action: "cleanup_storage", @@ -2102,92 +2328,95 @@ async function runGateAction( state?: OddkitState, ): Promise { const startMs = Date.now(); - const transition = detectTransition(input); const fullInput = context ? `${input}\n${context}` : input; - interface Prereq { - id: string; - description: string; - required: boolean; - } - const prereqs: Prereq[] = []; - if (transition.from === "exploration" && transition.to === "planning") { - prereqs.push({ - id: "problem_defined", - description: "Problem statement is clearly defined", - required: true, - }); - prereqs.push({ - id: "constraints_reviewed", - description: "Relevant constraints have been reviewed", - required: true, - }); - } else if (transition.from === "planning" && transition.to === "execution") { - prereqs.push({ - id: "decisions_locked", - description: "Key decisions are locked", - required: true, - }); - prereqs.push({ id: "dod_defined", description: "Definition of done is clear", required: true }); - prereqs.push({ - id: "irreversibility_assessed", - description: "Irreversible aspects identified", - required: true, - }); - prereqs.push({ - id: "constraints_satisfied", - description: "All MUST constraints are addressable", - required: true, - }); - } else if (transition.to === "completion") { - prereqs.push({ id: "dod_met", description: "DoD criteria met with evidence", required: true }); - prereqs.push({ - id: "artifacts_present", - description: "Required artifacts present", - required: true, - }); - } + // Load governance in parallel. Each helper returns a { , source } + // tuple per PRD D3; aggregate strictly per D1 (any helper minimal → aggregate + // minimal). Per PRD D5: transitions use BM25 (ranking problem); prereqs use + // stemmed set intersection (independent gap-or-not, avoids BM25 IDF-negative + // pathology on small shared-vocabulary corpora). + const [ + { transitions, source: transitionsSource }, + { prerequisites, source: prereqsSource }, + ] = await Promise.all([ + fetchGateTransitions(fetcher, knowledgeBaseUrl), + fetchGatePrerequisites(fetcher, knowledgeBaseUrl), + ]); + + // Strict union per canon/constraints/core-governance-baseline. Two-tier + // today (workers/baseline/ not shipped); expands additively to include + // "bundled" when that pipeline ships. + const governanceSource: "knowledge_base" | "minimal" = + [transitionsSource, prereqsSource].some((s) => s === "minimal") + ? "minimal" + : "knowledge_base"; + + // Per PRD D4: two peer governance URIs (not singular), alphabetical by + // path-tail. Shape divergence from encode's singular governance_uri is + // by design — gate's two files are peers with no hierarchy. Shape parity + // with challenge's governance_uris plural array; gate's is structurally + // cleaner because both entries point to peer single files (challenge's + // array mixed a directory anchor with three files). + const governanceUris = [ + "klappy://odd/gate/prerequisites", + "klappy://odd/gate/transitions", + ]; + + // Transition detection via BM25 per PRD D5. Index is built inline from + // the cached governance array (per PRD D9 — microsecond derivation, not + // cached separately). Top hit with score > 0 wins; rowOrder breaks ties + // deterministically when two transitions score identically. + const bm25Docs = transitions.map((t) => ({ id: t.key, text: t.detectionText })); + const transitionIndex = buildBM25Index(bm25Docs); + const hits = searchBM25(transitionIndex, fullInput, transitions.length); + + let matchedTransition: TransitionDef | null = null; + if (hits.length > 0 && hits[0].score > 0) { + const topScore = hits[0].score; + const tiedIds = new Set(hits.filter((h) => h.score === topScore).map((h) => h.id)); + const tiedTransitions = transitions + .filter((t) => tiedIds.has(t.key)) + .sort((a, b) => a.rowOrder - b.rowOrder); + matchedTransition = tiedTransitions[0] ?? null; + } + + const transition = matchedTransition + ? { from: matchedTransition.from, to: matchedTransition.to } + : { from: "unknown", to: "unknown" }; + + // Prereq evaluation via stemmed set intersection per PRD D5. Each prereq + // evaluates independently — pass if any stemmed input token matches any + // stemmed check term; no ranking, no scoring. Eliminates BM25's IDF- + // negative pathology on small corpora with shared vocabulary. + const inputStems = new Set(tokenize(fullInput)); + const prereqById = new Map(prerequisites.map((p) => [p.id, p])); const met: string[] = []; const unmet: string[] = []; const unknown: string[] = []; - const checkPatterns: Record = { - problem_defined: /\b(problem|goal|objective|need|issue)\b/i, - constraints_reviewed: /\b(constraint|rule|policy|reviewed|checked)\b/i, - decisions_locked: /\b(decided|locked|chosen|selected|committed)\b/i, - dod_defined: /\b(definition of done|dod|done when|acceptance criteria)\b/i, - irreversibility_assessed: /\b(irreversib|can't undo|one-way|point of no return)\b/i, - constraints_satisfied: /\b(constraints? (met|satisfied|addressed))\b/i, - dod_met: /\b(done|complete|finished|all criteria)\b/i, - artifacts_present: /\b(screenshot|test|log|artifact|evidence|proof)\b/i, - }; - for (const p of prereqs) { - const pattern = checkPatterns[p.id]; - if (pattern && pattern.test(fullInput)) met.push(p.description); - else if (p.required) unmet.push(p.description); - else unknown.push(p.description); - } - const gateStatus = unmet.length > 0 ? "NOT_READY" : "PASS"; - - const index = await fetcher.getIndex(knowledgeBaseUrl); - const results = scoreEntries(index.entries, `transition boundary deceleration ${input}`).slice( - 0, - 3, - ); - const canonRefs: Array<{ path: string; quote: string }> = []; - for (const entry of results) { - const content = await fetcher.getFile(entry.path, knowledgeBaseUrl); - if (content) { - const stripped = content.replace(/^---[\s\S]*?---\n/, ""); - const lines2 = stripped.split("\n").filter((l) => l.trim() && !l.startsWith("#")); - canonRefs.push({ - path: `${entry.path}#${entry.title}`, - quote: lines2.slice(0, 2).join(" ").slice(0, 150), - }); + if (matchedTransition) { + for (const prereqId of matchedTransition.prereqIds) { + const prereq = prereqById.get(prereqId); + if (!prereq) { + // Governance error: transition references a prereq id not defined in + // odd/gate/prerequisites.md. Surface as unknown rather than crash so + // partial canon states remain diagnosable. + unknown.push(`(unknown prereq id: ${prereqId})`); + continue; + } + const hasMatch = Array.from(prereq.stemmedTokens).some((s) => inputStems.has(s)); + if (hasMatch) { + met.push(prereq.id); + } else { + unmet.push(prereq.gapMessage); + } } } + const gateStatus = unmet.length > 0 ? "NOT_READY" : "PASS"; + const requiredTotal = matchedTransition ? matchedTransition.prereqIds.length : 0; + // Update state const updatedState = state ? initState(state) : undefined; if (updatedState && gateStatus === "PASS") { @@ -2198,10 +2427,7 @@ async function runGateAction( } const lines = [`Gate: ${gateStatus} (${transition.from} → ${transition.to})`, ""]; - lines.push( - `Prerequisites: ${met.length}/${prereqs.filter((p) => p.required).length} required met`, - "", - ); + lines.push(`Prerequisites: ${met.length}/${requiredTotal} required met`, ""); if (unmet.length > 0) { lines.push("Unmet (required):"); for (const u of unmet) lines.push(` - ${u}`); @@ -2212,13 +2438,18 @@ async function runGateAction( for (const m of met) lines.push(` + ${m}`); lines.push(""); } - if (canonRefs.length > 0) { - lines.push("Relevant canon:"); - for (const r of canonRefs) { - lines.push(` > ${r.quote}`); - lines.push(` — ${r.path}`); - lines.push(""); - } + if (unknown.length > 0) { + lines.push("Unknown (governance errors):"); + for (const u of unknown) lines.push(` ? ${u}`); + lines.push(""); + } + + const debug: Record = { + duration_ms: Date.now() - startMs, + generated_at: new Date().toISOString(), + }; + if (knowledgeBaseUrl) { + debug.knowledge_base_url = knowledgeBaseUrl; } return { @@ -2231,12 +2462,14 @@ async function runGateAction( unmet, unknown, required_met: met.length, - required_total: prereqs.filter((p) => p.required).length, + required_total: requiredTotal, }, + governance_source: governanceSource, + governance_uris: governanceUris, }, state: updatedState, assistant_text: lines.join("\n").trim(), - debug: { duration_ms: Date.now() - startMs, generated_at: new Date().toISOString() }, + debug, }; } diff --git a/workers/test/canon-tool-envelope.smoke.mjs b/workers/test/canon-tool-envelope.smoke.mjs index 45f88fe..8ca122e 100644 --- a/workers/test/canon-tool-envelope.smoke.mjs +++ b/workers/test/canon-tool-envelope.smoke.mjs @@ -321,6 +321,118 @@ async function run() { } } + // Tool 6: oddkit_gate — canon-driven, two governance surfaces. Full envelope + + // governance_source + governance_uris (plural array of 2 — shape diverges + // from encode's singular governance_uri, matches challenge's plural shape, + // structurally cleaner than challenge because both entries are single-file + // peers). Per PRD D5: transitions use BM25 (ranking problem); prereqs use + // stemmed set intersection (gap-or-not, avoids BM25 IDF-negative pathology). + // Stemming is uniform across knowledge_base and minimal tiers. + console.log(`\n─── oddkit_gate: envelope + governance_source + governance_uris ───`); + const gateDefault = await callTool("oddkit_gate", { + input: "ready to build my feature — decisions locked, done when tests pass, no irreversible changes, all constraints addressed", + }); + expectFullEnvelope("oddkit_gate (default knowledge_base)", gateDefault); + expectGovernanceSource("oddkit_gate (default knowledge_base)", gateDefault, "knowledge_base"); + ok( + "oddkit_gate: result.governance_uris is an array of exactly 2 entries", + Array.isArray(gateDefault.result?.governance_uris) && + gateDefault.result?.governance_uris.length === 2, + `got: ${JSON.stringify(gateDefault.result?.governance_uris)}`, + ); + const expectedGateUris = [ + "klappy://odd/gate/prerequisites", + "klappy://odd/gate/transitions", + ]; + ok( + "oddkit_gate: governance_uris matches alphabetical peer set", + JSON.stringify(gateDefault.result?.governance_uris) === JSON.stringify(expectedGateUris), + `got: ${JSON.stringify(gateDefault.result?.governance_uris)}`, + ); + ok( + "oddkit_gate: result.governance_uri (singular) is NOT emitted (divergence from encode by design — PRD D4)", + gateDefault.result?.governance_uri === undefined, + `got: ${gateDefault.result?.governance_uri}`, + ); + + console.log(`\n─── oddkit_gate: knowledge_base_url override ───`); + const gateOverride = await callTool("oddkit_gate", { + input: "ready to build", + knowledge_base_url: "https://github.com/torvalds/linux", + }); + expectFullEnvelope("oddkit_gate (override → linux)", gateOverride); + ok( + "oddkit_gate: debug.knowledge_base_url echoed on override", + gateOverride.debug?.knowledge_base_url === "https://github.com/torvalds/linux", + `got: ${gateOverride.debug?.knowledge_base_url}`, + ); + // Known limitation inherited from 0.18.0/0.19.0: getIndex merges baseline + // entries into the override result, so overrides to repos that lack the + // expected governance files may still resolve via the baseline tier rather + // than falling through to minimal. Same assertion pattern as encode's + // override test — accept either tier rather than forcing "minimal". + ok( + "oddkit_gate: governance_source is a valid tier on override", + ["knowledge_base", "minimal"].includes(gateOverride.result?.governance_source), + `got: ${gateOverride.result?.governance_source}`, + ); + + console.log(`\n─── oddkit_gate: BM25 transition detection — literal + stemmed variants ───`); + const transitionCases = [ + { input: "ready to build", expected: "execution", label: "literal planning→execution" }, + { input: "started building the feature", expected: "execution", label: "stemmed: started building → start build" }, + { input: "start planning", expected: "planning", label: "literal exploration→planning" }, + { input: "we're planning the approach", expected: "planning", label: "stemmed: planning → plan" }, + { input: "ship it", expected: "completion", label: "literal execution→completion" }, + { input: "shipping this now", expected: "completion", label: "stemmed: shipping → ship" }, + { input: "step back", expected: "exploration", label: "literal execution→exploration" }, + { input: "stepped back to reconsider", expected: "exploration", label: "stemmed: stepped back → step back" }, + { input: "hello there", expected: "unknown", label: "default guard: no match" }, + ]; + for (const tc of transitionCases) { + const r = await callTool("oddkit_gate", { input: tc.input }); + ok( + `oddkit_gate[${tc.label}]: transition.to === "${tc.expected}"`, + r.result?.transition?.to === tc.expected, + `input: "${tc.input}" got: ${r.result?.transition?.to}`, + ); + } + + console.log(`\n─── oddkit_gate: BM25 priority resolution (specific phrase beats bare word) ───`); + // "ready to build my feature" — "ready" appears in both planning-to-execution + // and exploration-to-planning vocabularies; "build" only appears in the + // former. BM25 should score the 2-term match (ready + build) above the + // 1-term match (ready alone), yielding planning-to-execution. This tests + // that BM25 scoring replaces the old regex cascade's fragile order-dependent + // priority resolution. + const priorityCase = await callTool("oddkit_gate", { input: "ready to build my feature" }); + ok( + "oddkit_gate: BM25 scoring picks planning-to-execution (specific phrase) over exploration-to-planning (bare 'ready')", + priorityCase.result?.transition?.to === "execution", + `got: ${priorityCase.result?.transition?.to}`, + ); + + console.log(`\n─── oddkit_gate: stemmed prereq set-intersection ───`); + // Prereq check uses stemmed set intersection (not BM25). Input contains: + // "locked" (→ decisions_locked check vocab "locked"), "done" (→ dod_defined + // and dod_met), "irreversible" (→ irreversibility_assessed), "addressed" + // (→ constraints_satisfied check vocab stemmed from "addressed"). With + // planning→execution transition, all four required prereqs pass. + const prereqPass = await callTool("oddkit_gate", { + input: "ready to build — decisions locked, done when tests pass, no irreversible changes, all constraints addressed", + }); + ok( + "oddkit_gate: stemmed prereq match produces PASS status", + prereqPass.result?.status === "PASS", + `got: ${prereqPass.result?.status} | unmet: ${JSON.stringify(prereqPass.result?.prerequisites?.unmet)}`, + ); + ok( + "oddkit_gate: all 4 planning→execution prereqs marked met", + prereqPass.result?.prerequisites?.required_met === 4 && + prereqPass.result?.prerequisites?.required_total === 4, + `got: met=${prereqPass.result?.prerequisites?.required_met} total=${prereqPass.result?.prerequisites?.required_total}`, + ); + console.log(`\n${passed} passed, ${failed} failed`); process.exit(failed === 0 ? 0 : 1); }