Skip to content

Commit 260492c

Browse files
authored
feat(gate): governance-driven BM25 + set intersection + envelope (0.20.0) (#118)
P1.3.2 Phase 2. runGateAction refactored to consume klappy://odd/gate/transitions and klappy://odd/gate/prerequisites at runtime via two new helpers (fetchGateTransitions, fetchGatePrerequisites). Transition detection via BM25 stemmed matching (ranking problem); prereq evaluation via stemmed set intersection (independent gap-or-not, avoids BM25 IDF-negative pathology on small shared-vocabulary corpora). Envelope declares governance_source + governance_uris (plural array of 2) + debug.knowledge_base_url echo. Preview smoke 158/158 × 3 consecutive clean. Canon-first satisfied: klappy/klappy.dev#120 + #122 merged before this PR.
1 parent 71ee6ed commit 260492c

6 files changed

Lines changed: 476 additions & 93 deletions

File tree

CHANGELOG.md

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,44 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.20.0] - 2026-04-20
11+
12+
### Added
13+
14+
- **`governance_source` on `oddkit_gate` envelope** — Gate response `result` now declares which tier served its governance vocabulary: `"knowledge_base"` (both `odd/gate/transitions.md` and `odd/gate/prerequisites.md` parsed from canon) or `"minimal"` (one or both files unreachable; hardcoded vocabulary snapshot used). Strict aggregation rule per P1.3.1 precedent: any helper falling through to minimal makes the aggregate `"minimal"`. Two-tier cascade today — `workers/baseline/` is not yet shipped, and `odd/gate/` is explicitly canon-only per `klappy://canon/constraints/core-governance-baseline` §What-Ships-in-Baseline.
15+
16+
- **`governance_uris` (plural array of 2) on `oddkit_gate` envelope** — Gate reads two peer governance documents (`odd/gate/transitions`, `odd/gate/prerequisites`); the envelope surfaces both URIs in alphabetical order by path-tail. **This is an intentional shape divergence from `oddkit_encode`'s singular `governance_uri`** — encode's encoding-type docs sit under a single canonical umbrella, but gate's two files are peers in a foreign-key relation (transitions references prereq ids defined in prerequisites). Same divergence rationale as `oddkit_challenge` in 0.19.0; gate's array is structurally symmetric because both entries point to peer single files. Consumers that prefer a singular anchor can read `governance_uris[0]` — alphabetical ordering makes this stable.
17+
18+
- **`debug.knowledge_base_url` echo on `oddkit_gate` envelope** — Gate now echoes the caller's `knowledge_base_url` override in the debug envelope, matching encode (0.18.0) and challenge (0.19.0).
19+
20+
- **Two new canon files define gate's governance:** `odd/gate/transitions.md` (four transition keys, from/to endpoints, prerequisite id mappings, BM25 detection terms) and `odd/gate/prerequisites.md` (eight prerequisite ids with check vocabularies and gap messages). Canon-first contract: both files merged to klappy.dev main before this release (klappy/klappy.dev#120).
21+
22+
### Changed
23+
24+
- **`oddkit_gate` transition detection now uses BM25 stemmed matching over canon-supplied vocabulary** (replaces the prior literal word-boundary regex cascade). This is **strictly additive**: every input that matched the prior regex still matches, plus stemmed variations now match too. `deploying`, `released`, `started building`, `building`, and `reconsidering` now match their canonical transitions via stemming. The Porter-style stemmer does not currently reverse consonant gemination (`shipping` → `shipp`, not `ship`), so the small number of geminating verbs gate cares about (`ship`, `step back`) have their inflected forms listed explicitly in `odd/gate/transitions.md` rather than relying on the stemmer. Priority resolution between competing transitions uses BM25 scoring (specific phrase beats bare word — `ready to build` outscores bare `ready` via 2-term-vs-1-term match) rather than the prior fragile regex-cascade order. Row order in `odd/gate/transitions.md` remains as deterministic tiebreaker for genuine ties.
25+
26+
- **`oddkit_gate` prerequisite evaluation now uses stemmed set intersection** (not BM25). Each prereq evaluates independently: pass if any stemmed input token matches any stemmed check term; fail otherwise. This is fit-to-problem — prereqs return gap-or-not in isolation, not a ranking. Avoids BM25's IDF-negative pathology on the small 8-prereq corpus where common vocabulary across prereqs (words like `goal`, `done`, `constraint`) would flip `log((N-df+0.5)/(df+0.5))` negative and produce score-zero contributions on valid matches. Stemming consequence for prereqs: `problems identified` satisfies `problem_defined`, `constraints addressed` satisfies `constraints_satisfied`, `deployed it` satisfies `dod_met`.
27+
28+
- **`oddkit_gate` matching is uniform across tiers.** The `knowledge_base` tier reads vocabulary from canon; the `minimal` tier uses a hardcoded vocabulary snapshot whose content mirrors the pre-0.20.0 regex alternations flattened to comma-separated phrases and words. Both tiers run the same BM25-for-transitions / set-intersection-for-prereqs matchers. The difference between tiers is edit-ability (canon is editable without deploy; minimal is locked to the deployed worker version), not capability. Stemming works in both tiers.
29+
30+
- **`runGateAction` now reads transitions and prerequisites from canon at runtime** via `fetchGateTransitions` and `fetchGatePrerequisites` helpers, replacing the prior hardcoded three-arm if/else over transition tuples and the hardcoded `checkPatterns` regex map. `MINIMAL_TRANSITIONS` and `MINIMAL_PREREQUISITES` module-level constants hold the fallback-tier vocabulary.
31+
32+
- **`result.prerequisites.met` format change (minor):** previously returned prereq description strings (e.g. `"Problem statement is clearly defined"`); now returns prereq ids (e.g. `"problem_defined"`). `result.prerequisites.unmet` now returns the canon-supplied gap messages (e.g. `"Problem statement not defined — the goal or issue being solved is unclear"`) which are more informative than the prior descriptions. Callers doing string-matching on these arrays should update their expectations.
33+
34+
### Fixed
35+
36+
- (none specific to this release)
37+
38+
### Known limitations
39+
40+
- **Stemmer does not handle consonant gemination.** The Porter-style stemmer in `workers/src/bm25.ts` drops common suffixes (`-ing`, `-ed`, etc.) but does not reverse doubled-consonant gemination — `shipping` stems to `shipp` rather than `ship`, `stepped` stems to `stepp` rather than `step`. Gate works around this by listing the handful of geminating inflected forms explicitly in `odd/gate/transitions.md` rather than relying on the stemmer. Non-geminating verbs (`deploy`, `build`, `start`, `reconsider`, etc.) continue to match their inflections via the stemmer alone. Same limitation applies to challenge and any future stemmed-matching tool; a proper Porter stemmer upgrade is tracked as a sweep follow-up.
41+
42+
- **`getIndex` strict-mode (`skipBaselineFallback`) still inherited from 0.18.0 and 0.19.0.** Same limitation documented in prior entries. No tool in the sweep has exercised the code path non-trivially yet; tracked as a P1.3.x follow-up.
43+
44+
- **`workers/baseline/` build pipeline still not shipped.** Two-tier cascade (`"knowledge_base" | "minimal"`) remains the operational envelope enum for gate; `"bundled"` stays out of the enum until the pipeline ships.
45+
46+
- **`oddkit_challenge`'s `evaluatePrerequisiteCheck` is still regex-based.** Migration to stemmed set intersection (same matcher as gate's prereqs per this release's D5) is on the sweep trajectory for challenge's next revisit, bundled with a review of `cachedChallengeTypeIndex` under the "don't cache microsecond derivations" principle applied to gate in this release.
47+
1048
## [0.19.0] - 2026-04-20
1149

1250
### Added

package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "oddkit",
3-
"version": "0.19.0",
3+
"version": "0.20.0",
44
"description": "Agent-first CLI for ODD-governed repos. Epistemic terrain rendering with portable baseline.",
55
"type": "module",
66
"bin": {

workers/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "oddkit-mcp-worker",
3-
"version": "0.19.0",
3+
"version": "0.20.0",
44
"private": true,
55
"type": "module",
66
"scripts": {

workers/src/index.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -292,7 +292,7 @@ Use when:
292292
},
293293
{
294294
name: "oddkit_gate",
295-
description: "Check transition prerequisites before changing epistemic modes. Validates readiness and blocks premature convergence. Gate at every implicit mode transition, not just formal ones.",
295+
description: "Check transition prerequisites before changing epistemic modes. Reads governance from klappy://odd/gate/transitions (transition keys, from/to, prereq mappings, detection terms) and klappy://odd/gate/prerequisites (prerequisite definitions and check vocabularies) at runtime; falls back to a minimal hardcoded vocabulary snapshot when canon is unreachable. Transition detection uses BM25 stemmed matching — 'deploying', 'started building', 'reconsidering' and other inflected variations match the same canonical transitions as their base forms. Geminating verbs (ship, step) have common inflections listed directly in canon to cover the stemmer's gemination gap. Prereq evaluation uses stemmed set intersection (independent gap-or-not per prereq; no ranking, no BM25 IDF pathology on the small prereq corpus). Response envelope declares governance_source (knowledge_base|minimal) and governance_uris (plural array of 2) per canon/constraints/core-governance-baseline. Accepts knowledge_base_url to read from an alternate canon. Gate at every implicit mode transition, not just formal ones.",
296296
action: "gate",
297297
schema: {
298298
input: z.string().describe("The proposed transition (e.g., 'ready to build', 'moving to planning')."),

0 commit comments

Comments
 (0)