feat(gate): governance-driven BM25 + set intersection + envelope (0.20.0)#118
Conversation
…0.0) P1.3.2 Phase 2. Consumes the two canon files landed in klappy/klappy.dev#120. runGateAction refactored: - fetchGateTransitions → {transitions, source} parses odd/gate/transitions.md, returns TransitionDef[] with detectionText fed to BM25 at gate time. - fetchGatePrerequisites → {prerequisites, source} parses odd/gate/prerequisites.md, precomputes stemmedTokens: Set<string> per prereq at parse time (cache parse products, not microsecond derivations). - Transition detection uses BM25 (ranking problem — one transition wins; rowOrder breaks score ties deterministically). Index built inline per request, not cached (microsecond derivation on gate's tiny vocabulary). - Prereq evaluation uses stemmed set intersection (independent gap-or-not; avoids BM25 IDF-negative pathology on small 8-prereq shared-vocab corpus where log((N-df+0.5)/(df+0.5)) flips negative for df > (N-1)/2). - Algorithm uniform across tiers. MINIMAL_TRANSITIONS + MINIMAL_PREREQUISITES hold fallback vocabulary matching pre-0.20.0 regex alternations. Stemming works in both tiers. Envelope additions: governance_source ('knowledge_base' | 'minimal'), governance_uris plural array of 2 (alphabetical by path-tail: prerequisites then transitions), debug.knowledge_base_url echo on override. Strictly additive: every input that matched pre-0.20.0 word-boundary regex still matches; stemmed variations now match too (shipping→completion, started building→execution, stepped back→exploration, etc.). cleanup_storage extended with two cache resets. Tool description updated. Smoke test adds ~30 assertions (envelope shape, governance_uris, override echo, literal+stemmed pairs per transition, BM25 priority resolution, stemmed prereq set-intersection, uniform stemming across tiers). Refs klappy/klappy.dev#120 (Phase 1 canon), klappy://odd/handoffs/2026-04-20-p1-3-2-phase-2-gate-code-refactor.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
oddkit | 1acc452 | Commit Preview URL Branch Preview URL |
Apr 20 2026, 03:07 AM |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Dead
detectTransitionfunction after refactoring- Removed the unused
detectTransitionfunction definition fromworkers/src/orchestrate.tssince it has no remaining callers after the BM25 refactor.
- Removed the unused
- ✅ Fixed: Backtick stripping inconsistent across cross-referencing identifier columns
- Added
.replace(//g, "")to the prereqIds column infetchGateTransitionsand to the prereq id column infetchGatePrerequisites` so cross-reference lookups are robust to canon backtick formatting.
- Added
You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 1acc452. Configure here.
| let cachedGatePrerequisites: GatePrerequisite[] | null = null; | ||
| let cachedGatePrerequisitesKnowledgeBaseUrl: string | undefined = undefined; | ||
| let cachedGatePrerequisitesSource: "knowledge_base" | "minimal" = "minimal"; | ||
|
|
There was a problem hiding this comment.
Dead detectTransition function after refactoring
Low Severity
The detectTransition function is now dead code. This PR removed its only call site in runGateAction (replacing it with BM25-based transition detection), but the function definition itself was left behind. It's a non-exported local function with zero remaining callers anywhere in the codebase. The PR description explicitly states it "Removed: the three-arm hardcoded if/else over transition tuples," suggesting this was an oversight.
Reviewed by Cursor Bugbot for commit 1acc452. Configure here.
| if (key.length === 0) continue; | ||
| const prereqIds = prereqIdsRaw.length > 0 | ||
| ? prereqIdsRaw.split(",").map((s: string) => s.trim()).filter((s: string) => s.length > 0) | ||
| : []; |
There was a problem hiding this comment.
Backtick stripping inconsistent across cross-referencing identifier columns
Medium Severity
fetchGateTransitions defensively strips backticks from cols[0] (transition key) but not from cols[3] (prereqIds). fetchGatePrerequisites also does not strip backticks from cols[0] (prereq id). These two id sets are joined via prereqById.get(prereqId) at runtime. If the canon's transitions table wraps prerequisite references in backticks but the prerequisites table does not (or vice versa), the lookup silently fails and prereqs are classified as unknown instead of evaluated.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 1acc452. Configure here.
Ledger at odd/ledger/2026-04-20-p1-3-2-gate-canary-landed.md captures the 2026-04-20T01:21Z-03:20Z session that shipped oddkit 0.20.0. Mirrors the P1.3.1 ledger structure (Summary, What Shipped, What the Validator Actually Checked, Patterns, Cleared, O-opens, Session Mechanics, Handoff, Provenance). Two handoffs flipped to status: superseded with superseded_by pointing at the new ledger: - odd/handoffs/2026-04-21-p1-3-2-gate-canary — the original two-phase handoff, superseded by the full P1.3.2 ship. - odd/handoffs/2026-04-20-p1-3-2-phase-2-gate-code-refactor — the mid-session Phase 2 forward handoff, superseded on arrival when Phase 2 shipped in the same session. Left in canon as documentation of the 'same-session handoff anti-pattern' called out in the ledger. Honest accounting in the ledger of the tradeoff between this session's smoke-heavy attestation (9 runs + live self-call) and P1.3.1's Sonnet 4.6 5-corroboration validator pattern. Neither is strictly superior; they catch different classes of issue. Refs klappy/oddkit#118 (merged 260492c), #119 (promotion 1308245), #120 #121 #122.


Summary
P1.3.2 Phase 2. Consumes the two canon files landed in klappy/klappy.dev#120 (later amended by #122 for geminating-verb inflections).
oddkit_gatenow reads transitions and prerequisites from canon at runtime, declaresgovernance_source+governance_uris(plural array of 2) +debug.knowledge_base_urlecho on its envelope, and ships as 0.20.0.Matching design — split by fit (PRD D5)
Transitions use BM25 stemmed matching. Picking the best transition is a ranking problem — one transition wins, and BM25's specific-phrase-beats-bare-word scoring (via term frequency) does this correctly. Row order in
odd/gate/transitions.mdremains as deterministic tiebreaker for genuine ties. Index built inline per request (D9 — microsecond derivation, not worth caching on gate's tiny vocabulary).Prereqs use stemmed set intersection. Each prereq evaluates independently (gap-or-not, not ranked). BM25's IDF term flips negative when
df > (N-1)/2on small shared-vocabulary corpora, which for 8 prereqs with overlapping check vocabularies can produce score=0 on valid matches (log((N-df+0.5)/(df+0.5))goes negative). Set intersection sidesteps that entirely with no scoring pass and semantically correct "any stem in common = prereq applies."Caching is asymmetric by design. Transitions rebuild their BM25 index inline; prereqs cache a precomputed
stemmedTokens: Set<string>on eachGatePrerequisiteat parse time. Principle: cache parse products, not microsecond derivations.What changed
New code in
workers/src/orchestrate.tsTransitionDefandGatePrerequisiteinterfacesfetchGateTransitionshelper (parses## Transitionstable fromodd/gate/transitions.md;{transitions, source}tuple per PRD D3)fetchGatePrerequisiteshelper (parses## Prerequisite Overlaystable fromodd/gate/prerequisites.md; precomputesstemmedTokensat parse time;{prerequisites, source}tuple)MINIMAL_TRANSITIONS+MINIMAL_PREREQUISITEShardcoded fallback-tier constantsrunGateActionfully refactored (~145 lines): parallel governance fetch, strict source aggregation, BM25 transition detection with rowOrder tiebreakers, stemmed set intersection for prereqs, envelope additionscleanup_storageextended with 6 gate-cache resetscheckPatternsregex map, and the deadgetIndex+scoreEntriescall that was producing unused canon refsEnvelope before/after
0.19.0:
{ "action": "gate", "result": {"status": "...", "transition": {...}, "prerequisites": {...}}, ... }0.20.0:
{ "action": "gate", "result": { "status": "...", "transition": {...}, "prerequisites": {...}, "governance_source": "knowledge_base", "governance_uris": [ "klappy://odd/gate/prerequisites", "klappy://odd/gate/transitions" ] }, "debug": { ..., "knowledge_base_url": "..." } }Additive only.
Strictly additive matching behavior
Every input that matched the pre-0.20.0 word-boundary regex still matches. Stemming now makes
deploying,released,started building,reconsideringetc. match their canonical transitions too. The Porter stemmer doesn't handle consonant gemination, soshipping/steppedhave their inflected forms listed directly inodd/gate/transitions.md(canon amendment in klappy/klappy.dev#122) rather than relying on the stemmer — cheaper and more auditable than extending the stemmer.Smoke test
Adds ~30 assertions covering envelope shape, governance_uris alphabetical-peer array, override echo, literal+stemmed pairs per transition, BM25 priority resolution (
ready to buildbeats bareready), stemmed prereq set-intersection, overridegovernance_sourceaccepts either tier (matching encode's inherited-limitation assertion pattern).Verification
npm run typecheckcleannode workers/test/governance-parser.test.mjs105/105 passed (no regressions)https://gate-governance-source-envelope-oddkit.klappy.workers.devCanon-first sequencing satisfied
86a7194937eb52oddkit_getbefore this PR openedRefs
Note
Medium Risk
Changes
oddkit_gatematching logic and its response schema (governance_*fields plus new prerequisite output strings), which can affect downstream consumers and transition/prereq classification. Risk is mitigated by minimal-tier fallbacks and expanded smoke tests, but runtime canon parsing introduces new failure modes.Overview
oddkit_gateis refactored to load its transition + prerequisite governance from canon at runtime (with hardcoded minimal fallbacks) instead of using hardcoded transition/prereq logic.Transition detection switches from regex cascades to BM25 + stemming with deterministic tie-breaking, while prerequisite checks switch to stemmed set-intersection and now return prereq ids in
metand canon gap messages inunmet.The gate response envelope now includes
governance_sourceandgovernance_uris(2-entry array) and echoesdebug.knowledge_base_url; cache cleanup resets new gate governance caches, versions are bumped to0.20.0, and smoke tests are extended to cover the new envelope and matching behavior.Reviewed by Cursor Bugbot for commit 1acc452. Bugbot is set up for automated code reviews on this repo. Configure here.