Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,31 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

## [0.23.0] - 2026-04-20

> **Version note:** P1.3.4 was scoped as 0.22.0 per the handoff, but two envelope-conformance fixes (PR #124 telemetry, PR #125 catalog) landed on main in parallel and were released as 0.22.0 via PR #128 while this branch was in Sonnet 4.6 validator dispatch. Per `klappy://canon/constraints/release-validation-gate` Rule 3 (canon outranks session artifacts) and SemVer discipline, this refactor is re-versioned to 0.23.0. The handoff's "ship as 0.22.0" recommendation was session-scoped; main-reality is the canon.

### Changed

- **`oddkit_encode` trigger-word classifier migrated from regex alternation to stemmed phrase-subset matching** (per PRD D5 from P1.3.4 — split-by-fit, same matcher family shipped for challenge in 0.21.0 and gate in 0.20.0, adapted for encode's phrasal vocabulary). `EncodingTypeDef.triggerRegex: RegExp | null` is replaced with `stemmedPhrases: string[][]` — each inner array is the ordered stem sequence of a single canon trigger word or phrase, parsed once per canon fetch. The runtime matcher `matchesStemmedPhrases(phrases, inputStems)` declares a match when ALL stems of at least one phrase appear in the input stem set. Single-stem phrases degenerate to set membership (identical to the old behavior for inflection matching like `deciding` → `decid`); multi-stem phrases like `committed to` → `[committ, to]` require both stems to co-occur, so ubiquitous function words like `to`, `with`, `by`, `up`, `out`, `not` cannot fire as standalone match triggers just because they appear inside a canon phrase. This preserves the pre-refactor regex semantic where `\b(committed to)\b` matched only when both words were present. Canon trigger vocabulary reads unchanged from `odd/encoding-types/*.md` (`## Trigger Words` fenced block); the matcher tokenizes each vocabulary entry with stop-words disabled (`tokenize(word, new Set())`) and stores the ordered stem array at parse time, and intersects against a stop-word-disabled stemmed input set at runtime. Inflected forms (`deciding`, `realizing`, `discovering`) now match their canonical stems (`decid`, `realiz`, `discover`) without canon having to enumerate each inflection. **Strictly additive** over the pre-refactor regex: every input that matched still matches (both phrase conjunction and word-boundary semantics preserved), plus stemmed variations of single-word vocab now match additionally. Stop-words disabled on both parse-time and runtime `tokenize()` calls — canon vocab survival is mandatory for the strictly-additive invariant to hold, per the P1.3.3 C-04 precedent. Both classifier call sites preserve their existing semantics: `parsePrefixedBatchInput` untagged-paragraph path picks first match via `break` (one artifact per paragraph); `parseUnstructuredInput` emits one artifact per matching type (no `break` — the load-bearing design comment at L1161–1164 preserved verbatim). `tokenize(para, new Set())` is hoisted once per paragraph into an `inputStems` Set reused across the per-type loop. The phrase-subset match (all stems co-occurring, any order) was adopted mid-PR in response to a high-severity Cursor Bugbot finding on commit `259170a` — the first version's flat `stemmedTokens: Set<string>` would have fired Decision on virtually every English paragraph because the ubiquitous function-word constituents of phrasal canon vocab (`to`, `with`) were being added as standalone singletons. Per `klappy://canon/principles/vodka-architecture`: fit the matcher to the problem shape.

### Removed

- **Module-level `cachedEncodingTypes` in-process cache** (per PRD D9 from P1.3.4 — don't cache microsecond derivations; same pattern challenge shipped in 0.21.0 and gate shipped in 0.20.0). `cachedEncodingTypes`, `cachedEncodingTypesKnowledgeBaseUrl`, `cachedEncodingTypesSource` module-level fields deleted; cache-check short-circuit at the top of `discoverEncodingTypes` deleted; `cleanup_storage` resets for the three fields deleted. Per `klappy://canon/principles/cache-fetches-and-parses`: the fetch layer (Module Memory → Cache API → R2, 5-minute TTL) already caches the canon file content; caching the parse product for microsecond re-derivation savings is the anti-pattern the principle names. Parse runs fresh per call; overhead is sub-millisecond on hot fetches.

### Added

- **New smoke regression assertions in `workers/test/canon-tool-envelope.smoke.mjs`** anchoring the D5 migration and the Bugbot phrase-subset fix: (12) stemmed inflection match — `"I'm deciding to ship the two-tier cascade"` classifies as Decision (`decid` stem degenerate-singleton matches `decided` in canon vocab); (13) stop-word phrase survival — `"we're going with option B after the review"` matches Decision via the `[go, with]` phrase having both stems present in the input set; (14) multi-type preservation — `"We must never deploy without tests because we decided this last week"` emits both `C` and `D` artifacts via the no-break path (`must`/`never` singletons for Constraint; `decid` singleton for Decision); (15) first-match preservation — untagged paragraph in a mixed batch emits exactly one artifact via the batch classifier's `break` semantic; (16) phrase-subset regression anchor — `"I need to wait until tomorrow for the review"` does NOT classify as Decision or Handoff (the pre-Bugbot-fix flat-Set implementation would have fired Decision via standalone `to` and Handoff via standalone `to`/`for`; post-fix, no phrase of either type has all its stems present in the input). Assertion (16) is the Bugbot PR #126 regression anchor and will fail against any revision where multi-word vocab is flattened back into standalone-singleton triggers.

### Refs

- Handoff: `klappy://odd/handoffs/2026-04-20-p1-3-4-encode-canon-parity`
- Canon basis: `klappy://canon/principles/cache-fetches-and-parses`, `klappy://canon/principles/vodka-architecture`
- Precedent: oddkit 0.21.1 (challenge's D5 + D9), 0.20.0 (gate's D5 + D9)
- Shipping gate: `klappy://canon/constraints/release-validation-gate` (binding)
- Bugbot finding dispositioned: PR #126 review `cursor[bot]` 2026-04-20T12:55:03Z (high severity, multi-word vocab flattening) — fix-forward in same PR via Cursor autofix commit `113ba11` (phrase-subset match). The in-session orchestrator proposed a stricter consecutive-subsequence variant; autofix's subset-match was accepted as the simpler design better aligned with encode's multi-type tolerance philosophy.
- Closes the canon-parity sweep — all three tools now use stemmed matching and have their in-process derivation caches removed per `cache-fetches-and-parses`.

## [0.22.0] - 2026-04-20

### Added
Expand Down
4 changes: 2 additions & 2 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "oddkit",
"version": "0.22.0",
"version": "0.23.0",
"description": "Agent-first CLI for ODD-governed repos. Epistemic terrain rendering with portable baseline.",
"type": "module",
"bin": {
Expand Down
4 changes: 2 additions & 2 deletions workers/package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion workers/package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "oddkit-mcp-worker",
"version": "0.22.0",
"version": "0.23.0",
"private": true,
"type": "module",
"scripts": {
Expand Down
120 changes: 93 additions & 27 deletions workers/src/orchestrate.ts
Original file line number Diff line number Diff line change
Expand Up @@ -56,12 +56,23 @@ export interface OddkitEnvelope {
/** Internal type — handlers return this, handleUnifiedAction stamps server_time */
type ActionResult = Omit<OddkitEnvelope, "server_time">;

// Governance-driven encoding types
// Governance-driven encoding types. Trigger-word classification is stemmed
// phrase-subset matching per klappy://canon/principles/vodka-architecture
// (fit the matcher to the problem) — same D5 shape applied to challenge
// prereqs in 0.21.0 and gate prereqs in 0.20.0. triggerWords kept for
// debugging only; stemmedPhrases is the parse product the runtime evaluates
// against. Each inner array is the ordered stem sequence of a single
// trigger word or phrase; a type matches an input when ALL stems of at
// least one phrase are present in the input's stem set. This preserves
// phrase-level semantics (`committed to`, `going with`, `must not`,
// `next step`, `follow up`, `blocked by`, `turns out`) so common function
// words (`to`, `with`, `by`, `up`, `out`, `not`) do not become standalone
// match triggers on every English paragraph.
interface EncodingTypeDef {
letter: string;
name: string;
triggerWords: string[];
triggerRegex: RegExp | null;
stemmedPhrases: string[][];
qualityCriteria: Array<{ criterion: string; check: string; gapMessage: string }>;
}

Expand All @@ -79,9 +90,12 @@ interface ParsedArtifact {
priority_band?: string;
}

let cachedEncodingTypes: EncodingTypeDef[] | null = null;
let cachedEncodingTypesKnowledgeBaseUrl: string | undefined = undefined;
let cachedEncodingTypesSource: "knowledge_base" | "minimal" = "minimal";
// D9 / klappy://canon/principles/cache-fetches-and-parses — no module-level
// cache on the parse product. fetcher.getFile / fetcher.getIndex already cache
// the canon read (Module Memory → Cache API → R2, 5-min TTL). Re-running the
// parse loop per request is sub-millisecond derivation work, not worth the
// plumbing tax of a keyed cache. Same pattern challenge (0.21.0) and gate
// (0.20.0) already applied.

// Governance-driven challenge types (E0008 — mirrors encode pattern from PR #96)
interface ChallengeTypeDef {
Expand Down Expand Up @@ -409,10 +423,6 @@ async function discoverEncodingTypes(
fetcher: KnowledgeBaseFetcher,
knowledgeBaseUrl?: string,
): Promise<{ types: EncodingTypeDef[]; source: "knowledge_base" | "minimal" }> {
if (cachedEncodingTypes && cachedEncodingTypesKnowledgeBaseUrl === knowledgeBaseUrl) {
return { types: cachedEncodingTypes, source: cachedEncodingTypesSource };
}

const index = await fetcher.getIndex(knowledgeBaseUrl);
const typeArticles = index.entries.filter(
(entry: IndexEntry) => entry.tags?.includes("encoding-type") && entry.path.includes("encoding-types/"),
Expand All @@ -437,10 +447,28 @@ async function discoverEncodingTypes(
const triggerWords = triggerSection
? triggerSection[1].split(",").map((w: string) => w.trim()).filter((w: string) => w.length > 0)
: [];
const triggerRegex =
triggerWords.length > 0
? new RegExp("\\b(" + triggerWords.map((w: string) => w.replace(/[.*+?^${}()|[\]\\]/g, "\\$&")).join("|") + ")\\b", "i")
: null;
// D5 / klappy://canon/principles/vodka-architecture — classification is
// stemmed phrase-subset matching, not regex alternation. Each canon
// trigger word/phrase is parsed once into its ordered stem sequence;
// runtime tokenizes input once and a type matches when ALL stems of
// at least one phrase are present. Inflected forms (deciding → decid,
// realizing → realiz) match their canonical stems without canon having
// to list each inflection. Stop-word filtering is disabled (empty Set)
// on both the parse-time and runtime tokenize() calls — canon vocab
// includes stop-word-adjacent phrases (`going with`, `committed to`,
// `must not`, `turns out`, `next step`, `blocked by`, `found that`)
// and dropping them would silently break the strictly-additive
// invariant, the same failure mode P1.3.3 hit on challenge's
// `from`-in-source-named vocab. Phrase-level conjunction (all stems
// of a phrase must match) is the precision floor: without it,
// ubiquitous function words like `to`/`with`/`by`/`up`/`out`/`not`
// would become standalone triggers on every English paragraph.
// Per canon/constraints/release-validation-gate and P1.3.3 C-04.
const stemmedPhrases: string[][] = [];
for (const word of triggerWords) {
const stems = tokenize(word, new Set());
if (stems.length > 0) stemmedPhrases.push(stems);
}
Comment thread
cursor[bot] marked this conversation as resolved.

const criteriaSection = content.match(
/## Quality Criteria[\s\S]*?\| Criterion[\s\S]*?\|[-|\s]+\|\n([\s\S]*?)(?=\n\n|\n##|$)/,
Expand All @@ -459,7 +487,7 @@ async function discoverEncodingTypes(
}
}

types.push({ letter, name, triggerWords, triggerRegex, qualityCriteria });
types.push({ letter, name, triggerWords, stemmedPhrases, qualityCriteria });
} catch {
continue;
}
Expand Down Expand Up @@ -495,17 +523,21 @@ async function discoverEncodingTypes(
["H", "Handoff", ["next session", "next step", "todo", "follow up", "blocked by"]],
["E", "Encode", ["encoded", "captured", "crystallized", "persisted", "artifact"]],
];
resolved = defaults.map(([letter, name, words]) => ({
letter, name, triggerWords: words,
triggerRegex: new RegExp("\\b(" + words.join("|") + ")\\b", "i"),
qualityCriteria: [],
}));
resolved = defaults.map(([letter, name, words]) => {
const stemmedPhrases: string[][] = [];
for (const word of words) {
const stems = tokenize(word, new Set());
if (stems.length > 0) stemmedPhrases.push(stems);
}
return {
letter, name, triggerWords: words,
stemmedPhrases,
qualityCriteria: [],
};
});
source = "minimal";
}

cachedEncodingTypes = resolved;
cachedEncodingTypesKnowledgeBaseUrl = knowledgeBaseUrl;
cachedEncodingTypesSource = source;
return { types: resolved, source };
}

Expand Down Expand Up @@ -1084,6 +1116,25 @@ function isPrefixedBatchInput(input: string): boolean {
return paragraphs.some((p) => PREFIX_TAG_REGEX.test(p));
}

// Phrase-subset match — a phrase matches when ALL of its stems appear in the
// input stem set. Short-circuits on the first phrase that matches. The D5
// matcher shape for encode trigger-word classification, mirroring the shape
// used by evaluatePrerequisiteCheck in the P1.3.3 challenge evaluator:
// single-stem phrases degenerate to set membership (identical to the old
// single-token behavior), while multi-stem phrases like
// `committed to` → ["committ","to"] require both stems to co-occur, so
// ubiquitous function words cannot match on their own.
function matchesStemmedPhrases(phrases: string[][], input: Set<string>): boolean {
for (const phrase of phrases) {
let allPresent = true;
for (const stem of phrase) {
if (!input.has(stem)) { allPresent = false; break; }
}
if (allPresent) return true;
}
return false;
}

function parsePrefixedBatchInput(input: string, types: EncodingTypeDef[]): ParsedArtifact[] {
const typeMap = new Map(types.map((t) => [t.letter, t.name]));
const paragraphs = input.split(/\n\n+/).map((p) => p.trim()).filter((p) => p.length > 0);
Expand Down Expand Up @@ -1118,9 +1169,16 @@ function parsePrefixedBatchInput(input: string, types: EncodingTypeDef[]): Parse
// Untagged paragraph in a batch that contains tags: classify via trigger
// words like parseUnstructuredInput, but emit one artifact per paragraph
// (not one-per-match) to preserve the author's paragraph boundaries.
// Stemmed set intersection mirrors parseUnstructuredInput — stop-words
// disabled on tokenize() both sides per P1.3.3 C-04 (canon vocab
// includes stop-word phrases like `going with` / `must not`).
let matched: EncodingTypeDef | null = null;
const inputStems = new Set(tokenize(para, new Set()));
for (const t of types) {
if (t.triggerRegex && t.triggerRegex.test(para)) { matched = t; break; }
// Break on first match: this path picks one type per paragraph by
// design (paragraph boundaries are the author's). Unlike
// parseUnstructuredInput which emits one artifact per matching type.
if (matchesStemmedPhrases(t.stemmedPhrases, inputStems)) { matched = t; break; }
}
const pick = matched ?? types[0] ?? { letter: "D", name: "Decision" };
const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
Expand Down Expand Up @@ -1157,12 +1215,19 @@ function parseUnstructuredInput(input: string, types: EncodingTypeDef[]): Parsed
const artifacts: ParsedArtifact[] = [];
for (const para of paragraphs) {
let matched = false;
// Hoist tokenize(para) out of the per-type loop — para is constant across
// the loop, stemmedTokens differ per type. Mirrors the P1.3.3 challenge
// prereq evaluator shape. Stop-words disabled (empty Set) on both parse-
// time and runtime tokenize() calls so canon vocab like `going with`,
// `must not`, `turns out`, `found that` survives on both sides. Per
// canon/constraints/release-validation-gate and P1.3.3 Bug #1 precedent.
const inputStems = new Set(tokenize(para, new Set()));
for (const t of types) {
// DESIGN: no break — a paragraph can match multiple types intentionally.
// "We must never deploy without tests" is both Decision and Constraint.
// Multi-typing at the server level mirrors what the model would do with
// separate TSV rows. Do not add a break here.
if (t.triggerRegex && t.triggerRegex.test(para)) {
if (matchesStemmedPhrases(t.stemmedPhrases, inputStems)) {
const first = para.split(/[.!?\n]/)[0]?.trim() || para.slice(0, 60);
const title = first.split(/\s+/).length <= 12 ? first : first.split(/\s+/).slice(0, 8).join(" ") + "...";
artifacts.push({ type: t.letter, typeName: t.name, fields: [t.letter, title, para.trim()], title, body: para.trim() });
Expand Down Expand Up @@ -1518,9 +1583,10 @@ async function runCleanupStorage(
// Also clear the in-memory BM25 index
cachedBM25Index = null;
cachedBM25Entries = null;
cachedEncodingTypes = null;
cachedEncodingTypesKnowledgeBaseUrl = undefined;
cachedEncodingTypesSource = "minimal";
// cachedEncodingTypes removed in 0.23.0 per cache-fetches-and-parses —
// encode's parse product is no longer cached in-process. The fetch tier
// (Cache API, R2) already handles canon file caching; the derivation is
// sub-millisecond. No reset needed here.
// E0008 — governance-driven challenge caches (mirror PR #96 fix)
cachedChallengeTypes = null;
cachedChallengeTypesKnowledgeBaseUrl = undefined;
Expand Down
Loading
Loading