canon: Search-Corpus Boundary in core-governance-baseline (E0008.5)#155
Merged
Conversation
… (E0008.5) Establishes the contract for what the search corpus indexes when knowledge_base_url is set. Defaults to overlay + required-baseline; opt-in to merged via include_full_baseline: true. Operationalizes klappy://canon/principles/scoped-truth and closes the gap named in klappy/ptxprint-mcp's oddkit-kb-isolation-feature-request handoff: project KBs were having their own canon outranked in BM25 by the unrelated baseline content of co-located repos. Companion code change ships in klappy/oddkit as a separate PR that cites this section. CHANGELOG bumps to 0.36.0 (Epoch 8.5).
Canon Quality —
|
This was referenced Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
Search-Corpus Boundarysection tocanon/constraints/core-governance-baseline.md(tier 1) defining what the worker BM25-indexes whenknowledge_base_urlis set. Default becomes overlay + required-baseline; the prior merged behavior is opt-in viainclude_full_baseline: true.This is the canon-side of E0008.5 (Search-Corpus Boundary, Project-KB Visibility). The companion code change in
klappy/oddkitfollows in a separate PR that cites this section. Percanon/constraints/governance-change-discipline.md, canon precedes code.Why
Reproduced today against
oddkitv0.26.0 withknowledge_base_url=https://github.com/klappy/ptxprint-mcp:oddkit_catalog: total 586 (overlay 21, baseline 566) — i.e., 96.4% of the indexed corpus came fromklappy.dev, not the project KBoddkit_searchfor a klappy.dev-only query (release validation gate Bugbot Sonnet validator): top hit wascanon/constraints/release-validation-gate.mdfromklappy.devranked above every project doc, labeledsource: "baseline"https://github.com/klappy/klappy.dev/archive/main.zip(19 MB) during a search keyed to ptxprint-mcpThese numbers match the prior measurement session captured in
klappy/ptxprint-mcpatcanon/handoffs/oddkit-kb-isolation-feature-request.md(584/19/566 then; 586/21/566 now — overlay grew by two docs, contamination unchanged). That handoff is the design doc; this PR codifies the contract its Option C names.The failure shape is the one
klappy://canon/principles/scoped-truthalready names: a single knowledge base serving every context, where unrelated domains compete for ranking slots and the project KB's own canon gets outvoted.What changed
canon/constraints/core-governance-baseline.md— new H2 section between §"What Ships in the Baseline" and §"Build-Time Invariants" with six sub-sections: why scoping defaults to on, opt-in to merged, scope-vs-resolution boundary, affected-tools table, cache-key rule, telemetry fields.canon/CHANGELOG.md— 0.36.0 entry tagged Epoch 8.5.No frontmatter changes to existing docs. Runtime Invariant #5 (
baseline path is never user-configurable) is preserved — scoping is a search-index property, not a baseline-floor change.Affected tools (per the new table)
oddkit_search,oddkit_catalog,oddkit_preflightget the new default + opt-in.oddkit_orient,oddkit_get,oddkit_challenge,oddkit_validate,oddkit_gate,oddkit_encodeare unchanged — they read governance via the per-file resolver, not the search index.Code PR sequencing
Per the operator's directive: canon merges first, then the
klappy/oddkitPR cites this. The code PR will:workers/baseline/MANIFEST.jsonenumerating the six required-baseline files this section names (closes Build-Time Invariant feat(agent-skill): v1.3 PRD Elicitation Enhancement #4).include_full_baseline?: booleantooddkit_search,oddkit_catalog,oddkit_preflightschemas.arbitrateEntries/ index build inworkers/src/zip-baseline-fetcher.tsto filter baseline against the manifest when scoped.(baselineSha, knowledgeBaseSha, scope).klappy/ptxprint-mcp(expect catalog total to drop from ~586 to ~27).completed; orchestrate.ts/governance-read changes get an independent Sonnet 4.6 validator before promotion.Note
Medium Risk
Documentation-only change, but it declares a new default retrieval contract for
oddkit_search/oddkit_catalog/oddkit_preflight; downstream implementations may change behavior and ranking expectations.Overview
Introduces a new Tier-1
Search-Corpus Boundarysection incanon/constraints/core-governance-baseline.mdthat defines scoped retrieval whenknowledge_base_urlis set: default search corpus becomes overlay + required-baseline, withinclude_full_baseline: trueas the explicit opt-in to the prior merged behavior.The new contract clarifies scope vs per-file resolution, lists affected tools, and specifies cache-key/telemetry fields to distinguish scoped vs merged indexing. Updates
canon/CHANGELOG.mdwith a0.36.0(Epoch 8.5) entry documenting the behavioral intent and upcoming companionoddkitcode change.Reviewed by Cursor Bugbot for commit 7d1e844. Bugbot is set up for automated code reviews on this repo. Configure here.