Context Engineering Harness

A bootstrap prompt and reference guide for building a self-reinforcing Claude Code development environment in any Taleemabad team repo.

What this is

A "harness" is a system that makes Claude reliably good — not just occasionally good. It has two sides:

Guides (feedforward): CLAUDE.md files, agents, skills — steer Claude before it acts
Sensors (feedback): Hooks, validators, evals — observe after Claude acts and correct mistakes

Without both, you have a style guide nobody enforces. This repo gives you the pattern to build both, in about 2–4 hours.

How to use it

Open a new Claude Code session in your team repo
Paste the contents of HARNESS_BOOTSTRAP.md into the session (or point Claude to it)
Say: "Read this file completely. Then audit my repo and build the harness described in it, phase by phase. Ask me before completing each phase."

Claude will work through 9 phases and build the harness. You review each phase before it moves on.

What gets built (the 9 phases)

Phase	What it builds
1 — Audit	One-page repo snapshot before touching anything
2 — CLAUDE.md hierarchy	L1 router → L2 folder READMEs → L3 typed docs (6 doc types with full frontmatter)
3 — Hooks	PreToolUse (hard blocks), PostToolUse (validation), SessionStart (context injection), Stop (session-end checks)
4 — Beads	Append-only JSONL work tracker: status.jsonl, decisions.jsonl, failures.jsonl
5 — Agents	Scoped sub-agents for repeated tasks (docs-updater, debugger, deploy-validator)
6 — Skills	Slash-command shortcuts for common operations (`/deploy`, `/debug`, `/review-pr`)
7 — Standards	`.claude/standards/` — domain rules Claude enforces without being asked
8 — What not to do	Common failure modes and how this harness avoids them
9 — Eval harness	YAML task suite measuring routing quality: hops, 6 SLOs, cost tracking

Key concepts

Progressive disclosure (L1→L2→L3) Context is a depletable resource — every token loaded degrades the tokens already there. The L1/L2/L3 system loads the minimum high-signal context needed per task, not everything upfront.

Level	Contents	Line limit	When loaded
L1	CLAUDE.md — routing only, critical rules	≤150 lines	Always
L2	Folder READMEs, skill indexes	≤100 lines	When domain matches
L3	Runbooks, references, investigation docs	≤300 lines	When blocked without it
L4	Archives, full changelogs	Unlimited	On-demand only

Beads Append-only JSONL files in .beads/ that survive context resets. Every task opens a bead; every decision and failure gets logged. A session-start hook re-injects the last 5 open beads automatically.

Eval harness Measures whether your docs actually work: does Claude reach the right file in ≤2 hops? Wrong-route rate target < 5%. Cost tracked per run (Tier 1: free, Tier 2: ~$0.05–0.50). Results stored in .claude/evals/baselines/ as append-only JSONL.

Repo contents

HARNESS_BOOTSTRAP.md   ← the full 9-phase bootstrap prompt (paste into Claude)
README.md              ← this file

Origin

Built from the patterns in two internal repos:

CEO OS — the reference implementation for hooks, progressive disclosure, and doc types
Rumi — the reference implementation for eval harness, hops measurement, and 6 SLOs

Any team member at Taleemabad can use this bootstrap to replicate those patterns in their own repo in a single session.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
HARNESS_BOOTSTRAP.md		HARNESS_BOOTSTRAP.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Context Engineering Harness

What this is

How to use it

What gets built (the 9 phases)

Key concepts

Repo contents

Origin

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Context Engineering Harness

What this is

How to use it

What gets built (the 9 phases)

Key concepts

Repo contents

Origin

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages