Skip to content

feat: EXP-25 faithfulness probe, encoding fixes, MCP HTTP transport#389

Merged
CalebisGross merged 11 commits intomainfrom
feat/exp25-faithfulness-probe
Apr 9, 2026
Merged

feat: EXP-25 faithfulness probe, encoding fixes, MCP HTTP transport#389
CalebisGross merged 11 commits intomainfrom
feat/exp25-faithfulness-probe

Conversation

@CalebisGross
Copy link
Copy Markdown
Collaborator

Summary

This branch bundles the EXP-25 faithfulness probe research with critical infrastructure fixes that came out of it.

EXP-25: Faithfulness Probe

  • Confirmed Qwen 3.5 2B + spoke architecture can learn faithful encoding on diverse inputs (25 categories, 100% EPR, 100% NP, 0% template echo)
  • Added eval_faithfulness.py (7-metric evaluation), prepare_faithfulness_data.py, training_constants.py with build_production_prompt() matching daemon format
  • Implemented chunked_cross_entropy() in training script to handle Qwen's 248K vocab at seq_len 2375
  • Pre-registered EXP-26 (v7 dataset) and EXP-27 (Qwen 3.5 4B)

Bug Fixes (#382, #383)

MCP HTTP Transport (#384)

  • Daemon serves MCP over POST /mcp — eliminates per-session subprocess spawning
  • SessionManager creates/caches MCPServer instances per session ID with 30-minute idle expiry
  • Claude Code config: {"type": "http", "url": "http://127.0.0.1:9999/mcp"}
  • Result: N sessions x ~3GB VRAM each → one daemon, one model, ~3GB total

Dashboard Fixes

  • Timeline "Today" header no longer overlaps first entry
  • Time formatting fixed (zero-padded minutes)

Test plan

  • make check passes
  • go test passes for all changed packages
  • ROCM=1 make build-embedded compiles
  • Daemon healthy with LLM loaded
  • MCP HTTP transport tested end-to-end (initialize, tools/list, tools/call, DELETE)
  • amend with raw_id verified against live daemon
  • Single GPU process after HTTP transport switch

Closes #382, closes #383, closes #384

🤖 Generated with Claude Code

CalebisGross and others added 11 commits April 9, 2026 09:35
Retrained EXP-25 at seq_len 2375 (up from 1280) after identifying that
the daemon's llama-server was holding ~3.4GB VRAM during the initial run.
All 25 diverse training examples now train untruncated.

Results: 25/25 valid JSON (was 3/25), 100% entity preservation, 100%
number preservation, 100% schema compliance, zero template echoing,
clean adversarial twin discrimination. The architecture can learn
faithful encoding on diverse inputs — failures were a data problem.

Key changes:
- Added chunked_cross_entropy() to train_qwen_spokes.py to handle
  Qwen's 248K vocab at long sequences (OOMs with standard cross_entropy
  at seq_len > 2048). Processes 256 positions at a time.
- Removed redundant HF internal loss computation (was passing labels to
  model AND computing loss manually).
- New scripts: eval_faithfulness.py (7-metric eval), prepare_faithfulness_data.py,
  run_exp25.sh, training_constants.py (build_production_prompt).

Tracking: #381

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The fabrication rate metric was counting semantic expansion in concepts
and structured_concepts as fabrication (e.g., "WAL mode on." -> concepts:
["database"] counted as 100% FR). Now only measures content-bearing
fields (gist, summary, content, narrative, outcome) where fabrication
is a real concern.

FR dropped from 25.8% to 3.0% — all 7 faithfulness metrics now pass.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Generates raw inputs across 5 categories for v7 faithfulness training:
- Production captures (600): extracted from daemon capture files with
  quality filtering (removes document ingestion, garbage, duplicates)
- Out-of-domain (300): 30 non-tech domains via Gemini 3.1 Pro
- Adversarial twins (100 pairs): matched decision pairs via Gemini
- Minimal inputs (100): 1-10 word script-generated inputs
- Dense numbers (100): metric-heavy inputs via Gemini

Phase 1 of #381 v7 dataset pipeline.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Combined v6 (4,255 encoding) + v7 (1,200 diverse new examples) dataset.
V7 categories: production captures, out-of-domain (30 domains),
adversarial twins (50 pairs), minimal inputs, dense numbers.

Hypothesis: diverse data eliminates faithfulness failures while
maintaining 100% schema and 7/7 stress test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove getRelatedContext() from encoding pipeline — FTS5 keyword
  matching injected unrelated memory summaries into the LLM prompt,
  causing cross-contamination (#383). Also removes extractKeywords
  and joinConcepts (dead code after removal).

- amend tool now accepts raw_id in addition to memory_id, resolving
  via GetMemoryByRawID when memory_id lookup fails (#382). Mirrors
  the check_memory pattern.

- Dashboard: fix sticky "Today" header overlapping first timeline
  entry (top: 30px → 0, solid background). Fix time formatting
  producing single-digit minutes (manual zero-padding replaces
  locale-dependent toLocaleString).

- Sync Python training_constants.py with Go buildCompressionPrompt
  (remove related_ctx parameter). Remove RELATED_MEMORY_STUB from
  prepare_faithfulness_data.py.

Closes #382, closes #383

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: encoding faithfulness, amend raw_id, dashboard timeline
…uning

Extract mnemonic's own 1.5-2B model from Gemma 4 31B (30.7B dense,
60 layers) via Sheared-LLaMA-style targeted structural pruning.

Phases: full fine-tune baseline → learned pruning masks → continued
pretraining → standalone GGUF export. Progressive targets 8B→4B→2B→1.5B
to find the quality cliff.

Target: >200 tok/s, <1.5GB VRAM, match EXP-26 faithfulness metrics.
Hardware: MI300X for pruning, local 7800 XT for deployment.

Tracking: #386

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…uning

Extract mnemonic's own 1.5-2B model from Gemma 4 31B (30.7B dense,
60 layers) via Sheared-LLaMA-style targeted structural pruning.

Phases: full fine-tune baseline → learned pruning masks → continued
pretraining → standalone GGUF export. Progressive targets 8B→4B→2B→1.5B
to find the quality cliff.

Target: >200 tok/s, <1.5GB VRAM, match EXP-26 faithfulness metrics.
Hardware: MI300X for pruning, local 7800 XT for deployment.

Tracking: #386

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add POST /mcp endpoint to the daemon API, eliminating the need for
per-session stdio subprocesses. Claude Code connects via HTTP transport
to the already-running daemon, sharing its LLM, store, and agents.

- New SessionManager (internal/mcp/session.go) creates and caches
  MCPServer instances per session ID with 30-minute idle expiry
- HTTP handler (internal/api/routes/mcp.go) accepts JSON-RPC requests,
  generates session IDs on first request (returned via Mcp-Session-Id
  header), routes subsequent requests to existing sessions
- Export JSONRPCRequest/Response types and HandleSingleRequest for
  the HTTP transport layer
- Wire session manager into daemon serve pipeline

Claude Code config changes from stdio to HTTP transport:
  {"type": "http", "url": "http://127.0.0.1:9999/mcp"}

Result: N sessions x ~3GB VRAM each → one daemon, one model, ~3GB total.
The mcp subcommand remains as fallback for offline/no-daemon usage.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: serve MCP over HTTP transport from daemon
Model-agnostic script that measures per-layer contribution to the
residual stream via forward-pass hooks. Metrics: residual contribution,
cosine drift, composite importance score.

Validated on Gemma 4 E2B (35 layers): clear signal — layers 30-32
nearly dead (importance 0.08-0.14), full attention layers avg 0.68 vs
sliding 0.57, classic U-shaped importance curve.

Supports CPU offload for large models. Next: run on Gemma 4 31B
(60 layers) on MI300X.

Tracking: #387, #386

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CalebisGross CalebisGross merged commit 45a7cd5 into main Apr 9, 2026
@CalebisGross CalebisGross deleted the feat/exp25-faithfulness-probe branch April 9, 2026 20:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

1 participant