feat: EXP-15-19 training research, Gemma 4 adapter, data pipeline by CalebisGross · Pull Request #377 · AppSprout-dev/mnemonic

CalebisGross · 2026-04-04T12:34:51Z

Summary

EXP-15 through EXP-19 completed: rotation experiments, data scaling (12K encoding examples), Gemma 4 E2B integration
Encoding spoke solved: 100% novel schema compliance on both Qwen 3.5 2B and Gemma 4 E2B
Bespoke spoke models outperform Gemini 3 Flash on mnemonic's encoding task (100% vs 0% schema)
Qwen 3.5 2B selected as production encoding model (1.7x faster than Gemma 4 locally at equal quality)
Hallucination stress test: both spoke models 5/7, Gemini 1/7

Key additions

gemma_spoke_adapter.py — Gemma 4 E2B spoke adapter (NF4, PLE CPU offload, SpokeWrappedLayer)
batch_encode.py — Gemini Batch API pipeline for scalable training data generation
compare_models.py — Side-by-side model comparison (schema, speed, quality)
stress_test_hallucination.py — Hard input testing for detail preservation
enrich_and_generate.py, extract_prenuke_data.py, merge_training_data.py — Data pipeline
train_qwen_spokes.py — Updated with --model-type gemma support, OOM protection
experiment_registry.md — Full results for EXP-15 through EXP-19
Handoff encoding preserved verbatim in agent.go

Test plan

Qwen novel eval: 10/10 (100%) schema compliance
Gemma novel eval: 10/10 (100%) schema compliance
Hallucination stress test: Qwen 5/7, Gemma 5/7
Model comparison: Qwen 19.7s/input, Gemma 33.9s/input, Gemini fails
End-to-end daemon integration via serve_spokes.py

🤖 Generated with Claude Code

…testing Research session covering rotation experiments, data scaling, Gemma 4 integration, and production quality validation. Experiments: - EXP-15/15b: orthogonal rotation (refuted full-space, minor bottleneck) - EXP-16: clean run 3 replication (eval 0.6074, 70% novel schema) - EXP-17: v2 dataset — 100% novel schema (removed poison data) - EXP-18: 12K encoding-only — 100% novel schema confirmed - EXP-19: Gemma 4 E2B + spokes — 100% schema, 5/7 stress test Infrastructure: - gemma_spoke_adapter.py (NF4, PLE offload, SpokeWrappedLayer) - batch_encode.py (Gemini Batch API, 50% cheaper) - compare_models.py, stress_test_hallucination.py - enrich_and_generate.py, extract_prenuke_data.py, merge_training_data.py - train_qwen_spokes.py --model-type gemma support - Handoff encoding preserved verbatim (agent.go) Results: Qwen 100% schema 20s/input vs Gemma 100% 34s/input vs Gemini 0%. Both spoke models 5/7 on hallucination stress test. Qwen selected as production encoding model for speed at equal quality. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

HTTP server on port 8899 that serves Qwen/Gemma + spokes as /v1/chat/completions endpoint. Allows mnemonic daemon to use the spoke model like LM Studio without GGUF export. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CalebisGross and others added 3 commits April 4, 2026 08:33

Merge remote-tracking branch 'origin/main' into autoresearch/ft-mar25

9929329

CalebisGross merged commit f6ce427 into main Apr 4, 2026

CalebisGross deleted the autoresearch/ft-mar25 branch April 4, 2026 12:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: EXP-15-19 training research, Gemma 4 adapter, data pipeline#377

feat: EXP-15-19 training research, Gemma 4 adapter, data pipeline#377
CalebisGross merged 3 commits intomainfrom
autoresearch/ft-mar25

CalebisGross commented Apr 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CalebisGross commented Apr 4, 2026

Summary

Key additions

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant