fix: type-filtered recall surfaces recent memories first by CalebisGross · Pull Request #395 · AppSprout-dev/mnemonic

CalebisGross · 2026-04-11T13:21:48Z

Summary

Fixes #394 — recall with type filter (e.g. type:"handoff") now reliably surfaces the most recent memory of that type, instead of older memories with richer association graphs.

Type-filtered recency boost: New config params TypeFilterRecencyWeight (0.5) and TypeFilterRecencyHalfLife (7 days) override general recency (0.2 / 30 days) for type-filtered queries. When you filter by type, you've already constrained what — the system now prioritizes when.
check_memory content: Output now includes the full Content field (was simply omitted from the format string, not truncated).

Changes

File	What
`internal/agent/retrieval/agent.go`	Config struct + ranking branch for type-filtered recency
`internal/config/config.go`	Config struct + defaults (0.5 weight, 7-day half-life)
`cmd/mnemonic/runtime.go`	Wire new config fields to retrieval agent
`internal/mcp/server.go`	Add Content line to check_memory output
`internal/agent/retrieval/config_behavior_test.go`	2 new tests for type-filter recency
`internal/mcp/server_test.go`	1 new test for check_memory content

Verified

Recency bonus for ~11h-old handoff: 0.469 (was 0.197)
Most recent handoff now gets recency_bonus: 0.499 (near max)
check_memory shows full content
All tests pass, lint clean

Test plan

TestConfigTypeFilterRecencyBoostsRecent — recent handoff ranks above older one with more associations
TestConfigTypeFilterRecencyParamsUsed — aggressive params override general ones
TestHandleCheckMemoryIncludesContent — content field present in output
Live verification via daemon HTTP endpoint (curl to /mcp)
Live verification via MCP tools after killing stale subprocess processes

🤖 Generated with Claude Code

…ysis Best eval loss: 1.2002 (PPL 3.3) at step 4800. Early stopped at step 5800 after 9.5h on RX 7800 XT. Two-phase learning: peak LR caused instability (regression steps 1200-1600), minimum LR produced steady second descent through 14 consecutive new bests. Full per-checkpoint loss table in registry. Evaluation of SC/EPR/FR/NP pending. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Eval loss improved (1.68→1.20) but generation is degenerate: 2/13 valid JSON (15%), 0 SC. Base model without spokes achieves 24/25 valid. Root cause: autoregressive generation compounds spoke perturbations through NF4 dequantization noise. Teacher-forced eval loss does not predict generation quality for spoke adapters on quantized models. Production path: Gemma E2B + faithful prompt + GBNF grammar (no spokes). Spoke training requires full bf16 (MI300X). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Multiple eval runs confirm: 1/10 valid JSON (10%), 0 SC. Base model without spokes achieves 24/25 valid. The spokes generate faithful content but cannot maintain JSON structure despite training on 5,238 perfectly structured examples. Eval loss (-0.483 improvement) does not predict generation quality for NF4 spoke adapters. Teacher-forced training and autoregressive generation have fundamentally different error dynamics on quantized models. Production path: Gemma E2B + faithful prompt + GBNF grammar. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Python HF generate() produces valid faithful JSON with trained spokes. llama.cpp server produces garbage with the same GGUF. The discrepancy is an inference engine bug, not a training failure. GBNF grammar was never tested through a working path. Verdict suspended pending llama.cpp debugging and spokes + GBNF evaluation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Type-filtered queries (e.g. type:"handoff") now use stronger recency scoring: weight 0.5 with 7-day half-life (vs general 0.2/30-day). When you filter by type, you've already constrained relevance — recency should dominate. Also adds Content field to check_memory output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

CalebisGross and others added 5 commits April 10, 2026 22:32

CalebisGross mentioned this pull request Apr 11, 2026

Recall fails to surface most recent handoffs #394

Closed

CalebisGross merged commit 86673ea into main Apr 11, 2026

CalebisGross deleted the feat/gemma-e2b-spokes branch April 11, 2026 13:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: type-filtered recall surfaces recent memories first#395

fix: type-filtered recall surfaces recent memories first#395
CalebisGross merged 5 commits intomainfrom
feat/gemma-e2b-spokes

CalebisGross commented Apr 11, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

CalebisGross commented Apr 11, 2026

Summary

Changes

Verified

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant