Problem
The retrieval_feedback table stores full graph traversal snapshots in the traversed_assocs JSON column. Each record is 300-600KB, making 176 rows consume ~22MB of the 116MB database (19% of total DB size for 0.5% of rows).
Current behavior
When a retrieval query runs, WriteRetrievalFeedback stores:
traversed_assocs: Full list of every association edge visited during spread activation (300-600KB JSON)
access_snapshot: Memory access state at query time (up to 800B)
retrieved_memory_ids: IDs of returned memories (~400B)
This data is read back by HandleFeedback in feedback.go to apply Hebbian learning -- strengthening/weakening the specific association edges based on the quality rating.
Why it's a problem
- 176 feedback records = ~22MB. At scale this will dominate DB size.
- The traversal data is only needed until feedback is applied. After
HandleFeedback adjusts the association strengths, the raw traversal is never read again.
- Most records (128/176) have empty feedback -- they stored the traversal but were never rated, so the data was written for nothing.
Proposed fix
Two options (not mutually exclusive):
Option A: Prune after feedback is applied
- After
HandleFeedback processes a record, null out traversed_assocs and access_snapshot
- Keeps the query text and rating for analytics but drops the bulk
Option B: Only store traversals for rated queries
- Don't write the feedback record during retrieval
- Instead, hold traversal data in memory (keyed by query_id) with a TTL (e.g., 30 min)
- Only persist to DB when
HandleFeedback is called and needs to apply adjustments
- Most queries are never rated, so most traversals never hit disk
Option C: Store only edge IDs, not full objects
traversed_assocs currently stores full TraversedAssoc structs
- Could store just
(source_id, target_id) pairs -- that's all HandleFeedback needs
- Would reduce each record from 300-600KB to a few KB
Data from pre-nuke DB (2026-03-21)
Total feedback records: 176
Rated (helpful/partial/irrelevant): 48
Unrated (empty): 128
Min traversed_assocs size: unknown (need to re-measure)
Max traversed_assocs size: ~634KB
Total retrieval_feedback size: ~22MB of 116MB DB
Backup at ~/.mnemonic/memory.db.backup-2026-03-21
Problem
The
retrieval_feedbacktable stores full graph traversal snapshots in thetraversed_assocsJSON column. Each record is 300-600KB, making 176 rows consume ~22MB of the 116MB database (19% of total DB size for 0.5% of rows).Current behavior
When a retrieval query runs,
WriteRetrievalFeedbackstores:traversed_assocs: Full list of every association edge visited during spread activation (300-600KB JSON)access_snapshot: Memory access state at query time (up to 800B)retrieved_memory_ids: IDs of returned memories (~400B)This data is read back by
HandleFeedbackinfeedback.goto apply Hebbian learning -- strengthening/weakening the specific association edges based on the quality rating.Why it's a problem
HandleFeedbackadjusts the association strengths, the raw traversal is never read again.Proposed fix
Two options (not mutually exclusive):
Option A: Prune after feedback is applied
HandleFeedbackprocesses a record, null outtraversed_assocsandaccess_snapshotOption B: Only store traversals for rated queries
HandleFeedbackis called and needs to apply adjustmentsOption C: Store only edge IDs, not full objects
traversed_assocscurrently stores fullTraversedAssocstructs(source_id, target_id)pairs -- that's allHandleFeedbackneedsData from pre-nuke DB (2026-03-21)
Backup at
~/.mnemonic/memory.db.backup-2026-03-21