feat(memory): activate chat memory pipeline — Archivist wiring, LLM recaps, cross-thread STM recall, segment-granular tree, unified compaction#2175
Conversation
…finalize-time embedding Phase 0 — wire ArchivistHook unconditionally: - Add `learning.episodic_capture_enabled: bool` config flag (default true) + `OPENHUMAN_LEARNING_EPISODIC_CAPTURE_ENABLED` env override - Register ArchivistHook in the production post-turn hook chain in session/builder.rs, independent of `config.learning.enabled` - Extend the `Memory` trait with `sqlite_conn() -> Option<Arc<Mutex<Connection>>>` default method; UnifiedMemory overrides to expose its connection - Store the hook Arc on Agent/AgentBuilder so the turn loop can call flush Phase 1 — unified embedded recap: - Replace `fallback_summary` stub with real LLM recap via LlmSummariser + ChatProvider; soft-fallback to heuristic on provider failure or empty output - Compute embedding via Embedder after recap; persist via segment_embedding_upsert (finalize-time only — never per-turn or on open segments) - Add `flush_open_segment(session_id)` to force-close + recap + embed the trailing open segment at session end - Call flush in spawn_session_memory_extraction (closest available session wind-down signal) as a fire-and-forget background task Testing: - `phase0_episodic_rows_and_segment_without_learning_enabled`: verifies 2 episodic rows + segment created/advanced when learning.enabled=false - `phase1_llm_recap_and_embedding_on_segment_close`: StubChatProvider + StubEmbedder → LLM recap persisted + embedding row written on boundary - `phase1_flush_open_segment_finalizes_trailing_segment`: flush force-closes open segment, writes recap + embedding without a boundary-triggering turn - All 13 archivist tests pass; cargo check clean; cargo fmt applied
…nt-embedding hybrid)
…eaves, episodic provenance, no per-turn double-write)
- Remove per-turn pipe_turn_to_tree call from on_turn_complete; the tree
write no longer fires once per turn. Episodic-log writes are unchanged.
- Add pipe_segment_to_tree: ingests the segment's raw prose turns (user +
assistant; tool-call JSON stripped via strip_tool_calls_from_response)
as ONE ChatBatch when a segment closes (on_segment_closed) or is flushed
(flush_open_segment). The LLM recap is intentionally NOT fed to the tree
(evidence-vs-interpretation policy).
- source_id is the constant "conversations:agent" for all sessions.
- Each ChatMessage.source_ref is stamped with episodic provenance:
agent://session/{session_id}/segment/{segment_id}#ep{start}-{end}
so tree leaves can be traced back to episodic_log rows.
- config.learning.chat_to_tree_enabled gate applied on the per-segment
ingest path (same semantics as the removed per-turn path).
- Add new_with_stubs_and_config test-only constructor to inject Config
alongside stub providers for hermetic Phase 2 tests.
- Add 5 Phase 2 tests (all green, 18 total archivist tests pass):
phase2_no_per_turn_tree_write
phase2_exactly_one_tree_ingest_per_segment_close
phase2_provenance_stamped_on_leaf_and_source_id_is_constant
phase2_ingested_content_is_raw_prose_not_recap
phase2_flush_also_triggers_tree_ingest
- cargo fmt applied; cargo check --tests clean.
…mmarizer, soft-fallback) - Factor shared `summarize_entries()` helper from `on_segment_closed` so both finalize and rolling paths use the same single LLM summarizer (LlmSummariser with heuristic fallback). - Add `ArchivistHook::rolling_segment_recap()`: reads the open segment's episodic entries via `summarize_entries()` and returns the recap string without closing the segment, writing segment_set_summary, or embedding. - Add `SegmentRecapSummarizer` in `context/segment_recap_summarizer.rs`: wraps an inner `ProviderSummarizer` and routes autocompaction through the archivist's rolling recap when available; falls back to inner on any failure (None recap, no open segment, LLM fail, archivist absent, flag off) — live prompt never left over-budget. - Gate behind `learning.unified_compaction_enabled` (default: true) with env override `OPENHUMAN_LEARNING_UNIFIED_COMPACTION_ENABLED=0|1`; flag off == today's behaviour exactly (ProviderSummarizer used directly, Phase 1.5 absent). - Wire in `AgentBuilder::build()` via `unified_compaction_enabled` builder flag set from `config.learning.unified_compaction_enabled` in `from_config`. - 6 new tests in `context/segment_recap_summarizer_tests.rs`: rolling recap no-close invariant, compaction uses recap not inner summarizer, soft-fallback archivist absent, soft-fallback heuristic on LLM fail, no-entries fallback, and Phase 1 finalize regression guard (all green).
…on falls back to ProviderSummarizer Phase 1.5 deviation fix. summarize_entries now returns (text, produced_by_llm); the finalize path (on_segment_closed) ignores the flag and persists either a real LLM recap or the heuristic fallback unchanged (Phase 1 contract intact), but rolling_segment_recap returns None unless a genuine summariser recap was produced — so the bookend stub can never become live compaction text and SegmentRecapSummarizer falls back to ProviderSummarizer instead. Inert clipped-content recaps (real text) remain acceptable compaction (Option A).
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughWalkthroughImplements segment-granularity recap/embedding, segment-only tree ingestion, unified compaction via rolling recaps, and a two-arm STM recall (embedding + FTS5) with session integration and feature toggles. ChangesLearning Pipeline (Phases 1–3)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/openhuman/agent/harness/archivist.rs (1)
328-337:⚠️ Potential issue | 🟠 Major | ⚡ Quick winUse segment row bounds instead of timestamps for closed-segment selection.
Line 334 uses
<= end_timestamp, butsegment_closeis called after inserting the boundary turn, so that turn can be pulled into the just-closed segment and misattributed/duplicated.💡 Suggested fix
- let segment_entries: Vec<&EpisodicEntry> = entries - .iter() - .filter(|e| { - e.timestamp >= segment.start_timestamp - && segment - .end_timestamp - .map(|end| e.timestamp <= end) - .unwrap_or(true) - }) - .collect(); + let start_id = segment.start_episodic_id; + let end_id = segment.end_episodic_id.unwrap_or(i64::MAX); + let segment_entries: Vec<&EpisodicEntry> = entries + .iter() + .filter(|e| e.id.is_some_and(|id| id >= start_id && id <= end_id)) + .collect();🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/openhuman/agent/harness/archivist.rs` around lines 328 - 337, The current filter for building segment_entries uses timestamps and <= end_timestamp which can include the boundary turn inserted before calling segment_close; change the selection to use row bounds instead: filter entries by EpisodicEntry.row against segment.start_row and segment.end_row (use start_row <= e.row and, when end_row is present, treat end_row as the exclusive upper bound: e.row < end_row) so the boundary turn is not pulled into the just-closed segment; update the closure that builds segment_entries to reference EpisodicEntry.row, segment.start_row and segment.end_row instead of timestamps.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/agent/harness/archivist.rs`:
- Around line 891-893: The assistant-path currently calls
strip_tool_calls_from_response(&e.content) which only removes XML-style blocks
and therefore allows JSON tool payload markers (e.g. {"tool_calls": ...} or
"tool_use") into the tree ingest; update the code so that assistant content is
run through a hardened sanitizer that strips both XML and JSON tool-call
patterns — either extend strip_tool_calls_from_response to also detect/remove
JSON tool payloads or replace the call with a new
sanitize_tool_calls_from_response function that accepts e.content and removes
XML blocks, JSON object markers (keys like "tool_calls", "tool_use", top-level
JSON arrays/objects used for tool output), and any trailing JSON blobs before
returning prose for downstream tree ingest. Ensure the change is applied at the
same location where e.role == "assistant" is checked so all assistant messages
are sanitized prior to ingest.
In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 270-273: Replace the brittle use of last_tree_prefetch_at as the
"first turn" sentinel with a dedicated boolean sentinel (e.g.,
stm_first_turn_done) on the session/turn struct: introduce stm_first_turn_done
defaulting to false, compute is_first_turn_for_stm as !self.stm_first_turn_done
(captured before the tree-prefetch block), and after the first-turn STM recall
logic set self.stm_first_turn_done = true so later turns won't re-run preemptive
STM recall even if last_tree_prefetch_at remains None; apply the same
replacement wherever the code currently gates on last_tree_prefetch_at
(including the block around lines ~326-378).
In `@src/openhuman/memory/stm_recall/mod.rs`:
- Around line 45-72: The STM tuning constants (RECENCY_WINDOW_DAYS,
RECENCY_WINDOW_MAX_SEGMENTS, COSINE_GATE, MAX_SEGMENT_RECAPS,
MAX_EPISODIC_TURNS, TOKEN_BUDGET, FTS5_LIMIT) should be moved out of mod.rs into
a sibling module (e.g., create src/openhuman/memory/stm_recall/constants.rs)
containing those pub const declarations, then re-export them from mod.rs (pub
use self::constants::{...}) so mod.rs stays export-focused; finally update any
local uses/imports in the crate to reference the re-exported names if necessary.
In `@src/openhuman/memory/stm_recall/recall.rs`:
- Around line 507-511: The recency cutoff is only applied when opts.query is
None, allowing query-based (keyword) searches to return stale entries; modify
the conditional inside the loop that currently reads "if entry.timestamp <
cutoff_ts && opts.query.is_none() { continue; }" so that the cutoff_ts check
always runs (e.g., remove the opts.query.is_none() guard) and continue for
entries older than cutoff_ts; locate this logic in
src/openhuman/memory/stm_recall/recall.rs where entry.timestamp, cutoff_ts, and
opts.query are referenced and enforce the recency window for both keyword and
non-keyword branches.
- Around line 188-193: The function decode_vector_blob currently drops trailing
bytes silently; change it to validate bytes.len() % 4 == 0 and return an error
when malformed instead of silently decoding. Update decode_vector_blob to return
Result<Vec<f32>, Box<dyn std::error::Error>> (or an appropriate error type used
in this crate), check the length at the start and return Err("malformed
embedding blob: length not multiple of 4") if validation fails, then keep the
existing chunks_exact decoding on the validated input; finally, update callers
of decode_vector_blob to handle the Result (propagate the error or map it into
the surrounding function's error type).
In `@src/openhuman/tools/ops.rs`:
- Around line 138-149: The StmRecallTool is being instantiated with
String::new() and always registered; change registration so it only occurs when
root_config.learning.stm_recall_enabled is true and ensure a real session id is
provided (or defer construction to a session-aware layer). Concretely, gate the
Box::new(crate::openhuman::memory::stm_recall::tool::StmRecallTool::new(...))
call on root_config.learning.stm_recall_enabled, and replace the String::new()
session_id with the actual session-scoped id source (or refactor to register an
Option-wrapping factory that is resolved per-session) so exclusion semantics
from StmRecallTool::new(session_id, ...) are correct.
---
Outside diff comments:
In `@src/openhuman/agent/harness/archivist.rs`:
- Around line 328-337: The current filter for building segment_entries uses
timestamps and <= end_timestamp which can include the boundary turn inserted
before calling segment_close; change the selection to use row bounds instead:
filter entries by EpisodicEntry.row against segment.start_row and
segment.end_row (use start_row <= e.row and, when end_row is present, treat
end_row as the exclusive upper bound: e.row < end_row) so the boundary turn is
not pulled into the just-closed segment; update the closure that builds
segment_entries to reference EpisodicEntry.row, segment.start_row and
segment.end_row instead of timestamps.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a6fea04e-5cbb-41bf-aed0-889333f386a5
📒 Files selected for processing (18)
src/openhuman/agent/harness/archivist.rssrc/openhuman/agent/harness/archivist_tests.rssrc/openhuman/agent/harness/session/builder.rssrc/openhuman/agent/harness/session/turn.rssrc/openhuman/agent/harness/session/types.rssrc/openhuman/config/schema/learning.rssrc/openhuman/config/schema/load.rssrc/openhuman/context/mod.rssrc/openhuman/context/segment_recap_summarizer.rssrc/openhuman/context/segment_recap_summarizer_tests.rssrc/openhuman/memory/mod.rssrc/openhuman/memory/stm_recall/mod.rssrc/openhuman/memory/stm_recall/recall.rssrc/openhuman/memory/stm_recall/recall_tests.rssrc/openhuman/memory/stm_recall/tool.rssrc/openhuman/memory/store/memory_trait.rssrc/openhuman/memory/traits.rssrc/openhuman/tools/ops.rs
- archivist: strip JSON tool-call markers (not just XML) before tree ingest (evidence-vs-interpretation policy hardening) - turn: gate STM first-turn recall on session turn index, not the tree-prefetch sentinel (decoupled — prefetch failure no longer re-fires it) - stm_recall: move tuning constants to constants.rs sibling, re-export - recall: validate embedding blob length (reject non-multiple-of-4) - recall: apply recency window consistently incl. keyword/FTS mode - ops: feature-gate StmRecallTool registration on learning.stm_recall_enabled
graycyrus
left a comment
There was a problem hiding this comment.
Walkthrough
This PR activates the previously-dormant chat memory pipeline — wiring ArchivistHook into the production turn loop, adding LLM segment recaps with finalize-time embeddings, cross-thread STM recall (FTS5 + cosine arms), segment-granular tree ingest, and unified compaction via SegmentRecapSummarizer. It's a large but well-structured change with three new feature gates (all default-on) and ~1900 lines of thorough test coverage. The design contracts are clearly documented and the soft-fallback philosophy is consistently applied throughout.
Overall this is strong work — the phased commit stack is logical, the error handling follows the existing warn! + swallow pattern correctly, and the tests cover the critical paths hermetically. Two findings below need attention before merge.
Change Summary
| File | Change type | Description |
|---|---|---|
archivist.rs |
Modified | Major refactor: LLM recap via summarize_entries shared helper, finalize-time embedding, flush_open_segment, rolling_segment_recap, per-segment tree ingest replacing per-turn, strip_tool_calls_from_response JSON filter |
archivist_tests.rs |
Modified | +608 lines: Phase 0/1/2 tests |
session/builder.rs |
Modified | Archivist hook wiring, SegmentRecapSummarizer wrapping, unified compaction gate |
session/turn.rs |
Modified | STM preemptive recall injection (first turn), flush_open_segment spawn at session wind-down |
session/types.rs |
Modified | archivist_hook + unified_compaction_enabled fields |
config/schema/learning.rs |
Modified | 3 new flags: episodic_capture_enabled, stm_recall_enabled, unified_compaction_enabled |
config/schema/load.rs |
Modified | Env var overrides for new flags |
context/mod.rs |
Modified | Re-export SegmentRecapSummarizer |
context/segment_recap_summarizer.rs |
New | Phase 1.5 compaction summarizer (244 lines) |
context/segment_recap_summarizer_tests.rs |
New | 562 lines of Phase 1.5 tests |
memory/mod.rs |
Modified | stm_recall module |
memory/stm_recall/constants.rs |
New | Tuning constants (35 lines) |
memory/stm_recall/mod.rs |
New | Module definition + re-exports (47 lines) |
memory/stm_recall/recall.rs |
New | Two-arm hybrid retrieval (629 lines) |
memory/stm_recall/recall_tests.rs |
New | 721 lines of STM recall tests |
memory/stm_recall/tool.rs |
New | On-demand STM recall tool (200 lines) |
memory/store/memory_trait.rs |
Modified | sqlite_conn() impl for UnifiedMemory |
memory/traits.rs |
Modified | sqlite_conn() default method on Memory trait |
tools/ops.rs |
Modified | StmRecallTool registration behind feature gate |
…gment, expect over unwrap (tinyhumansai#2175) Item 1 (major): snap_split_forward_no_break in segment_recap_summarizer.rs was a copy-paste of snap_split_forward in summarizer.rs. Deduplicated by making snap_split_forward pub(super) in summarizer.rs and removing the duplicate local function from segment_recap_summarizer.rs. Both compaction paths now share a single definition of the never-break-a-tool-call-pair invariant. Exact split logic is unchanged. Item 2 (major): flush_open_segment in spawn_session_memory_extraction was fire-and-forget (tokio::spawn + dropped JoinHandle). The doc comment on flush_open_segment guarantees the trailing segment always gets a recap plus embedding plus tree ingest -- that guarantee was violated on runtime shutdown during session wind-down. Fixed by making spawn_session_memory_extraction async, removing the tokio::spawn wrapper around the flush call, and directly awaiting archivist.flush_open_segment(). The call site in Agent::turn (already async) now .awaits the function. No deadlock risk: no mutex guard is held across the await point. Item 3 (minor): bare .unwrap() in stm_recall/recall.rs line 582. The match arm confirmed peek() returned Some for both iterators, so the unwrap was safe but violated the project convention reserving .unwrap() for tests. Replaced with .expect("peek confirmed Some"). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ecaps, cross-thread STM recall, segment-granular tree, unified compaction (tinyhumansai#2175) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
ArchivistHookwas implemented but never constructed in the production turn loop — soepisodic_log, conversation segmentation, segment recaps, and chat→tree ingest produced nothing at runtime. This wires it in (gated, default-on) and builds the layers on top.true; flag-off restores prior behavior on each path.Problem
ArchivistHook+ episodic log + segments + recaps + chat→tree) existed in code butArchivistHookwas constructed only in tests — never registered in the production session builder. Result: zero episodic capture, zero segments, zero recaps, zero chat→tree ingest at runtime. The agent had no short-term memory of its own conversations.Solution
Six commits, each compile-checked and unit/integration-tested (verbatim counts below):
fe3a438bP0+1 — registerArchivistHookin the session builder independent oflearning.enabled, gated by newlearning.episodic_capture_enabled(default true).on_segment_closedproduces a real LLM recap via the sharedLlmSummariserand a finalize-time embedding;flush_open_segmentfinalizes the trailing segment at session end.1f94707eP3 — bounded cross-thread STM recall: FTS5 over uncompacted episodic (high gate,exclude_session) ∪ brute-force cosine oversegment_embeddings(medium gate), recency-windowed, dedup by episodic span, token-bounded; session-start injection + on-demand tool; gated bylearning.stm_recall_enabled(default true). Includese2e_stm_recall_chainexercising the real turn→episodic→recap+embed→recall chain.5506721fP2 — chat→tree ingest moved from per-turn to per-segment; leaf = raw prose (never the recap);session_id+segment_id+episodic-span provenance stamped; singleconversations:agenttree preserved; per-turn double-write removed.ee2250ecP1.5 — one sharedsummarize_entrieshelper used by both finalize and a newrolling_segment_recap;SegmentRecapSummarizerroutes autocompaction through the rolling recap with soft-fallback toProviderSummarizer; gated bylearning.unified_compaction_enabled(default true; flag-off == prior behavior exactly).4be22c6ffix —summarize_entriesreturns(text, produced_by_llm); finalize persists either (Phase 1 contract intact), butrolling_segment_recapreturnsNoneunless a genuine summariser recap exists, so the shallow heuristic bookend stub can never become live compaction text — it falls back toProviderSummarizer. Inert clipped-content recaps (real text) remain acceptable compaction.Submission Checklist
node_modules); new code is heavily unit/integration-tested and CIcoverage.ymlis the authoritative ≥80%-changed-lines gate.## Related— N/A: no existing matrix rows touched; none to list.Closes #NNN).Impact
episodic_logrows, manages conversation segments, on segment close produces an LLM recap + embedding, ingests the segment's raw prose into theconversations:agenttree, and STM recall injects bounded cross-thread context at session start; compaction prefers the rolling segment recap (soft-fallback toProviderSummarizer). This previously did nothing. Each path is independently disableable:learning.episodic_capture_enabled,learning.stm_recall_enabled,learning.unified_compaction_enabled(all defaulttrue);unified_compaction_enabled=falserestores the exact prior compaction path.episodic_log,conversation_segments,segment_embeddings, etc.) are created idempotently at memory init. No destructive migration.Related
builder.rs/turn.rs/learning.rs/load.rs; whichever merges second needs a rebase.src/openhuman/about_app/) for the new STM-recall tool + config flags (applies to feat(learning): explicit user-preference tool + always-on pinned-facet prompt injection #2150 too).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
feat/stm-archivist-unified-recap4be22c6f8eb435cab8edee904db6543e1998c2ad(stack:fe3a438b→1f94707e→5506721f→ee2250ec→4be22c6f, base0b053c55)Validation Run
pnpm --filter openhuman-app format:check— no frontend files changed.pnpm typecheck— no frontend/TS changed.--lib, verbatim, independently verified): P0+1 archivist 13/13; P3stm_recall21/21 (incl.e2e_stm_recall_chain); P2 archivist 18/18 (incl.phase2_*); P1.5segment_recap_summarizer6/6 + archivist regression 19/19; deviation fixsegment_recap_summarizer7/7 + archivist regression 19/19.cargo fmtapplied;cargo check --manifest-path Cargo.toml --testsclean (pre-existing unrelated warnings only).src-tauriunchanged.Validation Blocked
command:git push(pre-push hook runspnpm --filter openhuman-app format)error:prettier: command not found— ephemeral worktree has nonode_modulesimpact:Frontend format hook cannot run here; this stack is Rust-only (0 frontend files). Pushed with--no-verifyper repo policy for unrelated hook failures. CI runs full validation.Behavior Changes
Parity Contract
episodic_capture_enabled=false→ Archivist not registered (prior runtime);unified_compaction_enabled=false→ exact priorProviderSummarizercompaction;stm_recall_enabled=false→ no STM injection.ProviderSummarizer;LlmSummarisersoft-fallback unchanged; circuit-breaker boundary unchanged; Phase 1 finalize path (recap persist + embed) unchanged (19/19 archivist regression each phase).Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Configuration
Tests