Skip to content

feat(memory): activate chat memory pipeline — Archivist wiring, LLM recaps, cross-thread STM recall, segment-granular tree, unified compaction#2175

Merged
senamakel merged 7 commits into
tinyhumansai:mainfrom
sanil-23:feat/stm-archivist-unified-recap
May 19, 2026
Merged

feat(memory): activate chat memory pipeline — Archivist wiring, LLM recaps, cross-thread STM recall, segment-granular tree, unified compaction#2175
senamakel merged 7 commits into
tinyhumansai:mainfrom
sanil-23:feat/stm-archivist-unified-recap

Conversation

@sanil-23
Copy link
Copy Markdown
Contributor

@sanil-23 sanil-23 commented May 19, 2026

Summary

  • Activates the chat memory pipeline, which was dormant. ArchivistHook was implemented but never constructed in the production turn loop — so episodic_log, conversation segmentation, segment recaps, and chat→tree ingest produced nothing at runtime. This wires it in (gated, default-on) and builds the layers on top.
  • 6-commit stack: P0+1 Archivist wiring + LLM segment recap + finalize-time embedding → P3 bounded cross-thread STM recall (FTS5 + segment-embedding hybrid) → P2 chat→tree at segment granularity (raw-prose leaves, episodic provenance, kills the per-turn double-write) → P1.5 compaction↔recap unification (one summarizer, soft-fallback) → deviation fix (bookend stub can never become live compaction text).
  • Pure Rust core change. New gated config flags, all default true; flag-off restores prior behavior on each path.
  • Behavior change is real, not inert-until-used — see Impact.

Problem

  • The chat memory subsystem (ArchivistHook + episodic log + segments + recaps + chat→tree) existed in code but ArchivistHook was constructed only in tests — never registered in the production session builder. Result: zero episodic capture, zero segments, zero recaps, zero chat→tree ingest at runtime. The agent had no short-term memory of its own conversations.
  • No cross-thread short-term recall existed. Compaction and the (non-functional) segment recap were two independent summarizers over the same conversation.

Solution

Six commits, each compile-checked and unit/integration-tested (verbatim counts below):

  • fe3a438b P0+1 — register ArchivistHook in the session builder independent of learning.enabled, gated by new learning.episodic_capture_enabled (default true). on_segment_closed produces a real LLM recap via the shared LlmSummariser and a finalize-time embedding; flush_open_segment finalizes the trailing segment at session end.
  • 1f94707e P3 — bounded cross-thread STM recall: FTS5 over uncompacted episodic (high gate, exclude_session) ∪ brute-force cosine over segment_embeddings (medium gate), recency-windowed, dedup by episodic span, token-bounded; session-start injection + on-demand tool; gated by learning.stm_recall_enabled (default true). Includes e2e_stm_recall_chain exercising the real turn→episodic→recap+embed→recall chain.
  • 5506721f P2 — chat→tree ingest moved from per-turn to per-segment; leaf = raw prose (never the recap); session_id+segment_id+episodic-span provenance stamped; single conversations:agent tree preserved; per-turn double-write removed.
  • ee2250ec P1.5 — one shared summarize_entries helper used by both finalize and a new rolling_segment_recap; SegmentRecapSummarizer routes autocompaction through the rolling recap with soft-fallback to ProviderSummarizer; gated by learning.unified_compaction_enabled (default true; flag-off == prior behavior exactly).
  • 4be22c6f fixsummarize_entries returns (text, produced_by_llm); finalize persists either (Phase 1 contract intact), but rolling_segment_recap returns None unless a genuine summariser recap exists, so the shallow heuristic bookend stub can never become live compaction text — it falls back to ProviderSummarizer. Inert clipped-content recaps (real text) remain acceptable compaction.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) — extensive per-phase unit + integration tests incl. an end-to-end chain test and soft-fallback/edge tests (verbatim results in AI metadata).
  • N/A: diff-coverage tooling not runnable in this ephemeral worktree (no node_modules); new code is heavily unit/integration-tested and CI coverage.yml is the authoritative ≥80%-changed-lines gate.
  • N/A: coverage matrix not updated in this PR — flagged as a follow-up (new STM-recall tool + config flags warrant rows).
  • All affected feature IDs from the matrix are listed in ## Related — N/A: no existing matrix rows touched; none to list.
  • No new external network dependencies introduced — uses existing in-process SQLite stores + already-configured chat/embed providers; brute-force cosine (no ANN / vector-DB dependency).
  • N/A: no release-cut UI surface; no real-app runtime smoke was performed (verification is unit/integration incl. an e2e chain test that drives the real pipeline; the actual sidecar/app was not run to observe live episodic rows) — recommended before release.
  • N/A: design-session origin, no tracked issue (no Closes #NNN).

Impact

  • Behavior change (significant — not inert-until-used). With default-on flags, every completed turn now: writes 2 episodic_log rows, manages conversation segments, on segment close produces an LLM recap + embedding, ingests the segment's raw prose into the conversations:agent tree, and STM recall injects bounded cross-thread context at session start; compaction prefers the rolling segment recap (soft-fallback to ProviderSummarizer). This previously did nothing. Each path is independently disableable: learning.episodic_capture_enabled, learning.stm_recall_enabled, learning.unified_compaction_enabled (all default true); unified_compaction_enabled=false restores the exact prior compaction path.
  • Desktop/CLI core. SQLite tables (episodic_log, conversation_segments, segment_embeddings, etc.) are created idempotently at memory init. No destructive migration.
  • Performance: per-turn episodic writes (cheap); LLM recap + embedding only at segment close/flush; brute-force cosine is bounded by the recency window (single-user scale). Compaction reuses one summarizer instead of two.
  • Security: no new external surface or network deps; raw episodic remains the faithful local system-of-record; tree ingest stays raw-prose-only per the evidence-vs-interpretation policy.

Related

  • Closes: N/A — design-session origin.
  • Follow-up PR(s)/TODOs:
    • Reconcile with #2150 (explicit user preferences) — overlaps builder.rs/turn.rs/learning.rs/load.rs; whichever merges second needs a rebase.
    • Capability-catalog update (src/openhuman/about_app/) for the new STM-recall tool + config flags (applies to feat(learning): explicit user-preference tool + always-on pinned-facet prompt injection #2150 too).
    • Deterministic non-LLM compaction safety net (pre-existing gap, not introduced here): today no-LLM ⇒ no compaction ⇒ circuit breaker. Future hardening for true LLM-outage degradation.
    • Real-app runtime smoke before release.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/stm-archivist-unified-recap
  • Commit SHA: 4be22c6f8eb435cab8edee904db6543e1998c2ad (stack: fe3a438b1f94707e5506721fee2250ec4be22c6f, base 0b053c55)

Validation Run

  • N/A: pnpm --filter openhuman-app format:check — no frontend files changed.
  • N/A: pnpm typecheck — no frontend/TS changed.
  • Focused tests (scoped --lib, verbatim, independently verified): P0+1 archivist 13/13; P3 stm_recall 21/21 (incl. e2e_stm_recall_chain); P2 archivist 18/18 (incl. phase2_*); P1.5 segment_recap_summarizer 6/6 + archivist regression 19/19; deviation fix segment_recap_summarizer 7/7 + archivist regression 19/19.
  • Rust fmt/check (if changed): cargo fmt applied; cargo check --manifest-path Cargo.toml --tests clean (pre-existing unrelated warnings only).
  • N/A: Tauri fmt/check — src-tauri unchanged.

Validation Blocked

  • command: git push (pre-push hook runs pnpm --filter openhuman-app format)
  • error: prettier: command not found — ephemeral worktree has no node_modules
  • impact: Frontend format hook cannot run here; this stack is Rust-only (0 frontend files). Pushed with --no-verify per repo policy for unrelated hook failures. CI runs full validation.

Behavior Changes

  • Intended behavior change: the chat memory pipeline (episodic capture, segmentation, LLM recaps + embeddings, per-segment tree ingest, cross-thread STM recall injection, recap-backed compaction) is activated. Previously dormant (Archivist unwired).
  • User-visible effect: the agent gains short-term memory of its own conversations across threads; system prompt may include a bounded "recent context from other conversations" block; compaction text may be the segment recap. All gated, default-on, individually disableable.

Parity Contract

  • Legacy behavior preserved when flags off: episodic_capture_enabled=false → Archivist not registered (prior runtime); unified_compaction_enabled=false → exact prior ProviderSummarizer compaction; stm_recall_enabled=false → no STM injection.
  • Guard/fallback/dispatch parity: compaction soft-falls-back to ProviderSummarizer; LlmSummariser soft-fallback unchanged; circuit-breaker boundary unchanged; Phase 1 finalize path (recap persist + embed) unchanged (19/19 archivist regression each phase).

Duplicate / Superseded PR Handling

Summary by CodeRabbit

  • New Features

    • Segment-granularity ingestion: conversation-boundary detection, async segment finalization with optional LLM recaps, persisted recaps, optional embeddings, and single-shot segment ingestion.
    • Short-term memory (STM) recall: bounded cross-session recall and on-demand recall tool; preemptive first-turn recall.
  • Configuration

    • Three learning toggles added (episodic capture, STM recall, unified compaction) with env var overrides (enabled by default).
  • Tests

    • Extensive integration/unit tests covering recap, ingest, STM recall, and compaction flows.

sanil-23 added 5 commits May 19, 2026 02:34
…finalize-time embedding

Phase 0 — wire ArchivistHook unconditionally:
- Add `learning.episodic_capture_enabled: bool` config flag (default true) +
  `OPENHUMAN_LEARNING_EPISODIC_CAPTURE_ENABLED` env override
- Register ArchivistHook in the production post-turn hook chain in
  session/builder.rs, independent of `config.learning.enabled`
- Extend the `Memory` trait with `sqlite_conn() -> Option<Arc<Mutex<Connection>>>`
  default method; UnifiedMemory overrides to expose its connection
- Store the hook Arc on Agent/AgentBuilder so the turn loop can call flush

Phase 1 — unified embedded recap:
- Replace `fallback_summary` stub with real LLM recap via LlmSummariser +
  ChatProvider; soft-fallback to heuristic on provider failure or empty output
- Compute embedding via Embedder after recap; persist via segment_embedding_upsert
  (finalize-time only — never per-turn or on open segments)
- Add `flush_open_segment(session_id)` to force-close + recap + embed the trailing
  open segment at session end
- Call flush in spawn_session_memory_extraction (closest available session wind-down
  signal) as a fire-and-forget background task

Testing:
- `phase0_episodic_rows_and_segment_without_learning_enabled`: verifies 2 episodic
  rows + segment created/advanced when learning.enabled=false
- `phase1_llm_recap_and_embedding_on_segment_close`: StubChatProvider + StubEmbedder
  → LLM recap persisted + embedding row written on boundary
- `phase1_flush_open_segment_finalizes_trailing_segment`: flush force-closes open
  segment, writes recap + embedding without a boundary-triggering turn
- All 13 archivist tests pass; cargo check clean; cargo fmt applied
…eaves, episodic provenance, no per-turn double-write)

- Remove per-turn pipe_turn_to_tree call from on_turn_complete; the tree
  write no longer fires once per turn. Episodic-log writes are unchanged.
- Add pipe_segment_to_tree: ingests the segment's raw prose turns (user +
  assistant; tool-call JSON stripped via strip_tool_calls_from_response)
  as ONE ChatBatch when a segment closes (on_segment_closed) or is flushed
  (flush_open_segment). The LLM recap is intentionally NOT fed to the tree
  (evidence-vs-interpretation policy).
- source_id is the constant "conversations:agent" for all sessions.
- Each ChatMessage.source_ref is stamped with episodic provenance:
  agent://session/{session_id}/segment/{segment_id}#ep{start}-{end}
  so tree leaves can be traced back to episodic_log rows.
- config.learning.chat_to_tree_enabled gate applied on the per-segment
  ingest path (same semantics as the removed per-turn path).
- Add new_with_stubs_and_config test-only constructor to inject Config
  alongside stub providers for hermetic Phase 2 tests.
- Add 5 Phase 2 tests (all green, 18 total archivist tests pass):
  phase2_no_per_turn_tree_write
  phase2_exactly_one_tree_ingest_per_segment_close
  phase2_provenance_stamped_on_leaf_and_source_id_is_constant
  phase2_ingested_content_is_raw_prose_not_recap
  phase2_flush_also_triggers_tree_ingest
- cargo fmt applied; cargo check --tests clean.
…mmarizer, soft-fallback)

- Factor shared `summarize_entries()` helper from `on_segment_closed` so both
  finalize and rolling paths use the same single LLM summarizer (LlmSummariser
  with heuristic fallback).
- Add `ArchivistHook::rolling_segment_recap()`: reads the open segment's
  episodic entries via `summarize_entries()` and returns the recap string
  without closing the segment, writing segment_set_summary, or embedding.
- Add `SegmentRecapSummarizer` in `context/segment_recap_summarizer.rs`:
  wraps an inner `ProviderSummarizer` and routes autocompaction through the
  archivist's rolling recap when available; falls back to inner on any failure
  (None recap, no open segment, LLM fail, archivist absent, flag off) — live
  prompt never left over-budget.
- Gate behind `learning.unified_compaction_enabled` (default: true) with env
  override `OPENHUMAN_LEARNING_UNIFIED_COMPACTION_ENABLED=0|1`; flag off ==
  today's behaviour exactly (ProviderSummarizer used directly, Phase 1.5 absent).
- Wire in `AgentBuilder::build()` via `unified_compaction_enabled` builder flag
  set from `config.learning.unified_compaction_enabled` in `from_config`.
- 6 new tests in `context/segment_recap_summarizer_tests.rs`: rolling recap
  no-close invariant, compaction uses recap not inner summarizer, soft-fallback
  archivist absent, soft-fallback heuristic on LLM fail, no-entries fallback,
  and Phase 1 finalize regression guard (all green).
…on falls back to ProviderSummarizer

Phase 1.5 deviation fix. summarize_entries now returns (text, produced_by_llm);
the finalize path (on_segment_closed) ignores the flag and persists either a
real LLM recap or the heuristic fallback unchanged (Phase 1 contract intact),
but rolling_segment_recap returns None unless a genuine summariser recap was
produced — so the bookend stub can never become live compaction text and
SegmentRecapSummarizer falls back to ProviderSummarizer instead. Inert
clipped-content recaps (real text) remain acceptable compaction (Option A).
@sanil-23 sanil-23 requested a review from a team May 19, 2026 06:04
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 19, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e0f05e3f-4c0b-4e11-82d6-ffd01276a5ce

📥 Commits

Reviewing files that changed from the base of the PR and between d5573f7 and a0381ed.

📒 Files selected for processing (4)
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/context/segment_recap_summarizer.rs
  • src/openhuman/context/summarizer.rs
  • src/openhuman/memory/stm_recall/recall.rs
✅ Files skipped from review due to trivial changes (1)
  • src/openhuman/context/summarizer.rs
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/openhuman/memory/stm_recall/recall.rs

📝 Walkthrough

Walkthrough

Implements segment-granularity recap/embedding, segment-only tree ingestion, unified compaction via rolling recaps, and a two-arm STM recall (embedding + FTS5) with session integration and feature toggles.

Changes

Learning Pipeline (Phases 1–3)

Layer / File(s) Summary
Configuration & Memory Trait Foundation
src/openhuman/config/schema/learning.rs, src/openhuman/memory/traits.rs, src/openhuman/memory/store/memory_trait.rs
New LearningConfig booleans (episodic_capture_enabled, stm_recall_enabled, unified_compaction_enabled) with env overlays; Memory trait gains sqlite_conn() accessor and UnifiedMemory exposes the connection.
Archivist: Segment lifecycle, recap & embedding
src/openhuman/agent/harness/archivist.rs
ArchivistHook gains optional chat_provider and embedder, with_config soft-constructs them, manage_segment_sync returns closed segments, summarize_entries centralizes recap generation, on_segment_closed persists recaps, conditionally writes embeddings, upserts heuristics/profiles, and flush_open_segment force-closes trailing open segments. on_turn_complete drops per-turn tree writes.
Segment → Tree ingestion (Phase 2)
src/openhuman/agent/harness/archivist.rs
pipe_segment_to_tree ingests closed-segment raw prose into the memory tree once per segment under source_id="conversations:agent", strips tool-call payloads, and excludes the LLM recap text from ingestion.
Archivist Integration Tests (Phases 0–2)
src/openhuman/agent/harness/archivist_tests.rs
Tests for episodic capture and segment creation, phase-1 recap+embedding with stubs, phase-2 tree ingest only on close (exactly once), flush_open_segment behavior, provenance source_ref, and content policy (raw prose vs recap).
Unified Compaction via Segment Recaps (Phase 1.5)
src/openhuman/context/segment_recap_summarizer.rs
Adds SegmentRecapSummarizer which queries ArchivistHook::rolling_segment_recap and, when present, rewrites the head as a [segment-recap] system message preserving the tail and tool-call invariants; falls back to inner summarizer otherwise.
Unified Compaction Tests
src/openhuman/context/segment_recap_summarizer_tests.rs
Tests: rolling recap without close; recap-based compaction; soft-fallback when recap absent or archivist missing; chat provider failure handling; feature-flag off; Phase 1 regression guard for on-close recap+embedding.
STM Recall Core (Phase 3)
src/openhuman/memory/stm_recall/mod.rs, src/openhuman/memory/stm_recall/recall.rs, src/openhuman/memory/stm_recall/constants.rs
Two-arm STM recall: Arm 1 episodic via FTS5/recency (exclude current session), Arm 2 segment recaps via cosine-gated embedding search; merge by recency, deduplicate episodic rows covered by returned segments, enforce token budget, and render to markdown. Exposes stm_recall and types.
STM Recall Tests & Integration
src/openhuman/memory/stm_recall/recall_tests.rs
Unit and integration tests for cosine similarity, gating, session exclusion, deduplication, recency window, token budget, no-query fallback, rendering, and an end-to-end archivist+recall integration test.
STM Recall Tool & Registration
src/openhuman/memory/stm_recall/tool.rs, src/openhuman/tools/ops.rs
Adds StmRecallTool (on-demand recall tool) and registers it when learning.stm_recall_enabled is true; tool performs cross-session episodic recall and returns rendered markdown or a no-match message.
Agent & Session wiring
src/openhuman/agent/harness/session/types.rs, src/openhuman/agent/harness/session/builder.rs, src/openhuman/agent/harness/session/turn.rs
AgentBuilder gains archivist_hook and unified_compaction_enabled setters; builder conditionally wraps summarizer with SegmentRecapSummarizer. Agent stores optional archivist hook. Agent::turn snapshots first-turn and injects preemptive STM recall on first turn when enabled. spawn_session_memory_extraction awaits archivist.flush_open_segment.
Context module exports
src/openhuman/context/mod.rs
Re-exports SegmentRecapSummarizer at the context module level.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Suggested reviewers

  • graycyrus

Poem

A rabbit hops through memories grand,
Segments stitched by careful hand,
Recaps whisper, trees take root,
Recall scurries — context reboot. 🐰✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and comprehensively summarizes the main changes: activating a chat memory pipeline with Archivist wiring, LLM recaps, cross-thread STM recall, segment-granular tree ingestion, and unified compaction.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 19, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/openhuman/agent/harness/archivist.rs (1)

328-337: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use segment row bounds instead of timestamps for closed-segment selection.

Line 334 uses <= end_timestamp, but segment_close is called after inserting the boundary turn, so that turn can be pulled into the just-closed segment and misattributed/duplicated.

💡 Suggested fix
-        let segment_entries: Vec<&EpisodicEntry> = entries
-            .iter()
-            .filter(|e| {
-                e.timestamp >= segment.start_timestamp
-                    && segment
-                        .end_timestamp
-                        .map(|end| e.timestamp <= end)
-                        .unwrap_or(true)
-            })
-            .collect();
+        let start_id = segment.start_episodic_id;
+        let end_id = segment.end_episodic_id.unwrap_or(i64::MAX);
+        let segment_entries: Vec<&EpisodicEntry> = entries
+            .iter()
+            .filter(|e| e.id.is_some_and(|id| id >= start_id && id <= end_id))
+            .collect();
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/openhuman/agent/harness/archivist.rs` around lines 328 - 337, The current
filter for building segment_entries uses timestamps and <= end_timestamp which
can include the boundary turn inserted before calling segment_close; change the
selection to use row bounds instead: filter entries by EpisodicEntry.row against
segment.start_row and segment.end_row (use start_row <= e.row and, when end_row
is present, treat end_row as the exclusive upper bound: e.row < end_row) so the
boundary turn is not pulled into the just-closed segment; update the closure
that builds segment_entries to reference EpisodicEntry.row, segment.start_row
and segment.end_row instead of timestamps.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/agent/harness/archivist.rs`:
- Around line 891-893: The assistant-path currently calls
strip_tool_calls_from_response(&e.content) which only removes XML-style blocks
and therefore allows JSON tool payload markers (e.g. {"tool_calls": ...} or
"tool_use") into the tree ingest; update the code so that assistant content is
run through a hardened sanitizer that strips both XML and JSON tool-call
patterns — either extend strip_tool_calls_from_response to also detect/remove
JSON tool payloads or replace the call with a new
sanitize_tool_calls_from_response function that accepts e.content and removes
XML blocks, JSON object markers (keys like "tool_calls", "tool_use", top-level
JSON arrays/objects used for tool output), and any trailing JSON blobs before
returning prose for downstream tree ingest. Ensure the change is applied at the
same location where e.role == "assistant" is checked so all assistant messages
are sanitized prior to ingest.

In `@src/openhuman/agent/harness/session/turn.rs`:
- Around line 270-273: Replace the brittle use of last_tree_prefetch_at as the
"first turn" sentinel with a dedicated boolean sentinel (e.g.,
stm_first_turn_done) on the session/turn struct: introduce stm_first_turn_done
defaulting to false, compute is_first_turn_for_stm as !self.stm_first_turn_done
(captured before the tree-prefetch block), and after the first-turn STM recall
logic set self.stm_first_turn_done = true so later turns won't re-run preemptive
STM recall even if last_tree_prefetch_at remains None; apply the same
replacement wherever the code currently gates on last_tree_prefetch_at
(including the block around lines ~326-378).

In `@src/openhuman/memory/stm_recall/mod.rs`:
- Around line 45-72: The STM tuning constants (RECENCY_WINDOW_DAYS,
RECENCY_WINDOW_MAX_SEGMENTS, COSINE_GATE, MAX_SEGMENT_RECAPS,
MAX_EPISODIC_TURNS, TOKEN_BUDGET, FTS5_LIMIT) should be moved out of mod.rs into
a sibling module (e.g., create src/openhuman/memory/stm_recall/constants.rs)
containing those pub const declarations, then re-export them from mod.rs (pub
use self::constants::{...}) so mod.rs stays export-focused; finally update any
local uses/imports in the crate to reference the re-exported names if necessary.

In `@src/openhuman/memory/stm_recall/recall.rs`:
- Around line 507-511: The recency cutoff is only applied when opts.query is
None, allowing query-based (keyword) searches to return stale entries; modify
the conditional inside the loop that currently reads "if entry.timestamp <
cutoff_ts && opts.query.is_none() { continue; }" so that the cutoff_ts check
always runs (e.g., remove the opts.query.is_none() guard) and continue for
entries older than cutoff_ts; locate this logic in
src/openhuman/memory/stm_recall/recall.rs where entry.timestamp, cutoff_ts, and
opts.query are referenced and enforce the recency window for both keyword and
non-keyword branches.
- Around line 188-193: The function decode_vector_blob currently drops trailing
bytes silently; change it to validate bytes.len() % 4 == 0 and return an error
when malformed instead of silently decoding. Update decode_vector_blob to return
Result<Vec<f32>, Box<dyn std::error::Error>> (or an appropriate error type used
in this crate), check the length at the start and return Err("malformed
embedding blob: length not multiple of 4") if validation fails, then keep the
existing chunks_exact decoding on the validated input; finally, update callers
of decode_vector_blob to handle the Result (propagate the error or map it into
the surrounding function's error type).

In `@src/openhuman/tools/ops.rs`:
- Around line 138-149: The StmRecallTool is being instantiated with
String::new() and always registered; change registration so it only occurs when
root_config.learning.stm_recall_enabled is true and ensure a real session id is
provided (or defer construction to a session-aware layer). Concretely, gate the
Box::new(crate::openhuman::memory::stm_recall::tool::StmRecallTool::new(...))
call on root_config.learning.stm_recall_enabled, and replace the String::new()
session_id with the actual session-scoped id source (or refactor to register an
Option-wrapping factory that is resolved per-session) so exclusion semantics
from StmRecallTool::new(session_id, ...) are correct.

---

Outside diff comments:
In `@src/openhuman/agent/harness/archivist.rs`:
- Around line 328-337: The current filter for building segment_entries uses
timestamps and <= end_timestamp which can include the boundary turn inserted
before calling segment_close; change the selection to use row bounds instead:
filter entries by EpisodicEntry.row against segment.start_row and
segment.end_row (use start_row <= e.row and, when end_row is present, treat
end_row as the exclusive upper bound: e.row < end_row) so the boundary turn is
not pulled into the just-closed segment; update the closure that builds
segment_entries to reference EpisodicEntry.row, segment.start_row and
segment.end_row instead of timestamps.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a6fea04e-5cbb-41bf-aed0-889333f386a5

📥 Commits

Reviewing files that changed from the base of the PR and between d844cf9 and 4be22c6.

📒 Files selected for processing (18)
  • src/openhuman/agent/harness/archivist.rs
  • src/openhuman/agent/harness/archivist_tests.rs
  • src/openhuman/agent/harness/session/builder.rs
  • src/openhuman/agent/harness/session/turn.rs
  • src/openhuman/agent/harness/session/types.rs
  • src/openhuman/config/schema/learning.rs
  • src/openhuman/config/schema/load.rs
  • src/openhuman/context/mod.rs
  • src/openhuman/context/segment_recap_summarizer.rs
  • src/openhuman/context/segment_recap_summarizer_tests.rs
  • src/openhuman/memory/mod.rs
  • src/openhuman/memory/stm_recall/mod.rs
  • src/openhuman/memory/stm_recall/recall.rs
  • src/openhuman/memory/stm_recall/recall_tests.rs
  • src/openhuman/memory/stm_recall/tool.rs
  • src/openhuman/memory/store/memory_trait.rs
  • src/openhuman/memory/traits.rs
  • src/openhuman/tools/ops.rs

Comment thread src/openhuman/agent/harness/archivist.rs
Comment thread src/openhuman/agent/harness/session/turn.rs Outdated
Comment thread src/openhuman/memory/stm_recall/mod.rs Outdated
Comment thread src/openhuman/memory/stm_recall/recall.rs
Comment thread src/openhuman/memory/stm_recall/recall.rs Outdated
Comment thread src/openhuman/tools/ops.rs Outdated
- archivist: strip JSON tool-call markers (not just XML) before tree ingest
  (evidence-vs-interpretation policy hardening)
- turn: gate STM first-turn recall on session turn index, not the
  tree-prefetch sentinel (decoupled — prefetch failure no longer re-fires it)
- stm_recall: move tuning constants to constants.rs sibling, re-export
- recall: validate embedding blob length (reject non-multiple-of-4)
- recall: apply recency window consistently incl. keyword/FTS mode
- ops: feature-gate StmRecallTool registration on learning.stm_recall_enabled
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 19, 2026
Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

This PR activates the previously-dormant chat memory pipeline — wiring ArchivistHook into the production turn loop, adding LLM segment recaps with finalize-time embeddings, cross-thread STM recall (FTS5 + cosine arms), segment-granular tree ingest, and unified compaction via SegmentRecapSummarizer. It's a large but well-structured change with three new feature gates (all default-on) and ~1900 lines of thorough test coverage. The design contracts are clearly documented and the soft-fallback philosophy is consistently applied throughout.

Overall this is strong work — the phased commit stack is logical, the error handling follows the existing warn! + swallow pattern correctly, and the tests cover the critical paths hermetically. Two findings below need attention before merge.

Change Summary

File Change type Description
archivist.rs Modified Major refactor: LLM recap via summarize_entries shared helper, finalize-time embedding, flush_open_segment, rolling_segment_recap, per-segment tree ingest replacing per-turn, strip_tool_calls_from_response JSON filter
archivist_tests.rs Modified +608 lines: Phase 0/1/2 tests
session/builder.rs Modified Archivist hook wiring, SegmentRecapSummarizer wrapping, unified compaction gate
session/turn.rs Modified STM preemptive recall injection (first turn), flush_open_segment spawn at session wind-down
session/types.rs Modified archivist_hook + unified_compaction_enabled fields
config/schema/learning.rs Modified 3 new flags: episodic_capture_enabled, stm_recall_enabled, unified_compaction_enabled
config/schema/load.rs Modified Env var overrides for new flags
context/mod.rs Modified Re-export SegmentRecapSummarizer
context/segment_recap_summarizer.rs New Phase 1.5 compaction summarizer (244 lines)
context/segment_recap_summarizer_tests.rs New 562 lines of Phase 1.5 tests
memory/mod.rs Modified stm_recall module
memory/stm_recall/constants.rs New Tuning constants (35 lines)
memory/stm_recall/mod.rs New Module definition + re-exports (47 lines)
memory/stm_recall/recall.rs New Two-arm hybrid retrieval (629 lines)
memory/stm_recall/recall_tests.rs New 721 lines of STM recall tests
memory/stm_recall/tool.rs New On-demand STM recall tool (200 lines)
memory/store/memory_trait.rs Modified sqlite_conn() impl for UnifiedMemory
memory/traits.rs Modified sqlite_conn() default method on Memory trait
tools/ops.rs Modified StmRecallTool registration behind feature gate

Comment thread src/openhuman/context/segment_recap_summarizer.rs Outdated
Comment thread src/openhuman/agent/harness/session/turn.rs Outdated
Comment thread src/openhuman/memory/stm_recall/recall.rs Outdated
…gment, expect over unwrap (tinyhumansai#2175)

Item 1 (major): snap_split_forward_no_break in segment_recap_summarizer.rs
was a copy-paste of snap_split_forward in summarizer.rs. Deduplicated by
making snap_split_forward pub(super) in summarizer.rs and removing the
duplicate local function from segment_recap_summarizer.rs. Both compaction
paths now share a single definition of the never-break-a-tool-call-pair
invariant. Exact split logic is unchanged.

Item 2 (major): flush_open_segment in spawn_session_memory_extraction was
fire-and-forget (tokio::spawn + dropped JoinHandle). The doc comment on
flush_open_segment guarantees the trailing segment always gets a recap plus
embedding plus tree ingest -- that guarantee was violated on runtime shutdown
during session wind-down. Fixed by making spawn_session_memory_extraction
async, removing the tokio::spawn wrapper around the flush call, and directly
awaiting archivist.flush_open_segment(). The call site in Agent::turn
(already async) now .awaits the function. No deadlock risk: no mutex guard
is held across the await point.

Item 3 (minor): bare .unwrap() in stm_recall/recall.rs line 582. The match
arm confirmed peek() returned Some for both iterators, so the unwrap was safe
but violated the project convention reserving .unwrap() for tests. Replaced
with .expect("peek confirmed Some").

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@senamakel senamakel merged commit 341ce42 into tinyhumansai:main May 19, 2026
27 checks passed
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
…ecaps, cross-thread STM recall, segment-granular tree, unified compaction (tinyhumansai#2175)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants