spacedriveapp · vsumner · Feb 28, 2026 · Mar 10, 2026
diff --git a/docs/content/docs/(core)/cortex.mdx b/docs/content/docs/(core)/cortex.mdx
@@ -36,7 +36,7 @@ Each bulletin generation pass does:
 
 This design avoids the problem of an LLM formulating search queries without conversation context. The retrieval phase uses `SearchMode::Typed`, `SearchMode::Recent`, and `SearchMode::Important` — metadata-based modes that query SQLite directly without needing vector embeddings or search terms. The LLM only gets involved for the part it's good at: turning structured data into readable prose.
 
-On startup, Spacebot runs a best-effort warmup pass before adapters accept traffic (bounded wait), so the first bulletin is usually already present when the first user message arrives. If generation fails, the previous bulletin is preserved. If the memory graph is empty, an empty bulletin is stored without invoking the LLM.
+On startup, Spacebot runs a best-effort warmup pass before adapters accept traffic (bounded wait), so the first bulletin is usually already present when the first user message arrives. The same pass also refreshes a warm recall cache of high-importance memories used only as degraded fallback context when hybrid branch recall fails. If generation fails, the previous bulletin is preserved. If the memory graph is empty, an empty bulletin is stored without invoking the LLM.
 
 ### What Channels See
 

diff --git a/docs/content/docs/(core)/memory.mdx b/docs/content/docs/(core)/memory.mdx
@@ -99,7 +99,7 @@ Memory recall is always delegated to a worker. No LLM process ever queries the d
 
 The `memory_recall` tool supports four search modes, each suited to different retrieval needs:
 
-**Hybrid** (default) -- Full pipeline: vector similarity (LanceDB HNSW) + full-text search (Tantivy) + graph traversal, merged via Reciprocal Rank Fusion (RRF). Requires a query string. Best when you have a specific topic to search for and conversation context to inform the query.
+**Hybrid** (default) -- Full pipeline: vector similarity (LanceDB HNSW) + full-text search (Tantivy) + graph traversal, merged via Reciprocal Rank Fusion (RRF). Requires a query string. Best when you have a specific topic to search for and conversation context to inform the query. If the hybrid search path errors, Spacebot can return a degraded fallback from a warm importance-sorted cache populated during warmup. This is availability hardening, not a replacement for normal hybrid retrieval.
 
 **Recent** -- Returns the most recent memories ordered by `created_at`. No query needed, no vector/FTS overhead. Pure SQLite. Best for temporal awareness -- "what just happened?"
 

diff --git a/docs/content/docs/(deployment)/roadmap.mdx b/docs/content/docs/(deployment)/roadmap.mdx
@@ -21,6 +21,7 @@ The full message-in → LLM → response-out pipeline is wired end-to-end across
 - **LLM** — `SpacebotModel` implements Rig's `CompletionModel`, routes through `LlmManager` via HTTP with retries and fallback chains across 13 providers (Anthropic, OpenAI, OpenRouter, Kilo Gateway, Z.ai, Groq, Together, Fireworks, DeepSeek, xAI, Mistral, OpenCode Zen, OpenCode Go)
 - **Model routing** — `RoutingConfig` with process-type defaults, task overrides, fallback chains
 - **Memory** — full stack: types, SQLite store (CRUD + graph), LanceDB (embeddings + vector + FTS), fastembed, hybrid search (RRF fusion). `memory_type` filter wired end-to-end through SearchConfig. `total_cmp` for safe sorting.
+- **Warm recall degraded fallback** — warmup refreshes an importance-sorted cache that branch recall can use when the hybrid search path fails, with epoch/lock coordination to avoid reintroducing forgotten memories during concurrent cache mutation.
 - **Memory maintenance** — decay + prune implemented
 - **Identity** — `Identity` struct loads SOUL.md/IDENTITY.md/ROLE.md from agent root, `Prompts` with fallback chain
 - **Agent loops** — all three process types run real Rig loops: