feat(agent-runtime): add SurrealDB memory backend and secure wizard flow#46
Conversation
…m-knowledge-graph-memory
Deploying corvus with
|
| Latest commit: |
fb16489
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://7cf1be43.corvus-42x.pages.dev |
| Branch Preview URL: | https://bart-feature-dallay-131-impl.corvus-42x.pages.dev |
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
|
No actionable comments were generated in the recent review. 🎉 📝 WalkthroughWalkthroughAdds an optional SurrealDB memory backend to agent-runtime: feature-flagged dependency and Cargo feature, SurrealMemory implementation with WS client and hybrid recall, config schema and env overrides, onboarding prompts and Docker/dev scaffolding, integration into memory selection, tests, and README updates. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent/Client
participant Surreal as SurrealMemory
participant Embed as Embedding Provider
participant DB as SurrealDB (WebSocket)
Agent->>Surreal: store(key, content, category)
activate Surreal
Surreal->>Embed: embed(content)
activate Embed
Embed-->>Surreal: vector
deactivate Embed
Surreal->>DB: upsert entry (id, content, vector, meta)
activate DB
DB-->>Surreal: success
deactivate DB
Surreal-->>Agent: ok
deactivate Surreal
Agent->>Surreal: recall(query, limit)
activate Surreal
Surreal->>Embed: embed(query)
activate Embed
Embed-->>Surreal: query_vector
deactivate Embed
Surreal->>DB: vector + keyword search
activate DB
DB-->>Surreal: ranked results
deactivate DB
Surreal-->>Agent: Vec<MemoryEntry>
deactivate Surreal
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Contributor ReportUser: @yacosta738
Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-02-19 to 2026-02-19 |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clients/agent-runtime/src/config/schema.rs (1)
824-825:⚠️ Potential issue | 🟡 MinorDoc comment for
backendis stale — missing"surreal".The comment lists
"sqlite" | "lucid" | "markdown" | "none"but this PR adds"surreal"as a valid backend. Update the doc to include it.- /// "sqlite" | "lucid" | "markdown" | "none" (`none` = explicit no-op memory) + /// "sqlite" | "lucid" | "markdown" | "surreal" | "none" (`none` = explicit no-op memory)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 824 - 825, Update the doc comment for the struct field `backend` to include the newly supported "surreal" option; the current comment only lists `"sqlite" | "lucid" | "markdown" | "none"`, so edit the `backend: String` doc comment to enumerate `"sqlite" | "lucid" | "markdown" | "surreal" | "none"` (keeping the `none` explanation) so the documentation matches the code.
🧹 Nitpick comments (12)
clients/agent-runtime/README.md (1)
345-356: Environment variable documentation is thorough.Good coverage of all SurrealDB env overrides. Minor nit: consider using placeholder values (e.g.,
your-password-here) instead ofcorvus-passin the example to reduce the chance of users accidentally deploying with default credentials.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/README.md` around lines 345 - 356, Update the example SurrealDB env block to avoid a realistic default password; replace the literal "corvus-pass" value for CORVUS_SURREALDB_PASSWORD with a clear placeholder like "your-password-here" (referencing the CORVUS_SURREALDB_PASSWORD env variable in the README code block) so the docs encourage users to choose their own secret instead of copying a default.clients/agent-runtime/docker-compose.yml (1)
56-62: Consider pinning the SurrealDB image tag for reproducibility.Using
surrealdb/surrealdb:latestcan lead to unexpected breakages when SurrealDB releases new versions with API changes. Since the Cargo dependency targets2.3, consider pinning to a compatible server version (e.g.,surrealdb/surrealdb:v2.3).Proposed fix
- image: surrealdb/surrealdb:latest + image: surrealdb/surrealdb:v2.3🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/docker-compose.yml` around lines 56 - 62, The SurrealDB service in the docker-compose snippet uses the floating image tag `surrealdb/surrealdb:latest` which can introduce breaking changes; update the `surrealdb` service definition to pin the image to a specific compatible version (for example `surrealdb/surrealdb:v2.3`) to match the Cargo dependency target, ensuring the `image` field is changed accordingly and any documentation or comments reflect the pinned version.clients/agent-runtime/src/main.rs (1)
124-126: Doc comment updated correctly, but consider keeping the default note.The
"default: sqlite"hint was removed from the help text. SincememoryisOption<String>with nodefault_value, CLI users runningcorvus onboard --helplose visibility into the default behavior. Consider restoring the default indication:- /// Memory backend (sqlite, lucid, surreal, markdown, none) - used in quick mode + /// Memory backend (sqlite, lucid, surreal, markdown, none; default: sqlite) - used in quick modeBased on learnings: "Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/main.rs` around lines 124 - 126, The CLI lost the "(default: sqlite)" hint for the memory option; update the #[arg(...)] on the memory field so the help text includes the default hint (e.g., add help = "Memory backend (sqlite, lucid, surreal, markdown, none) - used in quick mode (default: sqlite)") while keeping the field as Option<String> and not changing its type or behavior; target the memory field's attribute in main.rs to restore the default note in the --help output.dev/config.template.toml (1)
9-19: Dev template defaults tosurrealbackend — ensure this is intentional.The code default for
memory.backendis"sqlite"(seedefault_memory_backend_key()), but this template sets it to"surreal". This is fine if the template is specifically for SurrealDB development/testing, but could confuse users who copy it as a general starting point. Consider adding a comment clarifying this template is tailored for SurrealDB workflows.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dev/config.template.toml` around lines 9 - 19, The template's memory.backend is set to "surreal" while the code default_memory_backend_key() returns "sqlite"; either update the [memory] backend value to "sqlite" to match the code default or add a clear comment above the [memory] section stating this template is intentionally tailored for SurrealDB workflows (and not a generic starting point), referencing the memory.backend key and the default_memory_backend_key() function so reviewers/users understand the discrepancy.clients/agent-runtime/src/memory/surreal.rs (5)
102-138:client()clones six strings on every call, even after theOnceCellis initialized.The clones on lines 103–108 are captured by the
get_or_try_initclosure, but they execute on every invocation — including after the cell is populated. Move the clones inside the closure so they only run during the one-time init.Proposed fix
async fn client(&self) -> Result<&Surreal<Client>> { - let ws_endpoint = self.ws_endpoint.clone(); - let namespace = self.namespace.clone(); - let database = self.database.clone(); - let username = self.username.clone(); - let password = self.password.clone(); - let token = self.token.clone(); - self.client - .get_or_try_init(|| async move { + .get_or_try_init(|| { + let ws_endpoint = self.ws_endpoint.clone(); + let namespace = self.namespace.clone(); + let database = self.database.clone(); + let username = self.username.clone(); + let password = self.password.clone(); + let token = self.token.clone(); + async move { let db = Surreal::new::<Ws>(ws_endpoint.as_str()) .await .context("failed to connect to SurrealDB")?; // ... rest unchanged ... Ok(db) + } }) .await }As per coding guidelines,
clients/agent-runtime/src/**/*.rs: "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/surreal.rs` around lines 102 - 138, The client() method currently clones ws_endpoint, namespace, database, username, password, and token before calling self.client.get_or_try_init(...), causing those clones to happen on every call; move the clones into the async move closure passed to get_or_try_init so the strings are cloned only during initialization (i.e., capture self.ws_endpoint, self.namespace, self.database, self.username, self.password, self.token by cloning them inside the closure body before using them), and remove the pre-closure clones so subsequent client() calls avoid unnecessary allocations; keep using get_or_try_init, Surreal::new::<Ws>, authentication branches, and Self::ensure_schema(&db) as-is.
439-451: Fallback keyword search recomputes lowercase strings already computed earlier.Lines 443–445 rebuild the
format!("{} {}", row.key.to_lowercase(), row.content.to_lowercase())string for each row, which was already computed on line 406 in the main scoring loop. If you switch to server-side filtering (per thefetch_all_entriescomment), this becomes moot; otherwise consider caching the searchable text.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/surreal.rs` around lines 439 - 451, The fallback branch re-creates the lowercase searchable string for each row (format!("{} {}", row.key.to_lowercase(), row.content.to_lowercase())) even though it was already computed in the main scoring loop; update the code to compute and store the lowercase/searchable text once and reuse it in the fallback instead of recomputing: e.g., when building row_by_id or when converting rows earlier, add a cached searchable_lower (or a small HashMap id->searchable_lower) and then change the fallback filter to use that cached value for .contains(&query_lower) and pass the existing score to Self::row_to_entry; alternatively, if you follow the fetch_all_entries change, perform the server-side filtering to avoid this client-side lowercase work entirely.
195-206:fetch_all_entries()loads the entire table into memory —recall,list, andcountall pay O(n) cost.Every call to
recall,list, orcountdeserializes every row from thememory_entriestable. As memory grows, this becomes a performance bottleneck and defeats the purpose of using a database.Consider:
count: useSELECT count() FROM memory_entries GROUP ALL;list: pushcategoryandsession_idfilters into a SurrealQLWHEREclauserecall: for keyword search, a SurrealQLWHERE content CONTAINS ...or full-text index could reduce the working set. The in-memory vector search is reasonable for now but the full-table scan to feed it is not.This can be addressed incrementally, but
count()is the lowest-hanging fruit.As per coding guidelines,
clients/agent-runtime/src/**/*.rs: "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency."Also applies to: 383-383, 478-478, 514-516
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/surreal.rs` around lines 195 - 206, fetch_all_entries currently selects and deserializes the entire memory_entries table causing O(n) work for recall/list/count; update the code to avoid full-table materialization by implementing DB-backed variants: change count() to run "SELECT count() FROM memory_entries GROUP ALL;" in count_memory_entries (or in the existing count method) instead of calling fetch_all_entries; change list() to push category and session_id into a SurrealQL WHERE clause (e.g., WHERE category = ... AND session_id = ...) and paginate/limit as needed rather than deserializing all rows; change recall() to perform a WHERE content CONTAINS ... or use a full-text index query for keyword search (or accept optional filters on fetch_all_entries so it issues a SELECT ... WHERE ... to SurrealDB) so only matching rows are returned and deserialized; locate and modify functions fetch_all_entries, count, list, and recall (and any call sites that rely on fetch_all_entries) to use these targeted queries and avoid unnecessary allocations/clones.
140-158: SchemaDEFINEstatements silently swallow all errors, not just "already exists".Line 152–154 catches any error from each schema statement and logs it at
debuglevel. Permission errors, syntax errors in future schema changes, or connectivity issues during init would all be silently ignored. Consider checking the error type or at minimum logging atwarnfor unexpected failures.Proposed fix
for statement in statements { - if let Err(error) = db.query(statement).await { - tracing::debug!("SurrealDB schema statement skipped: {error}"); + if let Err(error) = db.query(statement).await { + let msg = error.to_string(); + if msg.contains("already exists") { + tracing::debug!("SurrealDB schema statement skipped (already exists): {statement}"); + } else { + tracing::warn!("SurrealDB schema statement failed: {error} — statement: {statement}"); + } } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/surreal.rs` around lines 140 - 158, The ensure_schema function currently swallows every error from db.query for each statement; change this so only benign "already exists" errors are ignored and everything else is surfaced: in async fn ensure_schema (and the loop over statements/db.query) inspect the returned error (match on the SurrealDB error type if available or check the error string for "already exists"); if it indicates the object already exists, keep the debug log, otherwise log at warn (or error) and return Err(...) to propagate the failure instead of silently continuing. Ensure references: statements array, the db.query(...) call, and the ensure_schema function are updated accordingly.
272-361:store()makes up to 6 DB round-trips per call.The flow:
get(check previous) →upsert→log_event→log_relation(category) →list(session entries) →log_relation(previous entry). For a session-scoped store, that's 6 separate database calls. Consider batching related writes into a single SurrealQL transaction to reduce latency and ensure atomicity.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/surreal.rs` around lines 272 - 361, store() currently performs multiple sequential DB calls (get, upsert, log_event, log_relation, list, log_relation) causing up to 6 round-trips; change it to run as a single SurrealDB transaction that upserts the EntryWrite (use record_id and payload), returns whether a previous record existed, inserts the event (log_event) and the category relation (log_relation "entry_category"), and if session_id is present also insert entry_session and compute the previous entry within the same transaction (SELECT ... ORDER BY timestamp DESC LIMIT 1 for that session) then insert entry_previous if found; keep using the same payload/record_id/EntryWrite and preserve the embedding logic, but replace the independent calls to get(), upsert(), list(), and separate log_relation/log_event calls with a single transactional block (begin/commit) so the writes are batched and atomic and use the transaction response to determine the "action" (store vs update) and any previous entry id.clients/agent-runtime/src/config/schema.rs (3)
774-797:Debugderive will exposepasswordandtokenin plaintext if this struct is ever logged.
SurrealMemoryConfigderivesDebug, which means anytracing::debug!("{:?}", config.memory.surreal)or similar call will emit credentials. Consider a manualDebugimpl that redacts sensitive fields, or wrap secrets in a redacting newtype.♻️ Example: manual Debug impl that redacts secrets
-#[derive(Debug, Clone, Serialize, Deserialize)] +#[derive(Clone, Serialize, Deserialize)] pub struct SurrealMemoryConfig { ... } + +impl std::fmt::Debug for SurrealMemoryConfig { + fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { + f.debug_struct("SurrealMemoryConfig") + .field("url", &self.url) + .field("namespace", &self.namespace) + .field("database", &self.database) + .field("username", &self.username.as_ref().map(|_| "***")) + .field("password", &self.password.as_ref().map(|_| "***")) + .field("token", &self.token.as_ref().map(|_| "***")) + .field("allow_http_loopback", &self.allow_http_loopback) + .finish() + } +}Based on learnings: "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable." As per coding guidelines:
clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 774 - 797, The SurrealMemoryConfig struct currently derives Debug which will expose sensitive fields (password, token) when logged; replace the automatic Debug derive with a manual implementation or a redacting newtype so that password and token are not printed (e.g., implement fmt::Debug for SurrealMemoryConfig and display placeholders like "<redacted>" for password and token while keeping other fields readable), or wrap the sensitive fields in a custom Redacted<T> type that implements Debug redacting the inner value; ensure you remove Debug from the derive list and update any references relying on the derived Debug to use the new manual impl or redacting wrappers for SurrealMemoryConfig, password, and token.
2354-2395: Repetitive env-override blocks could be condensed with a small helper.Six near-identical blocks follow the same trim-then-set-if-non-empty pattern. A helper would reduce surface area for copy-paste errors and make adding future fields easier.
♻️ Optional helper extraction
fn env_override_optional(var: &str, target: &mut Option<String>) { if let Ok(raw) = std::env::var(var) { let value = raw.trim(); if !value.is_empty() { *target = Some(value.to_string()); } } }Then the six blocks become:
env_override_optional("CORVUS_SURREALDB_URL", &mut self.memory.surreal.url); env_override_optional("CORVUS_SURREALDB_NAMESPACE", &mut self.memory.surreal.namespace); env_override_optional("CORVUS_SURREALDB_DATABASE", &mut self.memory.surreal.database); env_override_optional("CORVUS_SURREALDB_USERNAME", &mut self.memory.surreal.username); env_override_optional("CORVUS_SURREALDB_PASSWORD", &mut self.memory.surreal.password); env_override_optional("CORVUS_SURREALDB_TOKEN", &mut self.memory.surreal.token);This helper pattern could also benefit the other existing env-override blocks (e.g., web search provider, brave API key, etc.).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2354 - 2395, Extract the repeated trim-and-set logic into a small helper (e.g., env_override_optional) and replace the six near-identical blocks that set self.memory.surreal.url / namespace / database / username / password / token with calls to that helper using the corresponding env var names ("CORVUS_SURREALDB_URL", "CORVUS_SURREALDB_NAMESPACE", "CORVUS_SURREALDB_DATABASE", "CORVUS_SURREALDB_USERNAME", "CORVUS_SURREALDB_PASSWORD", "CORVUS_SURREALDB_TOKEN"); the helper should accept the env var name and a &mut Option<String>, read std::env::var, trim, and assign Some(value.to_string()) only if non-empty, and you can reuse the same helper for other env-override blocks (web search provider, brave API key, etc.).
2161-2190:url,namespace, anddatabaseare connection metadata, not secrets — encrypting them reduces debuggability without meaningful security benefit.Only
username,password, andtokenneed encryption. Encrypting the endpoint URL and schema identifiers means operators can't inspectconfig.tomlto verify connection settings without decrypting, and the decrypt-on-load / encrypt-on-save round-trips add unnecessary complexity.Consider limiting encryption to actual credentials:
♻️ Suggested change for load path (similar change needed for save path)
- decrypt_optional_secret( - &store, - &mut config.memory.surreal.url, - "config.memory.surreal.url", - )?; - decrypt_optional_secret( - &store, - &mut config.memory.surreal.namespace, - "config.memory.surreal.namespace", - )?; - decrypt_optional_secret( - &store, - &mut config.memory.surreal.database, - "config.memory.surreal.database", - )?; decrypt_optional_secret( &store, &mut config.memory.surreal.username, "config.memory.surreal.username", )?; decrypt_optional_secret( &store, &mut config.memory.surreal.password, "config.memory.surreal.password", )?; decrypt_optional_secret( &store, &mut config.memory.surreal.token, "config.memory.surreal.token", )?;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2161 - 2190, The current decryption loop calls decrypt_optional_secret on connection metadata (config.memory.surreal.url, .namespace, .database) which should not be treated as secrets; update the load path to only call decrypt_optional_secret for credentials (config.memory.surreal.username, .password, .token) and remove calls for .url, .namespace, and .database, leaving them as plain values; also apply the corresponding change in the save/encrypt path so only username/password/token are encrypted/decrypted, referencing the decrypt_optional_secret function and the config.memory.surreal.* fields to locate where to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/src/memory/mod.rs`:
- Around line 167-190: The Surreal backend branch (MemoryBackendKind::Surreal)
performs f64-to-f32 casts for config.vector_weight and config.keyword_weight
when calling SurrealMemory::new and is missing the Clippy suppression used
elsewhere; add #[allow(clippy::cast_possible_truncation)] to the block
containing the casts (the cfg(feature = "memory-surreal") section) so the as f32
conversions of config.vector_weight and config.keyword_weight do not emit
warnings.
In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 559-583: validate_endpoint_security currently permits "ws"
unconditionally; treat "ws" like "http" by applying the same allow_http_loopback
and is_loopback_host checks so unencrypted websocket endpoints cannot connect to
non-loopback hosts. Update the match in validate_endpoint_security to handle
"http" and "ws" together (or duplicate the same guard logic for "ws"), using the
existing allow_http_loopback parameter and is_loopback_host(host) call, and
produce the same informative error messages when rejecting insecure non-loopback
endpoints; leave "wss" and "https" allowed as before.
In `@clients/agent-runtime/src/onboard/wizard.rs`:
- Around line 2428-2449: The docker scaffold in scaffold_surreal_docker_files
writes a compose_contents string that starts SurrealDB with the memory backend
(the literal "memory" argument) which is non-persistent; change the
compose_contents to use a persistent backend (e.g., use a rocksdb:// or
surrealkv:// URI instead of "memory"), add a named volume mount in the compose
service so data persists across container restarts, and include a brief inline
warning comment in the compose_contents string indicating the difference between
in-memory and persistent storage; update any related filenames (compose_path,
env_example_path) usage if needed to reflect this persistence change.
---
Outside diff comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 824-825: Update the doc comment for the struct field `backend` to
include the newly supported "surreal" option; the current comment only lists
`"sqlite" | "lucid" | "markdown" | "none"`, so edit the `backend: String` doc
comment to enumerate `"sqlite" | "lucid" | "markdown" | "surreal" | "none"`
(keeping the `none` explanation) so the documentation matches the code.
---
Nitpick comments:
In `@clients/agent-runtime/docker-compose.yml`:
- Around line 56-62: The SurrealDB service in the docker-compose snippet uses
the floating image tag `surrealdb/surrealdb:latest` which can introduce breaking
changes; update the `surrealdb` service definition to pin the image to a
specific compatible version (for example `surrealdb/surrealdb:v2.3`) to match
the Cargo dependency target, ensuring the `image` field is changed accordingly
and any documentation or comments reflect the pinned version.
In `@clients/agent-runtime/README.md`:
- Around line 345-356: Update the example SurrealDB env block to avoid a
realistic default password; replace the literal "corvus-pass" value for
CORVUS_SURREALDB_PASSWORD with a clear placeholder like "your-password-here"
(referencing the CORVUS_SURREALDB_PASSWORD env variable in the README code
block) so the docs encourage users to choose their own secret instead of copying
a default.
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 774-797: The SurrealMemoryConfig struct currently derives Debug
which will expose sensitive fields (password, token) when logged; replace the
automatic Debug derive with a manual implementation or a redacting newtype so
that password and token are not printed (e.g., implement fmt::Debug for
SurrealMemoryConfig and display placeholders like "<redacted>" for password and
token while keeping other fields readable), or wrap the sensitive fields in a
custom Redacted<T> type that implements Debug redacting the inner value; ensure
you remove Debug from the derive list and update any references relying on the
derived Debug to use the new manual impl or redacting wrappers for
SurrealMemoryConfig, password, and token.
- Around line 2354-2395: Extract the repeated trim-and-set logic into a small
helper (e.g., env_override_optional) and replace the six near-identical blocks
that set self.memory.surreal.url / namespace / database / username / password /
token with calls to that helper using the corresponding env var names
("CORVUS_SURREALDB_URL", "CORVUS_SURREALDB_NAMESPACE",
"CORVUS_SURREALDB_DATABASE", "CORVUS_SURREALDB_USERNAME",
"CORVUS_SURREALDB_PASSWORD", "CORVUS_SURREALDB_TOKEN"); the helper should accept
the env var name and a &mut Option<String>, read std::env::var, trim, and assign
Some(value.to_string()) only if non-empty, and you can reuse the same helper for
other env-override blocks (web search provider, brave API key, etc.).
- Around line 2161-2190: The current decryption loop calls
decrypt_optional_secret on connection metadata (config.memory.surreal.url,
.namespace, .database) which should not be treated as secrets; update the load
path to only call decrypt_optional_secret for credentials
(config.memory.surreal.username, .password, .token) and remove calls for .url,
.namespace, and .database, leaving them as plain values; also apply the
corresponding change in the save/encrypt path so only username/password/token
are encrypted/decrypted, referencing the decrypt_optional_secret function and
the config.memory.surreal.* fields to locate where to modify.
In `@clients/agent-runtime/src/main.rs`:
- Around line 124-126: The CLI lost the "(default: sqlite)" hint for the memory
option; update the #[arg(...)] on the memory field so the help text includes the
default hint (e.g., add help = "Memory backend (sqlite, lucid, surreal,
markdown, none) - used in quick mode (default: sqlite)") while keeping the field
as Option<String> and not changing its type or behavior; target the memory
field's attribute in main.rs to restore the default note in the --help output.
In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 102-138: The client() method currently clones ws_endpoint,
namespace, database, username, password, and token before calling
self.client.get_or_try_init(...), causing those clones to happen on every call;
move the clones into the async move closure passed to get_or_try_init so the
strings are cloned only during initialization (i.e., capture self.ws_endpoint,
self.namespace, self.database, self.username, self.password, self.token by
cloning them inside the closure body before using them), and remove the
pre-closure clones so subsequent client() calls avoid unnecessary allocations;
keep using get_or_try_init, Surreal::new::<Ws>, authentication branches, and
Self::ensure_schema(&db) as-is.
- Around line 439-451: The fallback branch re-creates the lowercase searchable
string for each row (format!("{} {}", row.key.to_lowercase(),
row.content.to_lowercase())) even though it was already computed in the main
scoring loop; update the code to compute and store the lowercase/searchable text
once and reuse it in the fallback instead of recomputing: e.g., when building
row_by_id or when converting rows earlier, add a cached searchable_lower (or a
small HashMap id->searchable_lower) and then change the fallback filter to use
that cached value for .contains(&query_lower) and pass the existing score to
Self::row_to_entry; alternatively, if you follow the fetch_all_entries change,
perform the server-side filtering to avoid this client-side lowercase work
entirely.
- Around line 195-206: fetch_all_entries currently selects and deserializes the
entire memory_entries table causing O(n) work for recall/list/count; update the
code to avoid full-table materialization by implementing DB-backed variants:
change count() to run "SELECT count() FROM memory_entries GROUP ALL;" in
count_memory_entries (or in the existing count method) instead of calling
fetch_all_entries; change list() to push category and session_id into a
SurrealQL WHERE clause (e.g., WHERE category = ... AND session_id = ...) and
paginate/limit as needed rather than deserializing all rows; change recall() to
perform a WHERE content CONTAINS ... or use a full-text index query for keyword
search (or accept optional filters on fetch_all_entries so it issues a SELECT
... WHERE ... to SurrealDB) so only matching rows are returned and deserialized;
locate and modify functions fetch_all_entries, count, list, and recall (and any
call sites that rely on fetch_all_entries) to use these targeted queries and
avoid unnecessary allocations/clones.
- Around line 140-158: The ensure_schema function currently swallows every error
from db.query for each statement; change this so only benign "already exists"
errors are ignored and everything else is surfaced: in async fn ensure_schema
(and the loop over statements/db.query) inspect the returned error (match on the
SurrealDB error type if available or check the error string for "already
exists"); if it indicates the object already exists, keep the debug log,
otherwise log at warn (or error) and return Err(...) to propagate the failure
instead of silently continuing. Ensure references: statements array, the
db.query(...) call, and the ensure_schema function are updated accordingly.
- Around line 272-361: store() currently performs multiple sequential DB calls
(get, upsert, log_event, log_relation, list, log_relation) causing up to 6
round-trips; change it to run as a single SurrealDB transaction that upserts the
EntryWrite (use record_id and payload), returns whether a previous record
existed, inserts the event (log_event) and the category relation (log_relation
"entry_category"), and if session_id is present also insert entry_session and
compute the previous entry within the same transaction (SELECT ... ORDER BY
timestamp DESC LIMIT 1 for that session) then insert entry_previous if found;
keep using the same payload/record_id/EntryWrite and preserve the embedding
logic, but replace the independent calls to get(), upsert(), list(), and
separate log_relation/log_event calls with a single transactional block
(begin/commit) so the writes are batched and atomic and use the transaction
response to determine the "action" (store vs update) and any previous entry id.
In `@dev/config.template.toml`:
- Around line 9-19: The template's memory.backend is set to "surreal" while the
code default_memory_backend_key() returns "sqlite"; either update the [memory]
backend value to "sqlite" to match the code default or add a clear comment above
the [memory] section stating this template is intentionally tailored for
SurrealDB workflows (and not a generic starting point), referencing the
memory.backend key and the default_memory_backend_key() function so
reviewers/users understand the discrepancy.
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
clients/agent-runtime/src/config/schema.rs (1)
2266-2274: No input validation onCORVUS_MEMORY_BACKEND/MEMORY_BACKENDvaluesAny arbitrary string (e.g.
CORVUS_MEMORY_BACKEND=unknown_typo) is accepted silently without validating against the known set ("sqlite","lucid","markdown","surreal","none"). This can cause a confusing runtime failure deep in the memory backend initialisation rather than a clear config error.This is a pre-existing pattern for other string-typed env overrides (provider, model), but introducing a new
"surreal"backend is a good opportunity to tighten this. Consider logging atracing::warn!when the backend value is unrecognised.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2266 - 2274, Validate the incoming CORVUS_MEMORY_BACKEND / MEMORY_BACKEND value against the allowed set before assigning to self.memory.backend: after trimming and normalizing (e.g., to_ascii_lowercase()) check it is one of "sqlite", "lucid", "markdown", "surreal", or "none"; if it is, assign it to self.memory.backend as currently done, otherwise do not override and emit a tracing::warn! that includes the unrecognized value and the allowed list so the user gets a clear config warning (use the existing env-var lookup and the self.memory.backend symbol to locate where to change behavior).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 3764-3775: Add a new unit test that verifies the fallback to the
non-prefixed environment variable for memory backend: create a test (e.g.,
env_override_memory_backend_fallback) that uses env_override_test_guard(),
constructs Config::default(), ensures CORVUS_MEMORY_BACKEND is removed, sets
MEMORY_BACKEND to a value like "markdown", calls config.apply_env_overrides(),
and asserts config.memory.backend equals the fallback value, then cleans up by
removing MEMORY_BACKEND; this exercises the apply_env_overrides fallback path
alongside the existing env_override_memory_backend test.
- Around line 822-834: The Debug impl for SurrealMemoryConfig currently exposes
username; change the ".field(\"username\", &self.username)" to redact the value
the same way password and token are handled (e.g., map to "<redacted>" via
self.username.as_ref().map(|_| "<redacted>")) inside the impl fmt::Debug for
SurrealMemoryConfig so Debug never prints raw credentials, and update the test
surreal_memory_config_debug_redacts_sensitive_fields to assert that the username
string does not appear in the debug output (i.e., ensure debug output contains
"<redacted>" for username or simply does not contain the real username).
In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 421-509: The recall method currently calls recall_rows which
prefilters by keyword, causing purely semantic queries to miss results; change
recall to also fetch a vector-only candidate set (e.g., call a new
recall_vector_candidates or call recall_rows with an option to skip keyword
filtering, or fetch the most recent N rows) and merge those rows into row_by_id
and searchable_by_id before computing vector similarities and keyword scores;
ensure you compute embeddings for those additional rows if missing, deduplicate
by id, then run the existing vector scoring/keyword scoring and hybrid_merge on
the combined candidate set so semantic-only queries can return vector matches.
---
Nitpick comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2266-2274: Validate the incoming CORVUS_MEMORY_BACKEND /
MEMORY_BACKEND value against the allowed set before assigning to
self.memory.backend: after trimming and normalizing (e.g., to_ascii_lowercase())
check it is one of "sqlite", "lucid", "markdown", "surreal", or "none"; if it
is, assign it to self.memory.backend as currently done, otherwise do not
override and emit a tracing::warn! that includes the unrecognized value and the
allowed list so the user gets a clear config warning (use the existing env-var
lookup and the self.memory.backend symbol to locate where to change behavior).
…m-knowledge-graph-memory
This pull request adds support for using SurrealDB as a memory backend in the agent runtime, alongside existing options like SQLite and Markdown. It introduces a new configuration structure for SurrealDB, updates documentation and configuration files to reflect the new backend, and ensures environment variable and secret management support for SurrealDB connection settings.
SurrealDB Memory Backend Integration
surrealdbas an optional dependency inCargo.tomland introduced thememory-surrealfeature flag for enabling SurrealDB-backed memory. [1] [2]SurrealMemoryConfigstruct inconfig/schema.rsto encapsulate SurrealDB connection settings, with sensible defaults and support for deserialization/serialization.MemoryConfigto include asurrealfield for SurrealDB configuration, with default initialization and test coverage. [1] [2] [3] [4]Configuration and Environment Variable Support
CORVUS_SURREALDB_URL,CORVUS_SURREALDB_NAMESPACE, etc.). [1] [2] [3] [4]Documentation and Example Updates
README.mdto document SurrealDB support, configuration options, environment variables, and onboarding wizard changes. [1] [2] [3] [4] [5] [6] [7]docker-compose.ymlto provide example configuration for running SurrealDB locally and setting the relevant environment variables for the agent. [1] [2]Internal Codebase Adjustments
SurrealMemoryConfiginconfig/mod.rs.Summary by CodeRabbit
New Features
Documentation
Chores