Skip to content

feat(agent-runtime): add SurrealDB memory backend and secure wizard flow#46

Merged
yacosta738 merged 8 commits into
mainfrom
bart/feature/dallay-131-implementing-a-custom-knowledge-graph-memory
Feb 19, 2026
Merged

feat(agent-runtime): add SurrealDB memory backend and secure wizard flow#46
yacosta738 merged 8 commits into
mainfrom
bart/feature/dallay-131-implementing-a-custom-knowledge-graph-memory

Conversation

@yacosta738
Copy link
Copy Markdown
Contributor

@yacosta738 yacosta738 commented Feb 19, 2026

This pull request adds support for using SurrealDB as a memory backend in the agent runtime, alongside existing options like SQLite and Markdown. It introduces a new configuration structure for SurrealDB, updates documentation and configuration files to reflect the new backend, and ensures environment variable and secret management support for SurrealDB connection settings.

SurrealDB Memory Backend Integration

  • Added surrealdb as an optional dependency in Cargo.toml and introduced the memory-surreal feature flag for enabling SurrealDB-backed memory. [1] [2]
  • Introduced SurrealMemoryConfig struct in config/schema.rs to encapsulate SurrealDB connection settings, with sensible defaults and support for deserialization/serialization.
  • Extended MemoryConfig to include a surreal field for SurrealDB configuration, with default initialization and test coverage. [1] [2] [3] [4]

Configuration and Environment Variable Support

  • Updated config loading and saving logic to support encrypting/decrypting SurrealDB secrets, and to read SurrealDB settings from environment variables (e.g., CORVUS_SURREALDB_URL, CORVUS_SURREALDB_NAMESPACE, etc.). [1] [2] [3] [4]

Documentation and Example Updates

  • Updated README.md to document SurrealDB support, configuration options, environment variables, and onboarding wizard changes. [1] [2] [3] [4] [5] [6] [7]
  • Updated docker-compose.yml to provide example configuration for running SurrealDB locally and setting the relevant environment variables for the agent. [1] [2]

Internal Codebase Adjustments

  • Updated module exports to include SurrealMemoryConfig in config/mod.rs.
  • Minor formatting and refactoring for config directory resolution and environment variable handling. [1] [2]

Summary by CodeRabbit

  • New Features

    • Optional SurrealDB memory backend (feature-gated) with store/recall, search and health checks.
    • Interactive onboarding can configure SurrealDB and scaffold local Docker helpers.
  • Documentation

    • Configuration docs, examples and help text updated with SurrealDB options and env var examples.
  • Chores

    • Local docker-compose includes an optional SurrealDB service; dev config template updated with Surreal settings.

@linear
Copy link
Copy Markdown

linear Bot commented Feb 19, 2026

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Feb 19, 2026

Deploying corvus with  Cloudflare Pages  Cloudflare Pages

Latest commit: fb16489
Status: ✅  Deploy successful!
Preview URL: https://7cf1be43.corvus-42x.pages.dev
Branch Preview URL: https://bart-feature-dallay-131-impl.corvus-42x.pages.dev

View logs

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Feb 19, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 19, 2026

No actionable comments were generated in the recent review. 🎉


📝 Walkthrough

Walkthrough

Adds an optional SurrealDB memory backend to agent-runtime: feature-flagged dependency and Cargo feature, SurrealMemory implementation with WS client and hybrid recall, config schema and env overrides, onboarding prompts and Docker/dev scaffolding, integration into memory selection, tests, and README updates.

Changes

Cohort / File(s) Summary
Dependency & build
clients/agent-runtime/Cargo.toml
Add optional surrealdb dependency and memory-surreal feature alias.
Config schema & exports
clients/agent-runtime/src/config/schema.rs, clients/agent-runtime/src/config/mod.rs, dev/config.template.toml
Introduce SurrealMemoryConfig; add surreal field to MemoryConfig; Default/Debug, encrypt/decrypt on save/load, env overrides (CORVUS_SURREALDB_*, CORVUS_MEMORY_BACKEND); re-export added.
Memory backend selection & profiles
clients/agent-runtime/src/memory/backend.rs
Add MemoryBackendKind::Surreal, SURREAL_PROFILE, selectable backends conditional on feature, classifier/profile mapping, and tests updated.
Memory module & wiring
clients/agent-runtime/src/memory/mod.rs
Feature-gated surreal module and SurrealMemory export; refactor sqlite builder (build_sqlite_memory); create_memory and migration logic dispatch to Surreal (when enabled) with fallbacks.
Surreal backend implementation
clients/agent-runtime/src/memory/surreal.rs
New SurrealMemory module: OnceCell WS client, endpoint normalization, auth (token or user/pass), schema setup, transactional store/forget, hybrid vector+keyword recall, list/get/count/health_check, helpers, and tests (including smoke).
Onboarding & CLI
clients/agent-runtime/src/onboard/wizard.rs, clients/agent-runtime/src/main.rs
Onboarding prompts and wizard updated to include surreal; new setup_surreal_memory_options and scaffold_surreal_docker_files; CLI help text adjusted.
Docs & examples
clients/agent-runtime/README.md
README additions: build flag, config snippets, env overrides, onboarding notes and examples for SurrealDB.
Dev infra & tests
clients/agent-runtime/docker-compose.yml, clients/agent-runtime/tests/memory_comparison.rs
Add surrealdb service (profile surreal) to docker-compose; add feature-gated test helper maybe_surreal_backend and a Surreal store/recall test.
Misc
clients/agent-runtime/src/config/mod.rs, clients/agent-runtime/src/main.rs, clients/agent-runtime/README.md
Public re-export of SurrealMemoryConfig; small help/documentation string updates and README tweaks.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent/Client
    participant Surreal as SurrealMemory
    participant Embed as Embedding Provider
    participant DB as SurrealDB (WebSocket)

    Agent->>Surreal: store(key, content, category)
    activate Surreal
    Surreal->>Embed: embed(content)
    activate Embed
    Embed-->>Surreal: vector
    deactivate Embed
    Surreal->>DB: upsert entry (id, content, vector, meta)
    activate DB
    DB-->>Surreal: success
    deactivate DB
    Surreal-->>Agent: ok
    deactivate Surreal

    Agent->>Surreal: recall(query, limit)
    activate Surreal
    Surreal->>Embed: embed(query)
    activate Embed
    Embed-->>Surreal: query_vector
    deactivate Embed
    Surreal->>DB: vector + keyword search
    activate DB
    DB-->>Surreal: ranked results
    deactivate DB
    Surreal-->>Agent: Vec<MemoryEntry>
    deactivate Surreal
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

  • refactor: corvus env uppercase #18 — Overlaps with onboarding/config env-variable handling; likely related to the new CORVUS_SURREALDB_* and CORVUS_MEMORY_BACKEND overrides.
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 54.12% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed Title clearly describes the main feature: adding SurrealDB as a memory backend with secure wizard flow.
Description check ✅ Passed Description includes all required sections: related issues, summary of changes, tested information, breaking changes acknowledgment, and completed checklist.
Linked Issues check ✅ Passed PR fulfills DALLAY-131 objectives: implements SurrealDB-backed memory layer with configuration support, schema design, scoring/decay strategies integration.
Out of Scope Changes check ✅ Passed All changes directly support SurrealDB memory backend implementation: configuration, features, documentation, and integration with existing memory infrastructure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch bart/feature/dallay-131-implementing-a-custom-knowledge-graph-memory

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 19, 2026

✅ Contributor Report

User: @yacosta738
Status: Passed (12/13 metrics passed)

Metric Description Value Threshold Status
PR Merge Rate PRs merged vs closed 89% >= 30%
Repo Quality Repos with ≥100 stars 0 >= 0
Positive Reactions Positive reactions received 9 >= 1
Negative Reactions Negative reactions received 0 <= 5
Account Age GitHub account age 3036 days >= 30 days
Activity Consistency Regular activity over time 108% >= 0%
Issue Engagement Issues with community engagement 0 >= 0
Code Reviews Code reviews given to others 363 >= 0
Merger Diversity Unique maintainers who merged PRs 3 >= 0
Repo History Merge Rate Merge rate in this repo 87% >= 0%
Repo History Min PRs Previous PRs in this repo 31 >= 0
Profile Completeness Profile richness (bio, followers) 90 >= 0
Suspicious Patterns Spam-like activity detection 1 N/A

Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-02-19 to 2026-02-19

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
clients/agent-runtime/src/config/schema.rs (1)

824-825: ⚠️ Potential issue | 🟡 Minor

Doc comment for backend is stale — missing "surreal".

The comment lists "sqlite" | "lucid" | "markdown" | "none" but this PR adds "surreal" as a valid backend. Update the doc to include it.

-    /// "sqlite" | "lucid" | "markdown" | "none" (`none` = explicit no-op memory)
+    /// "sqlite" | "lucid" | "markdown" | "surreal" | "none" (`none` = explicit no-op memory)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/config/schema.rs` around lines 824 - 825, Update
the doc comment for the struct field `backend` to include the newly supported
"surreal" option; the current comment only lists `"sqlite" | "lucid" |
"markdown" | "none"`, so edit the `backend: String` doc comment to enumerate
`"sqlite" | "lucid" | "markdown" | "surreal" | "none"` (keeping the `none`
explanation) so the documentation matches the code.
🧹 Nitpick comments (12)
clients/agent-runtime/README.md (1)

345-356: Environment variable documentation is thorough.

Good coverage of all SurrealDB env overrides. Minor nit: consider using placeholder values (e.g., your-password-here) instead of corvus-pass in the example to reduce the chance of users accidentally deploying with default credentials.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/README.md` around lines 345 - 356, Update the example
SurrealDB env block to avoid a realistic default password; replace the literal
"corvus-pass" value for CORVUS_SURREALDB_PASSWORD with a clear placeholder like
"your-password-here" (referencing the CORVUS_SURREALDB_PASSWORD env variable in
the README code block) so the docs encourage users to choose their own secret
instead of copying a default.
clients/agent-runtime/docker-compose.yml (1)

56-62: Consider pinning the SurrealDB image tag for reproducibility.

Using surrealdb/surrealdb:latest can lead to unexpected breakages when SurrealDB releases new versions with API changes. Since the Cargo dependency targets 2.3, consider pinning to a compatible server version (e.g., surrealdb/surrealdb:v2.3).

Proposed fix
-    image: surrealdb/surrealdb:latest
+    image: surrealdb/surrealdb:v2.3
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/docker-compose.yml` around lines 56 - 62, The SurrealDB
service in the docker-compose snippet uses the floating image tag
`surrealdb/surrealdb:latest` which can introduce breaking changes; update the
`surrealdb` service definition to pin the image to a specific compatible version
(for example `surrealdb/surrealdb:v2.3`) to match the Cargo dependency target,
ensuring the `image` field is changed accordingly and any documentation or
comments reflect the pinned version.
clients/agent-runtime/src/main.rs (1)

124-126: Doc comment updated correctly, but consider keeping the default note.

The "default: sqlite" hint was removed from the help text. Since memory is Option<String> with no default_value, CLI users running corvus onboard --help lose visibility into the default behavior. Consider restoring the default indication:

-        /// Memory backend (sqlite, lucid, surreal, markdown, none) - used in quick mode
+        /// Memory backend (sqlite, lucid, surreal, markdown, none; default: sqlite) - used in quick mode

Based on learnings: "Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/main.rs` around lines 124 - 126, The CLI lost the
"(default: sqlite)" hint for the memory option; update the #[arg(...)] on the
memory field so the help text includes the default hint (e.g., add help =
"Memory backend (sqlite, lucid, surreal, markdown, none) - used in quick mode
(default: sqlite)") while keeping the field as Option<String> and not changing
its type or behavior; target the memory field's attribute in main.rs to restore
the default note in the --help output.
dev/config.template.toml (1)

9-19: Dev template defaults to surreal backend — ensure this is intentional.

The code default for memory.backend is "sqlite" (see default_memory_backend_key()), but this template sets it to "surreal". This is fine if the template is specifically for SurrealDB development/testing, but could confuse users who copy it as a general starting point. Consider adding a comment clarifying this template is tailored for SurrealDB workflows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dev/config.template.toml` around lines 9 - 19, The template's memory.backend
is set to "surreal" while the code default_memory_backend_key() returns
"sqlite"; either update the [memory] backend value to "sqlite" to match the code
default or add a clear comment above the [memory] section stating this template
is intentionally tailored for SurrealDB workflows (and not a generic starting
point), referencing the memory.backend key and the default_memory_backend_key()
function so reviewers/users understand the discrepancy.
clients/agent-runtime/src/memory/surreal.rs (5)

102-138: client() clones six strings on every call, even after the OnceCell is initialized.

The clones on lines 103–108 are captured by the get_or_try_init closure, but they execute on every invocation — including after the cell is populated. Move the clones inside the closure so they only run during the one-time init.

Proposed fix
     async fn client(&self) -> Result<&Surreal<Client>> {
-        let ws_endpoint = self.ws_endpoint.clone();
-        let namespace = self.namespace.clone();
-        let database = self.database.clone();
-        let username = self.username.clone();
-        let password = self.password.clone();
-        let token = self.token.clone();
-
         self.client
-            .get_or_try_init(|| async move {
+            .get_or_try_init(|| {
+                let ws_endpoint = self.ws_endpoint.clone();
+                let namespace = self.namespace.clone();
+                let database = self.database.clone();
+                let username = self.username.clone();
+                let password = self.password.clone();
+                let token = self.token.clone();
+                async move {
                 let db = Surreal::new::<Ws>(ws_endpoint.as_str())
                     .await
                     .context("failed to connect to SurrealDB")?;
                 // ... rest unchanged ...
                 Ok(db)
+                }
             })
             .await
     }

As per coding guidelines, clients/agent-runtime/src/**/*.rs: "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/memory/surreal.rs` around lines 102 - 138, The
client() method currently clones ws_endpoint, namespace, database, username,
password, and token before calling self.client.get_or_try_init(...), causing
those clones to happen on every call; move the clones into the async move
closure passed to get_or_try_init so the strings are cloned only during
initialization (i.e., capture self.ws_endpoint, self.namespace, self.database,
self.username, self.password, self.token by cloning them inside the closure body
before using them), and remove the pre-closure clones so subsequent client()
calls avoid unnecessary allocations; keep using get_or_try_init,
Surreal::new::<Ws>, authentication branches, and Self::ensure_schema(&db) as-is.

439-451: Fallback keyword search recomputes lowercase strings already computed earlier.

Lines 443–445 rebuild the format!("{} {}", row.key.to_lowercase(), row.content.to_lowercase()) string for each row, which was already computed on line 406 in the main scoring loop. If you switch to server-side filtering (per the fetch_all_entries comment), this becomes moot; otherwise consider caching the searchable text.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/memory/surreal.rs` around lines 439 - 451, The
fallback branch re-creates the lowercase searchable string for each row
(format!("{} {}", row.key.to_lowercase(), row.content.to_lowercase())) even
though it was already computed in the main scoring loop; update the code to
compute and store the lowercase/searchable text once and reuse it in the
fallback instead of recomputing: e.g., when building row_by_id or when
converting rows earlier, add a cached searchable_lower (or a small HashMap
id->searchable_lower) and then change the fallback filter to use that cached
value for .contains(&query_lower) and pass the existing score to
Self::row_to_entry; alternatively, if you follow the fetch_all_entries change,
perform the server-side filtering to avoid this client-side lowercase work
entirely.

195-206: fetch_all_entries() loads the entire table into memory — recall, list, and count all pay O(n) cost.

Every call to recall, list, or count deserializes every row from the memory_entries table. As memory grows, this becomes a performance bottleneck and defeats the purpose of using a database.

Consider:

  • count: use SELECT count() FROM memory_entries GROUP ALL;
  • list: push category and session_id filters into a SurrealQL WHERE clause
  • recall: for keyword search, a SurrealQL WHERE content CONTAINS ... or full-text index could reduce the working set. The in-memory vector search is reasonable for now but the full-table scan to feed it is not.

This can be addressed incrementally, but count() is the lowest-hanging fruit.

As per coding guidelines, clients/agent-runtime/src/**/*.rs: "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency."

Also applies to: 383-383, 478-478, 514-516

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/memory/surreal.rs` around lines 195 - 206,
fetch_all_entries currently selects and deserializes the entire memory_entries
table causing O(n) work for recall/list/count; update the code to avoid
full-table materialization by implementing DB-backed variants: change count() to
run "SELECT count() FROM memory_entries GROUP ALL;" in count_memory_entries (or
in the existing count method) instead of calling fetch_all_entries; change
list() to push category and session_id into a SurrealQL WHERE clause (e.g.,
WHERE category = ... AND session_id = ...) and paginate/limit as needed rather
than deserializing all rows; change recall() to perform a WHERE content CONTAINS
... or use a full-text index query for keyword search (or accept optional
filters on fetch_all_entries so it issues a SELECT ... WHERE ... to SurrealDB)
so only matching rows are returned and deserialized; locate and modify functions
fetch_all_entries, count, list, and recall (and any call sites that rely on
fetch_all_entries) to use these targeted queries and avoid unnecessary
allocations/clones.

140-158: Schema DEFINE statements silently swallow all errors, not just "already exists".

Line 152–154 catches any error from each schema statement and logs it at debug level. Permission errors, syntax errors in future schema changes, or connectivity issues during init would all be silently ignored. Consider checking the error type or at minimum logging at warn for unexpected failures.

Proposed fix
         for statement in statements {
-            if let Err(error) = db.query(statement).await {
-                tracing::debug!("SurrealDB schema statement skipped: {error}");
+            if let Err(error) = db.query(statement).await {
+                let msg = error.to_string();
+                if msg.contains("already exists") {
+                    tracing::debug!("SurrealDB schema statement skipped (already exists): {statement}");
+                } else {
+                    tracing::warn!("SurrealDB schema statement failed: {error} — statement: {statement}");
+                }
             }
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/memory/surreal.rs` around lines 140 - 158, The
ensure_schema function currently swallows every error from db.query for each
statement; change this so only benign "already exists" errors are ignored and
everything else is surfaced: in async fn ensure_schema (and the loop over
statements/db.query) inspect the returned error (match on the SurrealDB error
type if available or check the error string for "already exists"); if it
indicates the object already exists, keep the debug log, otherwise log at warn
(or error) and return Err(...) to propagate the failure instead of silently
continuing. Ensure references: statements array, the db.query(...) call, and the
ensure_schema function are updated accordingly.

272-361: store() makes up to 6 DB round-trips per call.

The flow: get (check previous) → upsertlog_eventlog_relation (category) → list (session entries) → log_relation (previous entry). For a session-scoped store, that's 6 separate database calls. Consider batching related writes into a single SurrealQL transaction to reduce latency and ensure atomicity.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/memory/surreal.rs` around lines 272 - 361, store()
currently performs multiple sequential DB calls (get, upsert, log_event,
log_relation, list, log_relation) causing up to 6 round-trips; change it to run
as a single SurrealDB transaction that upserts the EntryWrite (use record_id and
payload), returns whether a previous record existed, inserts the event
(log_event) and the category relation (log_relation "entry_category"), and if
session_id is present also insert entry_session and compute the previous entry
within the same transaction (SELECT ... ORDER BY timestamp DESC LIMIT 1 for that
session) then insert entry_previous if found; keep using the same
payload/record_id/EntryWrite and preserve the embedding logic, but replace the
independent calls to get(), upsert(), list(), and separate
log_relation/log_event calls with a single transactional block (begin/commit) so
the writes are batched and atomic and use the transaction response to determine
the "action" (store vs update) and any previous entry id.
clients/agent-runtime/src/config/schema.rs (3)

774-797: Debug derive will expose password and token in plaintext if this struct is ever logged.

SurrealMemoryConfig derives Debug, which means any tracing::debug!("{:?}", config.memory.surreal) or similar call will emit credentials. Consider a manual Debug impl that redacts sensitive fields, or wrap secrets in a redacting newtype.

♻️ Example: manual Debug impl that redacts secrets
-#[derive(Debug, Clone, Serialize, Deserialize)]
+#[derive(Clone, Serialize, Deserialize)]
 pub struct SurrealMemoryConfig {
     ...
 }
+
+impl std::fmt::Debug for SurrealMemoryConfig {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("SurrealMemoryConfig")
+            .field("url", &self.url)
+            .field("namespace", &self.namespace)
+            .field("database", &self.database)
+            .field("username", &self.username.as_ref().map(|_| "***"))
+            .field("password", &self.password.as_ref().map(|_| "***"))
+            .field("token", &self.token.as_ref().map(|_| "***"))
+            .field("allow_http_loopback", &self.allow_http_loopback)
+            .finish()
+    }
+}

Based on learnings: "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable." As per coding guidelines: clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/config/schema.rs` around lines 774 - 797, The
SurrealMemoryConfig struct currently derives Debug which will expose sensitive
fields (password, token) when logged; replace the automatic Debug derive with a
manual implementation or a redacting newtype so that password and token are not
printed (e.g., implement fmt::Debug for SurrealMemoryConfig and display
placeholders like "<redacted>" for password and token while keeping other fields
readable), or wrap the sensitive fields in a custom Redacted<T> type that
implements Debug redacting the inner value; ensure you remove Debug from the
derive list and update any references relying on the derived Debug to use the
new manual impl or redacting wrappers for SurrealMemoryConfig, password, and
token.

2354-2395: Repetitive env-override blocks could be condensed with a small helper.

Six near-identical blocks follow the same trim-then-set-if-non-empty pattern. A helper would reduce surface area for copy-paste errors and make adding future fields easier.

♻️ Optional helper extraction
fn env_override_optional(var: &str, target: &mut Option<String>) {
    if let Ok(raw) = std::env::var(var) {
        let value = raw.trim();
        if !value.is_empty() {
            *target = Some(value.to_string());
        }
    }
}

Then the six blocks become:

env_override_optional("CORVUS_SURREALDB_URL", &mut self.memory.surreal.url);
env_override_optional("CORVUS_SURREALDB_NAMESPACE", &mut self.memory.surreal.namespace);
env_override_optional("CORVUS_SURREALDB_DATABASE", &mut self.memory.surreal.database);
env_override_optional("CORVUS_SURREALDB_USERNAME", &mut self.memory.surreal.username);
env_override_optional("CORVUS_SURREALDB_PASSWORD", &mut self.memory.surreal.password);
env_override_optional("CORVUS_SURREALDB_TOKEN", &mut self.memory.surreal.token);

This helper pattern could also benefit the other existing env-override blocks (e.g., web search provider, brave API key, etc.).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/config/schema.rs` around lines 2354 - 2395, Extract
the repeated trim-and-set logic into a small helper (e.g.,
env_override_optional) and replace the six near-identical blocks that set
self.memory.surreal.url / namespace / database / username / password / token
with calls to that helper using the corresponding env var names
("CORVUS_SURREALDB_URL", "CORVUS_SURREALDB_NAMESPACE",
"CORVUS_SURREALDB_DATABASE", "CORVUS_SURREALDB_USERNAME",
"CORVUS_SURREALDB_PASSWORD", "CORVUS_SURREALDB_TOKEN"); the helper should accept
the env var name and a &mut Option<String>, read std::env::var, trim, and assign
Some(value.to_string()) only if non-empty, and you can reuse the same helper for
other env-override blocks (web search provider, brave API key, etc.).

2161-2190: url, namespace, and database are connection metadata, not secrets — encrypting them reduces debuggability without meaningful security benefit.

Only username, password, and token need encryption. Encrypting the endpoint URL and schema identifiers means operators can't inspect config.toml to verify connection settings without decrypting, and the decrypt-on-load / encrypt-on-save round-trips add unnecessary complexity.

Consider limiting encryption to actual credentials:

♻️ Suggested change for load path (similar change needed for save path)
-            decrypt_optional_secret(
-                &store,
-                &mut config.memory.surreal.url,
-                "config.memory.surreal.url",
-            )?;
-            decrypt_optional_secret(
-                &store,
-                &mut config.memory.surreal.namespace,
-                "config.memory.surreal.namespace",
-            )?;
-            decrypt_optional_secret(
-                &store,
-                &mut config.memory.surreal.database,
-                "config.memory.surreal.database",
-            )?;
             decrypt_optional_secret(
                 &store,
                 &mut config.memory.surreal.username,
                 "config.memory.surreal.username",
             )?;
             decrypt_optional_secret(
                 &store,
                 &mut config.memory.surreal.password,
                 "config.memory.surreal.password",
             )?;
             decrypt_optional_secret(
                 &store,
                 &mut config.memory.surreal.token,
                 "config.memory.surreal.token",
             )?;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/config/schema.rs` around lines 2161 - 2190, The
current decryption loop calls decrypt_optional_secret on connection metadata
(config.memory.surreal.url, .namespace, .database) which should not be treated
as secrets; update the load path to only call decrypt_optional_secret for
credentials (config.memory.surreal.username, .password, .token) and remove calls
for .url, .namespace, and .database, leaving them as plain values; also apply
the corresponding change in the save/encrypt path so only
username/password/token are encrypted/decrypted, referencing the
decrypt_optional_secret function and the config.memory.surreal.* fields to
locate where to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/src/memory/mod.rs`:
- Around line 167-190: The Surreal backend branch (MemoryBackendKind::Surreal)
performs f64-to-f32 casts for config.vector_weight and config.keyword_weight
when calling SurrealMemory::new and is missing the Clippy suppression used
elsewhere; add #[allow(clippy::cast_possible_truncation)] to the block
containing the casts (the cfg(feature = "memory-surreal") section) so the as f32
conversions of config.vector_weight and config.keyword_weight do not emit
warnings.

In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 559-583: validate_endpoint_security currently permits "ws"
unconditionally; treat "ws" like "http" by applying the same allow_http_loopback
and is_loopback_host checks so unencrypted websocket endpoints cannot connect to
non-loopback hosts. Update the match in validate_endpoint_security to handle
"http" and "ws" together (or duplicate the same guard logic for "ws"), using the
existing allow_http_loopback parameter and is_loopback_host(host) call, and
produce the same informative error messages when rejecting insecure non-loopback
endpoints; leave "wss" and "https" allowed as before.

In `@clients/agent-runtime/src/onboard/wizard.rs`:
- Around line 2428-2449: The docker scaffold in scaffold_surreal_docker_files
writes a compose_contents string that starts SurrealDB with the memory backend
(the literal "memory" argument) which is non-persistent; change the
compose_contents to use a persistent backend (e.g., use a rocksdb:// or
surrealkv:// URI instead of "memory"), add a named volume mount in the compose
service so data persists across container restarts, and include a brief inline
warning comment in the compose_contents string indicating the difference between
in-memory and persistent storage; update any related filenames (compose_path,
env_example_path) usage if needed to reflect this persistence change.

---

Outside diff comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 824-825: Update the doc comment for the struct field `backend` to
include the newly supported "surreal" option; the current comment only lists
`"sqlite" | "lucid" | "markdown" | "none"`, so edit the `backend: String` doc
comment to enumerate `"sqlite" | "lucid" | "markdown" | "surreal" | "none"`
(keeping the `none` explanation) so the documentation matches the code.

---

Nitpick comments:
In `@clients/agent-runtime/docker-compose.yml`:
- Around line 56-62: The SurrealDB service in the docker-compose snippet uses
the floating image tag `surrealdb/surrealdb:latest` which can introduce breaking
changes; update the `surrealdb` service definition to pin the image to a
specific compatible version (for example `surrealdb/surrealdb:v2.3`) to match
the Cargo dependency target, ensuring the `image` field is changed accordingly
and any documentation or comments reflect the pinned version.

In `@clients/agent-runtime/README.md`:
- Around line 345-356: Update the example SurrealDB env block to avoid a
realistic default password; replace the literal "corvus-pass" value for
CORVUS_SURREALDB_PASSWORD with a clear placeholder like "your-password-here"
(referencing the CORVUS_SURREALDB_PASSWORD env variable in the README code
block) so the docs encourage users to choose their own secret instead of copying
a default.

In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 774-797: The SurrealMemoryConfig struct currently derives Debug
which will expose sensitive fields (password, token) when logged; replace the
automatic Debug derive with a manual implementation or a redacting newtype so
that password and token are not printed (e.g., implement fmt::Debug for
SurrealMemoryConfig and display placeholders like "<redacted>" for password and
token while keeping other fields readable), or wrap the sensitive fields in a
custom Redacted<T> type that implements Debug redacting the inner value; ensure
you remove Debug from the derive list and update any references relying on the
derived Debug to use the new manual impl or redacting wrappers for
SurrealMemoryConfig, password, and token.
- Around line 2354-2395: Extract the repeated trim-and-set logic into a small
helper (e.g., env_override_optional) and replace the six near-identical blocks
that set self.memory.surreal.url / namespace / database / username / password /
token with calls to that helper using the corresponding env var names
("CORVUS_SURREALDB_URL", "CORVUS_SURREALDB_NAMESPACE",
"CORVUS_SURREALDB_DATABASE", "CORVUS_SURREALDB_USERNAME",
"CORVUS_SURREALDB_PASSWORD", "CORVUS_SURREALDB_TOKEN"); the helper should accept
the env var name and a &mut Option<String>, read std::env::var, trim, and assign
Some(value.to_string()) only if non-empty, and you can reuse the same helper for
other env-override blocks (web search provider, brave API key, etc.).
- Around line 2161-2190: The current decryption loop calls
decrypt_optional_secret on connection metadata (config.memory.surreal.url,
.namespace, .database) which should not be treated as secrets; update the load
path to only call decrypt_optional_secret for credentials
(config.memory.surreal.username, .password, .token) and remove calls for .url,
.namespace, and .database, leaving them as plain values; also apply the
corresponding change in the save/encrypt path so only username/password/token
are encrypted/decrypted, referencing the decrypt_optional_secret function and
the config.memory.surreal.* fields to locate where to modify.

In `@clients/agent-runtime/src/main.rs`:
- Around line 124-126: The CLI lost the "(default: sqlite)" hint for the memory
option; update the #[arg(...)] on the memory field so the help text includes the
default hint (e.g., add help = "Memory backend (sqlite, lucid, surreal,
markdown, none) - used in quick mode (default: sqlite)") while keeping the field
as Option<String> and not changing its type or behavior; target the memory
field's attribute in main.rs to restore the default note in the --help output.

In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 102-138: The client() method currently clones ws_endpoint,
namespace, database, username, password, and token before calling
self.client.get_or_try_init(...), causing those clones to happen on every call;
move the clones into the async move closure passed to get_or_try_init so the
strings are cloned only during initialization (i.e., capture self.ws_endpoint,
self.namespace, self.database, self.username, self.password, self.token by
cloning them inside the closure body before using them), and remove the
pre-closure clones so subsequent client() calls avoid unnecessary allocations;
keep using get_or_try_init, Surreal::new::<Ws>, authentication branches, and
Self::ensure_schema(&db) as-is.
- Around line 439-451: The fallback branch re-creates the lowercase searchable
string for each row (format!("{} {}", row.key.to_lowercase(),
row.content.to_lowercase())) even though it was already computed in the main
scoring loop; update the code to compute and store the lowercase/searchable text
once and reuse it in the fallback instead of recomputing: e.g., when building
row_by_id or when converting rows earlier, add a cached searchable_lower (or a
small HashMap id->searchable_lower) and then change the fallback filter to use
that cached value for .contains(&query_lower) and pass the existing score to
Self::row_to_entry; alternatively, if you follow the fetch_all_entries change,
perform the server-side filtering to avoid this client-side lowercase work
entirely.
- Around line 195-206: fetch_all_entries currently selects and deserializes the
entire memory_entries table causing O(n) work for recall/list/count; update the
code to avoid full-table materialization by implementing DB-backed variants:
change count() to run "SELECT count() FROM memory_entries GROUP ALL;" in
count_memory_entries (or in the existing count method) instead of calling
fetch_all_entries; change list() to push category and session_id into a
SurrealQL WHERE clause (e.g., WHERE category = ... AND session_id = ...) and
paginate/limit as needed rather than deserializing all rows; change recall() to
perform a WHERE content CONTAINS ... or use a full-text index query for keyword
search (or accept optional filters on fetch_all_entries so it issues a SELECT
... WHERE ... to SurrealDB) so only matching rows are returned and deserialized;
locate and modify functions fetch_all_entries, count, list, and recall (and any
call sites that rely on fetch_all_entries) to use these targeted queries and
avoid unnecessary allocations/clones.
- Around line 140-158: The ensure_schema function currently swallows every error
from db.query for each statement; change this so only benign "already exists"
errors are ignored and everything else is surfaced: in async fn ensure_schema
(and the loop over statements/db.query) inspect the returned error (match on the
SurrealDB error type if available or check the error string for "already
exists"); if it indicates the object already exists, keep the debug log,
otherwise log at warn (or error) and return Err(...) to propagate the failure
instead of silently continuing. Ensure references: statements array, the
db.query(...) call, and the ensure_schema function are updated accordingly.
- Around line 272-361: store() currently performs multiple sequential DB calls
(get, upsert, log_event, log_relation, list, log_relation) causing up to 6
round-trips; change it to run as a single SurrealDB transaction that upserts the
EntryWrite (use record_id and payload), returns whether a previous record
existed, inserts the event (log_event) and the category relation (log_relation
"entry_category"), and if session_id is present also insert entry_session and
compute the previous entry within the same transaction (SELECT ... ORDER BY
timestamp DESC LIMIT 1 for that session) then insert entry_previous if found;
keep using the same payload/record_id/EntryWrite and preserve the embedding
logic, but replace the independent calls to get(), upsert(), list(), and
separate log_relation/log_event calls with a single transactional block
(begin/commit) so the writes are batched and atomic and use the transaction
response to determine the "action" (store vs update) and any previous entry id.

In `@dev/config.template.toml`:
- Around line 9-19: The template's memory.backend is set to "surreal" while the
code default_memory_backend_key() returns "sqlite"; either update the [memory]
backend value to "sqlite" to match the code default or add a clear comment above
the [memory] section stating this template is intentionally tailored for
SurrealDB workflows (and not a generic starting point), referencing the
memory.backend key and the default_memory_backend_key() function so
reviewers/users understand the discrepancy.

Comment thread clients/agent-runtime/src/memory/mod.rs
Comment thread clients/agent-runtime/src/memory/surreal.rs
Comment thread clients/agent-runtime/src/onboard/wizard.rs
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (1)
clients/agent-runtime/src/config/schema.rs (1)

2266-2274: No input validation on CORVUS_MEMORY_BACKEND / MEMORY_BACKEND values

Any arbitrary string (e.g. CORVUS_MEMORY_BACKEND=unknown_typo) is accepted silently without validating against the known set ("sqlite", "lucid", "markdown", "surreal", "none"). This can cause a confusing runtime failure deep in the memory backend initialisation rather than a clear config error.

This is a pre-existing pattern for other string-typed env overrides (provider, model), but introducing a new "surreal" backend is a good opportunity to tighten this. Consider logging a tracing::warn! when the backend value is unrecognised.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/config/schema.rs` around lines 2266 - 2274,
Validate the incoming CORVUS_MEMORY_BACKEND / MEMORY_BACKEND value against the
allowed set before assigning to self.memory.backend: after trimming and
normalizing (e.g., to_ascii_lowercase()) check it is one of "sqlite", "lucid",
"markdown", "surreal", or "none"; if it is, assign it to self.memory.backend as
currently done, otherwise do not override and emit a tracing::warn! that
includes the unrecognized value and the allowed list so the user gets a clear
config warning (use the existing env-var lookup and the self.memory.backend
symbol to locate where to change behavior).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 3764-3775: Add a new unit test that verifies the fallback to the
non-prefixed environment variable for memory backend: create a test (e.g.,
env_override_memory_backend_fallback) that uses env_override_test_guard(),
constructs Config::default(), ensures CORVUS_MEMORY_BACKEND is removed, sets
MEMORY_BACKEND to a value like "markdown", calls config.apply_env_overrides(),
and asserts config.memory.backend equals the fallback value, then cleans up by
removing MEMORY_BACKEND; this exercises the apply_env_overrides fallback path
alongside the existing env_override_memory_backend test.
- Around line 822-834: The Debug impl for SurrealMemoryConfig currently exposes
username; change the ".field(\"username\", &self.username)" to redact the value
the same way password and token are handled (e.g., map to "<redacted>" via
self.username.as_ref().map(|_| "<redacted>")) inside the impl fmt::Debug for
SurrealMemoryConfig so Debug never prints raw credentials, and update the test
surreal_memory_config_debug_redacts_sensitive_fields to assert that the username
string does not appear in the debug output (i.e., ensure debug output contains
"<redacted>" for username or simply does not contain the real username).

In `@clients/agent-runtime/src/memory/surreal.rs`:
- Around line 421-509: The recall method currently calls recall_rows which
prefilters by keyword, causing purely semantic queries to miss results; change
recall to also fetch a vector-only candidate set (e.g., call a new
recall_vector_candidates or call recall_rows with an option to skip keyword
filtering, or fetch the most recent N rows) and merge those rows into row_by_id
and searchable_by_id before computing vector similarities and keyword scores;
ensure you compute embeddings for those additional rows if missing, deduplicate
by id, then run the existing vector scoring/keyword scoring and hybrid_merge on
the combined candidate set so semantic-only queries can return vector matches.

---

Nitpick comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2266-2274: Validate the incoming CORVUS_MEMORY_BACKEND /
MEMORY_BACKEND value against the allowed set before assigning to
self.memory.backend: after trimming and normalizing (e.g., to_ascii_lowercase())
check it is one of "sqlite", "lucid", "markdown", "surreal", or "none"; if it
is, assign it to self.memory.backend as currently done, otherwise do not
override and emit a tracing::warn! that includes the unrecognized value and the
allowed list so the user gets a clear config warning (use the existing env-var
lookup and the self.memory.backend symbol to locate where to change behavior).

Comment thread clients/agent-runtime/src/config/schema.rs
Comment thread clients/agent-runtime/src/config/schema.rs
Comment thread clients/agent-runtime/src/memory/surreal.rs
@yacosta738 yacosta738 merged commit 7ca0787 into main Feb 19, 2026
14 checks passed
@yacosta738 yacosta738 deleted the bart/feature/dallay-131-implementing-a-custom-knowledge-graph-memory branch February 19, 2026 11:56
@yacosta738 yacosta738 mentioned this pull request Mar 16, 2026
This was referenced Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant