feat: migrate memory to Cerebro MCP#235
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaces the SurrealDB long-term memory backend with an MCP-backed Cerebro service: adds a new Cerebro crate (service, storage, tools, schemas), migrates runtime config to MemoryCerebroConfig, rewires memory tools/loaders to call Cerebro via MCP (with local fallbacks), removes SurrealDB code/tests, and updates docs, CI, and onboarding. Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent Runtime
participant Loader as CerebroMemoryLoader
participant MCP as MCP Client Adapter
participant Cerebro as Cerebro Service
Agent->>Loader: load_context(user_message)
Loader->>Loader: validate cerebro endpoint
alt endpoint configured
Loader->>MCP: call_tool_http(name="mem_search", args)
MCP->>Cerebro: POST /mcp (JSON-RPC: tools/call mem_search)
Cerebro-->>MCP: JSON-RPC result (results array)
MCP-->>Loader: result.output
Loader->>Loader: filter by score & format [Memory context]
Loader-->>Agent: memory context string
else no endpoint
Loader-->>Agent: error (missing endpoint)
end
sequenceDiagram
participant User as Caller
participant Tool as MemoryStoreTool
participant Guard as Sensitive Data Guard
participant Adapter as MCP Client Adapter
participant Cerebro as Cerebro Service
User->>Tool: execute(store_payload)
Tool->>Guard: contains_sensitive_data(content)
alt sensitive detected
Guard-->>Tool: blocked_error
Tool-->>User: ToolResult::error
else clean
Tool->>Adapter: cerebro_tool_adapter() -> call mem_save
Adapter->>Cerebro: POST /mcp (JSON-RPC: tools/call mem_save)
Cerebro-->>Adapter: { result: { output: {...} } }
Adapter-->>Tool: result.output
Tool-->>User: ToolResult::success
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Contributor ReportUser: @yacosta738
Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-03-16 to 2026-03-16 |
Deploying corvus with
|
| Latest commit: |
ab20532
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://efad9081.corvus-42x.pages.dev |
| Branch Preview URL: | https://feature-213-core-architectur.corvus-42x.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 43
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
clients/agent-runtime/src/tools/mod.rs (1)
291-325:⚠️ Potential issue | 🟠 MajorDon’t register memory tools with no local fallback.
_memoryis now ignored and these tools are always built fromroot_config.memory.cerebro. On non-Cerebro installs that removes the injected local-memory implementation from the registry, somemory_store/memory_recall/memory_forgetare advertised even though they no longer have an in-process backend to operate on. Either keep the local fallback or stop registering them when Cerebro is unset.As per coding guidelines: "Look for behavioral regressions, missing tests, and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/mod.rs` around lines 291 - 325, The memory tools are always registered from root_config.memory.cerebro and the _memory parameter is ignored, which advertises memory tools when no local backend exists; update all_tools_with_runtime to: check whether a local memory implementation was injected via the _memory Arc and, if present, construct MemoryStoreTool/MemoryRecallTool/MemoryForgetTool from that local Arc<dyn Memory>; otherwise only register those tools when the Cerebro backend is configured (e.g. root_config.memory.cerebro.is_some()); ensure you reference the constructors MemoryStoreTool::new, MemoryRecallTool::new, MemoryForgetTool::new and do not drop or ignore the _memory parameter so the registry contracts remain correct.clients/agent-runtime/src/onboard/wizard.rs (1)
709-749:⚠️ Potential issue | 🟠 MajorQuick setup no longer exposes a working Cerebro migration path.
This flow only accepts
--memoryand then writesMemoryCerebroConfig::default(). There is no prompt or flag for the Cerebro endpoint/token, so fresh configs immediately hit the new"Cerebro MCP endpoint is required"failures in the migrated memory tools until the user hand-edits TOML. Add a Cerebro setup step here, or explicitly hide/disable the Cerebro-only tools when the endpoint is absent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/onboard/wizard.rs` around lines 709 - 749, The quick-setup path currently sets MemoryCerebroConfig::default() in memory_config_defaults_for_backend(backend: &str) without collecting or disabling Cerebro credentials, causing runtime failures; update memory_config_defaults_for_backend (or backend_key_from_choice flow) to either populate a minimal valid Cerebro config from CLI flags/env (e.g., accept --cerebro-endpoint and --cerebro-token) or explicitly mark cerebro as disabled when no endpoint/token is provided (so MemoryCerebroConfig is not treated as required), and ensure any code that checks for Cerebro MCP endpoint treats the absence as “disabled” rather than an error; reference MemoryCerebroConfig::default(), memory_config_defaults_for_backend, and backend_key_from_choice to locate where to add the flag parsing or the disablement logic.clients/agent-runtime/src/tools/memory_recall.rs (1)
28-55:⚠️ Potential issue | 🟠 MajorHarden recall input validation and keep bad input inside
ToolResult.This path still accepts whitespace-only queries, leaves
limiteffectively unbounded, and returnsErr(...)for a missing query instead of a structured failure. With the current Cerebro search implementation,""matches every record, so" "becomes a memory dump up to whateverlimitthe caller sends.Suggested fix
"query": { "type": "string", + "minLength": 1, "description": "Keywords or phrase to search for in memory" }, "limit": { "type": "integer", + "minimum": 1, + "maximum": 50, "description": "Max results to return (default: 5)" } - let query = args - .get("query") - .and_then(|v| v.as_str()) - .ok_or_else(|| anyhow::anyhow!("Missing 'query' parameter"))?; + let query = match args + .get("query") + .and_then(|v| v.as_str()) + .map(str::trim) + .filter(|value| !value.is_empty()) + { + Some(value) => value, + None => { + return Ok(ToolResult { + success: false, + output: String::new(), + error: Some("Missing or empty 'query' parameter".into()), + structured: None, + }); + } + }; - #[allow(clippy::cast_possible_truncation)] let limit = args .get("limit") .and_then(serde_json::Value::as_u64) - .map_or(5, |v| v as usize); + .map(|v| v.clamp(1, 50) as usize) + .unwrap_or(5);Based on learnings: Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_recall.rs` around lines 28 - 55, Update the parameters_schema and execute implementations to strictly validate and sanitize inputs: in parameters_schema (the method named parameters_schema) add constraints for "query" (e.g., minLength: 1) and "limit" (integer minimum 1, maximum a sane MAX_LIMIT like 100) so the schema expresses bounds; in execute (the async fn execute) trim the incoming "query" string and treat a missing or whitespace-only query as a structured failure by returning a ToolResult::Failure (with a clear message) instead of propagating an anyhow::Error; similarly parse and clamp "limit" to a default (5) and an upper bound (MAX_LIMIT) and return a ToolResult::Failure for non-numeric or out-of-range values, ensuring no panics or Err(...) escape and all invalid input paths return ToolResult variants rather than errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/Cargo.toml`:
- Line 203: Remove the dead SurrealRemote enum variant from the config enum in
modules/cerebro/src/config.rs: delete the SurrealRemote variant declaration and
remove any match arms, uses, imports, or serde attributes that reference
SurrealRemote (e.g., pattern matches or serialization tags referring to
SurrealRemote); then rebuild (cargo check/build) and fix any remaining compile
errors by removing or refactoring code that assumed the SurrealRemote variant.
- Line 203: The crate `cerebro` is currently only in dev-dependencies but is
directly imported by production code (e.g., calls to
cerebro::cerebro_tool_adapter() used in src/agent/memory_loader.rs,
src/tools/memory_recall.rs, src/tools/memory_forget.rs,
src/tools/memory_store.rs); move the cerebro = { path = "../../modules/cerebro"
} entry out of [dev-dependencies] and add it to [dependencies] in Cargo.toml
(preserve the path/table format), or alternatively remove the direct imports and
refactor those modules to call Cerebro over HTTP/MCP instead—ensure the
dev-dependencies entry is removed so production builds can compile.
In `@clients/agent-runtime/examples/custom_memory.rs`:
- Line 6: The top-of-file documentation comment in custom_memory.rs currently
points to an internal repo path
("clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md") which
isn't user-resolvable; update that doc comment to a public, user-facing URL
(e.g., the published docs site or the full GitHub URL to the markdown) so
readers can open the migration guide directly, replacing the internal path with
the chosen external link in the file-level doc comment at the top of
custom_memory.rs.
In `@clients/agent-runtime/src/agent/memory_loader.rs`:
- Around line 100-115: The code currently returns Ok(String::new()) for both a
missing endpoint and when adapter.execute yields response.success == false,
which masks misconfiguration or remote failures as “no memory”; change both
early returns to propagate errors instead: when endpoint.is_none() return an Err
describing missing Cerebro endpoint (e.g., using anyhow::anyhow! or the crate's
error type) and when !response.success return an Err that includes response
error/details (or adapter.execute error) so failures from
cerebro::cerebro_tool_adapter, adapter.execute, and response.success are
distinguishable from an empty-memory result; update callers of this loader
(CerebroMemoryLoader / the load method) to handle the error or explicitly fall
back before selecting CerebroMemoryLoader.
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2845-2854: The runtime validation in validate_memory_config
currently only rejects Surreal variants and lets any other memory.backend pass;
change it to a deny-by-default allowlist: accept only explicitly allowed backend
names (e.g., the intended backend token(s) such as "cerebro" — replace with your
actual allowed values) and call anyhow::bail! for any other value; update the
match on self.memory.backend.as_str() in validate_memory_config to branch:
allowed => Ok(()), and _ => anyhow::bail!("memory.backend '{}' is not supported;
valid options are: <list allowed names>", self.memory.backend) so unknown values
are rejected at runtime.
In `@clients/agent-runtime/src/memory/backend.rs`:
- Around line 66-71: The removal of legacy surreal backend entries causes legacy
configs to be treated as Unknown and silently routed to markdown; restore
explicit handling by either reintroducing legacy MemoryBackendProfile entries
(e.g., SURREAL_PROFILE / SURREAL_GRAPHS_PROFILE) into the
SELECTABLE_MEMORY_BACKENDS array or add a validation check where backend keys
are parsed to detect legacy "surreal" or "surreal-graphs" names and return an
explicit error (or migration hint) instead of mapping to Unknown; update the
code paths that reference SELECTABLE_MEMORY_BACKENDS and the backend-parsing
function to return a clear error for those legacy keys so migrations fail fast
rather than silently switching backends.
In `@clients/agent-runtime/src/tools/mcp/cerebro.rs`:
- Around line 24-43: build_cerebro_server currently skips setting MCP_AUTH_TOKEN
when MemoryCerebroConfig.auth_token is None, allowing a silent auth bypass;
change build_cerebro_server to validate that cerebro.auth_token is present when
an endpoint is configured and return an error if missing (e.g.,
Err(anyhow::anyhow!("memory.cerebro.auth_token must be configured"))), instead
of silently omitting the MCP_AUTH_TOKEN env insertion; update the code paths
around the existing env insertion and the function return so callers (and
cerebro_tool_adapter consumers) cannot proceed without a configured auth token.
In `@clients/agent-runtime/src/tools/mcp/client.rs`:
- Around line 220-225: The response body is being decoded into JSON without
enforcing the configured output_limit_bytes, letting oversized MCP responses
bypass limits; modify the code around response, status and payload handling in
client.rs (the block that calls response.json().await) to first obtain the raw
response bytes with a size guard: check response.content_length() if present
and/or read the body as a limited stream (using response.bytes_stream() or
response.bytes().await combined with tokio/bytes::Buf/Stream::take) and
immediately error if the total bytes exceed output_limit_bytes, then deserialize
the validated byte slice into serde_json::Value (e.g., via
serde_json::from_slice) and preserve the existing context messages on errors.
- Around line 240-250: The MCP HTTP path currently bails with raw JSON text
(payload.get("error")) and JSON-encodes string outputs via output.to_string(),
which breaks semantics; change the error handling to preserve the MCP error
envelope by returning/propagating the entire error payload (e.g., include the
JSON error value in the anyhow error or map it into the client's MCP error type
instead of formatting "{error}"), and change the output handling so that after
obtaining output from payload.get("result").and_then(|value|
value.get("output")), you detect if output.is_string() and return its inner
string value unquoted, otherwise serialize non-string JSON values as
before—update the logic around payload.get("error"), the output variable, and
the call to output.to_string() accordingly.
In `@clients/agent-runtime/src/tools/memory_forget.rs`:
- Around line 94-97: The memory_forget handler is building a payload that sends
the caller-supplied key as "memory_id", which breaks when callers pass a
topic_key; update memory_forget to resolve the provided key to the canonical
record id before sending or accept topic_key deletion: call the lookup used
elsewhere (reuse the record-resolution function or DAO used by recall/retrieve)
to map the incoming key to the stored memory_id, then populate payload with that
resolved memory_id, or alternatively add support for deleting by "topic_key" in
the Cerebro tool by setting payload to include "topic_key": key when resolution
fails; update references in memory_forget and related helpers so deletion uses
the same identifier returned to callers.
In `@clients/agent-runtime/src/tools/memory_store.rs`:
- Around line 168-181: Before calling the remote Cerebro adapter
(cerebro::cerebro_tool_adapter and adapter.execute) in memory_store.rs, perform
an explicit egress-policy check via the security/egress gating code and abort if
the remote endpoint/domain is not allowed; specifically, call the existing
egress enforcement helper (e.g. a function in crate::security::egress such as
check_egress_policy or enforce_egress_for_endpoint) using the Cerebro
endpoint/host and the current operation (ToolOperation::Act), and return an
error before constructing the payload or invoking adapter.execute when the
policy denies the outbound call.
- Around line 11-22: The sensitive-data check in contains_sensitive_data (and
the constant SENSITIVE_DATA_ERROR) is too narrow and must be expanded to
fail-closed: update contains_sensitive_data to treat any uncertain or matching
patterns as sensitive by adding robust pattern checks for quoted JSON
keys/values, emails, phone numbers, Bearer/JWT tokens, SSH/PGP key blocks,
common API key formats and other credential-looking strings (use/centralize
regexes used by contains_sensitive_labels and looks_like_api_key), and return
true on ambiguous input; ensure the memory_store Tool implementation validates
and sanitizes inputs against this function, returns a structured ToolResult on
rejection (no panics), and documents the rejection using SENSITIVE_DATA_ERROR.
- Around line 318-323: The test store_blocks_api_key_pattern uses a literal API
key string which triggers secret scanners; update the fixture so the token is
constructed at runtime or split to avoid a contiguous signature pattern: in the
test function store_blocks_api_key_pattern (which calls MemoryStoreTool::new and
then tool.execute(...)), replace the literal "api_key:
sk-1234567890abcdef1234567890" with a safe runtime-built value (e.g., assemble
the key from two parts or use format! to join "sk-" with a non-sensitive stub)
so the test behavior is unchanged but the exact signature never appears as a
single literal in source.
In `@clients/agent-runtime/tests/mcp_config_validation.rs`:
- Around line 87-97: Add a new unit test mirroring
rejects_insecure_cerebro_endpoint_without_loopback_opt_in but using a ws://
endpoint to ensure WebSocket insecure transports are rejected: create a test
(e.g., rejects_insecure_cerebro_ws_endpoint_without_loopback_opt_in) that sets
config.memory.cerebro.endpoint = Some("ws://cerebro.example.com/mcp".into()),
sets auth_token and request_timeout_ms, calls
config.validate_for_runtime().unwrap_err().to_string(), and asserts the error
contains "memory.cerebro.endpoint" and "allow_insecure_loopback"; this uses the
same validation path as validate_for_runtime and the same Config struct fields
to confirm ws:// is treated like http://.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_context.json`:
- Around line 10-53: The mem_context JSON schema allows empty strings for
identifiers and leaves timestamp unconstrained; update the schema for the
"session_id" property and the item properties "memory_id" and "summary" to
disallow empty strings (e.g., add "minLength": 1) and tighten the "timestamp"
property to require a valid ISO-8601 datetime (e.g., add "format": "date-time"
or a strict regex pattern); ensure these constraints are applied within the
"input" and "output" object definitions (referencing "session_id", "memory_id",
"summary", and "timestamp") and keep existing required fields and
"additionalProperties": false intact.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json`:
- Around line 32-34: The response schema's "status" enum only includes "deleted"
but the input supports a `hard_delete` option; update the output to either
expand the "status" enum to include both "soft_deleted" and "hard_deleted"
(replace or augment the existing "deleted" value in the "status" enum) or add a
new boolean field like "hard_deleted" alongside "status"; locate and modify the
"status" enum in mem_delete.json to implement one of these options and ensure
any consumers are updated to read the new field or enum values.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json`:
- Line 48: The timestamp property in the JSON schema is currently just "type":
"string", which allows ambiguous formats; update the mem_search.json schema's
"timestamp" property by adding a format constraint (format: "date-time")
alongside type: "string" to enforce ISO-8601 date-time parsing and downstream
consistency for the "timestamp" field.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.json`:
- Around line 10-17: The schema properties "session_id" and "ended_at" need
explicit bounds and RFC3339 validation: update the JSON schema in
mem_session_end.json for the "session_id" property to include minLength (>=1)
and a reasonable maxLength (e.g., 128) and optionally a pattern to restrict
allowed characters (e.g., alphanumeric plus -/_), and set the "ended_at"
property to include format:"date-time" (RFC3339) so requests are validated as
proper timestamps.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.json`:
- Around line 10-21: Add validation constraints to the JSON schemas: for
session_id in mem_session_start.json (and session_id in
mem_session_summary.json) add minLength: 1 and a reasonable maxLength (e.g. 255)
to prevent empty/oversized IDs; for started_at in mem_session_start.json (and
ended_at in mem_session_end.json) add an RFC3339 timestamp constraint (use
"format": "date-time" or a validated RFC3339 regex pattern) so timestamps are
validated against the expected format; update the relevant property objects
(session_id, started_at, ended_at) in those JSON files and keep property names
exactly as in the diff to locate them.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json`:
- Around line 14-39: Update the "summary" object schema: add required:
["goal","discoveries","accomplished","blockers","next_steps"] and set
additionalProperties: false to match other MCP schemas (e.g., mem_save.json,
mem_context.json); optionally tighten validation by adding maxLength on "goal"
and maxItems/maxLength on the arrays
("discoveries","accomplished","blockers","next_steps") to enforce bounds.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.json`:
- Around line 10-37: The schema currently marks "seed" (in the input object) and
"topic_key" (in the output object) as required but allows empty strings; update
the property definitions for "seed" and "topic_key" to enforce non-empty strings
by adding a constraint such as "minLength": 1 (or an equivalent non-empty
pattern) to each property, ensuring the "input.seed" and "output.topic_key"
fields cannot be empty; locate the definitions labeled "seed" and "topic_key" in
the schema and add the minLength constraint while keeping "required" and
"additionalProperties": false intact.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_timeline.json`:
- Line 43: The timestamp properties currently declare only "type": "string"
which permits any string; update both "timestamp" fields in mem_timeline.json to
include "format": "date-time" (i.e., change "timestamp": { "type": "string" } to
"timestamp": { "type": "string", "format": "date-time" }) so the schema
documents ISO 8601 timestamps and enables format-aware tooling to validate them;
apply this to both occurrences of the "timestamp" property.
- Around line 14-22: The schema allows unbounded values for the "before" and
"after" integer properties in mem_timeline.json; add a "maximum" constraint to
both "before" and "after" (matching the reasonable limit your Cerebro service
enforces, e.g., MAX_TIMELINE_ITEMS) so clients cannot request arbitrarily large
ranges, update the "before" and "after" property definitions to include
"maximum": <limit> and keep the existing "minimum": 0 and descriptions
unchanged.
- Around line 24-27: The schema option "include_deleted" currently allows any
authenticated caller to fetch soft-deleted records; update the Cerebro service
code that reads the include_deleted flag so it performs a role/permission check
before honoring it: in the request handler that parses include_deleted (the code
path that reads the mem_timeline include_deleted flag), call your
auth/permission validator to verify the caller has an admin or audit role (e.g.,
via existing token validation utility or middleware) and if not, reject the
request or ignore the flag with a 403/permission error; ensure the check is
applied before any data retrieval logic that uses include_deleted so
non-privileged tokens cannot retrieve deleted records.
In `@clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md`:
- Around line 1-73: The Spanish translation for this user-facing doc is missing;
update docs/content/docs/guides/cerebro/migration.md (the "Cerebro Migration
Guide") to indicate the ES translation is pending and/or add the Spanish
translation: add a short bilingual notice at the top (e.g., "Español: traducción
pendiente") and create a corresponding Spanish version of the guide or open a
follow-up issue referencing the migration guide title so the missing ES parity
is tracked and resolved.
- Around line 67-73: Update the "Migration checklist" to include explicit data
migration and rollback steps: add entries describing how to export long-term
memory from SurrealDB (instructions or tool references), how to import that data
into Cerebro (format, endpoints such as memory.cerebro.endpoint and required
auth_token), clarify whether migration is automatic/manual/tool-assisted and
recommended strategy, provide rollback procedures if Cerebro is unreachable
after switching (including re-enabling SurrealDB or restoring exported data),
and clearly state the data-loss risk when removing SurrealDB and that manual
export is required to preserve data; reference mem_* naming changes and
allow_insecure_loopback considerations where relevant.
In `@modules/cerebro/src/config.rs`:
- Around line 3-9: Add a clarifying doc comment to the StorageMode enum (and
specifically the SurrealRemote variant) explaining that SurrealRemote refers to
Cerebro's own persistence backend using SurrealDB for internal storage and is
not the removed agent-runtime SurrealDB backend; place the doc comment above the
enum and/or the SurrealRemote variant (symbols: StorageMode, SurrealRemote) so
readers won't confuse this with the legacy agent memory/runtime backend.
- Around line 58-60: The endpoint() function currently hardcodes "http://" which
conflicts with validation expecting "https://" for non-loopback hosts; update
the Config struct to allow a configurable scheme (e.g., add a scheme: String or
Option<String>) or default to "https" for non-loopback hosts and "http" for
loopback, then modify endpoint() to build the URL using that scheme instead of
the fixed "http://". Ensure endpoint() references the new field (or applies the
host loopback check) so produced URLs satisfy the existing validation logic.
- Around line 19-31: Replace the raw Option<String> auth_token in the
CerebroConfig struct with a secret-safe type (e.g.
Option<secrecy::SecretString>) and enable the secrecy crate's serde feature so
tokens are not accidentally exposed; update the field declaration in
CerebroConfig (auth_token) to Option<SecretString>, add the necessary use/import
for secrecy::SecretString, remove or adjust the automatic Debug/Display
derivation if it would expose secrets (or rely on SecretString's redaction), and
annotate serde to use the secrecy serde helpers (e.g. #[serde(with =
"secrecy::serde::option")] or the appropriate helper) so
serialization/deserialization works; finally, update any code that reads
auth_token to call expose_secret() where the raw value is required.
In `@modules/cerebro/src/server.rs`:
- Around line 116-120: The authorize method currently treats a missing/blank
config.auth_token as allowing all requests; change this so absence of a token
does not implicitly permit access: either reject requests when auth_token is
None/empty by returning an Err(CerebroError::Unauthorized(...)) from authorize,
or enforce at startup that config.auth_token must be non-empty (validate in the
initializer and return an error) unless an explicit opt-in flag (e.g.
allow_unauthenticated_loopback) is present; update the authorize function and/or
startup validation to use config.auth_token and CerebroError consistently and
ensure any explicit opt-in flag is checked before allowing Ok(()).
In `@modules/cerebro/src/storage/mod.rs`:
- Around line 69-79: InMemoryStorage currently stores records only in a
process-local RwLock<HashMap<String, MemoryRecord>}, so add a durable storage
implementation and wiring: introduce a new DiskBackedStorage (or
SqliteStorage/RocksDbStorage) type that implements the same storage API as
InMemoryStorage, implement persistence (serialize/deserialize MemoryRecord
entries to a file or use an embedded DB) and provide a factory API (e.g., change
InMemoryStorage::new to accept an optional persistence path or add
Storage::from_config) so the system can choose the disk-backed implementation
instead of the in-memory one; ensure the new storage exposes the same
methods/traits used elsewhere so callers using InMemoryStorage, records,
MemoryRecord continue to work and that existing migrations (SurrealDB → Cerebro)
persist across restarts.
In `@modules/cerebro/src/tools.rs`:
- Around line 168-182: The match arms in tools.rs inside the handle method
currently return fake success for unimplemented MCP operations (mem_timeline,
mem_save_prompt, mem_session_start, mem_session_end, mem_session_summary,
mem_context); update handle to return an explicit unsupported/NotImplemented
CerebroError for those symbols instead of delegating to placeholder helpers or
synthesizing IDs, and ensure any helper methods referenced (e.g., mem_timeline,
mem_save_prompt, mem_session_start, mem_session_end, mem_session_summary,
mem_context) either return the same unsupported error or are removed until real
storage-backed implementations exist so callers receive a clear error rather
than silent success/loss.
- Around line 227-239: The current limit calculation uses
input.input.limit.unwrap_or(5).max(1) which prevents zero but allows unbounded
large values; clamp the requested page size before calling self.storage.search
by enforcing a sane upper bound (e.g., define a MAX_MEM_SEARCH_LIMIT constant or
use std::cmp::min/clamp) so limit = input.input.limit.unwrap_or(5).clamp(1,
MAX_MEM_SEARCH_LIMIT) (or equivalent) and then pass that clamped limit into
self.storage.search to prevent excessive scanning/serialization.
In `@modules/cerebro/tests/mcp_tools_contract.rs`:
- Around line 62-70: The test currently hardcodes "topic" as the memory_id for
the mem_delete call; instead, after invoking mem_save via call_tool and
asserting save.error.is_none(), extract the actual memory_id from the save
response (e.g., from save.result.get("memory_id").and_then(|v|
v.as_str()).expect(...)) and pass that extracted memory_id into the mem_delete
call input; update the call_tool invocation that uses "mem_delete" to use this
memory_id and keep the existing assertion that delete.error.is_none().
In `@openspec/changes/archive/2026-03-16-cerebro/tasks.md`:
- Around line 70-83: Add an explicit EN/ES parity status item to the
documentation checklist and relevant files: update
clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md,
clients/agent-runtime/README.md, clients/agent-runtime/examples/custom_memory.rs
(example comment/link), root README.md, and the new MCP schema README under
clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/ to include a
clear line indicating whether Spanish translations were completed or are pending
(e.g., "Spanish translation: complete / pending – PR # or ETA"), and add a
top-level checklist entry for EN/ES parity in the tasks list so reviewers know
to verify translations or mark them intentionally deferred.
- Around line 3-89: Add a new explicit checklist item under Phase 5
(Verification Gaps) titled something like "5.5 Rollback & Threat/Risk Plan" that
documents rollback strategy and threat/risk notes (auth failure modes, insecure
endpoint misconfig, outage fallback), and references any related migration docs;
include test tasks for boundary/failure-mode checks (endpoint misconfig, auth
failures, fallback behavior) so the plan is tracked in the main task list and
linked to the existing migration/verification items.
- Around line 60-66: Update the verification checklist entries 3.3, 3.4, 3.5 and
item 5.4 to include an explicit path to the verification report by appending
"(see ./verify-report.md)" to the parenthetical references to "test log notes"
and "verification report" so reviewers can directly access the cargo fmt,
clippy, test results and coverage guidance in ./verify-report.md.
In `@openspec/changes/archive/2026-03-16-cerebro/verify-report.md`:
- Around line 1-56: The archived markdown has markdownlint violations: add
language labels to unlabeled fenced code blocks (use ```bash for the build/test
blocks), ensure there is a blank line before and after each fenced code block
and before/after the table and headings, and normalize fence usage (use matching
triple backticks) so the build/test output blocks and the coverage block are
properly fenced; update the blocks that currently show ``` to ```bash, add a
blank line between the "Build & Tests Execution" heading, the code fences, and
the surrounding text, and ensure the table (| Metric | Value | ...) has a blank
line above and below to satisfy heading/fence/table spacing rules.
In `@openspec/specs/cerebro/spec.md`:
- Around line 78-80: Update the test/spec expectation for legacy SurrealDB
config rejection to assert a specific, actionable error message: when the
runtime rejects a configuration that references the removed SurrealDB backend
(the GIVEN/WHEN/THEN block), the error string must state "SurrealDB backend has
been removed", recommend using the "Cerebro" backend as the alternative, and
include a URL to the migration documentation (e.g., "See
https://.../migrate-surrealdb-to-cerebro"). Modify the THEN clause in the spec
(the rejection expectation) to require that exact content (or a clearly
specified template with placeholders) so implementations must produce that
message when rejecting legacy SurrealDB configs.
- Around line 87-90: Update the "Out of scope: SurrealDB migration" paragraph to
explicitly point readers to the SurrealDB migration guide by adding a short
cross-reference and link; mention that existing SurrealDB memories remain
inaccessible to aliased tools unless migrated and include the migration guide
(e.g., "See the SurrealDB migration guide for steps to migrate existing
memories") so users have a clear migration path; modify the text that references
`memory_store`/`memory_recall`/`memory_forget` and
`mem_save`/`mem_search`/`mem_delete` in spec.md to include that pointer and, if
available, the doc ID or URL for the migration instructions.
- Around line 118-121: The spec mentions "explicit loopback opt-in" but doesn't
define it; update openspec/specs/cerebro/spec.md to specify a single
configuration flag named loopbackOptIn (boolean) under both the runtime config
and Cerebro config schemas, state that loopback detection is automatic (checks
hostnames/IPs 127.0.0.1, ::1, and localhost) and that the flag must be set true
to allow protocols http or ws to bind to loopback addresses, and add a short
example configuration snippet showing loopbackOptIn: true in the runtime/Cerebro
config for local development and note that absence or false rejects such
bindings with a security error.
- Around line 59-62: Update the spec to mandate a concrete enforcement mechanism
around the MCP boundary and mem_save: require the runtime to validate mem_save
payloads for PII/secrets before sending to Cerebro (define that the runtime
shall perform a validation step), state whether automated PII detection is
REQUIRED or optional (if optional, require developer-provided metadata/flags),
specify audit logging requirements for any attempted sensitive-data transmission
(include log content and retention expectations), and define error handling for
mem_save calls that contain detected sensitive data (e.g., reject the call with
a specific error code, emit an audit event, and keep the data local). Reference
the MCP boundary and mem_save in the text and add explicit normative language
(MUST/SHOULD/MUST NOT) for each enforcement point.
In `@README.md`:
- Line 82: Update the "Hybrid Memory Model" wording to reflect the current
runtime contract: replace the mention of "Neo4j" with the accurate backends
(local memory as "SQLite/Markdown" and long-term MCP-backed "Cerebro"), so the
line that begins with "**Hybrid Memory Model**" now reads something like
"Pluggable memory backends including SQLite/Markdown (local) and MCP-backed
Cerebro for long-term retrieval" to match runtime/backend migration docs and
avoid user confusion.
---
Outside diff comments:
In `@clients/agent-runtime/src/onboard/wizard.rs`:
- Around line 709-749: The quick-setup path currently sets
MemoryCerebroConfig::default() in memory_config_defaults_for_backend(backend:
&str) without collecting or disabling Cerebro credentials, causing runtime
failures; update memory_config_defaults_for_backend (or backend_key_from_choice
flow) to either populate a minimal valid Cerebro config from CLI flags/env
(e.g., accept --cerebro-endpoint and --cerebro-token) or explicitly mark cerebro
as disabled when no endpoint/token is provided (so MemoryCerebroConfig is not
treated as required), and ensure any code that checks for Cerebro MCP endpoint
treats the absence as “disabled” rather than an error; reference
MemoryCerebroConfig::default(), memory_config_defaults_for_backend, and
backend_key_from_choice to locate where to add the flag parsing or the
disablement logic.
In `@clients/agent-runtime/src/tools/memory_recall.rs`:
- Around line 28-55: Update the parameters_schema and execute implementations to
strictly validate and sanitize inputs: in parameters_schema (the method named
parameters_schema) add constraints for "query" (e.g., minLength: 1) and "limit"
(integer minimum 1, maximum a sane MAX_LIMIT like 100) so the schema expresses
bounds; in execute (the async fn execute) trim the incoming "query" string and
treat a missing or whitespace-only query as a structured failure by returning a
ToolResult::Failure (with a clear message) instead of propagating an
anyhow::Error; similarly parse and clamp "limit" to a default (5) and an upper
bound (MAX_LIMIT) and return a ToolResult::Failure for non-numeric or
out-of-range values, ensuring no panics or Err(...) escape and all invalid input
paths return ToolResult variants rather than errors.
In `@clients/agent-runtime/src/tools/mod.rs`:
- Around line 291-325: The memory tools are always registered from
root_config.memory.cerebro and the _memory parameter is ignored, which
advertises memory tools when no local backend exists; update
all_tools_with_runtime to: check whether a local memory implementation was
injected via the _memory Arc and, if present, construct
MemoryStoreTool/MemoryRecallTool/MemoryForgetTool from that local Arc<dyn
Memory>; otherwise only register those tools when the Cerebro backend is
configured (e.g. root_config.memory.cerebro.is_some()); ensure you reference the
constructors MemoryStoreTool::new, MemoryRecallTool::new, MemoryForgetTool::new
and do not drop or ignore the _memory parameter so the registry contracts remain
correct.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 3c90b0c0-5588-4d55-b6ba-683ced9ecc91
⛔ Files ignored due to path filters (2)
clients/agent-runtime/Cargo.lockis excluded by!**/*.lockmodules/cerebro/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (64)
README.mdclients/agent-runtime/Cargo.tomlclients/agent-runtime/README.mdclients/agent-runtime/docker-compose.ymlclients/agent-runtime/examples/custom_memory.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/agent/memory_loader.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/config/mod.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/gateway/admin.rsclients/agent-runtime/src/gateway/mod.rsclients/agent-runtime/src/gateway/utils.rsclients/agent-runtime/src/main.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/memory/mod.rsclients/agent-runtime/src/memory/surreal.rsclients/agent-runtime/src/memory/traits.rsclients/agent-runtime/src/onboard/wizard.rsclients/agent-runtime/src/tools/mcp/cerebro.rsclients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/tools/mcp/mod.rsclients/agent-runtime/src/tools/mcp/normalize.rsclients/agent-runtime/src/tools/memory_forget.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/mod.rsclients/agent-runtime/tests/mcp_config_validation.rsclients/agent-runtime/tests/memory_backend_selection.rsclients/agent-runtime/tests/memory_cerebro_aliases.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/tests/memory_comparison.rsclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_context.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_get_observation.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_save.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_save_prompt.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_stats.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_timeline.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_update.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/migration.mdmodules/cerebro/Cargo.tomlmodules/cerebro/src/config.rsmodules/cerebro/src/errors.rsmodules/cerebro/src/lib.rsmodules/cerebro/src/main.rsmodules/cerebro/src/server.rsmodules/cerebro/src/storage/mod.rsmodules/cerebro/src/tools.rsmodules/cerebro/src/validation.rsmodules/cerebro/tests/mcp_auth_policy.rsmodules/cerebro/tests/mcp_tools_contract.rsopenspec/changes/archive/2026-03-16-cerebro/cerebro.mdopenspec/changes/archive/2026-03-16-cerebro/design.mdopenspec/changes/archive/2026-03-16-cerebro/proposal.mdopenspec/changes/archive/2026-03-16-cerebro/specs/cerebro/spec.mdopenspec/changes/archive/2026-03-16-cerebro/tasks.mdopenspec/changes/archive/2026-03-16-cerebro/verify-report.mdopenspec/specs/cerebro/spec.md
💤 Files with no reviewable changes (2)
- clients/agent-runtime/src/memory/surreal.rs
- clients/agent-runtime/tests/memory_comparison.rs
| - The agent runtime MUST treat the MCP boundary as long-term, shared memory only. | ||
| - `mem_save` SHOULD carry non-sensitive, durable observations appropriate for long-term storage. | ||
| - Secrets, credentials, PII, and ephemeral context MUST remain local-only in the runtime and MUST | ||
| NOT be sent to Cerebro via MCP. |
There was a problem hiding this comment.
Specify PII/secrets enforcement mechanism.
The requirement correctly states that secrets, credentials, and PII must remain local-only, but the spec doesn't define the enforcement mechanism. Add requirements for:
- Runtime validation before MCP transmission
- Whether automated PII detection is required or developer responsibility
- Audit logging for attempted sensitive data transmission
- Error handling if sensitive data is detected in
mem_savecalls
Without enforcement details, this security requirement is unverifiable and may be violated unintentionally.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/specs/cerebro/spec.md` around lines 59 - 62, Update the spec to
mandate a concrete enforcement mechanism around the MCP boundary and mem_save:
require the runtime to validate mem_save payloads for PII/secrets before sending
to Cerebro (define that the runtime shall perform a validation step), state
whether automated PII detection is REQUIRED or optional (if optional, require
developer-provided metadata/flags), specify audit logging requirements for any
attempted sensitive-data transmission (include log content and retention
expectations), and define error handling for mem_save calls that contain
detected sensitive data (e.g., reject the call with a specific error code, emit
an audit event, and keep the data local). Reference the MCP boundary and
mem_save in the text and add explicit normative language (MUST/SHOULD/MUST NOT)
for each enforcement point.
|
|
||
| - GIVEN a configuration that uses `http` or `ws` without explicit loopback opt-in | ||
| - WHEN the runtime validates the configuration | ||
| - THEN the configuration is rejected with a security error |
There was a problem hiding this comment.
Define the "loopback opt-in" mechanism explicitly.
Line 119 references "explicit loopback opt-in" without defining the mechanism. Specify:
- The exact configuration flag/parameter name
- Whether loopback detection is automatic (checking for 127.0.0.1/::1/localhost)
- Whether this applies to runtime config, Cerebro config, or both
- Example configuration showing the opt-in for local development
Vague security requirements are unverifiable and may be implemented inconsistently.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/specs/cerebro/spec.md` around lines 118 - 121, The spec mentions
"explicit loopback opt-in" but doesn't define it; update
openspec/specs/cerebro/spec.md to specify a single configuration flag named
loopbackOptIn (boolean) under both the runtime config and Cerebro config
schemas, state that loopback detection is automatic (checks hostnames/IPs
127.0.0.1, ::1, and localhost) and that the flag must be set true to allow
protocols http or ws to bind to loopback addresses, and add a short example
configuration snippet showing loopbackOptIn: true in the runtime/Cerebro config
for local development and note that absence or false rejects such bindings with
a security error.
| - **Secure Sandboxing**: Execute dangerous commands safely within isolated Docker containers or restricted native runtimes. | ||
| - **Standardized Identity (AIEOS)**: Support for AIEOS v1.1, allowing for portable and model-agnostic AI personas. | ||
| - **Hybrid Memory Model**: Pluggable memory backends including SQLite, Neo4j, and SurrealDB for high-context retrieval. | ||
| - **Hybrid Memory Model**: Pluggable memory backends including SQLite, Neo4j, and MCP-backed Cerebro for high-context retrieval. |
There was a problem hiding this comment.
Update memory backend wording to match current runtime contract.
Line 82 lists Neo4j, but the runtime/backend migration now documents local memory as SQLite/Markdown (plus Cerebro for long-term MCP). This line should be aligned to avoid user confusion.
As per coding guidelines **/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@README.md` at line 82, Update the "Hybrid Memory Model" wording to reflect
the current runtime contract: replace the mention of "Neo4j" with the accurate
backends (local memory as "SQLite/Markdown" and long-term MCP-backed "Cerebro"),
so the line that begins with "**Hybrid Memory Model**" now reads something like
"Pluggable memory backends including SQLite/Markdown (local) and MCP-backed
Cerebro for long-term retrieval" to match runtime/backend migration docs and
avoid user confusion.
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Actionable comments posted: 31
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clients/agent-runtime/src/gateway/admin.rs (1)
1630-1664:⚠️ Potential issue | 🟡 MinorAdd explicit redaction assertion for
cerebro-tokenin admin config view test.The test sets
cfg.memory.cerebro.auth_tokenbut does not assert that this token is absent from serialized output. Add that regression check to lock secret-redaction guarantees.
As per coding guidelines: "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."Suggested test assertion
let text = serialized.to_string(); assert!(!text.contains("secret-key")); assert!(!text.contains("composio-key")); + assert!(!text.contains("cerebro-token")); assert!(!text.contains("webhook-secret"));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/gateway/admin.rs` around lines 1630 - 1664, The test that exercises admin_config_view sets cfg.memory.cerebro.auth_token but never asserts it was redacted; add a regression assertion like the existing secret checks to ensure the serialized view does not contain "cerebro-token" (e.g., after let text = serialized.to_string(); add assert!(!text.contains("cerebro-token"));), locating the change near the other redaction assertions in the same test that reference serialized and text.
♻️ Duplicate comments (27)
clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.json (1)
10-31:⚠️ Potential issue | 🟠 MajorEnforce non-empty strings for required contract fields.
requiredalone allows"", so boundary validation is still weak forinput.seed,output.topic_key, and candidate values.Proposed schema hardening
"seed": { "type": "string", + "minLength": 1, "description": "Seed text or phrase describing the topic." }, @@ "topic_key": { - "type": "string" + "type": "string", + "minLength": 1 }, "candidates": { "type": "array", - "items": { "type": "string" } + "items": { "type": "string", "minLength": 1 } }As per coding guidelines:
**/*: Security first, performance second. Validate input boundaries, auth/authz implications, and secret management.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.json` around lines 10 - 31, The schema currently allows empty strings for input.seed, output.topic_key, and individual output.candidates; update the JSON Schema to enforce non-empty strings by adding "minLength": 1 (or a suitable non-empty pattern) to the definitions for "seed" (input), "topic_key" (output), and the "items" schema under "candidates"; also ensure "output" declares required: ["topic_key","candidates"] and consider adding "minItems": 1 for "candidates" to prevent an empty list, while keeping "additionalProperties": false for strictness.openspec/changes/archive/2026-03-16-cerebro/tasks.md (3)
70-83:⚠️ Potential issue | 🟡 MinorDocument EN/ES parity status for user-facing docs.
Line 70–83 marks documentation tasks complete but does not state whether Spanish translations are complete or intentionally deferred. Add a parity status note (or explicit pending gap).
As per coding guidelines: "
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes. For user-facing docs, check EN/ES parity or explicitly note pending translation gaps."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-03-16-cerebro/tasks.md` around lines 70 - 83, Add an explicit EN/ES parity status note to the user-facing docs referenced in the checklist: include a short “Translation status” or “EN/ES parity” line in clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md, clients/agent-runtime/README.md, root README.md, and any new guide pages under clients/web/apps/docs/.../mcp-schema/, and mention the translation intent (e.g., “Spanish translation pending” or “Available in EN and ES”) so readers know whether Spanish versions exist or are intentionally deferred; ensure the same note appears in clients/agent-runtime/examples/custom_memory.rs README/linking section if it exposes user-facing prose.
84-89:⚠️ Potential issue | 🟠 MajorTrack rollback + threat/risk plan explicitly in Phase 5.
Line 84–89 still lacks a dedicated checklist item for rollback strategy and threat/risk notes (auth failure modes, insecure endpoint misconfig, outage fallback), plus related failure-mode verification tasks.
Based on learnings: "Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-03-16-cerebro/tasks.md` around lines 84 - 89, Under "Phase 5: Verification Gaps" (the header containing items 5.1–5.4) add a new checklist item (e.g., 5.5) that explicitly documents a rollback strategy and threat/risk notes (auth failure modes, insecure endpoint misconfig, outage/fallback), and add sub-tasks to verify failure modes: (a) tests for auth failure/retry behavior, (b) boundary checks for insecure endpoint misconfiguration, and (c) outage/fallback verification; also update the verification report entry (referenced by the same Phase 5 section) to include coverage guidance and evidence placeholders for these rollback/threat tests.
60-66:⚠️ Potential issue | 🟡 MinorAdd explicit verification artifact path in checklist text.
Line 60–66 and Line 89 still reference “test log notes”/“verification report” without a concrete path. Please point directly to
./verify-report.mdin these items for reproducible audit trails.Also applies to: 89-89
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-03-16-cerebro/tasks.md` around lines 60 - 66, Update the checklist entries that mention “test log notes”/“verification report” to point to the concrete artifact path ./verify-report.md; specifically modify the items beginning "3.3 Run `cargo fmt --all -- --check`", "3.4 Run `cargo clippy --all-targets -- -D warnings`", and "3.5 Run `cargo test` (or `make test`)" to replace the vague “test log notes”/“verification report” text with "see ./verify-report.md", and make the same change for the similar reference at the later single-line item around Line 89 so all verification references consistently point to ./verify-report.md.openspec/specs/cerebro/spec.md (4)
59-62:⚠️ Potential issue | 🟠 MajorDefine mandatory sensitive-data enforcement at the MCP boundary.
Line 59–62 states policy, but not enforceable behavior. Add normative requirements for: pre-send validation on
mem_save, reject behavior/error code, and audit event requirements when sensitive data is detected.As per coding guidelines: "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 59 - 62, The spec currently states policy but lacks enforceable behavior at the MCP boundary; add concrete requirements so implementations must perform pre-send validation in the mem_save pathway: mandate that the MCP client or runtime validate mem_save payloads for sensitive data (secrets, credentials, PII) before sending, define a rejection behavior that returns a specific error code and structured error (e.g., SENSITIVE_DATA_DETECTED) when validation fails, and require emitting an audit event (including caller id, timestamp, detected field names/hash, and rejection reason) for every rejected mem_save; update the mem_save contract and MCP boundary docs to reference the validation step, the exact error token, and mandatory audit-event fields so runtimes can implement enforcement and logging consistently.
78-80:⚠️ Potential issue | 🟡 MinorMake legacy SurrealDB rejection error actionable and testable.
Line 80 should require error content that explicitly says SurrealDB backend is removed, recommends Cerebro, and points to migration docs.
As per coding guidelines: "Validate input boundaries ... and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 78 - 80, Update the GIVEN/WHEN/THEN acceptance scenario so the THEN assertion checks the runtime rejects the legacy SurrealDB config with a specific, actionable error message containing the phrase "SurrealDB backend is removed", a recommendation to use "Cerebro", and a pointer to the migration docs URL; modify the spec text for the scenario (the GIVEN/WHEN/THEN block referencing the SurrealDB backend) to assert those exact contents so tests can match the error string and validate the contract-break handling.
87-90:⚠️ Potential issue | 🟠 MajorAdd a direct migration-guide reference in the out-of-scope note.
Line 87–90 documents the break but does not provide the migration path. Add a concrete link to the SurrealDB→Cerebro migration guide.
As per coding guidelines: "Look for behavioral regressions ... and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 87 - 90, Update the "Out of scope: SurrealDB migration" note to include a concrete link to the SurrealDB→Cerebro migration guide; specifically edit the paragraph that mentions the runtime alias/bridge mapping `memory_store`/`memory_recall`/`memory_forget` to `mem_save`/`mem_search`/`mem_delete` over the Cerebro MCP service (the same sentence in the Out of scope block) and append a short migration-guide reference (e.g., "See SurrealDB→Cerebro migration guide: <URL>") so readers have a direct path for migrating existing SurrealDB memories.
108-121:⚠️ Potential issue | 🟠 MajorSpecify the exact loopback opt-in contract.
“Explicit loopback opt-in” is still ambiguous. Define the config key name, automatic loopback detection rules (
127.0.0.1,::1,localhost), and rejection behavior when unset.As per coding guidelines: "Security first ... Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 108 - 121, Define an explicit loopback opt-in contract and enforce it in the Cerebro MCP configuration validation (e.g., in validateCerebroConfig): add a boolean config key named "cerebro.loopback.allow" (default false); treat an endpoint as loopback if its host is 127.0.0.1, ::1, or localhost (case-insensitive) or resolves to those addresses; during validation reject any http/ws endpoint that is loopback unless cerebro.loopback.allow === true and fail-fast with a clear security error; document that secure schemes (https/wss) bypass this opt-in check and that the validator must log and return the rejection error when the key is unset.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json (1)
32-34:⚠️ Potential issue | 🟡 MinorDisambiguate delete result in
output.status(soft vs hard).Line 32 only allows
"deleted"even when Line 14 supportshard_delete; consumers cannot reliably audit which deletion path occurred.Contract clarification option
"status": { "type": "string", - "enum": ["deleted"] + "enum": ["soft_deleted", "hard_deleted"] },As per coding guidelines, "Look for behavioral regressions, missing tests, and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json` around lines 32 - 34, The schema's output.status enum only contains "deleted", which is ambiguous when the system supports both soft and hard deletes (see hard_delete usage); update the mem_delete.json contract so output.status explicitly distinguishes deletion kinds—e.g., replace/extend the enum to include "soft_deleted" and "hard_deleted" (or add a dedicated "deletion_type" enum field) and keep or map "deleted" only if backward compatibility is required; update any related consumers/tests that read output.status (and the code paths that emit status in the soft/hard delete handlers) to use the new explicit values.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json (1)
48-48:⚠️ Potential issue | 🟡 MinorEnforce ISO-8601/RFC3339 format for
timestamp.Line [48] uses a plain string; add
format: "date-time"so consumers can rely on a stable timestamp contract.Suggested patch
- "timestamp": { "type": "string" } + "timestamp": { "type": "string", "format": "date-time" }For JSON Schema draft 2020-12, is `format: "date-time"` the recommended constraint for RFC3339 timestamps on string fields?As per coding guidelines, "Look for behavioral regressions, missing tests, and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json` at line 48, The "timestamp" schema currently only specifies "type": "string"; update the "timestamp" property to enforce RFC3339/ISO-8601 by adding "format": "date-time" to its schema so consumers get a stable timestamp contract; locate the "timestamp" property in the JSON Schema (the object with key "timestamp") and augment it to include format: "date-time" consistent with JSON Schema draft 2020-12 usage.README.md (1)
82-82:⚠️ Potential issue | 🟡 MinorRemove Neo4j from the Hybrid Memory Model bullet to match runtime support.
Line 82 is stale and can mislead users about supported backends.
Proposed README fix
-- **Hybrid Memory Model**: Pluggable memory backends including SQLite, Neo4j, and MCP-backed Cerebro for high-context retrieval. +- **Hybrid Memory Model**: Pluggable memory backends including SQLite/Lucid/Markdown (local) and MCP-backed Cerebro for long-term retrieval.As per coding guidelines
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@README.md` at line 82, Update the "Hybrid Memory Model" bullet to remove Neo4j so it matches runtime support: edit the bullet text that starts with "**Hybrid Memory Model**" to list only supported backends (e.g., SQLite and MCP-backed Cerebro) and scan the README for other occurrences of "Neo4j" or the "Hybrid Memory Model" phrase to remove or adjust stale references to Neo4j.clients/agent-runtime/examples/custom_memory.rs (1)
6-6:⚠️ Potential issue | 🟡 MinorUse a user-resolvable migration link in the example docs.
Line 6 points to a repo-internal path, which is hard to open when this example is shared outside the repo tree.
Proposed doc fix
-//! See: clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md +//! See: https://github.com/dallay/corvus/blob/main/clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/examples/custom_memory.rs` at line 6, The top-of-file doc comment in clients/agent-runtime/examples/custom_memory.rs currently points to an internal repo path ("//! See: clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md"); update that line to use a user-resolvable URL (e.g., the public GitHub or website URL for the migration guide) so the example works outside the repo, replacing the internal path with the full external link string.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_context.json (1)
10-43:⚠️ Potential issue | 🟠 MajorTighten identifier and timestamp constraints in
mem_contextschema.Line 10, Line 38, and Line 39 currently allow empty strings, and Line 42 accepts any string as timestamp.
Proposed schema hardening
"session_id": { "type": "string", + "minLength": 1, "description": "Session identifier requesting context." }, @@ "session_id": { - "type": "string" + "type": "string", + "minLength": 1 }, @@ - "memory_id": { "type": "string" }, - "summary": { "type": "string" }, + "memory_id": { "type": "string", "minLength": 1 }, + "summary": { "type": "string", "minLength": 1 }, "topic_key": { "type": "string" }, "scope": { "type": "string" }, - "timestamp": { "type": "string" } + "timestamp": { "type": "string", "format": "date-time" }As per coding guidelines
**/*: Security first, performance second. Validate input boundaries, auth/authz implications, and secret management.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_context.json` around lines 10 - 43, The schema in mem_context.json currently allows empty identifier strings and an unconstrained timestamp; update the input/output validation: add "minLength": 1 to the top-level "session_id" property and to the item identifier properties ("memory_id" and "topic_key" and optionally "scope") so empty strings are rejected, and constrain "timestamp" on the output items to a proper datetime (e.g., add "format": "date-time" or an RFC3339 pattern) to prevent arbitrary strings; ensure these changes are applied to the properties blocks for "session_id", the items' "properties" (memory_id, topic_key, scope) and the items' "timestamp" field in mem_context.json.clients/agent-runtime/tests/mcp_config_validation.rs (1)
87-97:⚠️ Potential issue | 🟡 MinorAdd the missing insecure
ws://regression test.Validation treats
ws://the same ashttp://, but this suite still only covers the HTTP branch. Aws://cerebro.example.com/mcpcase would keep the insecure-WebSocket path from regressing unnoticed.Based on learnings "Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/tests/mcp_config_validation.rs` around lines 87 - 97, Add a new test mirroring rejects_insecure_cerebro_endpoint_without_loopback_opt_in that uses a ws:// URL to ensure websocket insecure endpoints are validated the same as http; specifically, create a test (e.g., rejects_insecure_cerebro_ws_endpoint_without_loopback_opt_in) that sets config.memory.cerebro.endpoint = Some("ws://cerebro.example.com/mcp".into()), keeps the same auth_token and request_timeout_ms, calls config.validate_for_runtime().unwrap_err().to_string(), and asserts the error contains "memory.cerebro.endpoint" and "allow_insecure_loopback" to prevent regression in the websocket validation branch.clients/agent-runtime/src/tools/mcp/client.rs (2)
220-225:⚠️ Potential issue | 🟠 MajorBound the HTTP body before calling
response.json().
response.json().awaitreads and deserializes the whole payload first, so an oversized Cerebro response can bypassoutput_limit_bytesand spike memory. Read bounded bytes, fail once the cap is exceeded, then deserialize.As per coding guidelines "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/mcp/client.rs` around lines 220 - 225, The code currently calls response.json().await which can allocate arbitrarily large payloads; instead, read the response body with a bounded reader and enforce output_limit_bytes before deserializing: after req.send().await and status = response.status(), stream the body (e.g., using response.bytes_stream() or response.chunked reading) and accumulate into a Vec<u8> while checking against output_limit_bytes on each chunk, return an error if exceeded, and only then call serde_json::from_slice to produce the payload variable; update the logic around the response.json() call (and any uses of payload) to use the safely bounded deserialization approach.
240-249:⚠️ Potential issue | 🟠 MajorKeep JSON-RPC errors structured and return raw string outputs unquoted.
anyhow::bail!("{error}")collapses the MCP error envelope into plain text, andoutput.to_string()JSON-encodes string results, so"ok"becomes"\"ok\"". Preserve the error payload and useoutput.as_str()before falling back to JSON serialization.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/mcp/client.rs` around lines 240 - 249, The MCP error handling currently collapses the JSON-RPC error envelope and the output handling JSON-encodes strings; change the block that checks payload.get("error") to preserve and include the raw error JSON (e.g., serialize the error Value rather than interpolating a Display) when bailing, and change the final return to prefer output.as_str().map(String::from) so that JSON string values return unquoted raw strings, falling back to serde_json::to_string(output) only if output.as_str() is None; refer to the existing payload.get("error") check and the output extraction that looks up .get("result").and_then(|v| v.get("output")) so you update those spots.clients/agent-runtime/src/config/schema.rs (1)
2845-2860:⚠️ Potential issue | 🟠 MajorReject unknown
memory.backendvalues here.This match only blocks the removed Surreal variants. Values like
cerebroor any other typo still pass runtime validation even though the runtime only recognizessqlite,lucid,markdown, andnone.🛠️ Suggested fix
fn validate_memory_config(&self) -> Result<()> { match self.memory.backend.as_str() { + "sqlite" | "lucid" | "markdown" | "none" => {} "surreal" | "surreal-graphs" => { anyhow::bail!( "memory.backend '{}' is no longer supported; configure memory.cerebro instead", self.memory.backend ); } - _ => {} + other => { + anyhow::bail!( + "unsupported memory.backend '{}'; supported values are: sqlite, lucid, markdown, none", + other + ); + } }As per coding guidelines "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2845 - 2860, The validate_memory_config function currently only rejects "surreal" and "surreal-graphs" but lets any other unknown memory.backend values through; update validate_memory_config to enforce a whitelist of accepted backends ("sqlite", "lucid", "markdown", "none") and return an error for any other value (except still applying the cerebro.endpoint validation when memory.cerebro is set). Specifically, in validate_memory_config replace the current match on self.memory.backend with logic that checks whether self.memory.backend is one of the allowed strings and otherwise calls anyhow::bail! with a clear message that the backend is unsupported; keep the existing special-case behavior for cerebro.endpoint (validate_cerebro_endpoint) unchanged.clients/agent-runtime/src/tools/mcp/cerebro.rs (1)
24-42:⚠️ Potential issue | 🟠 MajorTrim and revalidate the endpoint/token before building the synthetic MCP server.
validate_cerebro_endpoint()trims before parsing, but this helper forwards the rawendpoint, so" https://cerebro.example.com/mcp "can pass validation and still fail later whenreqwestbuilds the request. The helper also still treats blank/missingauth_tokenas “send no Authorization header,” which silently weakens this path if a caller reaches it withoutvalidate_for_runtime().🛠️ Suggested hardening
fn build_cerebro_server(cerebro: &MemoryCerebroConfig) -> anyhow::Result<McpServerConfig> { let endpoint = cerebro .endpoint .as_deref() - .ok_or_else(|| anyhow::anyhow!("memory.cerebro.endpoint must be configured"))?; + .map(str::trim) + .filter(|value| !value.is_empty()) + .ok_or_else(|| anyhow::anyhow!("memory.cerebro.endpoint must be configured"))?; + let token = cerebro + .auth_token + .as_deref() + .map(str::trim) + .filter(|value| !value.is_empty()) + .ok_or_else(|| anyhow::anyhow!("memory.cerebro.auth_token must be configured"))?; let mut env = BTreeMap::new(); - if let Some(token) = cerebro.auth_token.as_deref() { - env.insert("MCP_AUTH_TOKEN".to_string(), token.to_string()); - } + env.insert("MCP_AUTH_TOKEN".to_string(), token.to_string());As per coding guidelines "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/mcp/cerebro.rs` around lines 24 - 42, The build_cerebro_server helper currently forwards raw values and may accept endpoint or auth_token padded with whitespace; update build_cerebro_server to trim whitespace from cerebro.endpoint and cerebro.auth_token before using them and re-validate the trimmed endpoint (reuse or call validate_cerebro_endpoint on the trimmed & deref'd string) so " https://..." cannot slip through; also treat an empty or all-whitespace auth_token as invalid (return an error from build_cerebro_server rather than omitting the header) so that MemoryCerebroConfig.auth_token must be present and non-blank before inserting MCP_AUTH_TOKEN into env.clients/agent-runtime/src/tools/memory_store.rs (3)
318-323:⚠️ Potential issue | 🟡 MinorAvoid contiguous API-key signature literals in test fixtures.
This test string still matches secret scanners and will continue causing noisy/failing leak checks.
Patch suggestion
- let result = tool - .execute(json!({"key": "token", "content": "api_key: sk-1234567890abcdef1234567890"})) + let fake_key = format!("{}{}", "sk-", "1234567890abcdef1234567890"); + let result = tool + .execute(json!({"key": "token", "content": format!("api_key: {fake_key}")})) .await .unwrap();🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 318 - 323, The test store_blocks_api_key_pattern uses a literal that resembles a real API key; update the fixture in the test function store_blocks_api_key_pattern to avoid contiguous API-key signature patterns by altering or redacting the token string passed to MemoryStoreTool::execute (e.g., replace the middle of the key with placeholders, split the prefix and rest, or use a clearly synthetic value like "sk-REDACTED" or "api_key: sk-XXXX-REDACTED") while keeping the rest of the test logic intact; references: MemoryStoreTool, MemoryCerebroConfig::default(), test_security(), and execute.
168-181:⚠️ Potential issue | 🟠 MajorAdd explicit egress-policy enforcement before remote Cerebro execution.
This path performs outbound network execution without an explicit endpoint/domain policy gate in the tool path.
As per coding guidelines: "Treat
src/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 168 - 181, This code calls cerebro::cerebro_tool_adapter(...)->execute(...) which performs outbound network execution without an explicit egress-policy check; add an explicit egress policy enforcement step before calling adapter.execute: retrieve the target endpoint/domain from the adapter or normalize::CEREBRO_TOOL_STORE configuration, call the project egress policy guard (e.g., security::egress::enforce or policy::egress::check_allowed(domain)) and abort/return an error if the policy rejects the domain, only then proceed to adapter.execute(payload). Ensure the check is added in the same function that constructs the adapter/response to protect the tool path.
13-68:⚠️ Potential issue | 🟠 MajorSensitive-data detector still has fail-open gaps.
The current checks remain too narrow and can miss common secret/PII patterns before forwarding content to Cerebro.
As per coding guidelines: "Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 13 - 68, The sensitive-data detector in contains_sensitive_data (and its helpers contains_sensitive_labels and looks_like_api_key) is too narrow; extend it to catch more common secret/PII patterns by: (1) broadening LABELS to include more variants and separators (e.g., "secret", "client_secret", "bearer", "auth_token", with ":" "=" and whitespace variants), (2) adding additional matcher functions for common patterns — e.g., regex-based checks for emails, UUIDs, long base64/hex strings, API-key prefixes like "sk-", "pk-", "ak-", and continuous alphanumeric/hex strings above a threshold (>=20), (3) adding checks for SSN and credit-card formats (with Luhn validation for cards), and (4) normalizing input safely (preserve non-whitespace separators when needed) and ensuring no panics or unwraps occur so the Tool implementation can validate/sanitize inputs and return structured ToolResult on detection or safe pass-through; update contains_sensitive_data to call the new matchers and short-circuit on first positive match.modules/cerebro/tests/mcp_tools_contract.rs (1)
62-70: 🧹 Nitpick | 🔵 TrivialExtract
memory_idfrom save response instead of hardcoding.The test hardcodes
"topic"as thememory_id. If the mem_save response includes amemory_idfield, extract and use that value to make the test resilient to implementation changes.Suggested approach
let save = call_tool( &service, "mem_save", json!({...}), ) .await; assert!(save.error.is_none()); + let memory_id = save + .result + .as_ref() + .and_then(|r| r.get("output")) + .and_then(|o| o.get("memory_id")) + .and_then(|v| v.as_str()) + .unwrap_or("topic"); // fallback if not returned let delete = call_tool( &service, "mem_delete", json!({ - "input": { "memory_id": "topic" } + "input": { "memory_id": memory_id } }), ) .await;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/tests/mcp_tools_contract.rs` around lines 62 - 70, The test currently hardcodes "topic" when calling mem_delete; instead, extract the memory_id returned by the mem_save call and pass that into call_tool for "mem_delete". After you call_tool for mem_save (the variable named save or similar), read the memory_id from the saved response (e.g., from save.output or save.result JSON field "memory_id"), assert it exists, and use that extracted memory_id in the mem_delete call so the test does not rely on the hardcoded "topic".modules/cerebro/src/tools.rs (2)
227-238:⚠️ Potential issue | 🟠 MajorClamp
mem_search.limitbefore calling storage.
.max(1)blocks zero but still allows arbitrarily large pages. A caller can force the service to scan and serialize far more data than intended. Put a sane upper bound onlimitbeforesearch(...).Suggested fix
- let limit = input.input.limit.unwrap_or(5).max(1); + const MAX_MEM_SEARCH_LIMIT: usize = 100; + let limit = input.input.limit.unwrap_or(5).clamp(1, MAX_MEM_SEARCH_LIMIT);As per coding guidelines: "Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/tools.rs` around lines 227 - 238, The code sets limit = input.input.limit.unwrap_or(5).max(1) which blocks zero but allows unbounded large values; clamp this before calling self.storage.search by enforcing an upper bound (e.g., const MAX_SEARCH_LIMIT) and computing limit = input.input.limit.unwrap_or(DEFAULT).clamp(1, MAX_SEARCH_LIMIT) (or equivalent min logic) so the value passed to self.storage.search(...) is within a sane range; update references to input.input.limit and the local limit variable and add a named constant (e.g., MAX_SEARCH_LIMIT) near the function for easy tuning.
168-182:⚠️ Potential issue | 🟠 MajorReturn explicit unsupported errors for stubbed MCP tools.
mem_timeline,mem_save_prompt,mem_session_start,mem_session_end,mem_session_summary, andmem_contextall report success without touching storage. Once a client starts relying on them, this becomes silent data loss. Keep them behind an explicit unsupported error until the storage-backed implementation exists.As per coding guidelines: "Look for behavioral regressions, missing tests, and contract breaks across modules."
Also applies to: 367-433
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/tools.rs` around lines 168 - 182, The listed handlers (mem_timeline, mem_save_prompt, mem_session_start, mem_session_end, mem_session_summary, mem_context) are currently stubs that return success without persisting data; change the match arms in handle to return an explicit unsupported/NotImplemented CerebroError for those tool names (e.g., Err(CerebroError::UnsupportedTool("mem_timeline"))) until a storage-backed implementation exists, and apply the same change to the duplicate handlers in the 367-433 region; update any helper methods (mem_timeline, mem_save_prompt, mem_session_start, mem_session_end, mem_session_summary, mem_context) signatures or bodies to consistently return the unsupported error so callers can detect and handle the unsupported tool case.modules/cerebro/src/config.rs (1)
19-31:⚠️ Potential issue | 🟠 MajorProtect
auth_tokenfrom debug/serialization leaks.
CerebroConfigisDebugandSerialize, whileauth_tokenis a plainString. Any config dump, panic context, or admin serialization will expose the bearer token verbatim. Please move this field behind a redacted secret wrapper or exclude it from debug/serialization paths.As per coding guidelines: "Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/config.rs` around lines 19 - 31, CerebroConfig currently exposes auth_token in Debug/Serialize; change the auth_token field to a secret wrapper and prevent it from being printed/serialized: replace auth_token: Option<String> with Option<secrecy::SecretString> (or your project’s secret type), add serde attributes to skip serialization (e.g. #[serde(skip_serializing)] or #[serde(skip_serializing_if = "Option::is_none")]) so the token is not emitted, and implement a custom Debug for CerebroConfig that redacts or omits auth_token (do not rely on the derived Debug), keeping deserialization/loading behavior intact (if you need to deserialize from config, add an appropriate deserialize helper or use secrecy’s serde feature). Ensure references to auth_token throughout the code use the new SecretString API (e.g., .expose_secret()) where absolutely necessary.clients/agent-runtime/src/memory/backend.rs (1)
81-88:⚠️ Potential issue | 🟠 MajorFail fast on removed
surreal*backend keys.After this change,
surrealandsurreal-graphscollapse intoUnknown, and the downstream factories still turnUnknowninto markdown. Existing configs will silently switch persistence backends on upgrade instead of surfacing a migration error. Preserve explicit legacy handling or return an error before the markdown fallback path.As per coding guidelines: "Look for behavioral regressions, missing tests, and contract breaks across modules."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/backend.rs` around lines 81 - 88, The change removed explicit handling for legacy surreal keys causing them to collapse to Unknown and silently fall back to Markdown; update classify_memory_backend to explicitly match "surreal" and "surreal-graphs" and map them to distinct MemoryBackendKind variants (e.g. MemoryBackendKind::Surreal and MemoryBackendKind::SurrealGraphs or a single MemoryBackendKind::LegacySurreal), and add those variant(s) to the MemoryBackendKind enum so downstream factories can detect legacy surreal keys and return a migration/error instead of mapping Unknown to Markdown; ensure downstream factory logic that currently turns Unknown into Markdown checks these new Surreal/LegacySurreal variants and returns an explicit error or migration path.modules/cerebro/src/storage/mod.rs (1)
69-79:⚠️ Potential issue | 🔴 CriticalIn-memory-only backend still breaks long-term memory durability.
InMemoryStorageis process-local only, so data is lost on restart. For a SurrealDB → Cerebro long-term-memory migration, this is still a release blocker unless a durable backend is the default or explicitly wired for production paths.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/storage/mod.rs` around lines 69 - 79, InMemoryStorage (struct InMemoryStorage and its new() constructor) is process-local and will drop all records (records: RwLock<HashMap<...>>) on restart; replace or gate its use in production by wiring a durable backend as the default: add or use a DurableStorage implementation (e.g., SurrealDbStorage::new or PersistedStorage) and update the factory/initialization path that currently calls InMemoryStorage::new() to select the durable backend based on config/env (or provide Storage::from_config factory). Alternatively, mark InMemoryStorage::new() as #[cfg(test)] and ensure production code uses the durable backend constructor so long-term-memory migration from SurrealDB is preserved.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f60c6c4b-0fb7-45a7-bd91-3be309cf5bfc
⛔ Files ignored due to path filters (2)
clients/agent-runtime/Cargo.lockis excluded by!**/*.lockmodules/cerebro/Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (66)
.github/workflows/lychee-links.yml.github/workflows/lychee-web-dist.ymlREADME.mdclients/agent-runtime/Cargo.tomlclients/agent-runtime/README.mdclients/agent-runtime/docker-compose.ymlclients/agent-runtime/examples/custom_memory.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/agent/memory_loader.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/config/mod.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/gateway/admin.rsclients/agent-runtime/src/gateway/mod.rsclients/agent-runtime/src/gateway/utils.rsclients/agent-runtime/src/main.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/memory/mod.rsclients/agent-runtime/src/memory/surreal.rsclients/agent-runtime/src/memory/traits.rsclients/agent-runtime/src/onboard/wizard.rsclients/agent-runtime/src/tools/mcp/cerebro.rsclients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/tools/mcp/mod.rsclients/agent-runtime/src/tools/mcp/normalize.rsclients/agent-runtime/src/tools/memory_forget.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/mod.rsclients/agent-runtime/tests/mcp_config_validation.rsclients/agent-runtime/tests/memory_backend_selection.rsclients/agent-runtime/tests/memory_cerebro_aliases.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/tests/memory_comparison.rsclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_context.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_get_observation.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_save.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_save_prompt.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_stats.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_timeline.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_update.jsonclients/web/apps/docs/src/content/docs/guides/cerebro/migration.mdmodules/cerebro/Cargo.tomlmodules/cerebro/src/config.rsmodules/cerebro/src/errors.rsmodules/cerebro/src/lib.rsmodules/cerebro/src/main.rsmodules/cerebro/src/server.rsmodules/cerebro/src/storage/mod.rsmodules/cerebro/src/tools.rsmodules/cerebro/src/validation.rsmodules/cerebro/tests/mcp_auth_policy.rsmodules/cerebro/tests/mcp_tools_contract.rsopenspec/changes/archive/2026-03-16-cerebro/cerebro.mdopenspec/changes/archive/2026-03-16-cerebro/design.mdopenspec/changes/archive/2026-03-16-cerebro/proposal.mdopenspec/changes/archive/2026-03-16-cerebro/specs/cerebro/spec.mdopenspec/changes/archive/2026-03-16-cerebro/tasks.mdopenspec/changes/archive/2026-03-16-cerebro/verify-report.mdopenspec/specs/cerebro/spec.md
💤 Files with no reviewable changes (3)
- clients/agent-runtime/tests/memory_comparison.rs
- .github/workflows/lychee-web-dist.yml
- clients/agent-runtime/src/memory/surreal.rs
| contents: write | ||
| pull-requests: write |
There was a problem hiding this comment.
Reduce write-token exposure across the whole job.
Line 19 and Line 20 grant write permissions to every step, including third-party actions. Split this into two jobs (read-only link check + write-scoped PR creation) to enforce least privilege.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/lychee-links.yml around lines 19 - 20, The workflow
currently grants repository-wide write tokens via the top-level permissions
"contents: write" and "pull-requests: write"; split this into two jobs: a
read-only "link-check" job that sets permissions to minimal read-only (e.g.,
"contents: read") and runs the link scanning steps, and a separate "create-pr"
job that runs only when changes are needed (use needs/from-link-check output or
conditional) and is scoped to write permissions ("contents: write" and
"pull-requests: write") at the job level; update job dependencies (needs) so the
PR job runs after the link-check job and only when required to follow least
privilege.
| - name: 💾 Create PR with report | ||
| if: steps.lychee.outputs.exit_code != 0 | ||
| uses: peter-evans/create-pull-request@98357b18bf14b5342f975ff684046ec3b2a07725 # v8.0.0 | ||
| with: | ||
| title: "[Automated] Link Checker Report" | ||
| content-filepath: ./lychee/out.md | ||
| labels: "automated-issue, broken-links" | ||
| repository: ${{ github.repository }} | ||
| env: | ||
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| token: ${{ secrets.GITHUB_TOKEN }} | ||
| branch: "auto-pr/lychee-broken-links" | ||
| commit-message: "chore: update lychee link report [skip ci]" | ||
| title: "chore: update lychee link report" | ||
| labels: | | ||
| pr|chore | ||
| broken-links | ||
| body: | | ||
| ## Summary | ||
|
|
||
| - name: 📝 Update existing issue | ||
| if: steps.lychee.outputs.exit_code != 0 && steps.find-issue.outputs.issue_number != '' && github.ref == 'refs/heads/main' | ||
| uses: peter-evans/create-issue-from-file@e8ef132d6df98ed982188e460ebb3b5d4ef3a9cd # v5 | ||
| with: | ||
| title: "[Automated] Link Checker Report" | ||
| content-filepath: ./lychee/out.md | ||
| labels: "automated-issue, broken-links" | ||
| issue-number: ${{ steps.find-issue.outputs.issue_number }} | ||
| repository: ${{ github.repository }} | ||
| env: | ||
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | ||
| Automated lychee report for broken links. | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "== create-pull-request step inputs =="
rg -n -A25 -B3 'create-pull-request@' .github/workflows/lychee-links.yml
echo
echo "== ignore rules for generated lychee artifacts =="
fd -HI '^\.gitignore$' -x sh -c '
echo "--- $1"
rg -n "(\.lycheecache|lychee/out\.md|\.github/lychee/report\.md)" "$1" || true
' sh {}Repository: dallay/corvus
Length of output: 1104
🏁 Script executed:
# Verify lychee report location and what files lychee creates in the workflow
rg -n "lychee" .github/workflows/lychee-links.yml | head -20Repository: dallay/corvus
Length of output: 973
Scope PR commits to the report file and set an explicit base branch.
The create-pull-request step lacks add-paths constraint, which means all modified files in the workflow—including ./lychee/out.md and .lycheecache—will be committed to the auto-PR instead of only .github/lychee/report.md. Additionally, set an explicit base branch to prevent dispatching to an unintended target.
Suggested patch
- name: 💾 Create PR with report
if: steps.lychee.outputs.exit_code != 0
uses: peter-evans/create-pull-request@98357b18bf14b5342f975ff684046ec3b2a07725 # v8.0.0
with:
token: ${{ secrets.GITHUB_TOKEN }}
branch: "auto-pr/lychee-broken-links"
+ base: "main"
+ add-paths: |
+ .github/lychee/report.md
commit-message: "chore: update lychee link report [skip ci]"
title: "chore: update lychee link report"
labels: |
pr|chore
broken-links🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.github/workflows/lychee-links.yml around lines 48 - 63, Update the "💾
Create PR with report" create-pull-request step to restrict committed files and
target branch: add the add-paths input to only include the report file (e.g.,
".github/lychee/report.md") so other artifacts like "./lychee/out.md" or
".lycheecache" are not included, and set the base input to your intended target
branch (e.g., "main") so the PR is opened against the correct branch; modify the
step that uses
peter-evans/create-pull-request@98357b18bf14b5342f975ff684046ec3b2a07725
accordingly.
| # Optional: Cerebro memory service | ||
| # - CORVUS_CEREBRO_ENDPOINT=https://cerebro.example.com/mcp | ||
| # - CORVUS_CEREBRO_AUTH_TOKEN= | ||
| # - CORVUS_CEREBRO_TIMEOUT_MS=30000 | ||
| # - CORVUS_CEREBRO_ALLOW_INSECURE_LOOPBACK=false |
There was a problem hiding this comment.
Avoid inline auth-token examples; prefer env substitution (or secrets).
CORVUS_CEREBRO_AUTH_TOKEN= in the Compose example nudges users toward storing secrets directly in versioned files (or unintentionally leaving empty auth when uncommenting). Use the existing ${VAR:-} pattern here as well.
🔐 Suggested patch
# Optional: Cerebro memory service
- # - CORVUS_CEREBRO_ENDPOINT=https://cerebro.example.com/mcp
- # - CORVUS_CEREBRO_AUTH_TOKEN=
+ # - CORVUS_CEREBRO_ENDPOINT=${CORVUS_CEREBRO_ENDPOINT:-https://cerebro.example.com/mcp}
+ # - CORVUS_CEREBRO_AUTH_TOKEN=${CORVUS_CEREBRO_AUTH_TOKEN:-}
# - CORVUS_CEREBRO_TIMEOUT_MS=30000
# - CORVUS_CEREBRO_ALLOW_INSECURE_LOOPBACK=falseAs per coding guidelines, "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Optional: Cerebro memory service | |
| # - CORVUS_CEREBRO_ENDPOINT=https://cerebro.example.com/mcp | |
| # - CORVUS_CEREBRO_AUTH_TOKEN= | |
| # - CORVUS_CEREBRO_TIMEOUT_MS=30000 | |
| # - CORVUS_CEREBRO_ALLOW_INSECURE_LOOPBACK=false | |
| # Optional: Cerebro memory service | |
| # - CORVUS_CEREBRO_ENDPOINT=${CORVUS_CEREBRO_ENDPOINT:-https://cerebro.example.com/mcp} | |
| # - CORVUS_CEREBRO_AUTH_TOKEN=${CORVUS_CEREBRO_AUTH_TOKEN:-} | |
| # - CORVUS_CEREBRO_TIMEOUT_MS=30000 | |
| # - CORVUS_CEREBRO_ALLOW_INSECURE_LOOPBACK=false |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/docker-compose.yml` around lines 32 - 36, Replace the
inline/empty auth-token example for CORVUS_CEREBRO_AUTH_TOKEN in the
docker-compose example with an environment substitution or secret reference to
avoid encouraging hardcoded secrets; specifically update the commented line
containing CORVUS_CEREBRO_AUTH_TOKEN to reference an env var fallback pattern
(e.g., use the existing ${VAR:-} style) or a docker secret, so users are
prompted to supply the token via environment or secrets rather than editing the
compose file directly.
| # Cerebro long-term memory runs as a separate MCP service | ||
| # See docs: ../../clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md |
There was a problem hiding this comment.
Fix the repo-relative docs path in this shell snippet.
After cd corvus, ../../clients/web/... points outside the repository. The in-repo path from the project root is clients/web/..., so the current snippet sends readers to a non-existent location.
As per coding guidelines "Verify technical accuracy and that docs stay aligned with code changes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/README.md` around lines 98 - 99, The docs snippet uses
a repo-relative path
"../../clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md" that
points outside the repo after the preceding cd corvus; update that snippet to
the correct in-repo path
"clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md" so the link
resolves; locate the path string in the README snippet (the line containing
"../../clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md") and
replace it with the corrected path, preserving surrounding comments and
formatting.
| | **Memory** | `Memory` | SQLite (hybrid search), Markdown, MCP-backed Cerebro | Any persistence backend | | ||
| | **Tools** | `Tool` | shell, file_read, file_write, memory_store, memory_recall, memory_forget, browser_open (Brave + allowlist), composio (optional) | Any capability | |
There was a problem hiding this comment.
List lucid anywhere you enumerate supported memory backends.
The runtime still accepts lucid, but these user-facing backend lists now only mention sqlite, markdown, and none. That makes a valid backend look unsupported.
As per coding guidelines "Verify technical accuracy and that docs stay aligned with code changes."
Also applies to: 201-202, 311-312
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/README.md` around lines 164 - 165, Update the README
memory backend enumeration to include "lucid" wherever the Memory backends are
listed (the table rows currently showing "SQLite (hybrid search), Markdown,
MCP-backed Cerebro" and the other occurrences called out in the review);
specifically edit the Memory cell text used in the table rows (and the repeated
lists at the other two locations) to add "lucid" to the comma-separated backend
list so docs match the runtime's supported backends.
…-mcp-implementation-for-cerebro # Conflicts: # clients/agent-runtime/Cargo.lock
…-mcp-implementation-for-cerebro
…entation-for-cerebro
…entation-for-cerebro
There was a problem hiding this comment.
Actionable comments posted: 25
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
clients/agent-runtime/src/tools/memory_forget.rs (1)
182-241: 🧹 Nitpick | 🔵 TrivialAdd tests for new Cerebro/local execution paths and response parsing.
Current tests do not cover: successful local forget flow, endpoint-missing error, recall output parsing failure, and successful Cerebro forget flow. This leaves the migrated path under-tested.
As per coding guidelines: "
**/*: Security first, performance second... Look for behavioral regressions, missing tests, and contract breaks across modules."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_forget.rs` around lines 182 - 241, Add unit tests to cover the Cerebro/local execution and response-parsing branches missing from the tests for MemoryForgetTool: implement tests for a successful local forget flow (e.g., simulate successful local memory deletion and assert result.success and no error), an endpoint-missing error when Cerebro config lacks an endpoint (assert error contains “missing endpoint” or similar), a recall output parsing failure (simulate Cerebro/local response with malformed recall output and assert parsing error is returned), and a successful Cerebro forget flow (mock Cerebro response with a valid forget confirmation and assert success). Place these tests in the existing tests module (same scope as name_and_schema and forget_* tests), reuse MemoryForgetTool::new and MemoryCerebroConfig to construct instances, and assert on result.success and result.error messages to validate each branch.clients/agent-runtime/src/tools/memory_store.rs (1)
142-188:⚠️ Potential issue | 🟠 MajorValidate and screen
key, not justcontent.An empty or sensitive
keystill passes here because the schema has no non-empty constraint and the runtime only checkscontent. That value is then sent astopic_keyand echoed back in the success message.Suggested fix
let key = args - .get("key") - .and_then(|v| v.as_str()) - .ok_or_else(|| anyhow::anyhow!("Missing 'key' parameter"))?; + .get("key") + .and_then(|v| v.as_str()) + .map(str::trim) + .filter(|value| !value.is_empty()) + .ok_or_else(|| anyhow::anyhow!("Missing or empty 'key' parameter"))?; + + if contains_sensitive_data(key) { + return Ok(ToolResult { + success: false, + output: String::new(), + error: Some(SENSITIVE_DATA_ERROR.to_string()), + structured: None, + }); + }As per coding guidelines, "Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 142 - 188, The runtime currently only checks content for sensitivity and emptiness; update parameters_schema and execute to validate and sanitize the key too: in parameters_schema (method parameters_schema) make "key" require a non-empty string (e.g., minLength/regex) and in execute trim the key, reject empty keys and keys containing sensitive data by reusing contains_sensitive_data, and return the same structured ToolResult error (using SENSITIVE_DATA_ERROR) when validation fails; ensure you still map category to MemoryCategory and use the sanitized key where topic_key is sent.clients/agent-runtime/src/config/schema.rs (1)
2668-2683:⚠️ Potential issue | 🟠 MajorUnsupported
CORVUS_MEMORY_BACKENDvalues should fail, not silently fall back.Ignoring invalid overrides leaves the previous backend active. If an upgraded deployment still exports
surrealorsurreal-graphs, startup quietly keeps the existing backend instead of failing fast on the unsupported runtime path.Suggested fix
if let Ok(backend) = std::env::var("CORVUS_MEMORY_BACKEND").or_else(|_| std::env::var("MEMORY_BACKEND")) { let backend_raw = backend.trim(); if !backend_raw.is_empty() { - let backend = backend_raw.to_ascii_lowercase(); - if matches!(backend.as_str(), "sqlite" | "lucid" | "markdown" | "none") { - self.memory.backend = backend; - } else { - tracing::warn!( - "ignoring unknown memory backend override '{}'; allowed: sqlite, lucid, markdown, none", - backend_raw - ); - } + self.memory.backend = backend_raw.to_ascii_lowercase(); } }As per coding guidelines, "Look for behavioral regressions, missing tests, and contract breaks across modules".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2668 - 2683, The current apply_memory_backend_override silently warns and keeps the previous backend when CORVUS_MEMORY_BACKEND / MEMORY_BACKEND contains an unsupported value; change this to fail fast: in apply_memory_backend_override (the function that reads CORVUS_MEMORY_BACKEND / MEMORY_BACKEND and currently calls tracing::warn), replace the tracing::warn branch for invalid values with a hard failure — log an error including backend_raw and then abort startup (e.g., tracing::error!(...) followed by std::process::exit(1) or return an Err from apply_memory_backend_override if you prefer propagating errors) so invalid overrides cannot be ignored and startup fails fast instead of retaining the previous self.memory.backend.
♻️ Duplicate comments (20)
openspec/changes/archive/2026-03-16-cerebro/verify-report.md (2)
1-1:⚠️ Potential issue | 🟡 MinorUse a top-level H1 for the document title.
Change
## Verification Reportto# Verification Reportto satisfy markdown heading requirements for archive docs.As per coding guidelines, "
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-03-16-cerebro/verify-report.md` at line 1, The document title uses a level-2 heading; change the top-level heading from "## Verification Report" to a level-1 heading "# Verification Report" so the archive doc uses a proper H1 title (update the same line containing the header).
1-5:⚠️ Potential issue | 🟡 MinorAdd explicit EN/ES parity status near the header.
Please add a short line such as
EN/ES parity: ES translation pending(oravailable) to satisfy docs parity tracking.As per coding guidelines, "
**/*.{md,mdx}: For user-facing docs, check EN/ES parity or explicitly note pending translation gaps."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-03-16-cerebro/verify-report.md` around lines 1 - 5, Add an explicit EN/ES parity line directly under the "## Verification Report" header in the verify-report.md file (near the existing Change: cerebro / Version: N/A block); insert a short status line in the exact format `EN/ES parity: ES translation pending` (or `EN/ES parity: available` if translated) so docs parity tooling and reviewers can detect the translation state.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.json (1)
10-14:⚠️ Potential issue | 🟡 Minor
session_idstill lacksmaxLengthconstraint.The previous review suggested adding both
minLength: 1(done ✓) andmaxLength: 128(missing). Without an upper bound, oversized identifiers could cause storage or memory issues downstream."session_id": { "type": "string", "minLength": 1, + "maxLength": 128, "description": "Session identifier to end." },Apply the same
maxLengthto the outputsession_idat line 32-35 for consistency.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.json` around lines 10 - 14, Add a maxLength: 128 constraint to the input "session_id" schema entry and also add the same maxLength: 128 to the output "session_id" field so both request and response have matching bounds; edit the JSON objects that define "session_id" (the input field currently with "minLength": 1) and the output "session_id" (the response schema at the later lines) to include "maxLength": 128.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json (1)
24-28:⚠️ Potential issue | 🟠 MajorAdd
maxLengthtoreasonfield.The
reasonfield lacks an upper bound. Even optional fields should be bounded to prevent abuse.Proposed fix
"reason": { "type": "string", "minLength": 1, + "maxLength": 1024, "description": "Optional deletion reason." }As per coding guidelines, "Security first, performance second. Validate input boundaries."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json` around lines 24 - 28, The JSON schema for the "reason" property in mem_delete.json has no upper bound; add a "maxLength" (e.g., 256) to the "reason" object to bound input size, update the "reason" description to note the max length, and ensure any validators/clients that consume this schema (the "reason" property) respect the new limit.clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md (1)
43-46:⚠️ Potential issue | 🟠 MajorReplace inline auth token with env-var-first guidance.
Line 45 hardcodes
auth_tokenin config, which encourages committing secrets. Use a placeholder/comment and explicitly direct users toCEREBRO_AUTH_TOKENor a secrets manager.Suggested doc fix
[memory.cerebro] endpoint = "https://cerebro.example.com/mcp" -auth_token = "token-rotate-regularly" +# auth_token = "from-env" # Prefer env var: CEREBRO_AUTH_TOKEN request_timeout_ms = 30000 allow_insecure_loopback = false+Security note: do not commit `auth_token` to source control. Prefer +`CEREBRO_AUTH_TOKEN` (or your secret manager) and inject at runtime.As per coding guidelines: "
**/*: Security first, performance second. Validate ... secret management."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md` around lines 43 - 46, The config block sets memory.cerebro.auth_token inline which risks committing secrets; replace the hardcoded auth_token value with a non-secret placeholder or commented example and update the text to instruct users to source the token from an environment variable (CEREBRO_AUTH_TOKEN) or a secrets manager, showing how to reference that env var instead of embedding the token and adding a short note about rotating/storing tokens securely.openspec/specs/cerebro/spec.md (2)
112-116:⚠️ Potential issue | 🟠 MajorAdd explicit MCP auth/authz requirements, not just transport constraints.
This section enforces
https/wssand loopback policy, but it still lacks a normative auth/authz contract (required auth method(s), least-privilege scope, auth failure behavior, and secret handling/rotation). Please add a dedicated requirement subsection with MUST-level statements for runtime-to-Cerebro authentication and authorization.As per coding guidelines, "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
Also applies to: 117-127, 129-141
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 112 - 116, Add a new dedicated subsection under "Requirement: Secure Configuration Defaults" (near the existing HTTPS/WSS and loopback policy text) that contains normative MUST-level statements specifying runtime-to-Cerebro authentication and authorization: require mutually authenticated TLS or token-based mTLS/OIDC client authentication as the permitted auth methods, require least-privilege scopes/roles (define minimal scope names or claims), mandate explicit failure behavior for auth failures (deny connection, log with minimal sensitive detail, and rate-limit retries), and require secret handling and rotation policies (short-lived credentials, automated rotation, and secure storage). Reference the existing section header "Requirement: Secure Configuration Defaults" and ensure every statement is at MUST level and adds requirements for auth method(s), scope/role naming, auth failure behavior, and secret lifecycle management.
85-87:⚠️ Potential issue | 🟡 MinorMake legacy SurrealDB rejection message deterministic.
The quoted message still ends with
..., which keeps implementation/tests ambiguous. Define an exact required message (or a strict template with placeholders) including removal, Cerebro alternative, and migration doc reference.As per coding guidelines, "
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 85 - 87, Replace the non-deterministic rejection message in the THEN clause that currently ends with "..." with a single exact message (or a strict template) so tests are unambiguous; update the quoted text "SurrealDB backend has been removed; use the Cerebro backend for long-term memory..." to a deterministic string such as "SurrealDB backend has been removed; use the Cerebro backend for long-term memory. See {MIGRATION_DOC_URL} for migration instructions." or use the template "SurrealDB backend has been removed; use the Cerebro backend for long-term memory. See <migration-doc>." and apply this exact message wherever the phrase "SurrealDB backend has been removed; use the Cerebro backend for long-term memory" appears in spec.md so examples/tests match precisely.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.json (1)
10-18:⚠️ Potential issue | 🟠 MajorAdd
maxLengthconstraints to identifier fields.Line 10 (
input.session_id), Line 15 (input.agent_id), and Line 37 (output.session_id) are still unbounded on maximum size. That leaves request/response contracts open to oversized IDs.Suggested fix
"session_id": { "type": "string", "minLength": 1, + "maxLength": 255, "description": "Client-generated session identifier." }, "agent_id": { "type": "string", "minLength": 1, + "maxLength": 255, "description": "Agent identifier for attribution." }, @@ "session_id": { "type": "string", - "minLength": 1 + "minLength": 1, + "maxLength": 255 },As per coding guidelines, "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
Also applies to: 37-39
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.json` around lines 10 - 18, The schema leaves identifier fields unbounded; add sensible maxLength constraints for "session_id", "agent_id" (in input) and "session_id" in output to prevent oversized IDs—locate the properties named session_id and agent_id in the JSON schema (input.session_id, input.agent_id, output.session_id) and add an appropriate "maxLength" (e.g., 256 or your project's chosen limit) to each property, update descriptions if needed to note the limit, and ensure these constraints are applied consistently across the input and output schema entries.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json (3)
10-14:⚠️ Potential issue | 🟡 MinorAdd
maxLengthtosession_idto complete input boundary validation.
minLength: 1was added (good), butmaxLengthis still missing. Unbounded identifiers open the door to oversized payloads. Match the pattern in other MCP schemas.Suggested fix
"session_id": { "type": "string", "minLength": 1, + "maxLength": 128, "description": "Session identifier to summarize." },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json` around lines 10 - 14, The schema property "session_id" currently has minLength but no upper bound; add a maxLength to complete input boundary validation by updating the "session_id" JSON schema entry (the object with "type": "string", "minLength": 1) to include a "maxLength" value consistent with other MCP schemas (e.g., 255 or the project-standard identifier limit) so identifiers are size-limited and payloads are bounded.
15-41:⚠️ Potential issue | 🟠 MajorAdd
requiredarray tosummaryobject to enforce mandatory fields.
additionalProperties: falsewas added (good), but thesummaryobject still lacks arequiredconstraint. This allows clients to submit an empty{}summary, breaking the contract established by the description and peer schemas.Suggested fix
"next_steps": { "type": "array", "items": { "type": "string", "minLength": 1 } } }, + "required": ["goal", "discoveries", "accomplished", "blockers", "next_steps"], "additionalProperties": false },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json` around lines 15 - 41, The summary object lacks a required array so clients can submit an empty {}; add a "required" property to the summary schema listing the mandatory keys (e.g. "goal", "discoveries", "accomplished", "blockers", "next_steps") to enforce presence of those fields in the "summary" object and keep "additionalProperties": false unchanged; update the schema definition for "summary" (the summary object in the mem_session_summary.json) to include this required array.
23-38: 🧹 Nitpick | 🔵 TrivialConsider adding
maxItemsandmaxLengthto array fields for defense-in-depth.Arrays (
discoveries,accomplished,blockers,next_steps) have no upper bounds. Malicious or buggy clients could submit arbitrarily large payloads. AddingmaxItemsandmaxLengthon items provides a safety net.Example constraint
"discoveries": { "type": "array", + "maxItems": 100, - "items": { "type": "string", "minLength": 1 } + "items": { "type": "string", "minLength": 1, "maxLength": 2000 } },🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json` around lines 23 - 38, Add upper bounds to the array schemas for defense-in-depth: for each of the properties "discoveries", "accomplished", "blockers", and "next_steps" add a "maxItems" (e.g., 50) and inside each "items" object add a "maxLength" for strings (e.g., 1000) in addition to the existing "minLength", so the schema for each array limits both number of entries and maximum string length; adjust the numeric limits if your domain requires different caps.clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json (2)
31-34: 🧹 Nitpick | 🔵 TrivialConsider adding explicit
default: falseto match documentation.The description states "default false" but the schema lacks an explicit
"default": falsedeclaration. Adding it prevents contract drift across implementations."include_deleted": { "type": "boolean", + "default": false, "description": "Include soft-deleted records (default false)." }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json` around lines 31 - 34, Add an explicit "default": false to the include_deleted boolean property in the mem_search.json JSON schema so the schema matches the description; locate the include_deleted property definition and add "default": false alongside "type": "boolean" (and keep the existing description) to prevent contract drift across implementations.
42-57: 🧹 Nitpick | 🔵 TrivialAdd
maxItemsto bound response size.The
resultsarray has no upper limit, allowing potentially large responses. Consider adding"maxItems": 100to match the inputlimitconstraint and protect against oversized payloads."results": { "type": "array", + "maxItems": 100, "items": {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json` around lines 42 - 57, The response schema's "results" array lacks an upper bound which can return oversized payloads; update the "results" item definition in the mem_search.json schema to add "maxItems": 100 (matching the input limit) inside the "results" object so that the array is constrained, and keep existing "items", "required", and "additionalProperties" intact to enforce element shape while preventing overly large responses.clients/agent-runtime/src/tools/memory_recall.rs (2)
10-28:⚠️ Potential issue | 🟠 Major
memory_recallstill bypassesSecurityPolicy.This type no longer carries a policy handle, so neither the local path nor the Cerebro path enforces
ToolOperation::Readformemory_recall. Based on learnings: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable.Also applies to: 61-152
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_recall.rs` around lines 10 - 28, MemoryRecallTool currently lacks any SecurityPolicy enforcement (the struct only has cerebro and local), so memory reads via new/with_local bypass ToolOperation::Read checks; update MemoryRecallTool to carry a SecurityPolicy handle (or reference) and validate policy.allow/ enforce ToolOperation::Read before any local or cerebro memory access (in methods that perform recall), using the policy API (e.g., SecurityPolicy::enforce or similar) and return/propagate an authorization error when the check fails; locate the struct and its constructors (MemoryRecallTool, new, with_local) and the recall methods that call into Memory (Arc<dyn Memory>) or MemoryCerebroConfig to add the policy check and fail closed by default.
113-136:⚠️ Potential issue | 🟠 MajorAdd an explicit egress check before remote recall.
After the endpoint presence check, this path constructs the Cerebro adapter and sends outbound MCP traffic without an
enforce_cerebro_egress-style gate.memory_storealready protects its remote path this way. Based on learnings: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treatsrc/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_recall.rs` around lines 113 - 136, After verifying endpoint presence, add an explicit egress policy gate call (the same pattern used by memory_store) before constructing the adapter so we never send outbound MCP traffic without approval: call the existing enforce_cerebro_egress (or the project's equivalent policy check) with the Cerebro config and abort with a ToolResult-style error if the check denies egress; place this check immediately before invoking cerebro::cerebro_tool_adapter(&self.cerebro, normalize::CEREBRO_TOOL_RECALL) and before calling adapter.execute so that normalize::CEREBRO_TOOL_RECALL, cerebro::cerebro_tool_adapter, and adapter.execute are only reached when egress is allowed.modules/cerebro/src/server.rs (2)
141-145:⚠️ Potential issue | 🟠 MajorRequire canonical
Bearerauth when auth is enabled.Falling back to the raw header value accepts non-standard
Authorizationheaders and weakens the auth contract on this security-sensitive path.As per coding guidelines "Security first, performance second. Validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/server.rs` around lines 141 - 145, The current logic accepts any Authorization header by falling back to the raw value when strip_prefix("Bearer ") fails; change it to require the canonical "Bearer " prefix and reject otherwise. In the auth handling (auth_header, header, token) update the flow so that if auth_header is None or header does not start_with "Bearer " you return Err(CerebroError::Unauthorized); only strip the "Bearer " prefix and trim when the prefix is present to produce token, and treat empty token as Unauthorized as before.
28-34:⚠️ Potential issue | 🟠 MajorUse numeric JSON-RPC error codes.
error.codemust be an integer in JSON-RPC 2.0. Emitting strings like"invalid_request"and"invalid_method"will break interoperable clients.Also applies to: 82-99, 172-176
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/server.rs` around lines 28 - 34, Change JsonRpcError.code from a string to a numeric type (e.g., i64 or i32) and update all places that construct or serialize JsonRpcError (including the other occurrences noted) to emit JSON-RPC 2.0 numeric error codes instead of textual ones; replace values like "invalid_request" / "invalid_method" with the appropriate integers (e.g., -32600 for Invalid Request, -32601 for Method Not Found, -32602 for Invalid Params, -32700 for Parse Error, or -32000..-32099 for server errors) and ensure any helper builders/constructors that create JsonRpcError (the JsonRpcError struct and any functions that instantiate it) are updated to pass the numeric constants and compile with the new field type.modules/cerebro/src/tools.rs (1)
227-233:⚠️ Potential issue | 🟠 MajorGenerate a real
memory_idon save.Using
topic_keyas the record ID means the secondmem_savefor the same topic overwrites the first record in ID-keyed storage, so history disappears.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/tools.rs` around lines 227 - 233, The code currently uses topic_key as the record ID (let memory_id = input.input.topic_key.clone()) which causes new saves to overwrite previous records; change the creation of memory_id used in MemoryRecord::new(...) to generate a real unique ID (e.g., a UUID or timestamp+random suffix) at save time so each mem_save produces a distinct MemoryRecord id; update the call site that builds the MemoryRecord (MemoryRecord::new) and any downstream expectations of id uniqueness (e.g., save/insert functions) to use the new generated id instead of input.input.topic_key.clients/agent-runtime/src/agent/memory_loader.rs (2)
86-159:⚠️ Potential issue | 🟠 MajorDon't drop local memory when Cerebro is enabled.
_memoryis unused, so this loader replaces local short-term recall instead of augmenting it. Query local memory first and merge Cerebro hits on top.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/agent/memory_loader.rs` around lines 86 - 159, The loader currently ignores the provided local memory (parameter named _memory) and bails when Cerebro endpoint is missing, which replaces instead of augmenting local short-term recall; update load_context(&self, _memory: &dyn Memory, user_message: &str) to accept/use the memory (remove the leading underscore or shadow to a local variable name like memory), call memory.load_context(user_message).await? to get local_context first, then keep the Cerebro call but if endpoint is missing or Cerebro returns empty results return local_context; when Cerebro returns hits build cerebro_context as now and merge by prepending cerebro_context on top of local_context (or concatenating cerebro_context + local_context) before returning, preserving existing filtering by min_relevance_score and existing error handling for adapter.execute only when endpoint present.
120-127:⚠️ Potential issue | 🟠 MajorMissing
resultsshould be an error, not an empty success.Returning
Ok(String::new())here makes malformed Cerebro/auth/proxy responses indistinguishable from “no matches.”🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/agent/memory_loader.rs` around lines 120 - 127, The code currently treats a missing "results" field as a successful empty match; instead make it return an error so malformed responses are distinguishable. In the match on value.get("results").and_then(serde_json::Value::as_array) (in memory_loader.rs where you parse response.output into `value` and bind `results`), replace the None branch that returns Ok(String::new()) with an Err that includes a clear message and the raw response (e.g., include `response.output` or `value`), using the function's existing Result error type so callers can detect and log malformed Cerebro/auth/proxy responses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/src/agent/agent.rs`:
- Around line 360-378: Replace the inline cerebro detection logic with the
existing helper: call crate::memory::cerebro_configured(&config.memory) instead
of repeating the as_deref/trim/is_some_and check; then construct memory_loader
exactly as before (using CerebroMemoryLoader::new(...) when cerebro_configured
is true, otherwise DefaultMemoryLoader::new(...)) so the code uses the single
source of truth for cerebro configuration and avoids duplicated logic.
- Around line 573-576: The call to
memory_loader.load_context(self.memory.as_ref(), user_message).await now
propagates errors (using ?) rather than silently returning an empty context as
before (unwrap_or_default), which can cause runtime failures if Cerebro is
configured but unreachable; either revert to the previous fallback behavior by
mapping the error to a default context with logging (e.g., replace the ? with an
explicit .await.map_err(|e| { log::warn!(...) ; e }).unwrap_or_default() pattern
or a .await.unwrap_or_else(|e| { log::warn!(...) ; Default::default() }) so
unreachable Cerebro yields a safe empty context, or keep the propagation and
update all callers of this call site to accept Result and handle the error;
locate the call at memory_loader.load_context(...), the surrounding use of
self.memory and user_message, and choose one of these two fixes consistently.
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2848-2851: The migration error in the match arm handling "surreal"
| "surreal-graphs" incorrectly tells users to "use the Cerebro backend" even
though memory.backend only accepts sqlite, lucid, markdown, or none; update the
error message in that match arm (the code handling "surreal" | "surreal-graphs"
in schema.rs) to clearly state the valid memory.backend choices and instruct
users to configure the Cerebro settings under [memory.cerebro] instead,
including the migration link; ensure the message explicitly directs legacy
SurrealDB users to set up [memory.cerebro] (and not change memory.backend to an
unsupported value) so validation errors are avoided.
In `@clients/agent-runtime/src/tools/mcp/cerebro.rs`:
- Around line 25-34: The code accepts a whitespace-only cerebro.endpoint; update
the endpoint extraction to mirror auth_token by trimming and rejecting blank
values: change the endpoint binding that uses cerebro.endpoint.as_deref() to
also .map(str::trim).filter(|v| !v.is_empty()).ok_or_else(...) so a missing or
blank MemoryCerebroConfig.endpoint fails immediately with a clear error instead
of deferring to later URL parsing; refer to the existing variables/fields named
endpoint, auth_token and the cerebro binding when making the change.
In `@clients/agent-runtime/src/tools/mcp/client.rs`:
- Around line 221-261: The code currently deserializes the HTTP body to JSON
before checking status, causing non-JSON error pages to produce a JSON parse
error; modify the flow in the MCP client (around response, status, body,
payload, and name handling) to: first read the body bytes into body and check
status via status.is_success(); if the status is not success, return the
mcp_transport_error using the raw body (or attempt a best-effort JSON parse but
fall back to including the body as a string) so the error shows the actual
server response; only call serde_json::from_slice(&body) to produce payload when
status.is_success() and you expect a JSON-RPC payload.
In `@clients/agent-runtime/src/tools/memory_forget.rs`:
- Around line 105-110: The endpoint value is only trimmed for validation into
the local variable endpoint but the untrimmed self.cerebro is still passed into
adapter construction (e.g., where adapters are built around lines with
constructs that use self.cerebro to create the adapter); update the adapter
construction sites to use the sanitized endpoint string (use the endpoint
variable or replace cerebro.endpoint with endpoint.unwrap_or_default() as
appropriate) so that the trimmed value is propagated into the adapter creation
paths (references: the local variable endpoint and the adapter construction
calls that currently consume self.cerebro).
- Around line 121-123: The current code in memory_forget.rs uses fallible calls
(cerebro::cerebro_tool_adapter, building recall_payload, and subsequent calls
around recall_adapter/recall_payload) that propagate errors with ? and bypass
the ToolResult contract; change these sites (references:
cerebro::cerebro_tool_adapter, normalize::CEREBRO_TOOL_RECALL, recall_payload,
recall_adapter) to validate and sanitize inputs and map any error into a
structured ToolResult failure value (e.g., ToolResult::Failure or the project’s
equivalent) instead of returning Err(...), ensuring every runtime error is
caught and converted into the tool’s structured failure shape before returning
to callers.
In `@clients/agent-runtime/src/tools/memory_store.rs`:
- Around line 202-253: The current memory_store path leaks backend/transport
errors via `?` on `local.store`, `cerebro::cerebro_tool_adapter`, and
`adapter.execute`; change each to catch errors and convert them into a
`ToolResult` with `success: false`, `output: String::new()`, and `error:
Some(...)` (include the error string) instead of propagating the error;
specifically replace the `local.store(key, content, category, None).await?`
call, the `cerebro::cerebro_tool_adapter(&self.cerebro,
normalize::CEREBRO_TOOL_STORE)?` call, and the `adapter.execute(payload).await?`
call with explicit match/if let handling that returns the structured
`ToolResult` on failure while keeping the existing success `ToolResult` shape on
success.
- Around line 81-103: The last raw string in sensitive_regex_set uses
backslash-escaped quotes inside a r"..." literal which ends the string
prematurely; update that pattern to use Rust's raw string with hashes (e.g.,
r#"... "#) so the inner quotes are treated literally — modify the entry in the
RegexSet array (the one matching JSON key/value like
"\"(password|passphrase|api[_-]?key|access[_-]?key|secret|token)\"\s*:\s*\"[^\"]+\"")
to use r#"... "# raw string syntax so the embedded double quotes and backslashes
are preserved and the RegexSet::new(...) compiles.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json`:
- Around line 10-18: Add an upper bound to the string fields to prevent
oversized payloads: update the schema properties "memory_id" and "topic_key" to
include a sensible "maxLength" (e.g., 255 or another agreed limit) alongside the
existing "minLength": 1; ensure both entries in the mem_delete.json schema (the
"memory_id" and "topic_key" properties) are modified so validation rejects
strings longer than the chosen limit before reaching backend logic.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.json`:
- Around line 20-24: The metadata schema currently allows unbounded objects (the
"metadata" property with "additionalProperties": true). Constrain it by adding
limits such as "maxProperties": <reasonable_number> (e.g., 20) and/or limiting
value sizes with "additionalProperties": { "type": "string", "maxLength":
<chars> } or using "patternProperties" to whitelist expected keys;
alternatively, if unbounded flexibility is required, update the "description" to
document expected usage and size limits. Update the "metadata" object definition
(the "metadata" property) accordingly to enforce these bounds or to clarify
intended limits.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.json`:
- Around line 25-29: The metadata schema currently allows unbounded keys/values;
update the "metadata" object schema to enforce limits (e.g., add
"maxProperties": 50 or a sensible cap) and add value constraints such as
"additionalProperties": { "type": "string" } or pattern/value schemas (or use
"patternProperties") to restrict types/lengths; ensure the "metadata" definition
(the object with "type": "object", "description": "Optional session metadata.")
includes "maxProperties" and appropriate "additionalProperties" or
"patternProperties" rules to prevent unbounded payloads.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_suggest_topic_key.json`:
- Around line 10-34: The schema lacks upper bounds for string lengths and array
size; update the mem_suggest_topic_key.json schema by adding reasonable
maxLength limits to the "seed", "scope", and "topic_key" string properties and
adding a maxItems (and optionally maxLength on each candidate string) to the
"candidates" array to prevent oversized request/response payloads; locate and
modify the "seed", "scope", "topic_key", and "candidates" property definitions
in the JSON to include these constraints.
In `@clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md`:
- Line 86: Update the rollback guidance in the sentence "Prepare a rollback plan
(restore old config + re-enable legacy storage if needed)" to remove the
now-invalid "re-enable legacy storage" option and replace it with actionable
rollback steps such as "revert deployment/config to prior release, restore
exported data backups, or run local-only mode until Cerebro recovers"; also add
a short note about including threat/risk considerations for
security/runtime/gateway changes and a reminder to add tests for boundary and
failure modes. Locate this text in migration.md and replace the phrase
accordingly and append the brief risk/test note nearby.
In `@modules/cerebro/Cargo.toml`:
- Line 18: The tokio dependency currently enables the heavier multi-threaded
runtime via the "rt-multi-thread" feature; evaluate whether the memory service
requires multi-threaded concurrency and, if not, change the tokio features in
Cargo.toml by replacing "rt-multi-thread" with the single-threaded "rt" feature
(keeping "macros", "net", "time" as needed) to reduce footprint, otherwise leave
as-is and document why multithreaded runtime is necessary for the service's
workload.
- Around line 9-19: The Cargo.toml dependencies block duplicates versions that
should be inherited from a workspace and leaves some critical crates unpinned;
if a workspace Cargo.toml exists, switch relevant entries in the dependencies
table (e.g., anyhow, axum, tokio, secrecy, uuid) to workspace = true so they
inherit versions (use the pattern like anyhow.workspace = true), and for
security-sensitive crates (axum, tokio, secrecy) change the version specifiers
to exact pins (use '=' exact versions rather than caret ranges) so updates are
controlled; update the dependencies entries for axum, tokio, and secrecy (and
any other critical crates) accordingly and verify Cargo.lock / workspace
manifest consistency.
In `@modules/cerebro/src/config.rs`:
- Around line 71-89: bind_addr() and endpoint() produce invalid strings for IPv6
literals (e.g. ::1) because they interpolate host and port without IPv6 bracket
notation; update both functions to detect IP literals and wrap IPv6 addresses in
brackets or construct the outputs from a SocketAddr/Url. Specifically, in
bind_addr() and endpoint() (which uses scheme, host, port and
is_loopback_host()), normalize self.host by parsing it as IpAddr/SocketAddr or
using std::net::SocketAddr to format as "[::1]:port" for IPv6, and when building
the endpoint ensure the host is bracketed before inserting into
"{scheme}://{host}:{port}/mcp" (or build the full URL via url::Url from a
SocketAddr) so both functions produce valid bind addresses and URLs for IPv6.
In `@modules/cerebro/src/main.rs`:
- Around line 6-14: Initialize tracing at startup and add graceful shutdown
wiring in main: set up a tracing subscriber (e.g., tracing_subscriber::fmt or
layered subscriber) before creating CerebroService to enable logs, and replace
axum::serve(listener, service.router()).await with an axum Server built from the
TcpListener (use tokio::net::TcpListener and axum::Server::from_tcp) so you can
call .serve(service.router().into_make_service()) and chain
.with_graceful_shutdown(shutdown_signal()). Implement shutdown_signal() using
tokio::signal (handle ctrl_c and SIGTERM on Unix) to drive graceful shutdown;
ensure CerebroConfig::default(), CerebroService::from_config(...), main(),
listener and service.router() are used unchanged except for adding tracing init
and the Server + graceful shutdown wiring.
In `@modules/cerebro/src/server.rs`:
- Around line 148-155: The code currently grants audit privileges when
audit_token.is_none() by setting is_audit = audit_token.is_none(); change this
so the main-token match (token == expected) returns AuthContext { is_audit:
false } unconditionally, and only set is_audit: true in the explicit audit-token
match branch (the if using audit_token.is_some_and(|audit| token == audit)).
Update the logic around token, expected, audit_token and AuthContext so audit
access is only given when the audit token explicitly matches.
In `@modules/cerebro/src/storage/mod.rs`:
- Around line 107-116: The storage I/O errors do not include the affected file
paths; update the map_err messages around storage operations (the create-dir
call, serde_json::to_vec_pretty usage, fs::write to tmp_path, and fs::rename to
self.path) to include the path(s) for actionable diagnostics — e.g., include
self.path (and the tmp_path you construct) via their Display form in the
CerebroError::Storage messages so each error shows which file or temp file
failed (use self.path.display() and tmp_path.display() in the formatted error
strings).
- Around line 194-198: The save method mutates the in-memory map before calling
self.persist, so if persist fails the in-memory state diverges; change save (and
the other mutating methods around lines 205-216) to rollback the in-memory
mutation on persist failure: perform the existing map.insert/map.remove
operations as now, call self.persist(&map), and if persist returns Err then undo
the prior mutation (e.g., call map.remove(&record.memory_id) after an insert, or
re-insert the old value after an update/delete) and return the Err from persist.
Ensure you reference the same lock and map variable (self.records.write().await
/ map) so rollback happens under the same write lock.
- Around line 100-117: persist performs blocking std::fs operations while called
under the async write lock in DiskBackedStorage::save and
DiskBackedStorage::delete, causing Tokio thread blocking; change the
implementation so filesystem IO does not run while the write lock is held:
either (A) make persist async and use tokio::fs::create_dir_all /
tokio::fs::write / tokio::fs::rename and await it outside the critical section,
or (B) release the write lock before doing the blocking persist call, or (C)
wrap the current persist body in tokio::task::spawn_blocking and await that task
after unlocking; update references to persist, DiskBackedStorage::save, and
DiskBackedStorage::delete accordingly so the lock is held only for in-memory
state mutation and not for disk IO.
- Around line 112-116: The atomic write uses fs::rename(&tmp_path, &self.path)
which on Windows fails if the destination exists; modify the save sequence in
the Storage implementation (around tmp_path and self.path usage) to proactively
remove the existing target before renaming—e.g. call
fs::remove_file(&self.path).ok() (or a platform-specific removal) immediately
before fs::rename and keep existing map_err handling to convert errors into
CerebroError::Storage so repeated saves succeed on Windows.
In `@modules/cerebro/tests/mcp_auth_policy.rs`:
- Around line 5-51: Add two new async tests alongside
rejects_requests_without_auth_token and accepts_requests_with_valid_auth_token
that exercise CerebroService::handle_json_rpc for malformed tokens: one that
calls handle_json_rpc with Some("Bearer wrong") and asserts the response
contains an auth error (error.code == "unauthorized"), and another that calls
handle_json_rpc with Some("secret") (missing "Bearer " prefix) and also asserts
an unauthorized error; reuse the same request construction and
InMemoryStorage/CerebroConfig setup from the existing tests to keep them
consistent.
In `@modules/cerebro/tests/mcp_tools_contract.rs`:
- Around line 171-178: The call to the helper call_tool is missing the required
second parameter auth_header: Option<&str>; update the invocation of call_tool
(the one using &service, "mem_get_observation", json!({...})) to pass an auth
header option (e.g., Some(auth_header_str) or None) as the second argument so
the call matches the signature call_tool(&service, auth_header,
"mem_get_observation", json!(...)); ensure you pass the actual auth header
variable used elsewhere in the test or None if unauthenticated.
---
Outside diff comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2668-2683: The current apply_memory_backend_override silently
warns and keeps the previous backend when CORVUS_MEMORY_BACKEND / MEMORY_BACKEND
contains an unsupported value; change this to fail fast: in
apply_memory_backend_override (the function that reads CORVUS_MEMORY_BACKEND /
MEMORY_BACKEND and currently calls tracing::warn), replace the tracing::warn
branch for invalid values with a hard failure — log an error including
backend_raw and then abort startup (e.g., tracing::error!(...) followed by
std::process::exit(1) or return an Err from apply_memory_backend_override if you
prefer propagating errors) so invalid overrides cannot be ignored and startup
fails fast instead of retaining the previous self.memory.backend.
In `@clients/agent-runtime/src/tools/memory_forget.rs`:
- Around line 182-241: Add unit tests to cover the Cerebro/local execution and
response-parsing branches missing from the tests for MemoryForgetTool: implement
tests for a successful local forget flow (e.g., simulate successful local memory
deletion and assert result.success and no error), an endpoint-missing error when
Cerebro config lacks an endpoint (assert error contains “missing endpoint” or
similar), a recall output parsing failure (simulate Cerebro/local response with
malformed recall output and assert parsing error is returned), and a successful
Cerebro forget flow (mock Cerebro response with a valid forget confirmation and
assert success). Place these tests in the existing tests module (same scope as
name_and_schema and forget_* tests), reuse MemoryForgetTool::new and
MemoryCerebroConfig to construct instances, and assert on result.success and
result.error messages to validate each branch.
In `@clients/agent-runtime/src/tools/memory_store.rs`:
- Around line 142-188: The runtime currently only checks content for sensitivity
and emptiness; update parameters_schema and execute to validate and sanitize the
key too: in parameters_schema (method parameters_schema) make "key" require a
non-empty string (e.g., minLength/regex) and in execute trim the key, reject
empty keys and keys containing sensitive data by reusing
contains_sensitive_data, and return the same structured ToolResult error (using
SENSITIVE_DATA_ERROR) when validation fails; ensure you still map category to
MemoryCategory and use the sanitized key where topic_key is sent.
---
Duplicate comments:
In `@clients/agent-runtime/src/agent/memory_loader.rs`:
- Around line 86-159: The loader currently ignores the provided local memory
(parameter named _memory) and bails when Cerebro endpoint is missing, which
replaces instead of augmenting local short-term recall; update
load_context(&self, _memory: &dyn Memory, user_message: &str) to accept/use the
memory (remove the leading underscore or shadow to a local variable name like
memory), call memory.load_context(user_message).await? to get local_context
first, then keep the Cerebro call but if endpoint is missing or Cerebro returns
empty results return local_context; when Cerebro returns hits build
cerebro_context as now and merge by prepending cerebro_context on top of
local_context (or concatenating cerebro_context + local_context) before
returning, preserving existing filtering by min_relevance_score and existing
error handling for adapter.execute only when endpoint present.
- Around line 120-127: The code currently treats a missing "results" field as a
successful empty match; instead make it return an error so malformed responses
are distinguishable. In the match on
value.get("results").and_then(serde_json::Value::as_array) (in memory_loader.rs
where you parse response.output into `value` and bind `results`), replace the
None branch that returns Ok(String::new()) with an Err that includes a clear
message and the raw response (e.g., include `response.output` or `value`), using
the function's existing Result error type so callers can detect and log
malformed Cerebro/auth/proxy responses.
In `@clients/agent-runtime/src/tools/memory_recall.rs`:
- Around line 10-28: MemoryRecallTool currently lacks any SecurityPolicy
enforcement (the struct only has cerebro and local), so memory reads via
new/with_local bypass ToolOperation::Read checks; update MemoryRecallTool to
carry a SecurityPolicy handle (or reference) and validate policy.allow/ enforce
ToolOperation::Read before any local or cerebro memory access (in methods that
perform recall), using the policy API (e.g., SecurityPolicy::enforce or similar)
and return/propagate an authorization error when the check fails; locate the
struct and its constructors (MemoryRecallTool, new, with_local) and the recall
methods that call into Memory (Arc<dyn Memory>) or MemoryCerebroConfig to add
the policy check and fail closed by default.
- Around line 113-136: After verifying endpoint presence, add an explicit egress
policy gate call (the same pattern used by memory_store) before constructing the
adapter so we never send outbound MCP traffic without approval: call the
existing enforce_cerebro_egress (or the project's equivalent policy check) with
the Cerebro config and abort with a ToolResult-style error if the check denies
egress; place this check immediately before invoking
cerebro::cerebro_tool_adapter(&self.cerebro, normalize::CEREBRO_TOOL_RECALL) and
before calling adapter.execute so that normalize::CEREBRO_TOOL_RECALL,
cerebro::cerebro_tool_adapter, and adapter.execute are only reached when egress
is allowed.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_delete.json`:
- Around line 24-28: The JSON schema for the "reason" property in
mem_delete.json has no upper bound; add a "maxLength" (e.g., 256) to the
"reason" object to bound input size, update the "reason" description to note the
max length, and ensure any validators/clients that consume this schema (the
"reason" property) respect the new limit.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_search.json`:
- Around line 31-34: Add an explicit "default": false to the include_deleted
boolean property in the mem_search.json JSON schema so the schema matches the
description; locate the include_deleted property definition and add "default":
false alongside "type": "boolean" (and keep the existing description) to prevent
contract drift across implementations.
- Around line 42-57: The response schema's "results" array lacks an upper bound
which can return oversized payloads; update the "results" item definition in the
mem_search.json schema to add "maxItems": 100 (matching the input limit) inside
the "results" object so that the array is constrained, and keep existing
"items", "required", and "additionalProperties" intact to enforce element shape
while preventing overly large responses.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_end.json`:
- Around line 10-14: Add a maxLength: 128 constraint to the input "session_id"
schema entry and also add the same maxLength: 128 to the output "session_id"
field so both request and response have matching bounds; edit the JSON objects
that define "session_id" (the input field currently with "minLength": 1) and the
output "session_id" (the response schema at the later lines) to include
"maxLength": 128.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_start.json`:
- Around line 10-18: The schema leaves identifier fields unbounded; add sensible
maxLength constraints for "session_id", "agent_id" (in input) and "session_id"
in output to prevent oversized IDs—locate the properties named session_id and
agent_id in the JSON schema (input.session_id, input.agent_id,
output.session_id) and add an appropriate "maxLength" (e.g., 256 or your
project's chosen limit) to each property, update descriptions if needed to note
the limit, and ensure these constraints are applied consistently across the
input and output schema entries.
In
`@clients/web/apps/docs/src/content/docs/guides/cerebro/mcp-schema/mem_session_summary.json`:
- Around line 10-14: The schema property "session_id" currently has minLength
but no upper bound; add a maxLength to complete input boundary validation by
updating the "session_id" JSON schema entry (the object with "type": "string",
"minLength": 1) to include a "maxLength" value consistent with other MCP schemas
(e.g., 255 or the project-standard identifier limit) so identifiers are
size-limited and payloads are bounded.
- Around line 15-41: The summary object lacks a required array so clients can
submit an empty {}; add a "required" property to the summary schema listing the
mandatory keys (e.g. "goal", "discoveries", "accomplished", "blockers",
"next_steps") to enforce presence of those fields in the "summary" object and
keep "additionalProperties": false unchanged; update the schema definition for
"summary" (the summary object in the mem_session_summary.json) to include this
required array.
- Around line 23-38: Add upper bounds to the array schemas for defense-in-depth:
for each of the properties "discoveries", "accomplished", "blockers", and
"next_steps" add a "maxItems" (e.g., 50) and inside each "items" object add a
"maxLength" for strings (e.g., 1000) in addition to the existing "minLength", so
the schema for each array limits both number of entries and maximum string
length; adjust the numeric limits if your domain requires different caps.
In `@clients/web/apps/docs/src/content/docs/guides/cerebro/migration.md`:
- Around line 43-46: The config block sets memory.cerebro.auth_token inline
which risks committing secrets; replace the hardcoded auth_token value with a
non-secret placeholder or commented example and update the text to instruct
users to source the token from an environment variable (CEREBRO_AUTH_TOKEN) or a
secrets manager, showing how to reference that env var instead of embedding the
token and adding a short note about rotating/storing tokens securely.
In `@modules/cerebro/src/server.rs`:
- Around line 141-145: The current logic accepts any Authorization header by
falling back to the raw value when strip_prefix("Bearer ") fails; change it to
require the canonical "Bearer " prefix and reject otherwise. In the auth
handling (auth_header, header, token) update the flow so that if auth_header is
None or header does not start_with "Bearer " you return
Err(CerebroError::Unauthorized); only strip the "Bearer " prefix and trim when
the prefix is present to produce token, and treat empty token as Unauthorized as
before.
- Around line 28-34: Change JsonRpcError.code from a string to a numeric type
(e.g., i64 or i32) and update all places that construct or serialize
JsonRpcError (including the other occurrences noted) to emit JSON-RPC 2.0
numeric error codes instead of textual ones; replace values like
"invalid_request" / "invalid_method" with the appropriate integers (e.g., -32600
for Invalid Request, -32601 for Method Not Found, -32602 for Invalid Params,
-32700 for Parse Error, or -32000..-32099 for server errors) and ensure any
helper builders/constructors that create JsonRpcError (the JsonRpcError struct
and any functions that instantiate it) are updated to pass the numeric constants
and compile with the new field type.
In `@modules/cerebro/src/tools.rs`:
- Around line 227-233: The code currently uses topic_key as the record ID (let
memory_id = input.input.topic_key.clone()) which causes new saves to overwrite
previous records; change the creation of memory_id used in
MemoryRecord::new(...) to generate a real unique ID (e.g., a UUID or
timestamp+random suffix) at save time so each mem_save produces a distinct
MemoryRecord id; update the call site that builds the MemoryRecord
(MemoryRecord::new) and any downstream expectations of id uniqueness (e.g.,
save/insert functions) to use the new generated id instead of
input.input.topic_key.
In `@openspec/changes/archive/2026-03-16-cerebro/verify-report.md`:
- Line 1: The document title uses a level-2 heading; change the top-level
heading from "## Verification Report" to a level-1 heading "# Verification
Report" so the archive doc uses a proper H1 title (update the same line
containing the header).
- Around line 1-5: Add an explicit EN/ES parity line directly under the "##
Verification Report" header in the verify-report.md file (near the existing
Change: cerebro / Version: N/A block); insert a short status line in the exact
format `EN/ES parity: ES translation pending` (or `EN/ES parity: available` if
translated) so docs parity tooling and reviewers can detect the translation
state.
In `@openspec/specs/cerebro/spec.md`:
- Around line 112-116: Add a new dedicated subsection under "Requirement: Secure
Configuration Defaults" (near the existing HTTPS/WSS and loopback policy text)
that contains normative MUST-level statements specifying runtime-to-Cerebro
authentication and authorization: require mutually authenticated TLS or
token-based mTLS/OIDC client authentication as the permitted auth methods,
require least-privilege scopes/roles (define minimal scope names or claims),
mandate explicit failure behavior for auth failures (deny connection, log with
minimal sensitive detail, and rate-limit retries), and require secret handling
and rotation policies (short-lived credentials, automated rotation, and secure
storage). Reference the existing section header "Requirement: Secure
Configuration Defaults" and ensure every statement is at MUST level and adds
requirements for auth method(s), scope/role naming, auth failure behavior, and
secret lifecycle management.
- Around line 85-87: Replace the non-deterministic rejection message in the THEN
clause that currently ends with "..." with a single exact message (or a strict
template) so tests are unambiguous; update the quoted text "SurrealDB backend
has been removed; use the Cerebro backend for long-term memory..." to a
deterministic string such as "SurrealDB backend has been removed; use the
Cerebro backend for long-term memory. See {MIGRATION_DOC_URL} for migration
instructions." or use the template "SurrealDB backend has been removed; use the
Cerebro backend for long-term memory. See <migration-doc>." and apply this exact
message wherever the phrase "SurrealDB backend has been removed; use the Cerebro
backend for long-term memory" appears in spec.md so examples/tests match
precisely.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| #[tokio::test] | ||
| async fn rejects_requests_without_auth_token() { | ||
| let storage = InMemoryStorage::new(); | ||
| let config = CerebroConfig { | ||
| auth_token: Some(SecretString::new("secret".to_string())), | ||
| ..Default::default() | ||
| }; | ||
| let service = CerebroService::new(config, storage); | ||
|
|
||
| let request = cerebro::JsonRpcRequest { | ||
| jsonrpc: "2.0".to_string(), | ||
| id: json!("1"), | ||
| method: "tools/call".to_string(), | ||
| params: cerebro::server::JsonRpcParams { | ||
| name: "mem_stats".to_string(), | ||
| arguments: json!({ "input": {} }), | ||
| }, | ||
| }; | ||
|
|
||
| let response = service.handle_json_rpc(request, None).await; | ||
| let error = response.error.expect("expected auth error"); | ||
| assert_eq!(error.code, "unauthorized"); | ||
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn accepts_requests_with_valid_auth_token() { | ||
| let storage = InMemoryStorage::new(); | ||
| let config = CerebroConfig { | ||
| auth_token: Some(SecretString::new("secret".to_string())), | ||
| ..Default::default() | ||
| }; | ||
| let service = CerebroService::new(config, storage); | ||
|
|
||
| let request = cerebro::JsonRpcRequest { | ||
| jsonrpc: "2.0".to_string(), | ||
| id: json!("1"), | ||
| method: "tools/call".to_string(), | ||
| params: cerebro::server::JsonRpcParams { | ||
| name: "mem_stats".to_string(), | ||
| arguments: json!({ "input": {} }), | ||
| }, | ||
| }; | ||
|
|
||
| let response = service | ||
| .handle_json_rpc(request, Some("Bearer secret")) | ||
| .await; | ||
| assert!(response.error.is_none()); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider adding edge-case tests for malformed tokens.
Current tests cover missing and valid tokens. Consider adding tests for:
- Invalid token value (
Bearer wrong) - Missing "Bearer " prefix (
secretinstead ofBearer secret)
This ensures auth validation is robust against bypass attempts.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/cerebro/tests/mcp_auth_policy.rs` around lines 5 - 51, Add two new
async tests alongside rejects_requests_without_auth_token and
accepts_requests_with_valid_auth_token that exercise
CerebroService::handle_json_rpc for malformed tokens: one that calls
handle_json_rpc with Some("Bearer wrong") and asserts the response contains an
auth error (error.code == "unauthorized"), and another that calls
handle_json_rpc with Some("secret") (missing "Bearer " prefix) and also asserts
an unauthorized error; reuse the same request construction and
InMemoryStorage/CerebroConfig setup from the existing tests to keep them
consistent.
There was a problem hiding this comment.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clients/agent-runtime/src/memory/backend.rs (1)
93-100:⚠️ Potential issue | 🟡 MinorAdd direct tests for legacy profile mapping behavior.
Line 99 introduces new behavior (
LegacySurreal -> CUSTOM_PROFILE) but there’s no direct assertion for it. Add a focused test to prevent regressions in migration-facing config/profile handling.✅ Suggested test addition
#[test] fn unknown_profile_preserves_extensibility_defaults() { let profile = memory_backend_profile("custom-memory"); assert_eq!(profile.key, "custom"); assert!(profile.auto_save_default); assert!(!profile.uses_sqlite_hygiene); } + +#[test] +fn legacy_surreal_profile_maps_to_custom_defaults() { + let profile = memory_backend_profile("surreal"); + assert_eq!(profile.key, "custom"); + assert!(profile.auto_save_default); + assert!(!profile.uses_sqlite_hygiene); + assert!(!profile.sqlite_based); +} + +#[test] +fn legacy_surreal_graphs_profile_maps_to_custom_defaults() { + let profile = memory_backend_profile("surreal-graphs"); + assert_eq!(profile.key, "custom"); +}As per coding guidelines: "Look for behavioral regressions, missing tests, and contract breaks across modules."
Also applies to: 153-159
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/memory/backend.rs` around lines 93 - 100, Add a focused unit test asserting that classify_memory_backend returns MemoryBackendKind::LegacySurreal (or that memory_backend_profile maps a legacy backend string) yields CUSTOM_PROFILE to lock in the migration behavior introduced at memory_backend_profile; specifically, add a test that calls classify_memory_backend with the legacy Surreal identifier or calls memory_backend_profile with the legacy backend string and asserts equality with CUSTOM_PROFILE (use the MemoryBackendKind::LegacySurreal and memory_backend_profile symbols to locate code), and mirror the same focused assertion for the other case noted around lines 153-159 to prevent regressions.
♻️ Duplicate comments (8)
modules/cerebro/src/tools.rs (1)
227-234:⚠️ Potential issue | 🟠 MajorUsing
topic_keyasmemory_idcauses record overwrites.This was flagged in a previous review and remains unaddressed. When a second
mem_saveoccurs under the sametopic_key, it overwrites the first record sincememory_idserves as the primary key. Generate a unique ID instead.Suggested fix
+use uuid::Uuid; + - let memory_id = input.input.topic_key.clone(); + let memory_id = Uuid::new_v4().to_string(); let record = MemoryRecord::new( memory_id.clone(), input.input.scope, input.input.topic_key, observation, );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@modules/cerebro/src/tools.rs` around lines 227 - 234, The code currently uses input.input.topic_key as memory_id which causes new saves to overwrite existing MemoryRecord primary keys; change the memory_id generation in the mem_save path to a unique ID (e.g., Uuid::new_v4().to_string() or another unique generator) and keep input.input.topic_key as the topic field passed into MemoryRecord::new, then call self.storage.save(record).await? as before; update imports to use the UUID generator (e.g., add uuid::Uuid) or your project’s unique-id helper so each MemoryRecord created by MemoryRecord::new has a distinct primary key instead of topic_key.openspec/specs/cerebro/spec.md (2)
112-127:⚠️ Potential issue | 🟠 MajorAdd explicit MCP authentication and authorization requirements.
The spec requires secure transport (https/wss) but does not specify authentication mechanisms for runtime-to-Cerebro MCP connections. Add requirements for:
- Supported auth methods (API key/token, mTLS, bearer auth)
- Token/credential rotation and secure storage guidance
- Auth failure handling (rejection, retry, observability)
- Least-privilege scope expectations
Transport security alone is insufficient; an attacker with network access to the Cerebro endpoint could make unauthorized memory operations without authentication.
🔐 Suggested auth/authz section
Add after line 127:
#### Authentication and authorization requirements The runtime MUST authenticate to Cerebro MCP endpoints using one of: - Bearer token via `Authorization` header (for https endpoints) - API key via `auth_token` configuration parameter - Mutual TLS (mTLS) with client certificate verification Configuration: - Auth credentials MUST be stored securely (environment variables, secret manager) - Tokens SHOULD support rotation without runtime restart - Failed auth attempts MUST be logged and MAY trigger backoff/circuit breaker Authorization: - Cerebro SHOULD enforce least-privilege scopes per client token - Runtime SHOULD use a token with minimal required permissions (mem_save, mem_search, mem_delete only)As per coding guidelines: "Security first ... validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 112 - 127, Add a new "Authentication and authorization requirements" section after the secure-transport requirement (after the current secure endpoint scenarios) that mandates the runtime MUST authenticate to Cerebro MCP using one of: Bearer token via the Authorization header, an API key supplied via an auth_token configuration parameter, or mutual TLS (mTLS) with client certificate verification; require auth credentials be stored securely (env vars or secret manager), support token/credential rotation without runtime restarts, and log failed auth attempts with configurable backoff/circuit-breaker behavior; additionally require Cerebro to enforce least-privilege scopes per client token (recommend minimal scopes such as mem_save, mem_search, mem_delete) and state that runtimes should use tokens limited to those scopes.
59-64:⚠️ Potential issue | 🟡 MinorClarify the PII/secret detection mechanism.
Lines 63-64 require "a local PII/secret guard" but don't specify whether detection is automated (pattern matching, ML-based) or requires developer-provided metadata/annotations. Consider adding:
- Whether automated PII detection is REQUIRED or OPTIONAL
- If manual, the expected developer workflow (annotations, flags)
- Concrete examples of what constitutes "sensitive content" for this guard
Underspecified security requirements lead to inconsistent implementations.
As per coding guidelines: "Security first ... validate input boundaries, auth/authz implications, and secret management."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/specs/cerebro/spec.md` around lines 59 - 64, Update the PII/secret guard text to specify detection modes and developer workflow: state that the runtime MUST implement automated PII/secret detection (e.g., pattern matching and optional ML models) as the default and that manual developer-provided annotations/flags are OPTIONAL and override/augment automated results; describe the expected developer workflow for annotations (how to mark data as sensitive or non-sensitive when calling mem_save or invoking MCP, and that annotated fields are treated as local-only), require the runtime to return a structured error when either automated detection or an annotation blocks an MCP call, and include concrete examples of sensitive content (credentials, SSNs, credit card numbers, OAuth tokens, private keys, health data, and PII such as full names + DOB) and non-sensitive examples to guide implementers; reference the MCP boundary, mem_save, and the "local PII/secret guard" so implementers know which components must enforce these rules.clients/agent-runtime/src/tools/memory_recall.rs (1)
11-12:⚠️ Potential issue | 🟠 MajorRestore policy and egress checks on the remote recall path.
MemoryRecallToolno longer carries aSecurityPolicy, so the Cerebro branch can issue outbound requests without the deny-by-default gate thatMemoryStoreToolnow applies. InjectArc<SecurityPolicy>, enforceToolOperation::Read, and callenforce_cerebro_egress(...)before constructing the adapter.
As per coding guidelines "Treatsrc/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks".Also applies to: 16-27, 125-148
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_recall.rs` around lines 11 - 12, MemoryRecallTool's remote (Cerebro) path lacks security checks: inject an Arc<SecurityPolicy> into the MemoryRecallTool (alongside MemoryCerebroConfig and local: Option<Arc<dyn Memory>>), ensure the remote branch enforces ToolOperation::Read via the SecurityPolicy before doing work, and call enforce_cerebro_egress(&policy, &cerebro, ...) to validate outbound egress prior to constructing the Cerebro adapter; update all Cerebro-related branches (creation points referenced by MemoryRecallTool, MemoryCerebroConfig and the adapter construction sites) to perform these checks so filesystem/network execution is not expanded without explicit policy enforcement.clients/agent-runtime/src/tools/memory_store.rs (2)
202-203:⚠️ Potential issue | 🟠 MajorReturn storage and transport failures as
ToolResult, notErr.
local.store(...),cerebro::cerebro_tool_adapter(...), andadapter.execute(...)still use?, so backend/network faults escape the tool boundary instead of producingsuccess: falsewith a structured error. Mirror the deny-path handling here so the agent gets a normal tool failure response.
As per coding guidelines "ImplementTooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path".Also applies to: 240-253
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 202 - 203, The calls local.store(...), cerebro::cerebro_tool_adapter(...), and adapter.execute(...) currently use the ? operator which lets backend/network errors escape the Tool boundary; change each to catch/map errors and return a ToolResult with success: false and a structured error (mirroring the existing deny-path handling) instead of propagating Err. Specifically, replace usages of local.store(...) .await?, cerebro::cerebro_tool_adapter(...)? and adapter.execute(...)? with explicit match/map_err that constructs the same ToolResult error shape used in the deny branch (include context text, error.to_string(), and any appropriate error_kind), returning Ok(ToolResult { success: false, error: Some(...) }) so failures are surfaced as tool failures rather than panics/propagation.
181-188:⚠️ Potential issue | 🟠 MajorValidate every outbound memory field for sensitive data.
The new guard only inspects
content, but Line 244 still forwardskeyastopic_keyand Line 248 forwards custom categories in metadata. A model can bypass the block by moving a token, email, or phone number into those fields. Screenkeyand any customcategorybefore either the local or Cerebro path.
As per coding guidelines "ImplementTooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path".Also applies to: 241-249
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/memory_store.rs` around lines 181 - 188, The current sensitive-data guard only checks `content` but must also validate and sanitize outbound fields `key` (used as `topic_key`) and any custom `category` values in `metadata` before taking either the local or Cerebro path; update the logic in the function in clients/agent-runtime/src/tools/memory_store.rs to call `contains_sensitive_data` (or equivalent sanitizer) on `key` and each `metadata.category` entry and return the same `ToolResult` with `SENSITIVE_DATA_ERROR` when any of those checks fail, ensuring both the local-path branch and the Cerebro-path branch never forward unsanitized `topic_key` or `category` values.clients/agent-runtime/src/config/schema.rs (1)
2848-2851:⚠️ Potential issue | 🟠 MajorPoint SurrealDB users at
[memory.cerebro], not a nonexistent backend.This error still says "Use the Cerebro backend", but the allowlist in this same validator only accepts
sqlite,lucid,markdown, andnone. That sends migrations into a second validation error instead of the real fix: keepmemory.backendon a supported value and configure[memory.cerebro].
As per coding guidelines "Look for behavioral regressions, missing tests, and contract breaks across modules".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/config/schema.rs` around lines 2848 - 2851, The error message for the SurrealDB cases incorrectly tells users to "Use the Cerebro backend" which conflicts with the allowlist for memory.backend; update the error text in the match arm handling "surreal" | "surreal-graphs" (where self.memory.backend is referenced) to instruct users to keep memory.backend set to a supported value (sqlite, lucid, markdown, or none) and configure the Cerebro settings under the [memory.cerebro] table/section instead of changing memory.backend; modify the anyhow::bail(...) message accordingly so it points to the correct configuration key and migration doc.clients/agent-runtime/src/tools/mcp/client.rs (1)
254-268:⚠️ Potential issue | 🟠 MajorCheck HTTP status before JSON-decoding error bodies.
Non-2xx responses still go through
serde_json::from_slicefirst, so an HTML/plain-text 401/502 becomesMCP HTTP response was not valid JSONinstead of a transport error with the real status/body. Branch onstatus.is_success()before deserializing, and only parse JSON on the success path.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/mcp/client.rs` around lines 254 - 268, The code currently calls serde_json::from_slice(&body) before checking HTTP status, causing non-JSON error responses to produce a misleading JSON-deserialization error; change the flow so you first check status.is_success() and only call serde_json::from_slice on the success branch. On the non-success branch (using status and body), build the anyhow::bail! error including self.server.name and name and include the raw body as a UTF-8 string (with a fallback representation if decoding fails) under "details" so transport errors show the real status and body; leave JSON parsing of payload to the successful path where payload is set from serde_json::from_slice(&body).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/skills/conventional-commits/SKILL.md:
- Line 12: The file is missing a top-level H1 heading which triggers
markdownlint MD041; add a single top-level heading (e.g., "# Conventional
Commits" or similar) immediately before the existing "## When to Use" heading in
.agents/skills/conventional-commits/SKILL.md so the file starts with an H1 and
satisfies the linter.
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 985-997: MemoryCerebroConfig's docs advertise ws/wss but
validate_cerebro_endpoint() and the egress matching logic in
clients/agent-runtime/src/security/egress.rs currently allow websocket schemes
while reqwest::Client::post() only supports http/https; update
validate_cerebro_endpoint() to explicitly reject "ws" and "wss" schemes (allow
only "http" and "https") and update the corresponding
matching/allow_insecure_loopback logic in the egress code to treat ws/wss as
invalid, or alternatively implement a websocket transport used by the code paths
that call reqwest::Client so the schemes are actually supported—pick one
approach and make the validation and egress matching consistent with
MemoryCerebroConfig's advertised schemes.
In `@clients/agent-runtime/src/security/pairing.rs`:
- Around line 190-199: The generate_token function currently composes two
UUIDv4s which only yield ~244 bits of randomness due to RFC4122 fixed bits;
replace that with true CSPRNG-generated 32 bytes (e.g., fill a [u8;32] via
getrandom/OsRng/random_device) and then hex-encode them to produce the
64-character token, or alternatively update the docstring to state the actual
~244-bit entropy if you intentionally keep UUIDs; locate the generate_token
function to implement the CSPRNG fill (or update the comment) and ensure the
rest of the code expecting a 64-char hex token remains unchanged.
In `@clients/agent-runtime/src/tools/memory_recall.rs`:
- Around line 114-116: The runtime calls in memory_recall.rs (calls to
local.recall, cerebro::cerebro_tool_adapter, adapter.execute, and
normalize_legacy_recall_output) currently propagate errors with `?`; change them
to catch failures and return a structured ToolResult with success: false
(matching the existing validation error shape) instead of bubbling Err. For each
call wrap the await/execute in a match or map_err->Ok(ToolResult { success:
false, output: <empty_or_preview>, error: Some(detail) }) so
transport/backend/schema errors become a ToolResult payload rather than an Err;
mirror the same error serialization used by the file’s validation errors and
ensure function signatures still return Result<ToolResult, _> but only use Err
for unrecoverable internal issues. Use the same pattern for the blocks around
local.recall, cerebro::cerebro_tool_adapter, adapter.execute, and
normalize_legacy_recall_output so all runtime failures conform to the ToolResult
contract.
In `@modules/cerebro/src/tools.rs`:
- Around line 382-390: The update path is missing applying input.input.metadata
and does not update any timestamp when modifying a record; locate the block that
sets record.topic_key and record.scope (the method that ends with
self.storage.save(record).await?) and set the record's metadata from
input.input.metadata (e.g., if let Some(meta) = input.input.metadata {
record.metadata = meta }) and also update a modification time (either set
record.timestamp = Utc::now() or prefer adding/setting record.updated_at =
Utc::now() depending on the model) before calling
self.storage.save(record).await; ensure the types for metadata and
timestamp/updated_at match the Record struct.
- Around line 173-207: Add the same audit-only guard used by mem_timeline to
mem_search and mem_get_observation: inside the mem_search and
mem_get_observation handler functions, after parsing the input (where you
reference input.input.include_deleted), check if
input.input.include_deleted.unwrap_or(false) && !auth.is_audit and return
Err(CerebroError::Forbidden("include_deleted requires audit
permissions".to_string())); this ensures deleted-record access is restricted
consistently across mem_search, mem_get_observation (and mirrors the existing
logic in mem_timeline).
- Around line 279-282: The JSON response always sets "truncated": false; update
the code that constructs the response (the Ok(json!({ "results": items,
"truncated": ... }))) to compute the truncated flag from the query/result
logic—e.g., derive it from a has_more/was_truncated boolean or from items.len()
compared to the requested limit (use items.len() >= limit if you fetched up to
limit+1 or appropriate sentinel), and set "truncated" to that computed value
instead of the hardcoded false so pagination can be signaled correctly.
In `@openspec/specs/cerebro/spec.md`:
- Around line 20-23: Update the broken external reference in the "Cerebro MCP
Tool Surface" requirement by replacing the incorrect path string
'openspec/changes/cerebro/cerebro.md' with the correct location
'openspec/changes/archive/2026-03-16-cerebro/cerebro.md' (or alternatively
move/copy the target file to the original path so the existing reference remains
valid); change the path in the spec line that currently references
'openspec/changes/cerebro/cerebro.md' so the link resolves to the actual file.
---
Outside diff comments:
In `@clients/agent-runtime/src/memory/backend.rs`:
- Around line 93-100: Add a focused unit test asserting that
classify_memory_backend returns MemoryBackendKind::LegacySurreal (or that
memory_backend_profile maps a legacy backend string) yields CUSTOM_PROFILE to
lock in the migration behavior introduced at memory_backend_profile;
specifically, add a test that calls classify_memory_backend with the legacy
Surreal identifier or calls memory_backend_profile with the legacy backend
string and asserts equality with CUSTOM_PROFILE (use the
MemoryBackendKind::LegacySurreal and memory_backend_profile symbols to locate
code), and mirror the same focused assertion for the other case noted around
lines 153-159 to prevent regressions.
---
Duplicate comments:
In `@clients/agent-runtime/src/config/schema.rs`:
- Around line 2848-2851: The error message for the SurrealDB cases incorrectly
tells users to "Use the Cerebro backend" which conflicts with the allowlist for
memory.backend; update the error text in the match arm handling "surreal" |
"surreal-graphs" (where self.memory.backend is referenced) to instruct users to
keep memory.backend set to a supported value (sqlite, lucid, markdown, or none)
and configure the Cerebro settings under the [memory.cerebro] table/section
instead of changing memory.backend; modify the anyhow::bail(...) message
accordingly so it points to the correct configuration key and migration doc.
In `@clients/agent-runtime/src/tools/mcp/client.rs`:
- Around line 254-268: The code currently calls serde_json::from_slice(&body)
before checking HTTP status, causing non-JSON error responses to produce a
misleading JSON-deserialization error; change the flow so you first check
status.is_success() and only call serde_json::from_slice on the success branch.
On the non-success branch (using status and body), build the anyhow::bail! error
including self.server.name and name and include the raw body as a UTF-8 string
(with a fallback representation if decoding fails) under "details" so transport
errors show the real status and body; leave JSON parsing of payload to the
successful path where payload is set from serde_json::from_slice(&body).
In `@clients/agent-runtime/src/tools/memory_recall.rs`:
- Around line 11-12: MemoryRecallTool's remote (Cerebro) path lacks security
checks: inject an Arc<SecurityPolicy> into the MemoryRecallTool (alongside
MemoryCerebroConfig and local: Option<Arc<dyn Memory>>), ensure the remote
branch enforces ToolOperation::Read via the SecurityPolicy before doing work,
and call enforce_cerebro_egress(&policy, &cerebro, ...) to validate outbound
egress prior to constructing the Cerebro adapter; update all Cerebro-related
branches (creation points referenced by MemoryRecallTool, MemoryCerebroConfig
and the adapter construction sites) to perform these checks so
filesystem/network execution is not expanded without explicit policy
enforcement.
In `@clients/agent-runtime/src/tools/memory_store.rs`:
- Around line 202-203: The calls local.store(...),
cerebro::cerebro_tool_adapter(...), and adapter.execute(...) currently use the ?
operator which lets backend/network errors escape the Tool boundary; change each
to catch/map errors and return a ToolResult with success: false and a structured
error (mirroring the existing deny-path handling) instead of propagating Err.
Specifically, replace usages of local.store(...) .await?,
cerebro::cerebro_tool_adapter(...)? and adapter.execute(...)? with explicit
match/map_err that constructs the same ToolResult error shape used in the deny
branch (include context text, error.to_string(), and any appropriate
error_kind), returning Ok(ToolResult { success: false, error: Some(...) }) so
failures are surfaced as tool failures rather than panics/propagation.
- Around line 181-188: The current sensitive-data guard only checks `content`
but must also validate and sanitize outbound fields `key` (used as `topic_key`)
and any custom `category` values in `metadata` before taking either the local or
Cerebro path; update the logic in the function in
clients/agent-runtime/src/tools/memory_store.rs to call
`contains_sensitive_data` (or equivalent sanitizer) on `key` and each
`metadata.category` entry and return the same `ToolResult` with
`SENSITIVE_DATA_ERROR` when any of those checks fail, ensuring both the
local-path branch and the Cerebro-path branch never forward unsanitized
`topic_key` or `category` values.
In `@modules/cerebro/src/tools.rs`:
- Around line 227-234: The code currently uses input.input.topic_key as
memory_id which causes new saves to overwrite existing MemoryRecord primary
keys; change the memory_id generation in the mem_save path to a unique ID (e.g.,
Uuid::new_v4().to_string() or another unique generator) and keep
input.input.topic_key as the topic field passed into MemoryRecord::new, then
call self.storage.save(record).await? as before; update imports to use the UUID
generator (e.g., add uuid::Uuid) or your project’s unique-id helper so each
MemoryRecord created by MemoryRecord::new has a distinct primary key instead of
topic_key.
In `@openspec/specs/cerebro/spec.md`:
- Around line 112-127: Add a new "Authentication and authorization requirements"
section after the secure-transport requirement (after the current secure
endpoint scenarios) that mandates the runtime MUST authenticate to Cerebro MCP
using one of: Bearer token via the Authorization header, an API key supplied via
an auth_token configuration parameter, or mutual TLS (mTLS) with client
certificate verification; require auth credentials be stored securely (env vars
or secret manager), support token/credential rotation without runtime restarts,
and log failed auth attempts with configurable backoff/circuit-breaker behavior;
additionally require Cerebro to enforce least-privilege scopes per client token
(recommend minimal scopes such as mem_save, mem_search, mem_delete) and state
that runtimes should use tokens limited to those scopes.
- Around line 59-64: Update the PII/secret guard text to specify detection modes
and developer workflow: state that the runtime MUST implement automated
PII/secret detection (e.g., pattern matching and optional ML models) as the
default and that manual developer-provided annotations/flags are OPTIONAL and
override/augment automated results; describe the expected developer workflow for
annotations (how to mark data as sensitive or non-sensitive when calling
mem_save or invoking MCP, and that annotated fields are treated as local-only),
require the runtime to return a structured error when either automated detection
or an annotation blocks an MCP call, and include concrete examples of sensitive
content (credentials, SSNs, credit card numbers, OAuth tokens, private keys,
health data, and PII such as full names + DOB) and non-sensitive examples to
guide implementers; reference the MCP boundary, mem_save, and the "local
PII/secret guard" so implementers know which components must enforce these
rules.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 2a06c650-abaf-4fd1-a3bd-6955bad3a99c
⛔ Files ignored due to path filters (1)
clients/web/pnpm-lock.yamlis excluded by!**/pnpm-lock.yaml
📒 Files selected for processing (11)
.agents/AGENTS.md.agents/skills/conventional-commits/SKILL.mdclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/tests/memory_cerebro_integration.rsmodules/cerebro/src/tools.rsopenspec/specs/cerebro/spec.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: sonar
- GitHub Check: pr-checks
- GitHub Check: pr-checks
- GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (9)
**/*.{md,mdx}
⚙️ CodeRabbit configuration file
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.
Files:
openspec/specs/cerebro/spec.md
**/*
⚙️ CodeRabbit configuration file
**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.
Files:
openspec/specs/cerebro/spec.mdclients/agent-runtime/src/tools/mcp/client.rsmodules/cerebro/src/tools.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
clients/agent-runtime/src/tools/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path
Files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rs
clients/agent-runtime/src/{security,gateway,tools}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Treat
src/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rs
clients/agent-runtime/src/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
clients/agent-runtime/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testfor code validation, or document which checks were skipped and why
Files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
**/*.rs
⚙️ CodeRabbit configuration file
**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.
Files:
clients/agent-runtime/src/tools/mcp/client.rsmodules/cerebro/src/tools.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/src/security/pairing.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
.agents/AGENTS.md
📄 CodeRabbit inference engine (AGENTS.md)
.agents/AGENTS.md: Document agent configurations and capabilities in AGENTS.md
Maintain comprehensive agent metadata including name, description, purpose, and capabilities
Include version information and compatibility details for agents
Files:
.agents/AGENTS.md
🧠 Learnings (13)
📚 Learning: 2026-02-17T07:28:38.934Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-17T07:28:38.934Z
Learning: Applies to .agents/AGENTS.md : Document agent configurations and capabilities in AGENTS.md
Applied to files:
.agents/skills/conventional-commits/SKILL.md.agents/AGENTS.md
📚 Learning: 2026-02-17T07:28:38.934Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-17T07:28:38.934Z
Learning: Applies to .agents/AGENTS.md : Maintain comprehensive agent metadata including name, description, purpose, and capabilities
Applied to files:
.agents/skills/conventional-commits/SKILL.md.agents/AGENTS.md
📚 Learning: 2026-02-17T07:28:38.934Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-02-17T07:28:38.934Z
Learning: Applies to .agents/AGENTS.md : Include version information and compatibility details for agents
Applied to files:
.agents/skills/conventional-commits/SKILL.md.agents/AGENTS.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Use conventional commit titles and prefer small PRs (size: XS/S/M) when possible
Applied to files:
.agents/skills/conventional-commits/SKILL.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Applied to files:
openspec/specs/cerebro/spec.mdclients/agent-runtime/src/tools/mcp/client.rsmodules/cerebro/src/tools.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Applied to files:
openspec/specs/cerebro/spec.mdclients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
Applied to files:
clients/agent-runtime/src/tools/mcp/client.rsmodules/cerebro/src/tools.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths
Applied to files:
clients/agent-runtime/src/tools/mcp/client.rsmodules/cerebro/src/tools.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/tools/memory_recall.rsclients/agent-runtime/src/config/schema.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Keep startup path lean and avoid heavy initialization in command parsing flow
Applied to files:
clients/agent-runtime/src/tools/mcp/client.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why
Applied to files:
clients/agent-runtime/src/tools/mcp/client.rsclients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/src/tools/memory_store.rsclients/agent-runtime/src/config/schema.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/channels/**/*.rs : Implement `Channel` trait in `src/channels/` with consistent `send`, `listen`, and `health_check` semantics and cover auth/allowlist/health behavior with tests
Applied to files:
clients/agent-runtime/tests/memory_cerebro_integration.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Applied to files:
clients/agent-runtime/tests/memory_cerebro_integration.rsclients/agent-runtime/src/memory/backend.rsclients/agent-runtime/src/config/schema.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Applied to files:
clients/agent-runtime/src/tools/memory_store.rs
🪛 markdownlint-cli2 (0.21.0)
.agents/skills/conventional-commits/SKILL.md
[warning] 12-12: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🔇 Additional comments (9)
.agents/AGENTS.md (1)
191-191: Skill reference path update is correct.The new path under
.agents/skills/...is consistent with the skills catalog structure and keeps references resolvable.As per coding guidelines: "Document agent configurations and capabilities in AGENTS.md."
.agents/skills/conventional-commits/SKILL.md (1)
20-26: Conventional Commits patterns and examples are solid.Rules, breaking-change handling, and command examples are clear and actionable for contributors.
Also applies to: 29-58
clients/agent-runtime/src/memory/backend.rs (3)
7-9: Legacy Surreal classification is correctly hardened.Good change: Line 88 now classifies legacy Surreal keys explicitly, which enables the runtime to fail with a migration error instead of treating them as generic unknowns.
Also applies to: 82-90
67-72: Selectable backend surface and test expectations are consistently updated.The 4-backend list and onboarding/order assertions are aligned, and the backend round-trip test now matches the supported set.
Also applies to: 136-143, 167-179, 243-249
103-250: Please confirm required Rust validation checks were run for this change.I don’t see execution evidence in the provided context for formatting/lint/tests. Please confirm results (or explicitly document skips and rationale):
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo test.As per coding guidelines: "clients/agent-runtime/**/*.rs: Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testfor code validation, or document which checks were skipped and why".modules/cerebro/src/tools.rs (3)
330-359: LGTM on soft-delete visibility logic.The handler correctly distinguishes between deleted records based on
include_deleted. The auth gap forinclude_deletedwas flagged in thehandle()review.
398-414: LGTM on basic slug generation.Minimal but functional. The
scopeparameter andcandidatesarray are placeholders for future enhancement.
416-453: LGTM onmem_timelineandmem_stats.
mem_timelinecorrectly validates inputs and checks authorization before returningNotImplemented.mem_statsreturns realmemory_countwith placeholder zeros for unimplemented features—consistent with the tool surface.openspec/specs/cerebro/spec.md (1)
66-72: Migration guide reference is accurate and accessible. The file atclients/web/apps/docs/src/content/docs/guides/cerebro/migration.mdexists and contains proper SurrealDB-to-Cerebro migration steps with EN/ES parity documentation.
| version: "1.0" | ||
| --- | ||
|
|
||
| ## When to Use |
There was a problem hiding this comment.
Add a top-level heading to satisfy markdownlint MD041.
Lint reports missing H1 at file start (after front matter). Add a # ... heading before ## When to Use.
Proposed fix
---
name: conventional-commits
description: >
Conventional Commits specification and usage guidance.
Trigger: Creating commits, git messages, or commit guidelines.
license: Apache-2.0
metadata:
author: generic-author
version: "1.0"
---
+# Conventional Commits Skill
+
## When to Use📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## When to Use | |
| --- | |
| name: conventional-commits | |
| description: > | |
| Conventional Commits specification and usage guidance. | |
| Trigger: Creating commits, git messages, or commit guidelines. | |
| license: Apache-2.0 | |
| metadata: | |
| author: generic-author | |
| version: "1.0" | |
| --- | |
| # Conventional Commits Skill | |
| ## When to Use |
🧰 Tools
🪛 markdownlint-cli2 (0.21.0)
[warning] 12-12: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/skills/conventional-commits/SKILL.md at line 12, The file is missing
a top-level H1 heading which triggers markdownlint MD041; add a single top-level
heading (e.g., "# Conventional Commits" or similar) immediately before the
existing "## When to Use" heading in
.agents/skills/conventional-commits/SKILL.md so the file starts with an H1 and
satisfies the linter.
| pub struct MemoryCerebroConfig { | ||
| /// MCP endpoint URL, e.g. "https://cerebro.example.com/mcp" or "wss://cerebro.example.com/mcp" | ||
| #[serde(default)] | ||
| pub username: Option<String>, | ||
| /// SurrealDB password (ignored if token is provided) | ||
| pub endpoint: Option<String>, | ||
| /// MCP auth token for Cerebro | ||
| #[serde(default)] | ||
| pub password: Option<String>, | ||
| /// SurrealDB auth token (preferred over username/password) | ||
| pub auth_token: Option<String>, | ||
| /// Request timeout in milliseconds | ||
| #[serde(default = "default_cerebro_timeout_ms")] | ||
| pub request_timeout_ms: u64, | ||
| /// Allow plain HTTP/WS for loopback addresses only. | ||
| #[serde(default)] | ||
| pub token: Option<String>, | ||
| /// Allow plain HTTP for loopback addresses only. | ||
| #[serde(default = "default_true")] | ||
| pub allow_http_loopback: bool, | ||
| pub allow_insecure_loopback: bool, |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Does Rust's reqwest::Client::postsupportws://orwss://URLs, or onlyhttp://andhttps://?
💡 Result:
reqwest::Client::post is an HTTP client API and is intended for http:// and https:// URLs, not ws:// / wss:// WebSocket URLs. (docs.rs)
If you need WebSockets, use a WebSocket client crate (e.g. tokio-tungstenite) or an extension built for reqwest’s HTTP Upgrade handshake (e.g. reqwest-websocket adds an upgrade method to RequestBuilder). (docs.rs)
Citations:
- 1: https://docs.rs/reqwest%3A%3AUrl?utm_source=openai
- 2: https://docs.rs/crate/reqwest-websocket/latest?utm_source=openai
🏁 Script executed:
#!/bin/bash
# Search for validate_cerebro_endpoint function
rg "validate_cerebro_endpoint" clients/agent-runtime/src/config/schema.rs -A 15Repository: dallay/corvus
Length of output: 1348
🏁 Script executed:
#!/bin/bash
# Search for the MCP client code that uses reqwest
rg "reqwest.*Client.*post" clients/agent-runtime/src/tools/mcp/ -B 3 -A 3Repository: dallay/corvus
Length of output: 39
🏁 Script executed:
#!/bin/bash
# Confirm endpoint validation logic accepts ws/wss
rg "ws:/|wss:/|\"ws\"|\"wss\"" clients/agent-runtime/src/config/schema.rs -B 2 -A 2Repository: dallay/corvus
Length of output: 487
🏁 Script executed:
#!/bin/bash
# Find all files in the MCP tools directory
fd . clients/agent-runtime/src/tools/mcp/ -type fRepository: dallay/corvus
Length of output: 227
🏁 Script executed:
#!/bin/bash
# Search for where Cerebro endpoint is actually used at runtime
rg "MemoryCerebroConfig|cerebro.*endpoint" clients/agent-runtime/src/ --type rs -B 2 -A 2Repository: dallay/corvus
Length of output: 83
🏁 Script executed:
#!/bin/bash
# Search for reqwest usage in the runtime
rg "reqwest" clients/agent-runtime/src/ --type rs -lRepository: dallay/corvus
Length of output: 83
🏁 Script executed:
#!/bin/bash
# Find all files in the MCP tools directory
find clients/agent-runtime/src/tools/mcp/ -type f -name "*.rs"Repository: dallay/corvus
Length of output: 288
🏁 Script executed:
#!/bin/bash
# Search for where Cerebro endpoint is actually used
rg "MemoryCerebroConfig|cerebro.*endpoint" clients/agent-runtime/src/ -B 2 -A 2Repository: dallay/corvus
Length of output: 28947
🏁 Script executed:
#!/bin/bash
# Search for reqwest usage in the runtime
rg "reqwest" clients/agent-runtime/src/Repository: dallay/corvus
Length of output: 12134
🏁 Script executed:
#!/bin/bash
# Inspect the MCP client code to see exactly how it uses reqwest
cat -n clients/agent-runtime/src/tools/mcp/client.rs | head -100Repository: dallay/corvus
Length of output: 3823
🏁 Script executed:
#!/bin/bash
# Look for the actual HTTP method used in the Cerebro client
rg "post\|Client::new\|Client::builder" clients/agent-runtime/src/tools/mcp/cerebro.rs -B 3 -A 10Repository: dallay/corvus
Length of output: 39
🏁 Script executed:
#!/bin/bash
# See the full cerebro.rs file to understand how it invokes the MCP client
cat -n clients/agent-runtime/src/tools/mcp/cerebro.rsRepository: dallay/corvus
Length of output: 2077
🏁 Script executed:
#!/bin/bash
# Search for where HTTP requests are actually made to Cerebro endpoint
rg "call_tool_http" clients/agent-runtime/src/tools/mcp/client.rs -A 30Repository: dallay/corvus
Length of output: 2132
Remove websocket scheme support from Cerebro endpoint validation or implement websocket transport.
The config validator accepts ws:// and wss:// schemes, but reqwest::Client::post() only supports HTTP(S). A configuration with a wss://cerebro.example.com/mcp endpoint will pass validation and fail at runtime.
Either reject websocket schemes in validate_cerebro_endpoint() (lines 2970–3018) and the matching logic in clients/agent-runtime/src/security/egress.rs, or add actual websocket transport support before advertising these schemes in the struct docs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/config/schema.rs` around lines 985 - 997,
MemoryCerebroConfig's docs advertise ws/wss but validate_cerebro_endpoint() and
the egress matching logic in clients/agent-runtime/src/security/egress.rs
currently allow websocket schemes while reqwest::Client::post() only supports
http/https; update validate_cerebro_endpoint() to explicitly reject "ws" and
"wss" schemes (allow only "http" and "https") and update the corresponding
matching/allow_insecure_loopback logic in the egress code to treat ws/wss as
invalid, or alternatively implement a websocket transport used by the code paths
that call reqwest::Client so the schemes are actually supported—pick one
approach and make the validation and egress matching consistent with
MemoryCerebroConfig's advertised schemes.
| /// Uses UUID v4 (CSPRNG-backed) to produce 256-bit entropy | ||
| /// (/dev/urandom on Linux, BCryptGenRandom on Windows, SecRandomCopyBytes | ||
| /// on macOS). The 32 random bytes (256 bits) are hex-encoded for a | ||
| /// 64-character token, providing 256 bits of entropy. | ||
| fn generate_token() -> String { | ||
| use rand::RngCore; | ||
| let mut bytes = [0u8; 32]; | ||
| rand::rng().fill_bytes(&mut bytes); | ||
| let first = uuid::Uuid::new_v4(); | ||
| let second = uuid::Uuid::new_v4(); | ||
| bytes[..16].copy_from_slice(first.as_bytes()); | ||
| bytes[16..].copy_from_slice(second.as_bytes()); |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
For RFC 4122 / UUID v4, how many bits are random and which bits are fixed (version and variant)?
💡 Result:
For an RFC 4122 UUID v4 (128 bits total):
-
Fixed bits (6 bits total):
- Version: 4 bits set to
0100(the high nibble of thetime_hi_and_versionfield / octet 6). [1] - Variant (RFC 4122 / “Leach-Salz”): 2 bits set to
10(the two most significant bits ofclock_seq_hi_and_reserved/ octet 8). [1]
- Version: 4 bits set to
-
Random bits: 128 − 6 = 122 bits (all remaining bits are randomly/pseudorandomly chosen). [1]
Sources: RFC 4122, §4.4 (v4 algorithm) and related field definitions. [1]
Use full CSPRNG bytes for true 256-bit token entropy.
Line 190 documents 256-bit entropy, but lines 196–199 concatenate two UUIDv4 values, each providing only 122 random bits due to fixed version and variant bits specified in RFC 4122. This yields 244 random bits total, not the documented 256. In this security-critical path, either use raw CSPRNG output or update the security claim.
Suggested fix
fn generate_token() -> String {
let mut bytes = [0u8; 32];
- let first = uuid::Uuid::new_v4();
- let second = uuid::Uuid::new_v4();
- bytes[..16].copy_from_slice(first.as_bytes());
- bytes[16..].copy_from_slice(second.as_bytes());
+ rand::rngs::OsRng.fill_bytes(&mut bytes);
format!("zc_{}", hex::encode(bytes))
}+use rand::RngCore;🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/security/pairing.rs` around lines 190 - 199, The
generate_token function currently composes two UUIDv4s which only yield ~244
bits of randomness due to RFC4122 fixed bits; replace that with true
CSPRNG-generated 32 bytes (e.g., fill a [u8;32] via
getrandom/OsRng/random_device) and then hex-encode them to produce the
64-character token, or alternatively update the docstring to state the actual
~244-bit entropy if you intentionally keep UUIDs; locate the generate_token
function to implement the CSPRNG fill (or update the comment) and ensure the
rest of the code expecting a 64-char hex token remains unchanged.
| pub async fn handle( | ||
| &self, | ||
| tool: &str, | ||
| payload: Value, | ||
| auth: &AuthContext, | ||
| ) -> Result<Value, CerebroError> { | ||
| match tool { | ||
| "mem_save" => self.mem_save(payload).await, | ||
| "mem_search" => self.mem_search(payload).await, | ||
| "mem_delete" => self.mem_delete(payload).await, | ||
| "mem_get_observation" => self.mem_get_observation(payload).await, | ||
| "mem_update" => self.mem_update(payload).await, | ||
| "mem_suggest_topic_key" => self.mem_suggest_topic_key(payload).await, | ||
| "mem_timeline" => self.mem_timeline(payload, auth).await, | ||
| "mem_save_prompt" => Err(CerebroError::NotImplemented( | ||
| "mem_save_prompt".to_string(), | ||
| )), | ||
| "mem_session_start" => Err(CerebroError::NotImplemented( | ||
| "mem_session_start".to_string(), | ||
| )), | ||
| "mem_session_end" => Err(CerebroError::NotImplemented( | ||
| "mem_session_end".to_string(), | ||
| )), | ||
| "mem_session_summary" => Err(CerebroError::NotImplemented( | ||
| "mem_session_summary".to_string(), | ||
| )), | ||
| "mem_context" => Err(CerebroError::NotImplemented( | ||
| "mem_context".to_string(), | ||
| )), | ||
| "mem_stats" => self.mem_stats(payload).await, | ||
| _ => Err(CerebroError::Validation(format!( | ||
| "unsupported tool '{tool}'", | ||
| ))), | ||
| } | ||
| } |
There was a problem hiding this comment.
Authorization check missing for include_deleted in other handlers.
mem_timeline correctly requires auth.is_audit for include_deleted=true, but mem_search (line 252) and mem_get_observation (line 341) allow accessing deleted records without any authorization check. Apply the same guard consistently.
- pub async fn handle(
+ pub async fn handle(
&self,
tool: &str,
payload: Value,
auth: &AuthContext,
) -> Result<Value, CerebroError> {
match tool {
- "mem_save" => self.mem_save(payload).await,
- "mem_search" => self.mem_search(payload).await,
+ "mem_save" => self.mem_save(payload).await,
+ "mem_search" => self.mem_search(payload, auth).await,
"mem_delete" => self.mem_delete(payload).await,
- "mem_get_observation" => self.mem_get_observation(payload).await,
+ "mem_get_observation" => self.mem_get_observation(payload, auth).await,Then add the same check inside those handlers:
if input.input.include_deleted.unwrap_or(false) && !auth.is_audit {
return Err(CerebroError::Forbidden(
"include_deleted requires audit permissions".to_string(),
));
}As per coding guidelines: "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/cerebro/src/tools.rs` around lines 173 - 207, Add the same audit-only
guard used by mem_timeline to mem_search and mem_get_observation: inside the
mem_search and mem_get_observation handler functions, after parsing the input
(where you reference input.input.include_deleted), check if
input.input.include_deleted.unwrap_or(false) && !auth.is_audit and return
Err(CerebroError::Forbidden("include_deleted requires audit
permissions".to_string())); this ensures deleted-record access is restricted
consistently across mem_search, mem_get_observation (and mirrors the existing
logic in mem_timeline).
| Ok(json!({ | ||
| "results": items, | ||
| "truncated": false | ||
| })) |
There was a problem hiding this comment.
truncated flag is always false.
When results hit the limit, truncated should be true to signal pagination. Currently hardcoded to false.
Suggested fix
+ let truncated = results.len() >= limit;
+
let items: Vec<Value> = results
.into_iter()
// ...
Ok(json!({
"results": items,
- "truncated": false
+ "truncated": truncated
}))🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/cerebro/src/tools.rs` around lines 279 - 282, The JSON response
always sets "truncated": false; update the code that constructs the response
(the Ok(json!({ "results": items, "truncated": ... }))) to compute the truncated
flag from the query/result logic—e.g., derive it from a has_more/was_truncated
boolean or from items.len() compared to the requested limit (use items.len() >=
limit if you fetched up to limit+1 or appropriate sentinel), and set "truncated"
to that computed value instead of the hardcoded false so pagination can be
signaled correctly.
| async fn mem_delete(&self, payload: Value) -> Result<Value, CerebroError> { | ||
| let input: ToolInput<MemDeleteRequest> = serde_json::from_value(payload) | ||
| .map_err(|err| CerebroError::Validation(err.to_string()))?; | ||
| let _ = input.input.reason.as_deref(); | ||
| let hard_delete = input.input.hard_delete.unwrap_or(false); | ||
| let memory_id = match ( | ||
| input.input.memory_id.as_deref(), | ||
| input.input.topic_key.as_deref(), | ||
| ) { | ||
| (Some(id), _) => { | ||
| require_non_empty("memory_id", id)?; | ||
| id.to_string() | ||
| } | ||
| (None, Some(topic_key)) => { | ||
| require_non_empty("topic_key", topic_key)?; | ||
| let record = self | ||
| .storage | ||
| .search("", 1, true, None, Some(topic_key)) | ||
| .await? | ||
| .into_iter() | ||
| .next() | ||
| .ok_or(CerebroError::NotFound)?; | ||
| record.memory_id | ||
| } | ||
| _ => { | ||
| return Err(CerebroError::Validation( | ||
| "memory_id or topic_key must be provided".to_string(), | ||
| )); | ||
| } | ||
| }; | ||
|
|
||
| let deleted = self.storage.delete(&memory_id, hard_delete).await?; | ||
| let status = if hard_delete { | ||
| "hard_deleted" | ||
| } else { | ||
| "soft_deleted" | ||
| }; | ||
|
|
||
| Ok(json!({ | ||
| "memory_id": memory_id, | ||
| "status": status, | ||
| "deleted": deleted | ||
| })) | ||
| } |
There was a problem hiding this comment.
Hard delete lacks authorization check.
hard_delete=true permanently removes data but doesn't require elevated permissions. Soft delete is auditable, hard delete is not—apply an auth guard.
+ if hard_delete && !auth.is_audit {
+ return Err(CerebroError::Forbidden(
+ "hard_delete requires audit permissions".to_string(),
+ ));
+ }Additionally, the reason field (line 288) is extracted but discarded. Consider logging it for audit purposes.
| if let Some(topic_key) = input.input.topic_key { | ||
| record.topic_key = topic_key; | ||
| } | ||
|
|
||
| if let Some(scope) = input.input.scope { | ||
| record.scope = scope; | ||
| } | ||
|
|
||
| self.storage.save(record).await?; |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Timestamp not updated on modification.
When a record is updated, record.timestamp retains its original value. This makes it impossible to track when records were last modified. Consider updating the timestamp or adding a separate updated_at field.
Also, input.input.metadata is declared but never applied to the record.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@modules/cerebro/src/tools.rs` around lines 382 - 390, The update path is
missing applying input.input.metadata and does not update any timestamp when
modifying a record; locate the block that sets record.topic_key and record.scope
(the method that ends with self.storage.save(record).await?) and set the
record's metadata from input.input.metadata (e.g., if let Some(meta) =
input.input.metadata { record.metadata = meta }) and also update a modification
time (either set record.timestamp = Utc::now() or prefer adding/setting
record.updated_at = Utc::now() depending on the model) before calling
self.storage.save(record).await; ensure the types for metadata and
timestamp/updated_at match the Record struct.
| ### Requirement: Cerebro MCP Tool Surface | ||
|
|
||
| The Cerebro module MUST expose the MCP tool set defined in | ||
| `openspec/changes/cerebro/cerebro.md` and return structured, typed errors for invalid requests. |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify the referenced Cerebro tool contract file exists and contains expected tool definitions
# Check if the referenced spec file exists
if [ -f "openspec/changes/cerebro/cerebro.md" ]; then
echo "✓ File exists: openspec/changes/cerebro/cerebro.md"
echo ""
echo "=== Checking for tool definitions ==="
# Search for the three main tool definitions
rg -n "mem_save|mem_search|mem_delete" openspec/changes/cerebro/cerebro.md -C 2
else
echo "✗ File not found: openspec/changes/cerebro/cerebro.md"
exit 1
fiRepository: dallay/corvus
Length of output: 110
🏁 Script executed:
# Check directory structure of openspec/changes
find openspec/changes -type f -name "*.md" 2>/dev/null | head -20
# Also check openspec/ root structure
find openspec -type d -name "cerebro" 2>/dev/null
# List all files in openspec/changes if directory exists
if [ -d "openspec/changes" ]; then
echo "=== Files in openspec/changes ==="
ls -la openspec/changes/ 2>/dev/null || echo "Directory exists but no listing"
fi
# Check if spec.md file exists and get context around line 23
if [ -f "openspec/specs/cerebro/spec.md" ]; then
echo ""
echo "=== Context around line 23 of spec.md ==="
sed -n '20,25p' openspec/specs/cerebro/spec.md
fiRepository: dallay/corvus
Length of output: 2057
🏁 Script executed:
# Check the archived cerebro directory structure
find openspec/changes/archive/2026-03-16-cerebro -type f -name "*.md" | head -20
# Check openspec/specs/cerebro directory
find openspec/specs/cerebro -type f -name "*.md" 2>/dev/null
# Look for any cerebro.md file anywhere in openspec/
find openspec -name "cerebro.md" 2>/dev/null
# Check what's in the archive specs/cerebro directory
if [ -d "openspec/changes/archive/2026-03-16-cerebro/specs/cerebro" ]; then
echo "=== Files in archive specs/cerebro ==="
ls -la openspec/changes/archive/2026-03-16-cerebro/specs/cerebro/
fiRepository: dallay/corvus
Length of output: 694
🏁 Script executed:
# Check the content of the actual cerebro.md file
if [ -f "openspec/changes/archive/2026-03-16-cerebro/cerebro.md" ]; then
echo "=== Checking archive cerebro.md for tool definitions ==="
rg -n "mem_save|mem_search|mem_delete" openspec/changes/archive/2026-03-16-cerebro/cerebro.md -C 1
echo ""
echo "=== First 50 lines of archive cerebro.md ==="
head -50 openspec/changes/archive/2026-03-16-cerebro/cerebro.md
fiRepository: dallay/corvus
Length of output: 3492
Fix broken external reference path.
Line 23 references openspec/changes/cerebro/cerebro.md, but this file does not exist at that location. The actual file is at openspec/changes/archive/2026-03-16-cerebro/cerebro.md. Update the reference path to point to the correct location, or move/copy the file to the referenced path to maintain the spec's external contract.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/specs/cerebro/spec.md` around lines 20 - 23, Update the broken
external reference in the "Cerebro MCP Tool Surface" requirement by replacing
the incorrect path string 'openspec/changes/cerebro/cerebro.md' with the correct
location 'openspec/changes/archive/2026-03-16-cerebro/cerebro.md' (or
alternatively move/copy the target file to the original path so the existing
reference remains valid); change the path in the spec line that currently
references 'openspec/changes/cerebro/cerebro.md' so the link resolves to the
actual file.
|


This pull request migrates Corvus' long-term memory backend from SurrealDB to a new MCP-backed Cerebro service. All references, configuration options, documentation, and code paths related to SurrealDB have been removed or replaced. The runtime now uses the Cerebro service for long-term memory, while SQLite and Markdown remain as options for local, short-term memory. The agent's memory loading logic is updated to select Cerebro if configured. Documentation and configuration examples are updated to guide users through the transition, including migration resources.
Memory Backend Migration and Code Updates:
CerebroMemoryLoaderwhen a Cerebro endpoint is configured, otherwise defaulting to the local loader. Added theCerebroMemoryLoaderimplementation. [1] [2] [3] [4] [5] [6] [7]Documentation and Example Updates:
README.md, example code, and configuration samples) to reference Cerebro instead of SurrealDB, including new configuration keys, migration guides, and schema documentation. [1] [2] [3] [4]Migration Resources:
This update ensures a unified, secure, and maintainable memory backend for Corvus, with clear migration paths for existing users.