fix: Harden sandbox isolation and update allowed status handling#398
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughSummary by CodeRabbit
WalkthroughThis PR wires an injected OS sandbox into tool construction and shell execution, makes sandbox detection failable via a new Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent
participant ShellTool as ShellTool
participant Sandbox as Sandbox
participant Audit as AuditLogger
participant Proc as Process
Agent->>ShellTool: execute(command)
ShellTool->>ShellTool: validate via SecurityPolicy
ShellTool->>ShellTool: sanitize environment
ShellTool->>Sandbox: wrap_command(&mut cmd)
alt wrap_command succeeds
Sandbox-->>ShellTool: wrapped command
ShellTool->>Proc: spawn & await (timeout)
Proc-->>ShellTool: exit status, stdout/stderr
ShellTool->>Audit: log_shell_audit_event(command, approved, risk_level, sandbox_backend)
Audit-->>ShellTool: recorded
ShellTool-->>Agent: ToolResult { success, structured }
else wrap_command fails
Sandbox-->>ShellTool: error
ShellTool->>Audit: log_shell_audit_event(..., sandbox_backend)
ShellTool-->>Agent: ToolResult { success: false, error (sandbox) }
end
sequenceDiagram
participant Agent as Agent
participant BrowserTool as BrowserTool
participant Sidecar as Sidecar
participant Audit as AuditLogger
Agent->>BrowserTool: execute_computer_use_action(action)
alt first action & verification required
BrowserTool->>BrowserTool: derive health URL
BrowserTool->>Sidecar: GET /v1/health (bearer auth, timeout)
alt sidecar responds healthy
Sidecar-->>BrowserTool: {status:"healthy", isolation}
BrowserTool->>BrowserTool: cache SidecarHealthStatus
BrowserTool->>Audit: log_browser_security_event(action, sidecar_health)
else sidecar fails
alt require=true
BrowserTool-->>Agent: error (reject action)
else require=false
BrowserTool->>BrowserTool: warn, cache failure
BrowserTool->>Audit: log_browser_security_event(action, sidecar_health=None)
end
end
else use cached or skip check
end
BrowserTool->>Sidecar: POST action
Sidecar-->>BrowserTool: response
BrowserTool-->>Agent: ToolResult { structured: { computer_use: { action, sidecar_health } } }
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Contributor ReportUser: @yacosta738
Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-04-03 to 2026-04-03 |
…sandbox isolation guide
Deploying corvus with
|
| Latest commit: |
47a2835
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://fac2b55f.corvus-42x.pages.dev |
| Branch Preview URL: | https://runtime-sandbox.corvus-42x.pages.dev |
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clients/agent-runtime/src/security/detect.rs (1)
117-145:⚠️ Potential issue | 🟠 MajorProbe Bubblewrap in Linux auto-detection too.
create_sandbox()accepts an explicitBubblewrapbackend on Linux above, butdetect_best_sandbox()never probes it in the Linux path. A Linux host where Bubblewrap is the only available backend will incorrectly fall back toNoopSandboxor fail startup in require mode.🛠️ Minimal fix
#[cfg(target_os = "linux")] { // Try Landlock first (native, no dependencies) #[cfg(feature = "sandbox-landlock")] { if let Ok(sandbox) = super::landlock::LandlockSandbox::probe() { tracing::info!("Landlock sandbox enabled (Linux kernel 5.13+)"); return Ok(Arc::new(sandbox)); } } // Try Firejail second (user-space tool) if let Ok(sandbox) = super::firejail::FirejailSandbox::probe() { tracing::info!("Firejail sandbox enabled"); return Ok(Arc::new(sandbox)); } + + #[cfg(feature = "sandbox-bubblewrap")] + { + if let Ok(sandbox) = super::bubblewrap::BubblewrapSandbox::probe() { + tracing::info!("Bubblewrap sandbox enabled"); + return Ok(Arc::new(sandbox)); + } + } }Based on learnings "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/security/detect.rs` around lines 117 - 145, The Linux detection path in detect_best_sandbox() doesn't probe Bubblewrap, so add probing for super::bubblewrap::BubblewrapSandbox::probe() inside the #[cfg(target_os = "linux")] block (similar to how Landlock and Firejail are probed) so that detect_best_sandbox() can return an Arc::new(bubblewrap) when available; ensure the probe block is gated by #[cfg(feature = "sandbox-bubblewrap")] to match create_sandbox() and keep behavior consistent with require-mode/security expectations.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/src/agent/agent.rs`:
- Around line 608-623: The audit write currently swallows failures in strict
mode by only changing log level inside the execute_tool_call path where
logger.log_command_event(CommandExecutionLog { .. }) is called; change the
logging helper(s) to return Result<(), AuditError> (or appropriate error type)
instead of unit, update the call sites in execute_tool_call to propagate errors
(use ? or map_err) so that when self.audit_strict is true the function returns
early on audit write failure, and mirror the same change for the other audit
call sites (the similar block around record_code_session_result) so strict mode
consistently fails the overall operation instead of only lowering the log level.
- Around line 586-617: The command string is recorded verbatim into
CommandExecutionLog via logger.log_command_event, which can leak secrets; before
building the CommandExecutionLog, sanitize/redact the `command` value (the local
variable `command`) using the same redaction logic used elsewhere in this module
(or add a helper like `redact_command`/`sanitize_shell_command`) to strip inline
secrets (passwords, tokens, auth headers, environment substitutions, or long
hex/ssh-like tokens) and then pass the sanitized value into CommandExecutionLog
so logger.log_command_event never receives raw sensitive payloads.
In `@clients/agent-runtime/src/bootstrap/mod.rs`:
- Around line 227-233: The sandbox is created too late; move the call to
crate::security::create_sandbox(&config.security) so it runs before
init_memory_and_observer_with_overrides(...) and create_runtime(...) to allow
security.sandbox.require = true to fail fast. Update the sequence in
bootstrap/mod.rs so sandbox is created first, then initialize memory/observer
via init_memory_and_observer_with_overrides, then call create_runtime, and
finally pass the created sandbox into tools::all_tools_with_runtime(...) (ensure
the variable names remain sandbox and are Arc/cloned as needed).
In `@clients/agent-runtime/src/security/audit.rs`:
- Around line 251-252: The call currently clones entry.sandbox_backend before
passing it to with_security; since entry is owned by log_command_event(), change
the code to move the backend into with_security instead of cloning it (i.e.,
pass entry.sandbox_backend directly to with_security), ensuring you no longer
call clone() on sandbox_backend in the chain that also uses with_result; update
any subsequent uses of entry after the move accordingly or reorder operations so
the move is valid.
In `@clients/agent-runtime/src/tools/browser.rs`:
- Around line 353-389: verify_computer_use_sidecar has a TOCTOU race: it
releases sidecar_health before calling fetch_computer_use_sidecar_health so
concurrent callers can all trigger the fetch; replace this ad-hoc caching with a
true once-init primitive (tokio::sync::OnceCell) for the sidecar health, or use
OnceCell/get_or_init semantics so the first caller runs
fetch_computer_use_sidecar_health and others await that result; update the
sidecar_health field and adjust verify_computer_use_sidecar to call
OnceCell::get_or_try_init (or equivalent) and reference
fetch_computer_use_sidecar_health and SidecarHealthStatus accordingly to ensure
only one health request is performed under concurrency.
In `@clients/agent-runtime/src/tools/shell.rs`:
- Around line 703-729: Replace the manual unsafe env manipulation and manual
restoration for CORVUS_TEST_SECRET with the module's EnvGuard RAII pattern and
mark the test with #[tokio::test(flavor = "current_thread")]; specifically,
remove the unsafe std::env::set_var / remove_var and the previous-value restore
logic around CORVUS_TEST_SECRET, create an EnvGuard instance at the start of the
test and use it to set CORVUS_TEST_SECRET to "super-secret" (so it will be
restored on drop), keep the rest of the test (EnvInspectSandbox, ShellTool::new,
execute call, and assertions) unchanged, and ensure the test function has the
current_thread tokio flavor to serialize env-dependent tests.
In `@clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md`:
- Around line 1-3: This file is missing the required docs frontmatter before the
H1 "Runtime Sandbox Isolation"; add a YAML frontmatter block (--- ... ---)
immediately at the top of the file containing the minimal required fields
(title, description, and any required metadata per the docs pipeline) so the
page builds and is indexed, then keep the existing "Runtime Sandbox Isolation"
heading and content unchanged.
In `@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.md`:
- Around line 95-97: The rollback description incorrectly states that setting
sandbox.require = false and sandbox.backend = None restores pre-change behavior;
in reality backend = None still routes through NoopSandbox and leaves the new
mutating-command warning path active (so only reverting the PR removes
warnings). Update the prose in this section to state that a config-only change
(sandbox.require = false and sandbox.backend = None) disables enforcement but
does not remove the new warning behavior, and clarify that full removal of the
warnings requires reverting the PR (undoing the ShellTool field and its call) or
a code rollback; reference sandbox.require, sandbox.backend, NoopSandbox,
ShellTool, and the mutating-command warning path in the text so operators know
exactly what each option does.
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.md`:
- Line 65: The scenario title "**R2-S3: Require mode with auto finds nothing
fails**" is awkward; update that header to a clearer phrasing such as "Require
mode with auto-detection finds no backend" (or similar) so the intent is
immediately readable; locate and replace the exact header string in the spec
file (the title line currently: R2-S3: Require mode with auto finds nothing
fails) with the new wording.
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md`:
- Line 1: Replace the second-level heading "## Verification Report" with a
top-level heading "# Verification Report" and ensure you add a blank line both
before and after every heading and table in this document to resolve
MD041/MD022/MD058; specifically update the "## Verification Report" heading and
any other headings/tables mentioned (occurrences around lines referenced: 8-9,
92-93, 104-105, 135) so each heading is preceded and followed by a single blank
line and tables have the required blank-line padding to make the markdownlint
checks pass.
- Around line 76-78: Update the compliance status for R4-S2 and R4-S3 in
verify-report.md to "Partial" until runtime propagation is validated: change the
entries corresponding to tests audit_log_command_event_writes_structured_entry
and audit_log_command_event_records_real_sandbox_backend to reflect partial
compliance because the current tests in
clients/agent-runtime/src/security/audit.rs only inject values into
CommandExecutionLog (proving serialization) rather than asserting that
NoopSandbox or the real sandbox backend name is emitted via the actual
shell/browser execution path; add a short note referencing CommandExecutionLog
and the need for an end-to-end runtime propagation test to flip these back to
compliant.
---
Outside diff comments:
In `@clients/agent-runtime/src/security/detect.rs`:
- Around line 117-145: The Linux detection path in detect_best_sandbox() doesn't
probe Bubblewrap, so add probing for
super::bubblewrap::BubblewrapSandbox::probe() inside the #[cfg(target_os =
"linux")] block (similar to how Landlock and Firejail are probed) so that
detect_best_sandbox() can return an Arc::new(bubblewrap) when available; ensure
the probe block is gated by #[cfg(feature = "sandbox-bubblewrap")] to match
create_sandbox() and keep behavior consistent with require-mode/security
expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: aae5ee38-5750-4b09-ad34-d325e5d653bc
📒 Files selected for processing (20)
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/state.yamlopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/specs/sandbox-hardening/spec.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: sonar
- GitHub Check: pr-checks
- GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (8)
**/*
⚙️ CodeRabbit configuration file
**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.
Files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/state.yamlclients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdclients/agent-runtime/src/tools/mod.rs
clients/agent-runtime/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testfor code validation, or document which checks were skipped and why
Files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
**/*.rs
⚙️ CodeRabbit configuration file
**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.
Files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
clients/agent-runtime/src/{security,gateway,tools}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Treat
src/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Files:
clients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
clients/agent-runtime/src/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Files:
clients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Files:
clients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
**/*.{md,mdx}
⚙️ CodeRabbit configuration file
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.
Files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.md
clients/agent-runtime/src/tools/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path
Files:
clients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
🧠 Learnings (12)
📓 Common learnings
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Keep startup path lean and avoid heavy initialization in command parsing flow
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/bootstrap/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/bootstrap/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/security/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/config/schema.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/shell.rsopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/browser.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.mdclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/Cargo.toml : Do not add heavy dependencies for minor convenience; justify new crate additions
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/Cargo.toml : Preserve release-size profile assumptions in `Cargo.toml` and avoid adding heavy dependencies unless clearly justified
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/tools/mod.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Applied to files:
clients/agent-runtime/tests/mcp_native_regression.rsclients/agent-runtime/tests/mcp_runtime_e2e.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/channels/**/*.rs : Implement `Channel` trait in `src/channels/` with consistent `send`, `listen`, and `health_check` semantics and cover auth/allowlist/health behavior with tests
Applied to files:
clients/agent-runtime/tests/mcp_runtime_e2e.rsclients/agent-runtime/src/tools/browser.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Keep startup path lean and avoid heavy initialization in command parsing flow
Applied to files:
clients/agent-runtime/src/security/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/exploration.mdopenspec/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.md
🪛 GitHub Actions: Docs Quality
clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md
[error] 1-1: Documentation metadata validation failed: missing frontmatter block.
🪛 LanguageTool
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/tasks.md
[style] ~34-~34: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...= false, return Err(contradiction). Whenrequire = false`, preserve existing No...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
openspec/specs/sandbox-hardening/spec.md
[grammar] ~65-~65: Use a hyphen to join words.
Context: ...backend R2-S3: Require mode with auto finds nothing fails - Given `sandbox.r...
(QB_NEW_EN_HYPHEN)
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.md
[grammar] ~65-~65: Use a hyphen to join words.
Context: ...backend R2-S3: Require mode with auto finds nothing fails - Given `sandbox.r...
(QB_NEW_EN_HYPHEN)
🪛 markdownlint-cli2 (0.22.0)
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
[warning] 1-1: First line in a file should be a top-level heading
(MD041, first-line-heading, first-line-h1)
[warning] 8-8: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 9-9: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
[warning] 92-92: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 93-93: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
[warning] 104-104: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 105-105: Tables should be surrounded by blank lines
(MD058, blanks-around-tables)
[warning] 135-135: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.md
[warning] 90-90: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (12)
clients/agent-runtime/src/tools/browser.rs (5)
57-67: LGTM!Clean state modeling for sidecar health caching. Using
tokio::sync::Mutexis appropriate for the async context.
391-396: LGTM!Sensible timeout capping at 5s for health checks. Creating a new
reqwest::Clientper call is fine for this infrequent operation.
850-856: Good: Fail-closed enforcement wired correctly.When
sidecar_verification_required=true, failed health checks returnErrhere, blocking the action. This aligns with the spec's R5-S3 scenario.
2327-2355: LGTM!Clean test infrastructure using ephemeral ports. The fire-and-forget
tokio::spawnis fine for short-lived tests.
2624-2724: Good test coverage for sidecar verification scenarios.Tests cover R5-S1 (healthy with isolation info), R5-S2 (graceful failure), R5-S3 (required mode rejection), and R5-S4 (missing endpoint). The low timeouts (50-200ms) keep tests fast.
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.md (1)
1-200: Well-structured security specification.Requirements R1-R6 map cleanly to the implementation. Scenarios are testable and the acceptance criteria provide clear validation gates.
clients/agent-runtime/src/tools/mod.rs (4)
92-110: LGTM!Clean threading of
sandboxthrough the factory. OnlyShellToolreceives it—file tools don't execute shell commands and correctly skip sandbox dependency.
112-147: LGTM!Explicit field mapping for
ComputerUseConfigaids maintainability. Builder pattern forsidecar_verification_requiredis clean.
368-373: Good: Singlesandbox.requiregoverns both shell sandbox and sidecar verification.This matches spec R5-S3—operators who mandate sandbox isolation also expect sidecar verification. The semantic coupling is intentional.
401-407: LGTM!Clean test helper.
NoopSandboxis appropriate for unit tests—real sandboxing would add unnecessary complexity.openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/design.md (2)
9-40: Well-documented design decisions.Each choice includes alternatives considered and rationale. The trade-off documentation (e.g.,
wrap_commandfor ALL executions vs. risk-based filtering) will help future maintainers.
209-243: LGTM!Testing strategy covers both unit and integration layers with concrete examples. The
MockSandboxpattern aligns with the actualtest_sandbox()implementation.
| if let Err(error) = logger.log_command_event(CommandExecutionLog { | ||
| channel: "agent", | ||
| command, | ||
| risk_level, | ||
| approved, | ||
| allowed: result.structured.is_some(), | ||
| success: result.success, | ||
| duration_ms, | ||
| sandbox_backend, | ||
| }) { | ||
| if self.audit_strict { | ||
| tracing::error!("Failed to write shell command audit event: {error}"); | ||
| } else { | ||
| tracing::warn!("Failed to write shell command audit event: {error}"); | ||
| } | ||
| } |
There was a problem hiding this comment.
audit_strict is only changing log level here.
When these writes fail, the helpers still return normally in strict mode. That lets shell/browser actions complete without the required audit trail, which is inconsistent with record_code_session_result() further down this file. Please make the helpers return a Result and propagate strict failures out of execute_tool_call().
Also applies to: 667-673
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/agent/agent.rs` around lines 608 - 623, The audit
write currently swallows failures in strict mode by only changing log level
inside the execute_tool_call path where
logger.log_command_event(CommandExecutionLog { .. }) is called; change the
logging helper(s) to return Result<(), AuditError> (or appropriate error type)
instead of unit, update the call sites in execute_tool_call to propagate errors
(use ? or map_err) so that when self.audit_strict is true the function returns
early on audit write failure, and mirror the same change for the other audit
call sites (the similar block around record_code_session_result) so strict mode
consistently fails the overall operation instead of only lowering the log level.
| let sandbox = crate::security::create_sandbox(&config.security)?; | ||
|
|
||
| let tools = tools::all_tools_with_runtime( | ||
| Arc::new(config.clone()), | ||
| &security, | ||
| Arc::clone(&runtime), | ||
| sandbox, |
There was a problem hiding this comment.
Fail before initializing runtime state.
create_sandbox() still runs after init_memory_and_observer_with_overrides() and create_runtime(), so security.sandbox.require = true can do avoidable setup work before startup aborts. Move sandbox creation ahead of the memory/runtime initialization so the new require mode is actually fail-fast.
Suggested reordering
) -> anyhow::Result<Self> {
let profile = AgentProfile::from_config(config)?;
+ let sandbox = crate::security::create_sandbox(&config.security)?;
let (memory, observer) = init_memory_and_observer_with_overrides(
config,
profile,
memory_override,
observer_override,
@@
- let sandbox = crate::security::create_sandbox(&config.security)?;
-
let tools = tools::all_tools_with_runtime(
Arc::new(config.clone()),
&security,
Arc::clone(&runtime),
sandbox,🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/bootstrap/mod.rs` around lines 227 - 233, The
sandbox is created too late; move the call to
crate::security::create_sandbox(&config.security) so it runs before
init_memory_and_observer_with_overrides(...) and create_runtime(...) to allow
security.sandbox.require = true to fail fast. Update the sequence in
bootstrap/mod.rs so sandbox is created first, then initialize memory/observer
via init_memory_and_observer_with_overrides, then call create_runtime, and
finally pass the created sandbox into tools::all_tools_with_runtime(...) (ensure
the variable names remain sandbox and are Arc/cloned as needed).
- Propagate audit write errors in strict mode instead of swallowing them - Redact shell commands before writing to audit log - Move sandbox creation before memory/observer init for fail-fast - Remove unnecessary clone of sandbox_backend in audit path - Replace TOCTOU sidecar health cache with tokio::sync::OnceCell - Use EnvGuard RAII pattern in shell env sanitization test - Add Bubblewrap probe to Linux auto-detection path - Correct rollback description in proposal - Clarify spec scenario title and verify-report compliance status
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (2)
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md (1)
138-139:⚠️ Potential issue | 🟡 MinorAdd a blank line after the “### Verdict” heading.
Line 138 should be followed by one blank line before Line 139 to satisfy markdownlint heading spacing.
Suggested patch
### Verdict + PASS WITH WARNINGSAs per coding guidelines, “Verify technical accuracy and that docs stay aligned with code changes.”
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md` around lines 138 - 139, Add a single blank line immediately after the "### Verdict" heading in the verify-report.md content so the heading is followed by one empty line (i.e., change the block that currently reads "### Verdict" followed immediately by "PASS WITH WARNINGS" to have one blank line between them).clients/agent-runtime/src/agent/agent.rs (1)
535-544:⚠️ Potential issue | 🔴 CriticalDon't turn a completed shell command into a retriable tool failure.
log_shell_audit_event()runs aftertool.execute()returns, so in strict mode this branch can reportsuccess: falseeven though the shell command already ran. That gives the model/user a false retry signal for non-idempotent shell operations. Please surface this as a terminal audit failure path, not a normal tool-execution failure.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/agent/agent.rs` around lines 535 - 544, The code currently converts a post-execution audit logging error from log_shell_audit_event into a retriable tool failure by returning ToolExecutionResult with success: false and DispatchAction::Execute; instead, treat audit logging failures as terminal audit failures so the already-run shell command is not marked retriable. Update the branch handling log_shell_audit_event errors inside the shell call path (where call.name == "shell") to return a terminal/audit-failure action (e.g., DispatchAction::TerminalFailure or the project’s terminal failure variant) and preserve the tool's successful execution state (success: true and appropriate output/message indicating audit logging failed), so the dispatcher will stop retrying the tool but surface the audit error as terminal. Ensure you reference the same symbols: log_shell_audit_event, ToolExecutionResult, call.name, call.tool_call_id, and crate::agent::dispatcher::DispatchAction when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/agent-runtime/src/agent/agent.rs`:
- Around line 605-625: The code is using result.structured.is_some() as a proxy
for "allowed", which is incorrect; update the ShellTool (the tool that builds
runtime command results) to carry explicit fields like allowed: bool and
risk_level: Option<String> (or risk_level: &str) in the result type, populate
those when approval/validation succeeds or fails, and then change the logger
call (logger.log_command_event / CommandExecutionLog construction) to use
result.allowed and result.risk_level.unwrap_or("unknown") instead of
result.structured.is_some() and the current risk_level extraction; ensure any
callers that create CommandExecutionLog use the new explicit fields so
approved-but-failed commands log allowed=true with the correct risk level.
In `@clients/agent-runtime/src/tools/browser.rs`:
- Around line 404-407: The error path currently injects the raw endpoint (which
may include user:pass) into logs and ToolResult. Parse the endpoint (e.g., using
url::Url) before any formatting and either strip userinfo (set username/password
to empty) or reject endpoints containing credentials during validation; update
the with_context formatting around request.send() in the health-call block and
any assignment to sidecar_health.endpoint (the code that
formats/returns/serializes sidecar_health around lines using
sidecar_health.endpoint) to use the sanitized URL (or return an error on
credentials) so no raw credentials are ever logged or returned.
- Around line 421-428: The health-response handling currently normalizes missing
status/isolation into permissive defaults; change it to fail closed by treating
any non-"healthy" status or missing/empty isolation as an error: in the function
that parses the health payload (the code that reads payload.get("status") and
payload.get("isolation")), replace the unwrap_or defaults with explicit
validation — if status is absent or status != "healthy" return an Err (or mark
the response unverified) and if isolation is absent/empty (e.g., equals {} after
cloning) likewise return an Err; update the caller to propagate this failure so
actions requiring security.sandbox.require=true are blocked, and add regression
tests exercising 200 responses with {} and {"status":"unhealthy"} to ensure they
are rejected.
- Around line 345-350: The computer_use_health_url function currently calls
computer_use_endpoint_url() and then uses url.set_path("/v1/health"), which
discards any configured base path; change the logic to preserve the endpoint's
base path by inspecting the path returned from computer_use_endpoint_url(): if
the path ends_with "/v1/actions" replace only that suffix with "/v1/health"
(keeping existing leading path segments), otherwise either accept known
alternative shapes or return an error validating unsupported path shapes; ensure
you still clear query/fragment and return the modified reqwest::Url from
computer_use_health_url, referencing computer_use_health_url and
computer_use_endpoint_url in your changes.
In `@clients/agent-runtime/src/tools/shell.rs`:
- Around line 191-195: The warning currently logs the raw shell command
(variable command) in the tracing::warn call inside the
should_warn_for_noop_sandbox branch; remove or redact that sensitive value so
secrets/tokens aren't emitted. Update the tracing::warn invocation used with
should_warn_for_noop_sandbox(self.security.as_ref(), self.sandbox.as_ref(),
command) to omit the command field (or replace it with a non-sensitive
placeholder such as "<redacted>" or command length/summary), ensuring only
non-secret context is logged and leaving the rest of the message unchanged.
In `@clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md`:
- Around line 42-47: The Linux backend autodetection order in the guide is out
of sync with the runtime: update the list for backend = "auto" so Linux reads
"landlock → bubblewrap → firejail → docker" (preserve macOS as "bubblewrap →
docker" and the fallback behavior when require = false -> none); mirror this
exact change in the corresponding translated/ES runtime-sandbox-isolation guide
so both docs match the detection code.
- Around line 122-127: Update the docs to clarify that the audited shell command
field is redacted/sanitized: change the "- command" bullet to something like "-
command (redacted/sanitized)" and ensure the Elasticsearch translation/schema
for the audit sink (including the security.sandbox_backend field) reflects that
only a sanitized command is stored; update any references to "command" in the
runtime-sandbox-isolation guide and the ES translation to explicitly state
redaction to keep docs aligned with the runtime change that strips sensitive
command text before the audit sink.
In `@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.md`:
- Around line 71-81: Update the affected-areas table to mark
src/security/audit.rs as Modified and describe the change: that a
sandbox_backend field was added to the audit payload and that audit wiring now
includes sandbox_backend in command-event logging and tests (e.g., references to
sandbox_backend in AuditPayload and command-event logging/test helpers). Ensure
the Description briefly notes schema change (added sandbox_backend) and
logging/test blast radius so readers are aware this file's audit schema and
logging behavior were modified.
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md`:
- Line 8: The Markdown heading level jumps from H1 to H3 for the section titled
"Completeness"; update that heading ("### Completeness") to a proper H2 ("##
Completeness") so the document maintains a correct heading hierarchy and passes
lint checks, then verify related headings in verify-report.md to ensure
consistent levels throughout.
---
Duplicate comments:
In `@clients/agent-runtime/src/agent/agent.rs`:
- Around line 535-544: The code currently converts a post-execution audit
logging error from log_shell_audit_event into a retriable tool failure by
returning ToolExecutionResult with success: false and DispatchAction::Execute;
instead, treat audit logging failures as terminal audit failures so the
already-run shell command is not marked retriable. Update the branch handling
log_shell_audit_event errors inside the shell call path (where call.name ==
"shell") to return a terminal/audit-failure action (e.g.,
DispatchAction::TerminalFailure or the project’s terminal failure variant) and
preserve the tool's successful execution state (success: true and appropriate
output/message indicating audit logging failed), so the dispatcher will stop
retrying the tool but surface the audit error as terminal. Ensure you reference
the same symbols: log_shell_audit_event, ToolExecutionResult, call.name,
call.tool_call_id, and crate::agent::dispatcher::DispatchAction when making the
change.
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md`:
- Around line 138-139: Add a single blank line immediately after the "###
Verdict" heading in the verify-report.md content so the heading is followed by
one empty line (i.e., change the block that currently reads "### Verdict"
followed immediately by "PASS WITH WARNINGS" to have one blank line between
them).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: af4fc779-729e-4c6a-8b4a-c32288c1ba94
📒 Files selected for processing (12)
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsclients/web/apps/docs/astro.config.mjsclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pr-checks
- GitHub Check: sonar
- GitHub Check: submit-gradle
- GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (8)
**/*
⚙️ CodeRabbit configuration file
**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.
Files:
clients/web/apps/docs/astro.config.mjsclients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
clients/agent-runtime/src/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
clients/agent-runtime/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testfor code validation, or document which checks were skipped and why
Files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
**/*.rs
⚙️ CodeRabbit configuration file
**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.
Files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
clients/agent-runtime/src/{security,gateway,tools}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Treat
src/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Files:
clients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Files:
clients/agent-runtime/src/security/audit.rsclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
**/*.{md,mdx}
⚙️ CodeRabbit configuration file
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.
Files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
clients/agent-runtime/src/tools/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path
Files:
clients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rs
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Applied to files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/security/detect.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
Applied to files:
clients/agent-runtime/src/bootstrap/mod.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rsclients/agent-runtime/src/tools/browser.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Applied to files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.mdclients/agent-runtime/src/security/detect.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Keep startup path lean and avoid heavy initialization in command parsing flow
Applied to files:
clients/agent-runtime/src/bootstrap/mod.rsclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths
Applied to files:
clients/agent-runtime/src/security/audit.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/security/detect.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.mdclients/agent-runtime/src/tools/shell.rsclients/agent-runtime/src/tools/browser.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/specs/sandbox-hardening/spec.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Applied to files:
clients/agent-runtime/src/agent/agent.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Applied to files:
clients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/channels/**/*.rs : Implement `Channel` trait in `src/channels/` with consistent `send`, `listen`, and `health_check` semantics and cover auth/allowlist/health behavior with tests
Applied to files:
clients/agent-runtime/src/tools/browser.rs
🪛 LanguageTool
clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
[grammar] ~11-~11: Corrige la mayúscula.
Context: ...n docType: guide --- # Aislamiento del Sandbox del Runtime Corvus usa **dos capas de ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_UPPERCASE)
[grammar] ~11-~11: Corrige la mayúscula.
Context: ...uide --- # Aislamiento del Sandbox del Runtime Corvus usa dos capas de seguridad pa...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_UPPERCASE)
[style] ~15-~15: En sentido figurado es preferible usar otras expresiones.
Context: ... disparada por usuarios: 1. Política a nivel de aplicación mediante SecurityPolicy ...
(A_NIVEL_DE)
[grammar] ~16-~16: Agrega un signo de puntuación.
Context: ...nte SecurityPolicy - allowlists de comandos - restricciones de rutas - límite...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~17-~17: Agrega un signo de puntuación.
Context: ...lists de comandos - restricciones de rutas - límites de tasa - clasificación...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~18-~18: Agrega un signo de puntuación.
Context: ... restricciones de rutas - límites de tasa - clasificación de riesgo y compuert...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~19-~19: Agrega un signo de puntuación.
Context: ...clasificación de riesgo y compuertas de aprobación 2. **Sandboxing a nivel de sistema oper...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[style] ~20-~20: En sentido figurado es preferible usar otras expresiones.
Context: ...ompuertas de aprobación 2. Sandboxing a nivel de sistema operativo mediante los backen...
(A_NIVEL_DE)
[grammar] ~20-~20: Aquí puede haber un error.
Context: ...iante los backends Sandbox del runtime - landlock - firejail - bubblewrap - docker Las dos capas importan. La capa de polít...
(QB_NEW_ES)
[grammar] ~55-~55: Cambia la palabra o signo.
Context: ... → error al iniciar - backend = "auto" sin backend encontrado → error al inicia...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~56-~56: Cambia la palabra o signo.
Context: ... → error al iniciar - backend = "none" → error al iniciar - enabled = false →...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~57-~57: Cambia la palabra o signo.
Context: ...→ error al iniciar -enabled = false` → error al iniciar Úsalo en despliegues...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~67-~67: Corrige la minúscula.
Context: ...Corvus aplica ahora esta secuencia: 1. valida el comando contra SecurityPolicy 2. s...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_LOWERCASE)
[grammar] ~68-~68: Oración con errores
Context: ...a el comando contra SecurityPolicy 2. sanitiza variables de entorno 3. envuelve el com...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~68-~68: Oración con errores
Context: ...Policy` 2. sanitiza variables de entorno 3. envuelve el comando con el backend de sandbox se...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~69-~69: Oración con errores
Context: ...o con el backend de sandbox seleccionado 4. ejecuta con timeout 5. audita el resultado con ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~70-~70: Oración con errores
Context: ...dbox seleccionado 4. ejecuta con timeout 5. audita el resultado con metadata del sandbox ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~71-~71: Aquí puede haber un error.
Context: ...ta el resultado con metadata del sandbox Eso significa que cada comando shell per...
(QB_NEW_ES)
[style] ~77-~77: En sentido figurado es preferible usar otras expresiones.
Context: ...nto del sandbox noop Si no hay backend a nivel de sistema operativo y require = false, ...
(A_NIVEL_DE)
[grammar] ~80-~80: Corrige la minúscula.
Context: ...none (NoopSandbox). En ese modo: - la política a nivel de aplicación sigue ap...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_LOWERCASE)
[style] ~80-~80: En sentido figurado es preferible usar otras expresiones.
Context: ...pSandbox`). En ese modo: - la política a nivel de aplicación sigue aplicando - los comand...
(A_NIVEL_DE)
[grammar] ~80-~80: Agrega un signo de puntuación.
Context: ...la política a nivel de aplicación sigue aplicando - los comandos mutables emiten una advertenci...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~81-~81: Agrega un signo de puntuación.
Context: ...l sandbox del sistema operativo no está activo - los logs de auditoría registran `sandbox_ba...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~82-~82: Aquí puede haber un error.
Context: ...activo - los logs de auditoría registran sandbox_backend = "none" Esto se permite para desarrollo local, p...
(QB_NEW_ES)
[grammar] ~91-~91: Oración con errores
Context: ...idecar. Valores seguros por defecto: - el endpoint por defecto usa loopback: `htt...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~91-~91: Elimina la palabra o signo.
Context: ...- el endpoint por defecto usa loopback: http://127.0.0.1:8787/v1/actions - los endpoints remotos/públicos se bloque...
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_SPACE)
[grammar] ~92-~92: Elimina la palabra o signo.
Context: ... remotos/públicos se bloquean salvo que allow_remote_endpoint = true - los endpoints públicos remotos deben usa...
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_SPACE)
[grammar] ~93-~93: Oración con errores
Context: ...points públicos remotos deben usar HTTPS - dominios permitidos, allowlists de ventanas y lí...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~94-~94: Aquí puede haber un error.
Context: ...das se reenvían como política al sidecar Corvus hace un health-check perezoso...
(QB_NEW_ES)
[grammar] ~118-~118: Aquí puede haber un error.
Context: ...o verificarse el aislamiento del sidecar ## Expectativas de auditoría Los eventos d...
(QB_NEW_ES)
[grammar] ~126-~126: Aquí puede haber un error.
Context: ... - bandera de aprobación - éxito/fracaso - security.sandbox_backend La verificación del sidecar de computer-...
(QB_NEW_ES)
[grammar] ~129-~129: Aquí puede haber un error.
Context: ...-use genera una entrada SecurityEvent describiendo: - estado de salud del sidecar - tipo de...
(QB_NEW_ES)
[grammar] ~130-~130: Agrega un signo de puntuación.
Context: ...nt` describiendo: - estado de salud del sidecar - tipo de aislamiento reportado - runti...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~131-~131: Agrega un signo de puntuación.
Context: ...salud del sidecar - tipo de aislamiento reportado - runtime reportado ## Valores recomen...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~132-~132: Aquí puede haber un error.
Context: ...islamiento reportado - runtime reportado ## Valores recomendados para operadores ##...
(QB_NEW_ES)
[grammar] ~171-~171: Elimina la puntuación
Context: ...te de despliegue del sidecar de computer-use.
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_PUNCTUATION)
🪛 markdownlint-cli2 (0.22.0)
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
[warning] 8-8: Heading levels should only increment by one level at a time
Expected: h2; Actual: h3
(MD001, heading-increment)
[warning] 138-138: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🔇 Additional comments (1)
clients/agent-runtime/src/bootstrap/mod.rs (1)
207-209: Good fail-fast placement.Creating the sandbox before memory/runtime setup keeps
sandbox.require = truegenuinely fail-closed and avoids doing startup work that will be discarded anyway.Based on learnings,
Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable.
| let structured = result.structured.as_ref(); | ||
| let risk_level = structured | ||
| .and_then(|v| v.get("risk_level")) | ||
| .and_then(|v| v.as_str()) | ||
| .unwrap_or("unknown"); | ||
| let sandbox_backend = structured | ||
| .and_then(|v| v.get("sandbox_backend")) | ||
| .and_then(|v| v.as_str()) | ||
| .map(str::to_string); | ||
|
|
||
| let duration_ms = u64::try_from(duration.as_millis()).unwrap_or(u64::MAX); | ||
|
|
||
| if let Err(error) = logger.log_command_event(CommandExecutionLog { | ||
| channel: "agent", | ||
| command: &redacted_command, | ||
| risk_level, | ||
| approved, | ||
| allowed: result.structured.is_some(), | ||
| success: result.success, | ||
| duration_ms, | ||
| sandbox_backend, |
There was a problem hiding this comment.
allowed is being inferred from the wrong signal.
result.structured.is_some() is not equivalent to “policy allowed this command.” clients/agent-runtime/src/tools/shell.rs:147-153 still returns structured: None when runtime command construction fails after validation, so this path logs allowed=false / risk_level=unknown for a command that was actually approved and only failed later. Carry an explicit allowed bit (and risk level) out of ShellTool instead of using metadata presence as the proxy.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/agent/agent.rs` around lines 605 - 625, The code is
using result.structured.is_some() as a proxy for "allowed", which is incorrect;
update the ShellTool (the tool that builds runtime command results) to carry
explicit fields like allowed: bool and risk_level: Option<String> (or
risk_level: &str) in the result type, populate those when approval/validation
succeeds or fails, and then change the logger call (logger.log_command_event /
CommandExecutionLog construction) to use result.allowed and
result.risk_level.unwrap_or("unknown") instead of result.structured.is_some()
and the current risk_level extraction; ensure any callers that create
CommandExecutionLog use the new explicit fields so approved-but-failed commands
log allowed=true with the correct risk level.
| fn computer_use_health_url(&self) -> anyhow::Result<reqwest::Url> { | ||
| let mut url = self.computer_use_endpoint_url()?; | ||
| url.set_path("/v1/health"); | ||
| url.set_query(None); | ||
| url.set_fragment(None); | ||
| Ok(url) |
There was a problem hiding this comment.
Preserve the configured base path when deriving the health URL.
set_path("/v1/health") always rewrites from the host root. A configured endpoint like https://sidecar.example.com/proxy/v1/actions would probe https://sidecar.example.com/v1/health instead of the sibling route and fail verification even though the action endpoint is valid. Replace only the trailing /v1/actions, or validate the configured path and reject unsupported shapes up front.
Suggested fix
fn computer_use_health_url(&self) -> anyhow::Result<reqwest::Url> {
let mut url = self.computer_use_endpoint_url()?;
- url.set_path("/v1/health");
+ let health_path = url
+ .path()
+ .strip_suffix("/v1/actions")
+ .map(|prefix| format!("{prefix}/v1/health"))
+ .ok_or_else(|| {
+ anyhow::anyhow!(
+ "browser.computer_use.endpoint must end with /v1/actions for sidecar health verification"
+ )
+ })?;
+ url.set_path(&health_path);
url.set_query(None);
url.set_fragment(None);
Ok(url)
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| fn computer_use_health_url(&self) -> anyhow::Result<reqwest::Url> { | |
| let mut url = self.computer_use_endpoint_url()?; | |
| url.set_path("/v1/health"); | |
| url.set_query(None); | |
| url.set_fragment(None); | |
| Ok(url) | |
| fn computer_use_health_url(&self) -> anyhow::Result<reqwest::Url> { | |
| let mut url = self.computer_use_endpoint_url()?; | |
| let health_path = url | |
| .path() | |
| .strip_suffix("/v1/actions") | |
| .map(|prefix| format!("{prefix}/v1/health")) | |
| .ok_or_else(|| { | |
| anyhow::anyhow!( | |
| "browser.computer_use.endpoint must end with /v1/actions for sidecar health verification" | |
| ) | |
| })?; | |
| url.set_path(&health_path); | |
| url.set_query(None); | |
| url.set_fragment(None); | |
| Ok(url) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/agent-runtime/src/tools/browser.rs` around lines 345 - 350, The
computer_use_health_url function currently calls computer_use_endpoint_url() and
then uses url.set_path("/v1/health"), which discards any configured base path;
change the logic to preserve the endpoint's base path by inspecting the path
returned from computer_use_endpoint_url(): if the path ends_with "/v1/actions"
replace only that suffix with "/v1/health" (keeping existing leading path
segments), otherwise either accept known alternative shapes or return an error
validating unsupported path shapes; ensure you still clear query/fragment and
return the modified reqwest::Url from computer_use_health_url, referencing
computer_use_health_url and computer_use_endpoint_url in your changes.
| Corvus tries supported backends in platform order and uses the first available one. | ||
|
|
||
| - Linux: `landlock` → `firejail` → `docker` | ||
| - macOS: `bubblewrap` → `docker` | ||
| - other platforms: `docker` | ||
| - if none are available and `require = false`, Corvus falls back to `none` |
There was a problem hiding this comment.
The auto-detection matrix is out of sync with the runtime.
This PR adds Bubblewrap probing on Linux, but the guide omits it from Linux and lists it under macOS. That gives operators the wrong expectation about what backend = "auto" can actually select. Please copy the exact platform order from the detection code here, and mirror the same fix in clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md.
As per coding guidelines, verify technical accuracy and that docs stay aligned with code changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md`
around lines 42 - 47, The Linux backend autodetection order in the guide is out
of sync with the runtime: update the list for backend = "auto" so Linux reads
"landlock → bubblewrap → firejail → docker" (preserve macOS as "bubblewrap →
docker" and the fallback behavior when require = false -> none); mirror this
exact change in the corresponding translated/ES runtime-sandbox-isolation guide
so both docs match the detection code.
…ning - Extract explicit allowed from structured payload instead of is_some() proxy - Make audit failure terminal instead of retriable (DispatchAction) - Strip credentials from sidecar endpoint URL before logging - Fail closed on missing/bad sidecar health status or isolation - Preserve base path in health URL construction - Redact command in noop sandbox tracing warning - Update docs: Linux autodetect order includes bubblewrap - Note command redaction in audit expectations (EN + ES) - Fix proposal audit.rs status and verify-report heading hierarchy
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
clients/agent-runtime/src/tools/browser.rs (1)
867-873:⚠️ Potential issue | 🟠 MajorRequired sidecar verification failures bypass browser audit logging.
Line 872 uses
?, so a required health-check failure returnsErrbefore anyToolResult.structuredpayload is built.clients/agent-runtime/src/agent/agent.rsonly callslog_browser_security_event()onOk(ToolResult)paths, so the blocked computer-use action is never audited. Return a failedToolResultwith the verification details, or emit the audit event before propagating the error.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/browser.rs` around lines 867 - 873, The verify_computer_use_sidecar() call in execute_computer_use_action currently uses the ? operator which returns Err early and bypasses browser audit logging; update execute_computer_use_action so that when verify_computer_use_sidecar() (and similarly computer_use_endpoint_url()) fails you either (a) construct and return an Err(…) that wraps an Ok(ToolResult::structured(...)) representing the failed verification (including the verification/error details) so callers still receive an Ok(ToolResult) and can call log_browser_security_event(), or (b) explicitly call log_browser_security_event() with the verification failure details before propagating the error; locate execute_computer_use_action, verify_computer_use_sidecar, computer_use_endpoint_url, and ToolResult to implement one of these two approaches and ensure the audit payload contains the verification result and error message.
♻️ Duplicate comments (4)
clients/agent-runtime/src/tools/browser.rs (2)
345-353:⚠️ Potential issue | 🟠 MajorUnsupported computer-use endpoint paths should fail closed.
The fallback branch rewrites any non-
/v1/actionsendpoint to host-root/v1/health. That can verify a different service than the one receiving actions, sorequire=truemay succeed against the wrong health endpoint on proxied or misconfigured deployments. Replace only the trailing/v1/actions, or reject unsupported shapes.Possible fix
fn computer_use_health_url(&self) -> anyhow::Result<reqwest::Url> { let mut url = self.computer_use_endpoint_url()?; - let path = url.path().to_string(); - let new_path = if path.ends_with("/v1/actions") { - path.replace("/v1/actions", "/v1/health") - } else { - "/v1/health".to_string() - }; + let new_path = url + .path() + .strip_suffix("/v1/actions") + .map(|prefix| format!("{prefix}/v1/health")) + .ok_or_else(|| { + anyhow::anyhow!( + "browser.computer_use.endpoint must end with /v1/actions for sidecar health verification" + ) + })?; url.set_path(&new_path); url.set_query(None); url.set_fragment(None); Ok(url) }Based on learnings: "Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/browser.rs` around lines 345 - 353, In computer_use_health_url(), avoid rewriting arbitrary endpoints to "/v1/health": only accept and rewrite when the path ends exactly with "/v1/actions" (use strip_suffix or check the trailing segment) and replace that suffix with "/v1/health"; otherwise return an Err (e.g., anyhow!("unsupported computer_use_endpoint path: {}", url.path())) so unsupported shapes fail closed; update the logic around computer_use_endpoint_url(), path manipulation, and url.set_path accordingly.
912-917:⚠️ Potential issue | 🟠 MajorDo not format the raw sidecar URL into this error.
self.computer_use.endpointcan containuser:pass@.... On request failure that raw value is surfaced through the tool error path. Reuse the sanitizedendpointURL instead.Possible fix
+ let mut sanitized_endpoint = endpoint.clone(); + let _ = sanitized_endpoint.set_username(""); + let _ = sanitized_endpoint.set_password(None); let response = request.send().await.with_context(|| { - format!( - "Failed to call computer-use sidecar at {}", - self.computer_use.endpoint - ) + format!("Failed to call computer-use sidecar at {sanitized_endpoint}") })?;As per coding guidelines, "Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/tools/browser.rs` around lines 912 - 917, The error message currently interpolates the raw URL via self.computer_use.endpoint which may include credentials; change the with_context closure to use the already-sanitized endpoint value instead (e.g., reuse a previously computed sanitized `endpoint` variable or a sanitized accessor like self.computer_use.sanitized_endpoint) so the formatted string no longer exposes secrets when calling request.send().await.clients/agent-runtime/src/agent/agent.rs (2)
643-690:⚠️ Potential issue | 🟠 Major
audit_strictis still fail-open for browser audit writes.If
logger.log(&event)fails, this helper only changes the log level. In strict mode the browser action still returns normally, unlike the shell and code-session audit paths. Make this returnResult<()>and propagate it fromexecute_tool_call().🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/agent/agent.rs` around lines 643 - 690, The helper log_browser_security_event currently swallows logger.log failures and only alters log level; change fn log_browser_security_event(&self, result: &crate::tools::ToolResult) to return Result<(), crate::Error> (or the crate's common Result type), convert the internal early returns to Ok(()) where appropriate, and propagate logger.log errors as Err(...) when self.audit_strict is true (or always return Err on failure so the caller can decide); update callers—most importantly execute_tool_call()—to handle the Result from log_browser_security_event and propagate the error out of execute_tool_call() so browser audit write failures do not fail-open. Ensure references to audit_strict, logger.log(&event), AuditEvent/AuditEventType, and execute_tool_call are adjusted to the new fallible flow.
607-620:⚠️ Potential issue | 🟠 Major
allowedis still derived fromapproved.
approvedonly means a supervised approval was supplied; it is not equivalent to “policy allowed this command.” Successful low-risk shell executions will now be audited asallowed=false, and post-validationShellToolfailures that returnstructured: Nonestill collapse toallowed=false/risk_level="unknown". Carry an explicitallowedbit out ofclients/agent-runtime/src/tools/shell.rsand read that here.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@clients/agent-runtime/src/agent/agent.rs` around lines 607 - 620, The code is deriving `allowed` from the `approved` structured field, which is incorrect; update the `agent` parsing in agent.rs to read the explicit allowed boolean propagated from the tool (e.g., the explicit allowed bit added to the tool result in clients/agent-runtime/src/tools/shell.rs) instead of using `approved`. Locate the parsing block that sets `risk_level`, `sandbox_backend`, and `allowed` (around the `let structured = result.structured.as_ref();` section) and change `allowed` to prefer the explicit field carried on the result (for example `result.allowed` or `structured.and_then(|v| v.get("allowed")).and_then(|v| v.as_bool())`) falling back only if that explicit flag is absent; remove reliance on `approved` for policy allowance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md`:
- Around line 86-88: The docs list of computer-use sidecar actions is missing
"key_press"; update the English page's action list
(`runtime-sandbox-isolation.md` under the Computer-use sidecar section) to
include `key_press` alongside `mouse_move`, `mouse_click`, `mouse_drag`,
`key_type`, and `screen_capture`, and make the identical edit in the Spanish
file
(`clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md`)
so the docs match the implementation in
clients/agent-runtime/src/tools/browser.rs which treats `key_press` as a
computer-use-only action.
In `@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.md`:
- Around line 57-59: The proposal text must be updated to reflect that sidecar
verification is performed lazily on first use rather than at startup: change all
references in Phase 5, the startup-latency risk, and the success criteria from
"startup" to "first computer-use" (BrowserTool's lazy verification flow), state
that the SecurityEvent audit and health-check call occur on first computer-use,
and update the docs section describing the isolation contract (docs/) to match
the first-use contract; apply the same wording changes to the other affected
sections referenced (sections 90 and 111).
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md`:
- Around line 62-89: The compliance total is inconsistent: the table shows 13 "✅
COMPLIANT" rows but the footer reads "Compliance summary: 14/24 scenarios
compliant"; update the footer to match the table by changing the string
"Compliance summary: 14/24 scenarios compliant" to "Compliance summary: 13/24
scenarios compliant" (search for that exact footer line in verify-report.md) or
alternatively flip one of the rows to ✅ if you intended 14 compliant—ensure the
table rows and the final summary string are numerically consistent.
---
Outside diff comments:
In `@clients/agent-runtime/src/tools/browser.rs`:
- Around line 867-873: The verify_computer_use_sidecar() call in
execute_computer_use_action currently uses the ? operator which returns Err
early and bypasses browser audit logging; update execute_computer_use_action so
that when verify_computer_use_sidecar() (and similarly
computer_use_endpoint_url()) fails you either (a) construct and return an Err(…)
that wraps an Ok(ToolResult::structured(...)) representing the failed
verification (including the verification/error details) so callers still receive
an Ok(ToolResult) and can call log_browser_security_event(), or (b) explicitly
call log_browser_security_event() with the verification failure details before
propagating the error; locate execute_computer_use_action,
verify_computer_use_sidecar, computer_use_endpoint_url, and ToolResult to
implement one of these two approaches and ensure the audit payload contains the
verification result and error message.
---
Duplicate comments:
In `@clients/agent-runtime/src/agent/agent.rs`:
- Around line 643-690: The helper log_browser_security_event currently swallows
logger.log failures and only alters log level; change fn
log_browser_security_event(&self, result: &crate::tools::ToolResult) to return
Result<(), crate::Error> (or the crate's common Result type), convert the
internal early returns to Ok(()) where appropriate, and propagate logger.log
errors as Err(...) when self.audit_strict is true (or always return Err on
failure so the caller can decide); update callers—most importantly
execute_tool_call()—to handle the Result from log_browser_security_event and
propagate the error out of execute_tool_call() so browser audit write failures
do not fail-open. Ensure references to audit_strict, logger.log(&event),
AuditEvent/AuditEventType, and execute_tool_call are adjusted to the new
fallible flow.
- Around line 607-620: The code is deriving `allowed` from the `approved`
structured field, which is incorrect; update the `agent` parsing in agent.rs to
read the explicit allowed boolean propagated from the tool (e.g., the explicit
allowed bit added to the tool result in
clients/agent-runtime/src/tools/shell.rs) instead of using `approved`. Locate
the parsing block that sets `risk_level`, `sandbox_backend`, and `allowed`
(around the `let structured = result.structured.as_ref();` section) and change
`allowed` to prefer the explicit field carried on the result (for example
`result.allowed` or `structured.and_then(|v| v.get("allowed")).and_then(|v|
v.as_bool())`) falling back only if that explicit flag is absent; remove
reliance on `approved` for policy allowance.
In `@clients/agent-runtime/src/tools/browser.rs`:
- Around line 345-353: In computer_use_health_url(), avoid rewriting arbitrary
endpoints to "/v1/health": only accept and rewrite when the path ends exactly
with "/v1/actions" (use strip_suffix or check the trailing segment) and replace
that suffix with "/v1/health"; otherwise return an Err (e.g.,
anyhow!("unsupported computer_use_endpoint path: {}", url.path())) so
unsupported shapes fail closed; update the logic around
computer_use_endpoint_url(), path manipulation, and url.set_path accordingly.
- Around line 912-917: The error message currently interpolates the raw URL via
self.computer_use.endpoint which may include credentials; change the
with_context closure to use the already-sanitized endpoint value instead (e.g.,
reuse a previously computed sanitized `endpoint` variable or a sanitized
accessor like self.computer_use.sanitized_endpoint) so the formatted string no
longer exposes secrets when calling request.send().await.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: ca9d1ee2-0d89-4128-88c9-e3e2087acc6c
📒 Files selected for processing (7)
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: pr-checks
- GitHub Check: sonar
- GitHub Check: submit-gradle
- GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (8)
**/*.{md,mdx}
⚙️ CodeRabbit configuration file
**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.
Files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
**/*
⚙️ CodeRabbit configuration file
**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.
Files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
clients/agent-runtime/src/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Files:
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
clients/agent-runtime/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Run
cargo fmt --all -- --check,cargo clippy --all-targets -- -D warnings, andcargo testfor code validation, or document which checks were skipped and why
Files:
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
**/*.rs
⚙️ CodeRabbit configuration file
**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.
Files:
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
clients/agent-runtime/src/tools/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Implement
Tooltrait insrc/tools/with strict parameter schema, validate and sanitize all inputs, and return structuredToolResultwithout panics in runtime path
Files:
clients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
clients/agent-runtime/src/{security,gateway,tools}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Treat
src/security/,src/gateway/,src/tools/as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Files:
clients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs
📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)
Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Files:
clients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
🧠 Learnings (11)
📓 Common learnings
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools}/**/*.rs : Treat `src/security/`, `src/gateway/`, `src/tools/` as high-risk surfaces and never broaden filesystem/network execution scope without explicit policy checks
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.mdclients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.mdclients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why
Applied to files:
openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.mdclients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rsopenspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Applied to files:
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Keep startup path lean and avoid heavy initialization in command parsing flow
Applied to files:
clients/agent-runtime/src/agent/agent.rsclients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/**/*.rs : Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency
Applied to files:
clients/agent-runtime/src/tools/browser.rsclients/agent-runtime/src/tools/shell.rs
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/channels/**/*.rs : Implement `Channel` trait in `src/channels/` with consistent `send`, `listen`, and `health_check` semantics and cover auth/allowlist/health behavior with tests
Applied to files:
clients/agent-runtime/src/tools/browser.rs
🪛 LanguageTool
clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md
[grammar] ~11-~11: Corrige la mayúscula.
Context: ...n docType: guide --- # Aislamiento del Sandbox del Runtime Corvus usa **dos capas de ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_UPPERCASE)
[grammar] ~11-~11: Corrige la mayúscula.
Context: ...uide --- # Aislamiento del Sandbox del Runtime Corvus usa dos capas de seguridad pa...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_UPPERCASE)
[style] ~15-~15: En sentido figurado es preferible usar otras expresiones.
Context: ... disparada por usuarios: 1. Política a nivel de aplicación mediante SecurityPolicy ...
(A_NIVEL_DE)
[grammar] ~16-~16: Agrega un signo de puntuación.
Context: ...nte SecurityPolicy - allowlists de comandos - restricciones de rutas - límite...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~17-~17: Agrega un signo de puntuación.
Context: ...lists de comandos - restricciones de rutas - límites de tasa - clasificación...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~18-~18: Agrega un signo de puntuación.
Context: ... restricciones de rutas - límites de tasa - clasificación de riesgo y compuert...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~19-~19: Agrega un signo de puntuación.
Context: ...clasificación de riesgo y compuertas de aprobación 2. **Sandboxing a nivel de sistema oper...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[style] ~20-~20: En sentido figurado es preferible usar otras expresiones.
Context: ...ompuertas de aprobación 2. Sandboxing a nivel de sistema operativo mediante los backen...
(A_NIVEL_DE)
[grammar] ~20-~20: Aquí puede haber un error.
Context: ...iante los backends Sandbox del runtime - landlock - firejail - bubblewrap - docker Las dos capas importan. La capa de polít...
(QB_NEW_ES)
[grammar] ~55-~55: Cambia la palabra o signo.
Context: ... → error al iniciar - backend = "auto" sin backend encontrado → error al inicia...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~56-~56: Cambia la palabra o signo.
Context: ... → error al iniciar - backend = "none" → error al iniciar - enabled = false →...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~57-~57: Cambia la palabra o signo.
Context: ...→ error al iniciar -enabled = false` → error al iniciar Úsalo en despliegues...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_SPACE)
[grammar] ~67-~67: Corrige la minúscula.
Context: ...Corvus aplica ahora esta secuencia: 1. valida el comando contra SecurityPolicy 2. s...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_LOWERCASE)
[grammar] ~68-~68: Oración con errores
Context: ...a el comando contra SecurityPolicy 2. sanitiza variables de entorno 3. envuelve el com...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~68-~68: Oración con errores
Context: ...Policy` 2. sanitiza variables de entorno 3. envuelve el comando con el backend de sandbox se...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~69-~69: Oración con errores
Context: ...o con el backend de sandbox seleccionado 4. ejecuta con timeout 5. audita el resultado con ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~70-~70: Oración con errores
Context: ...dbox seleccionado 4. ejecuta con timeout 5. audita el resultado con metadata del sandbox ...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~71-~71: Aquí puede haber un error.
Context: ...ta el resultado con metadata del sandbox Eso significa que cada comando shell per...
(QB_NEW_ES)
[style] ~77-~77: En sentido figurado es preferible usar otras expresiones.
Context: ...nto del sandbox noop Si no hay backend a nivel de sistema operativo y require = false, ...
(A_NIVEL_DE)
[grammar] ~80-~80: Corrige la minúscula.
Context: ...none (NoopSandbox). En ese modo: - la política a nivel de aplicación sigue ap...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_ORTHOGRAPHY_LOWERCASE)
[style] ~80-~80: En sentido figurado es preferible usar otras expresiones.
Context: ...pSandbox`). En ese modo: - la política a nivel de aplicación sigue aplicando - los comand...
(A_NIVEL_DE)
[grammar] ~80-~80: Agrega un signo de puntuación.
Context: ...la política a nivel de aplicación sigue aplicando - los comandos mutables emiten una advertenci...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~81-~81: Agrega un signo de puntuación.
Context: ...l sandbox del sistema operativo no está activo - los logs de auditoría registran `sandbox_ba...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~82-~82: Aquí puede haber un error.
Context: ...activo - los logs de auditoría registran sandbox_backend = "none" Esto se permite para desarrollo local, p...
(QB_NEW_ES)
[grammar] ~91-~91: Oración con errores
Context: ...idecar. Valores seguros por defecto: - el endpoint por defecto usa loopback: `htt...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~91-~91: Elimina la palabra o signo.
Context: ...- el endpoint por defecto usa loopback: http://127.0.0.1:8787/v1/actions - los endpoints remotos/públicos se bloque...
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_SPACE)
[grammar] ~92-~92: Elimina la palabra o signo.
Context: ... remotos/públicos se bloquean salvo que allow_remote_endpoint = true - los endpoints públicos remotos deben usa...
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_SPACE)
[grammar] ~93-~93: Oración con errores
Context: ...points públicos remotos deben usar HTTPS - dominios permitidos, allowlists de ventanas y lí...
(QB_NEW_ES_OTHER_ERROR_IDS_REPLACEMENT_MULTITOKEN)
[grammar] ~94-~94: Aquí puede haber un error.
Context: ...das se reenvían como política al sidecar Corvus hace un health-check perezoso...
(QB_NEW_ES)
[grammar] ~118-~118: Aquí puede haber un error.
Context: ...o verificarse el aislamiento del sidecar ## Expectativas de auditoría Los eventos d...
(QB_NEW_ES)
[grammar] ~126-~126: Aquí puede haber un error.
Context: ... - bandera de aprobación - éxito/fracaso - security.sandbox_backend La verificación del sidecar de computer-...
(QB_NEW_ES)
[grammar] ~129-~129: Aquí puede haber un error.
Context: ...-use genera una entrada SecurityEvent describiendo: - estado de salud del sidecar - tipo de...
(QB_NEW_ES)
[grammar] ~130-~130: Agrega un signo de puntuación.
Context: ...nt` describiendo: - estado de salud del sidecar - tipo de aislamiento reportado - runti...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~131-~131: Agrega un signo de puntuación.
Context: ...salud del sidecar - tipo de aislamiento reportado - runtime reportado ## Valores recomen...
(QB_NEW_ES_OTHER_ERROR_IDS_MISSING_PUNCTUATION)
[grammar] ~132-~132: Aquí puede haber un error.
Context: ...islamiento reportado - runtime reportado ## Valores recomendados para operadores ##...
(QB_NEW_ES)
[grammar] ~171-~171: Elimina la puntuación
Context: ...te de despliegue del sidecar de computer-use.
(QB_NEW_ES_OTHER_ERROR_IDS_UNNECESSARY_PUNCTUATION)
| ## Computer-use sidecar isolation | ||
|
|
||
| Computer-use actions (`mouse_move`, `mouse_click`, `mouse_drag`, `key_type`, `screen_capture`) use a sidecar. |
There was a problem hiding this comment.
The computer-use action list is missing key_press.
clients/agent-runtime/src/tools/browser.rs treats key_press as a computer-use-only action, but this guide omits it from the supported sidecar actions. Add it here and mirror the same change in clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md to keep EN/ES aligned.
As per coding guidelines, "Verify technical accuracy and that docs stay aligned with code changes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@clients/web/apps/docs/src/content/docs/guides/runtime-sandbox-isolation.md`
around lines 86 - 88, The docs list of computer-use sidecar actions is missing
"key_press"; update the English page's action list
(`runtime-sandbox-isolation.md` under the Computer-use sidecar section) to
include `key_press` alongside `mouse_move`, `mouse_click`, `mouse_drag`,
`key_type`, and `screen_capture`, and make the identical edit in the Spanish
file
(`clients/web/apps/docs/src/content/docs/es/guides/runtime-sandbox-isolation.md`)
so the docs match the implementation in
clients/agent-runtime/src/tools/browser.rs which treats `key_press` as a
computer-use-only action.
| - Add an optional health-check call to the sidecar endpoint at `BrowserTool` initialization that queries isolation capabilities. | ||
| - Log the sidecar's reported isolation level as an audit event (`SecurityEvent` type). | ||
| - Document the expected isolation contract in `docs/` — what operators SHOULD ensure about sidecar deployment. |
There was a problem hiding this comment.
The proposal still treats sidecar verification as startup-time.
clients/agent-runtime/src/tools/browser.rs now verifies the sidecar lazily on the first computer-use action, not during initialization. Phase 5, the startup-latency risk, and the success criteria still say "startup", which changes when operators expect failures and audit events to surface. Please update those sections to the first-use contract.
As per coding guidelines, "Verify technical accuracy and that docs stay aligned with code changes."
Also applies to: 90-90, 111-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/proposal.md`
around lines 57 - 59, The proposal text must be updated to reflect that sidecar
verification is performed lazily on first use rather than at startup: change all
references in Phase 5, the startup-latency risk, and the success criteria from
"startup" to "first computer-use" (BrowserTool's lazy verification flow), state
that the SecurityEvent audit and health-check call occur on first computer-use,
and update the docs section describing the isolation contract (docs/) to match
the first-use contract; apply the same wording changes to the other affected
sections referenced (sections 90 and 111).
| | Requirement | Scenario | Test | Result | | ||
| |-------------|----------|------|--------| | ||
| | R1 | R1-S1 Sandbox wraps every shell command | `tools::shell::tests::shell_calls_wrap_command_on_injected_sandbox` | ✅ COMPLIANT | | ||
| | R1 | R1-S2 Sandbox error blocks execution | `tools::shell::tests::shell_returns_error_when_sandbox_wrap_fails` | ✅ COMPLIANT | | ||
| | R1 | R1-S3 NoopSandbox passes through unchanged | `tools::shell::tests::shell_executes_allowed_command` | ⚠️ PARTIAL | | ||
| | R1 | R1-S4 Wrapping happens after env sanitization | `tools::shell::tests::shell_wraps_after_env_sanitization` | ✅ COMPLIANT | | ||
| | R2 | R2-S1 Require mode with available backend succeeds | `security::detect::tests::docker_backend_falls_back_gracefully` | ⚠️ PARTIAL | | ||
| | R2 | R2-S2 Require mode with unavailable explicit backend fails | `security::detect::tests::require_landlock_on_non_linux_returns_error` | ✅ COMPLIANT | | ||
| | R2 | R2-S3 Require mode with auto finds nothing fails | `security::detect::tests::require_auto_no_backend_returns_error_or_ok` | ⚠️ PARTIAL | | ||
| | R2 | R2-S4 Non-require mode falls back to NoopSandbox | `security::detect::tests::explicit_none_returns_noop`, `security::detect::tests::default_security_config_produces_working_sandbox` | ⚠️ PARTIAL | | ||
| | R2 | R2-S5 Explicit none backend with require is an error | `security::detect::tests::require_none_backend_returns_error` | ✅ COMPLIANT | | ||
| | R2 | R2-S6 Disabled sandbox with require is an error | `security::detect::tests::require_disabled_returns_error` | ✅ COMPLIANT | | ||
| | R3 | R3-S1 Warning for mutating command with NoopSandbox | `tools::shell::tests::noop_sandbox_warning_helper_triggers_for_mutating_commands` | ⚠️ PARTIAL | | ||
| | R3 | R3-S2 No warning for read-only command with NoopSandbox | `tools::shell::tests::noop_sandbox_warning_helper_skips_read_only_commands` | ⚠️ PARTIAL | | ||
| | R3 | R3-S3 No warning when real sandbox is active | `tools::shell::tests::noop_sandbox_warning_helper_skips_real_sandbox` | ⚠️ PARTIAL | | ||
| | R4 | R4-S1 Audit event includes sandbox backend name | `security::audit::tests::audit_log_command_event_writes_structured_entry` | ✅ COMPLIANT | | ||
| | R4 | R4-S2 NoopSandbox is recorded as `none` | `security::audit::tests::audit_log_command_event_writes_structured_entry` | ⚠️ PARTIAL — test injects values into `CommandExecutionLog` proving serialization, but does not assert `NoopSandbox` name is emitted via the actual shell execution path; an end-to-end runtime propagation test is needed | | ||
| | R4 | R4-S3 Real backend name is recorded | `security::audit::tests::audit_log_command_event_records_real_sandbox_backend` | ⚠️ PARTIAL — test injects a backend name into `CommandExecutionLog` proving serialization, but does not assert the real sandbox backend name flows through the actual shell/browser execution path; an end-to-end runtime propagation test is needed | | ||
| | R5 | R5-S1 Healthy sidecar with isolation info | `tools::browser::tests::computer_use_sidecar_health_check_reports_isolation_info` | ✅ COMPLIANT | | ||
| | R5 | R5-S2 Sidecar health-check fails gracefully | `tools::browser::tests::computer_use_sidecar_health_check_fails_gracefully_when_optional` | ✅ COMPLIANT | | ||
| | R5 | R5-S3 Sidecar health-check with require mode | `tools::browser::tests::computer_use_sidecar_health_check_rejects_when_required` | ✅ COMPLIANT | | ||
| | R5 | R5-S4 Sidecar without health endpoint | `tools::browser::tests::computer_use_sidecar_missing_health_endpoint_is_treated_as_failure` | ✅ COMPLIANT | | ||
| | R6 | R6-S1 Default config produces identical behavior | `config::schema::tests::sandbox_config_default_require_is_false`, `bootstrap::tests::bootstrap_context_builds_core_components` | ⚠️ PARTIAL | | ||
| | R6 | R6-S2 Existing SecurityPolicy tests pass unchanged | full `cargo test` (`security::policy::tests::*`) | ✅ COMPLIANT | | ||
| | R6 | R6-S3 Config without require field deserializes correctly | `config::schema::tests::sandbox_config_missing_require_defaults_to_false` | ✅ COMPLIANT | | ||
| | R6 | R6-S4 CLI contract unchanged | full `cargo test`, `mcp_native_regression`, `mcp_runtime_e2e` | ⚠️ PARTIAL | | ||
|
|
||
| **Compliance summary**: 14/24 scenarios compliant |
There was a problem hiding this comment.
Compliance summary is off by one.
The matrix currently contains 13 ✅ COMPLIANT rows, but Line 89 says 14/24. Please fix the summary or the row statuses so the archived report stays internally consistent.
As per coding guidelines, "Verify technical accuracy and that docs stay aligned with code changes."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@openspec/changes/archive/2026-04-02-runtime-sandbox-hardening/verify-report.md`
around lines 62 - 89, The compliance total is inconsistent: the table shows 13
"✅ COMPLIANT" rows but the footer reads "Compliance summary: 14/24 scenarios
compliant"; update the footer to match the table by changing the string
"Compliance summary: 14/24 scenarios compliant" to "Compliance summary: 13/24
scenarios compliant" (search for that exact footer line in verify-report.md) or
alternatively flip one of the rows to ✅ if you intended 14 compliant—ensure the
table rows and the final summary string are numerically consistent.
|


This pull request enhances sandbox security configuration and auditing in the agent runtime. It introduces a new
requirefield to the sandbox config, allowing deployments to enforce the presence of a real OS-level sandbox and fail fast if unavailable. The PR also improves audit logging for shell command execution, including which sandbox backend was used, and strengthens tests for these changes.Sandbox configuration and enforcement:
Added a
requireboolean field toSandboxConfig(default:false). When set totrue, the agent will refuse to start if no real OS-level sandbox (e.g., Firejail, Landlock, Docker) is available, instead of silently falling back to the NoopSandbox. This is enforced increate_sandboxanddetect_best_sandbox. [1] [2] [3] [4] [5]Updated all code paths and tests to handle the new error-returning behavior of
create_sandbox, ensuring proper propagation and test coverage for both required and non-required modes. [1] [2] [3] [4] [5] [6] [7]Audit logging improvements:
Extended
CommandExecutionLogand audit logging to include thesandbox_backendused for each shell command execution. This information is now recorded in audit logs and verified by new tests. [1] [2] [3] [4] [5]Added detailed audit event logging for both shell and browser tool executions, including risk level, approval status, and environment details. [1] [2] [3]
These changes make sandboxing requirements explicit and auditable, improving the agent's security posture and compliance capabilities.