Foundation: introduce weaver-sandbox for CLI/daemon isolation#33
Foundation: introduce weaver-sandbox for CLI/daemon isolation#33
Conversation
Introduces the `weaver-sandbox` crate to provide a restrictive, policy-driven sandbox wrapper around the birdcage library (v0.8.1). This crate enforces Weaver's zero-trust sandbox defaults including: - Absolute and whitelisted executables only - Networking disabled by default - Environment variables isolated unless explicitly allowed - Standard Linux runtime library paths whitelisted for read-only access - Single-threaded enforcement with explicit error reporting Adds comprehensive profile and sandbox implementations, error types, environment guard to restore environment after sandboxing, runtime helpers, and tests using `rstest-bdd`. Updates workspace and dependencies accordingly. Includes documentation updates describing sandbox behavior and roadmap progress. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
Warning Rate limit exceeded@leynos has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 9 minutes and 5 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (1)
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. Summary by CodeRabbitRelease Notes
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughIntroduce a new crate Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Sandbox
participant EnvGuard
participant Profile
participant Birdcage
participant ChildProcess
Client->>Sandbox: construct(Profile)
Client->>Sandbox: spawn(Command)
rect rgb(245,245,220)
Sandbox->>Sandbox: ensure_single_threaded()
Sandbox->>Sandbox: canonical_program()
Sandbox->>Sandbox: ensure_program_whitelisted()
end
rect rgb(220,245,245)
Sandbox->>EnvGuard: capture()
EnvGuard-->>Sandbox: EnvGuard handle
end
rect rgb(220,245,220)
Sandbox->>Profile: query policies & paths
Sandbox->>Sandbox: collect_exceptions()
end
Sandbox->>Birdcage: activate(exceptions, command)
Birdcage->>ChildProcess: spawn
Birdcage-->>Sandbox: SandboxChild
Sandbox-->>Client: Result<SandboxChild>
Note right of EnvGuard: On Drop -> restore environment
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Comment |
Reviewer's GuideIntroduces a new weaver-sandbox crate that wraps the birdcage library to provide a policy-driven sandbox for Weaver subprocesses, with strict defaults for filesystem, environment, networking, and threading; wires it into the workspace, adds targeted tests, and documents the design and usage. Sequence diagram for Sandbox::spawn sandboxed command executionsequenceDiagram
actor Caller
participant Sandbox
participant Runtime as RuntimeHelpers
participant Profile as SandboxProfile
participant EnvGuard
participant Birdcage as BirdcageSandbox
Caller->>Sandbox: spawn(command: SandboxCommand)
activate Sandbox
Sandbox->>Runtime: thread_count()
Runtime-->>Sandbox: Result~usize, io::Error~
alt thread_count > 1
Sandbox-->>Caller: Err(SandboxError::MultiThreaded)
else single-threaded
Sandbox->>Sandbox: canonical_program(command.get_program())
alt program not absolute or missing
Sandbox-->>Caller: Err(SandboxError::ProgramNotAbsolute | MissingPath | CanonicalisationFailed)
else program canonicalised
Sandbox->>Profile: executable_paths()
Profile-->>Sandbox: &[PathBuf]
Sandbox->>Sandbox: ensure_program_whitelisted(program)
alt not whitelisted
Sandbox-->>Caller: Err(SandboxError::ExecutableNotAuthorised)
else whitelisted
Sandbox->>EnvGuard: capture()
EnvGuard-->>Sandbox: EnvGuard
Sandbox->>Profile: read_only_paths(), read_write_paths(), executable_paths(), environment_policy(), network_policy()
Profile-->>Sandbox: paths and policies
Sandbox->>Sandbox: collect_exceptions(program)
Sandbox->>Birdcage: new()
Birdcage-->>Sandbox: BirdcageSandbox
loop for each Exception
Sandbox->>Birdcage: add_exception(exception)
Birdcage-->>Sandbox: Result~(), BirdcageError~
end
alt activation error
Sandbox-->>Caller: Err(SandboxError::Activation)
else activation ok
Sandbox->>Birdcage: spawn(command)
Birdcage-->>Sandbox: SandboxChild
Sandbox->>EnvGuard: drop (restore env)
Sandbox-->>Caller: Ok(SandboxChild)
end
end
end
end
deactivate Sandbox
Class diagram for the new weaver-sandbox crateclassDiagram
class Sandbox {
- SandboxProfile profile
+ new(profile: SandboxProfile) Sandbox
+ spawn(command: SandboxCommand) Result~SandboxChild, SandboxError~
- ensure_single_threaded() Result~(), SandboxError~
- ensure_program_whitelisted(program: Path) Result~(), SandboxError~
- collect_exceptions(program: Path) Result~Vec~Exception~, SandboxError~
- canonical_program(program: Path) Result~PathBuf, SandboxError~
}
class SandboxProfile {
- Vec~PathBuf~ read_only_paths
- Vec~PathBuf~ read_write_paths
- Vec~PathBuf~ executable_paths
- EnvironmentPolicy environment
- NetworkPolicy network
+ new() SandboxProfile
+ allow_executable(path: PathBuf) SandboxProfile
+ allow_read_path(path: PathBuf) SandboxProfile
+ allow_read_write_path(path: PathBuf) SandboxProfile
+ allow_environment_variable(key: String) SandboxProfile
+ allow_full_environment() SandboxProfile
+ allow_networking() SandboxProfile
+ read_only_paths() &[PathBuf]
+ read_write_paths() &[PathBuf]
+ executable_paths() &[PathBuf]
+ environment_policy() &EnvironmentPolicy
+ network_policy() NetworkPolicy
}
class EnvironmentPolicy {
<<enum>>
Isolated
AllowList(keys: BTreeSet~String~)
InheritAll
+ with_allowed(key: String) EnvironmentPolicy
+ to_exceptions() Vec~Exception~
}
class NetworkPolicy {
<<enum>>
Deny
Allow
+ is_denied() bool
}
class EnvGuard {
- HashMap~OsString, OsString~ original
+ capture() EnvGuard
+ restore() void
- drop() void
}
class SandboxError {
<<enum>>
ProgramNotAbsolute(path: PathBuf)
ExecutableNotAuthorised(program: PathBuf)
MissingPath(path: PathBuf)
CanonicalisationFailed(path: PathBuf, source: io::Error)
MultiThreaded(thread_count: usize)
ThreadCountUnavailable(source: io::Error)
Activation(source: BirdcageError)
}
class RuntimeHelpers {
+ linux_runtime_roots() Vec~PathBuf~
+ thread_count() Result~usize, io::Error~
}
class BirdcageSandbox {
<<external>>
+ new() BirdcageSandbox
+ add_exception(exception: Exception) Result~(), BirdcageError~
+ spawn(command: SandboxCommand) Result~SandboxChild, BirdcageError~
}
class BirdcageProcessTypes {
<<external>>
SandboxCommand
SandboxChild
SandboxOutput
}
class BirdcageException {
<<external enum>>
Read(path: PathBuf)
WriteAndRead(path: PathBuf)
ExecuteAndRead(path: PathBuf)
Environment(key: String)
FullEnvironment
Networking
}
Sandbox --> SandboxProfile : uses
Sandbox --> EnvGuard : uses
Sandbox --> BirdcageSandbox : wraps
Sandbox --> BirdcageException : builds
Sandbox --> SandboxError : returns
Sandbox --> RuntimeHelpers : uses
SandboxProfile --> EnvironmentPolicy : has
SandboxProfile --> NetworkPolicy : has
SandboxProfile --> RuntimeHelpers : uses linux_runtime_roots
EnvironmentPolicy --> BirdcageException : to_exceptions builds
SandboxError --> BirdcageSandbox : wraps BirdcageError
BirdcageSandbox --> BirdcageProcessTypes : spawns
RuntimeHelpers --> BirdcageSandbox : preflight for spawn
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
EnvGuard, the use ofunsafeforenv::set_varandenv::remove_varappears unnecessary since the standard APIs are already safe; consider removing theunsafeblocks and relying on the safe versions or adding a strong justification if there's a concrete invariants issue you're guarding against. - The
canonicalise/MissingPathcheck in the sandbox profile prevents whitelisting paths that don't yet exist (e.g., output files that will be created by the sandboxed command); if you expect that use-case, consider allowing non-existent paths when their parent directory is authorised, or making this behaviour configurable. - Setting
RUST_TEST_THREADS=1globally in the top-levelMakefileforces the entire workspace test suite to run single-threaded; you might want to scope this to only theweaver-sandboxcrate (e.g., via a dedicated test target or per-crate env) to avoid slowing unrelated tests.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `EnvGuard`, the use of `unsafe` for `env::set_var` and `env::remove_var` appears unnecessary since the standard APIs are already safe; consider removing the `unsafe` blocks and relying on the safe versions or adding a strong justification if there's a concrete invariants issue you're guarding against.
- The `canonicalise`/`MissingPath` check in the sandbox profile prevents whitelisting paths that don't yet exist (e.g., output files that will be created by the sandboxed command); if you expect that use-case, consider allowing non-existent paths when their parent directory is authorised, or making this behaviour configurable.
- Setting `RUST_TEST_THREADS=1` globally in the top-level `Makefile` forces the entire workspace test suite to run single-threaded; you might want to scope this to only the `weaver-sandbox` crate (e.g., via a dedicated test target or per-crate env) to avoid slowing unrelated tests.
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/env_guard.rs:32-35` </location>
<code_context>
+ // Remove variables introduced while the guard was active.
+ for (key, _) in env::vars_os() {
+ if !expected_keys.contains(&key) {
+ // SAFETY: keys originate from the host OS and were previously
+ // present in the environment, so removal cannot violate
+ // invariants expected by `std::env`.
+ unsafe { env::remove_var(&key) };
+ }
+ }
</code_context>
<issue_to_address>
**suggestion:** The `unsafe` blocks around `env::remove_var`/`env::set_var` are unnecessary and could be removed to simplify the code.
These functions are explicitly safe and support `OsString` keys/values, so the `unsafe` blocks and `SAFETY` comments don’t add protection and instead increase audit noise. Removing them simplifies the implementation without changing semantics.
Suggested implementation:
```rust
// Remove variables introduced while the guard was active.
for (key, _) in env::vars_os() {
if !expected_keys.contains(&key) {
env::remove_var(&key);
}
}
```
1. Search the rest of `crates/weaver-sandbox/src/env_guard.rs` for any other `unsafe` blocks surrounding `env::set_var` or `env::remove_var`.
2. Apply the same pattern: remove the `unsafe` block and any corresponding `SAFETY` comment, and call `env::set_var` / `env::remove_var` directly as safe functions.
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/sandbox.rs:133-140` </location>
<code_context>
+ Ok(set)
+}
+
+fn canonicalise(path: &Path) -> Result<PathBuf, SandboxError> {
+ if !path.exists() {
+ return Err(SandboxError::MissingPath {
+ path: path.to_path_buf(),
+ });
+ }
+
+ fs::canonicalize(path).map_err(|source| SandboxError::CanonicalisationFailed {
+ path: path.to_path_buf(),
+ source,
</code_context>
<issue_to_address>
**suggestion (bug_risk):** The explicit `exists()` check before `canonicalize` introduces a TOCTTOU window and might be better folded into the error handling.
Because `path.exists()` and `fs::canonicalize(path)` are separate operations, the filesystem can change in between, so you may see different behavior depending on timing. Instead, you can rely on `fs::canonicalize` alone and map `io::ErrorKind::NotFound` to `SandboxError::MissingPath`, which preserves behavior while avoiding both the TOCTTOU window and the extra syscall.
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/tests/unit.rs:36` </location>
<code_context>
+}
+
+#[test]
+fn network_is_denied_by_default() {
+ let profile = SandboxProfile::new();
+ assert_eq!(profile.network_policy(), NetworkPolicy::Deny);
</code_context>
<issue_to_address>
**suggestion (testing):** Extend unit tests to cover `allow_networking` and full-environment inheritance behaviour
Current tests cover the default network policy and environment allowlist deduplication. To complete coverage of `SandboxProfile`’s public API, please also add tests that:
- Use `SandboxProfile::new().allow_networking()` and assert `network_policy() == NetworkPolicy::Allow` and `NetworkPolicy::Allow.is_denied()` is `false`.
- Call `allow_full_environment()` and assert `environment_policy() == EnvironmentPolicy::InheritAll`.
- Optionally confirm the transition `Isolated` → `AllowList` → `InheritAll` (e.g. adding allowlist entries after `allow_full_environment` has no effect).
This will exercise the positive/opt-in paths for networking and environment inheritance.
Suggested implementation:
```rust
#[test]
fn network_is_denied_by_default() {
let profile = SandboxProfile::new();
assert_eq!(profile.network_policy(), NetworkPolicy::Deny);
}
#[test]
fn network_can_be_allowed() {
let profile = SandboxProfile::new().allow_networking();
assert_eq!(profile.network_policy(), NetworkPolicy::Allow);
assert!(!NetworkPolicy::Allow.is_denied());
}
#[test]
fn full_environment_can_be_inherited() {
let profile = SandboxProfile::new().allow_full_environment();
assert!(matches!(
profile.environment_policy(),
EnvironmentPolicy::InheritAll
));
}
#[test]
fn read_write_paths_are_recorded() {
```
If your `SandboxProfile` API exposes methods for adding to the environment allowlist (e.g. `allow_env_var`, `allow_environment_key`, etc.), consider adding an additional test to assert that calling that method *after* `allow_full_environment()` does not change the policy away from `EnvironmentPolicy::InheritAll`. The skeleton would be:
```rust
#[test]
fn environment_allowlist_has_no_effect_after_full_inherit() {
let profile = SandboxProfile::new()
.allow_full_environment()
.allow_environment_key("SHOULD_BE_IGNORED");
assert!(matches!(
profile.environment_policy(),
EnvironmentPolicy::InheritAll
));
}
```
You will need to replace `allow_environment_key` with the actual allowlist API used elsewhere in this test module to ensure the transition `Isolated → AllowList → InheritAll` behaves as expected and that post-`allow_full_environment` allowlist mutations have no effect.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:47-48` </location>
<code_context>
+ env::set_var("DROP_ME", "remove-me");
+}
+
+#[given("the sandbox allows only KEEP_ME to be inherited")]
+fn given_environment_allowlist(world: &RefCell<TestWorld>) {
+ let mut world = world.borrow_mut();
+ world.configure_env_reader();
</code_context>
<issue_to_address>
**suggestion (testing):** Behavioural tests only cover the allowlist environment policy; add scenarios for default isolation and full inheritance
Currently only the allowlist mode is exercised via the `KEEP_ME` / `DROP_ME` scenario. Please also add BDD scenarios for:
- **Default isolation**: no environment policy configured; the env-reading command should see neither `KEEP_ME` nor `DROP_ME`.
- **Full inheritance**: using `allow_full_environment()`; both `KEEP_ME` and `DROP_ME` should be visible inside the sandbox.
You can reuse the same step definitions with different Given/When combinations so that all environment policies are behaviourally covered.
Suggested implementation:
```rust
.allow_executable(world.command.as_ref().unwrap().get_program())
.allow_read_path(&world.allowed_file);
}
#[given("environment variables KEEP_ME and DROP_ME are set")]
fn given_environment_variables(_world: &RefCell<TestWorld>) {
env::set_var("KEEP_ME", "present");
env::set_var("DROP_ME", "remove-me");
}
#[given("the sandbox allows only KEEP_ME to be inherited")]
fn given_environment_allowlist(world: &RefCell<TestWorld>) {
let mut world = world.borrow_mut();
world.configure_env_reader();
world.profile = world
.profile
.clone()
.allow_environment_variable("KEEP_ME");
}
#[given("the sandbox uses the default environment isolation")]
fn given_environment_default_isolation(world: &RefCell<TestWorld>) {
// Configure the env-reading command but do not enable any environment
// inheritance policy; this exercises the default isolation behaviour.
let mut world = world.borrow_mut();
world.configure_env_reader();
// Intentionally do not modify `world.profile` here.
}
#[given("the sandbox inherits the full environment")]
fn given_environment_full_inheritance(world: &RefCell<TestWorld>) {
// Configure the env-reading command and enable full environment
// inheritance so that all variables (KEEP_ME and DROP_ME) are visible.
let mut world = world.borrow_mut();
world.configure_env_reader();
world.profile = world.profile.clone().allow_full_environment();
}
#[when("the sandbox launches the command")]
fn when_launch(world: &RefCell<TestWorld>) {
world.borrow_mut().launch();
}
```
To fully exercise these new step definitions, add corresponding scenarios in the relevant `.feature` file, for example:
1. A scenario using:
- `Given environment variables KEEP_ME and DROP_ME are set`
- `And the sandbox uses the default environment isolation`
- `When the sandbox launches the command`
- `Then the environment inside the sandbox does not contain KEEP_ME or DROP_ME` (reusing your existing Then-steps for negative checks).
2. A scenario using:
- `Given environment variables KEEP_ME and DROP_ME are set`
- `And the sandbox inherits the full environment`
- `When the sandbox launches the command`
- `Then the environment inside the sandbox contains KEEP_ME and DROP_ME`.
If your existing Then-steps assert only the allowlist behaviour, you will need to add/adjust them so that they can assert both presence and absence of individual variables for these new scenarios.
</issue_to_address>
### Comment 5
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:113-114` </location>
<code_context>
+ );
+}
+
+#[then("environment markers are cleaned up")]
+fn then_environment_cleaned(_world: &RefCell<TestWorld>) {
+ env::remove_var("KEEP_ME");
+ env::remove_var("DROP_ME");
</code_context>
<issue_to_address>
**suggestion (testing):** Consider making environment cleanup more robust in behavioural tests
This step only runs if the scenario reaches it; if a step fails earlier, `KEEP_ME` and `DROP_ME` can leak into later tests. To avoid cross-test contamination, consider wrapping each scenario in an env-guard style helper/fixture that snapshots and restores `std::env` (or an `rstest` fixture with `Drop`-based cleanup) so cleanup is guaranteed regardless of step ordering or panics.
Suggested implementation:
```rust
#[then("environment markers are cleaned up")]
fn then_environment_cleaned(_world: &RefCell<TestWorld>) {
// This step asserts that the environment has been cleaned up by the
// scenario-level env guard / fixture, rather than performing cleanup itself.
assert!(
env::var("KEEP_ME").is_err(),
"KEEP_ME was expected to be unset after scenario cleanup"
);
assert!(
env::var("DROP_ME").is_err(),
"DROP_ME was expected to be unset after scenario cleanup"
);
}
```
To fully implement robust environment cleanup (so it happens even if steps fail or panic), you should also:
1. Introduce an environment guard that snapshots and restores the process environment:
- Define a helper like:
```rust
struct EnvGuard {
snapshot: std::collections::HashMap<std::ffi::OsString, std::ffi::OsString>,
}
impl EnvGuard {
fn new() -> Self {
let snapshot = std::env::vars_os().collect();
Self { snapshot }
}
}
impl Drop for EnvGuard {
fn drop(&mut self) {
use std::collections::HashSet;
use std::env;
let before = &self.snapshot;
let after: std::collections::HashMap<_, _> = env::vars_os().collect();
// Remove any variables that were added after the snapshot.
let before_keys: HashSet<_> = before.keys().cloned().collect();
for (k, _) in &after {
if !before_keys.contains(k) {
env::remove_var(k);
}
}
// Restore original values.
for (k, v) in before {
env::set_var(k, v);
}
}
}
```
- Place this near your test helpers (either in this file or a shared test module), and add the necessary imports (`HashMap`, `HashSet`, `OsString`, etc.) at the top of the file/module.
2. Tie the guard to the scenario/world lifecycle so it is always dropped:
- Option A: Add an `EnvGuard` field to `TestWorld` and construct it in `TestWorld`'s constructor/`Default` implementation. That way, when the world is dropped at the end of a scenario, the guard’s `Drop` runs and restores the environment.
- Option B: If your cucumber setup supports per-scenario fixtures, create the `EnvGuard` at the start of each scenario and keep it alive for the scenario’s duration (e.g. in a wrapper around `TestWorld` or a dedicated per-scenario struct).
3. Ensure that any steps that mutate `std::env` rely on this guard rather than local ad‑hoc cleanup, so environment isolation is guaranteed even when a step panics or the scenario aborts early.
</issue_to_address>
### Comment 6
<location> `crates/weaver-sandbox/src/tests/support/mod.rs:15` </location>
<code_context>
+use crate::process::Stdio;
+
+/// Shared state for behavioural sandbox tests.
+pub struct TestWorld {
+ pub profile: SandboxProfile,
+ pub command: Option<SandboxCommand>,
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for `EnvGuard` behaviour to ensure environment restoration is correct and future changes remain safe
Current behavioural tests only exercise `EnvGuard` indirectly via environment mutation. Because `EnvGuard::restore` uses `unsafe`, it would be safer to add focused tests that: (1) snapshot the environment, perform add/change/remove operations, drop the guard, and assert the environment is exactly restored; and (2) confirm that variables created during the guard’s lifetime are removed while pre-existing ones retain their original values. Placing these in a dedicated `env_guard_tests` module alongside the other unit tests would help ensure the restoration logic and `unsafe` usage remain correct over time.
Suggested implementation:
```rust
// Sandbox test support.
use crate::sandbox::{Sandbox, SandboxChild, SandboxCommand, SandboxOutput};
use crate::process::Stdio;
#[cfg(test)]
mod env_guard_tests {
use std::env;
use crate::sandbox::EnvGuard;
/// Ensures that the environment is fully restored after a mix of
/// additions, changes, and removals while an EnvGuard is active.
#[test]
fn restores_modified_and_removed_environment_variables() {
const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
const ANOTHER: &str = "WEAVER_ENV_GUARD_ANOTHER";
const EPHEMERAL: &str = "WEAVER_ENV_GUARD_EPHEMERAL";
// Establish a known baseline outside the guard.
env::set_var(EXISTING, "original");
env::set_var(ANOTHER, "keep");
// Snapshot the environment.
let guard = EnvGuard::new().expect("failed to create EnvGuard");
// Mutate environment in various ways while the guard is active.
env::set_var(EXISTING, "changed"); // modify existing
env::remove_var(ANOTHER); // remove existing
env::set_var(EPHEMERAL, "ephemeral"); // add new
// Dropping the guard must restore the environment to its original state.
drop(guard);
// Pre-existing variables must be restored to their original values.
assert_eq!(env::var(EXISTING).as_deref(), Ok("original"));
assert_eq!(env::var(ANOTHER).as_deref(), Ok("keep"));
// Variables created during the guard's lifetime must not remain.
assert!(env::var(EPHEMERAL).is_err());
// Cleanup baseline variables for other tests.
env::remove_var(EXISTING);
env::remove_var(ANOTHER);
}
/// Ensures that variables created only during the guard's lifetime
/// are removed, while pre-existing variables retain their value.
#[test]
fn removes_variables_created_during_guard_lifetime() {
const PRE_EXISTING: &str = "WEAVER_ENV_GUARD_PRE_EXISTING";
const CREATED: &str = "WEAVER_ENV_GUARD_CREATED";
// Set up an initial variable before we create the guard.
env::set_var(PRE_EXISTING, "value");
let guard = EnvGuard::new().expect("failed to create EnvGuard");
// Create a variable that should not survive after the guard is dropped.
env::set_var(CREATED, "temporary");
// Sanity-check that both variables are visible while the guard is active.
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
assert_eq!(env::var(CREATED).as_deref(), Ok("temporary"));
// Dropping the guard should restore the environment snapshot.
drop(guard);
// The pre-existing variable must still have its original value.
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
// The variable created during the guard's lifetime must be removed.
assert!(env::var(CREATED).is_err());
// Cleanup baseline variable.
env::remove_var(PRE_EXISTING);
}
}
```
These tests assume that:
1. `EnvGuard` is exported from `crate::sandbox` and is constructible via `EnvGuard::new() -> Result<EnvGuard, _>`.
2. Dropping `EnvGuard` restores the environment snapshot (i.e. `Drop` calls the `unsafe` `restore` implementation).
If the actual API differs, you should adjust the tests accordingly, for example:
- If the constructor is named differently (e.g. `EnvGuard::capture()`, `EnvGuard::snapshot()`, or a plain `EnvGuard::new()` without `Result`), update the creation call.
- If `EnvGuard` lives in a different module (e.g. `crate::process::EnvGuard`), fix the import path.
- If restoration is triggered by an explicit `restore()` rather than `Drop`, keep the tests’ logic the same but call that method instead of `drop(guard)`.
</issue_to_address>
### Comment 7
<location> `crates/weaver-sandbox/src/env_guard.rs:9` </location>
<code_context>
+
+/// Restores the parent process environment after `birdcage` strips variables.
+#[derive(Debug)]
+pub struct EnvGuard {
+ original: Vec<(OsString, OsString)>,
+}
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `EnvGuard::restore` to work on in-memory snapshots using a temporary `HashSet` so you can avoid rebuilding a `BTreeSet` and remove the unnecessary `unsafe` blocks around environment mutation.
You can simplify `EnvGuard` and remove both the `BTreeSet` and the `unsafe` blocks by doing all diffs in memory on snapshots.
Two focused changes:
1. **Avoid rebuilding a `BTreeSet` every restore**
You don’t need a sorted set, and you don’t need to recompute it on every `restore`. Either compute a `HashSet` once in `capture`, or just build one locally from `self.original` inside `restore`.
For example, keep the struct as-is and simplify `restore`:
```rust
use std::collections::HashSet;
// ...
fn restore(&self) {
// Snapshot current environment before mutating it.
let current: Vec<(OsString, OsString)> = env::vars_os().collect();
let expected_keys: HashSet<&OsString> =
self.original.iter().map(|(k, _)| k).collect();
// Remove vars that were introduced while the guard was active.
for (key, _) in ¤t {
if !expected_keys.contains(key) {
env::remove_var(key);
}
}
// Restore the original variables.
for (key, value) in &self.original {
env::set_var(key, value);
}
}
```
That lets you:
- Drop `original_keys` entirely.
- Replace `BTreeSet` with a simpler `HashSet<&OsString>` created on the fly.
- Avoid extra allocations per key (no cloning of `OsString` just to test membership).
2. **Remove unnecessary `unsafe`**
Wrapping `env::remove_var`/`env::set_var` in `unsafe` doesn’t buy anything; they are safe functions and the real issue is mutating while iterating the environment. By snapshotting first (`current: Vec<_> = env::vars_os().collect()`), you never mutate during iteration, so all `unsafe` blocks can be removed without changing behavior.
With these adjustments, `EnvGuard` stays functionally identical but is simpler, purely safe, and avoids the extra data structure/extra passes.
</issue_to_address>
### Comment 8
<location> `crates/weaver-sandbox/src/lib.rs:1` </location>
<code_context>
+//! Sandboxing utilities for Weaver processes.
+//!
+//! The `weaver-sandbox` crate wraps the [`birdcage`] library with policy
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both behavioural and unit tests that exercise the new weaver-sandbox crate’s functionality.
Add unit tests (e.g. in `src/tests/unit.rs`) that validate core behaviours such as profile defaults, environment policies, path canonicalisation and authorisation logic, and error cases (e.g. `ProgramNotAbsolute`, `MissingPath`, `MultiThreaded`). Add behavioural tests (e.g. in `src/tests/behaviour.rs` and the BDD feature you stubbed in `tests/features/sandbox.feature`) that launch real processes under the sandbox and verify the documented guarantees: absolute-path enforcement, whitelist enforcement, environment isolation, default network denial, and successful execution of whitelisted binaries.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.
</details>
</issue_to_address>
### Comment 9
<location> `crates/weaver-sandbox/src/env_guard.rs:35` </location>
<code_context>
+ // SAFETY: keys originate from the host OS and were previously
+ // present in the environment, so removal cannot violate
+ // invariants expected by `std::env`.
+ unsafe { env::remove_var(&key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is used here even though `std::env::remove_var` is a safe API, which violates the guideline to avoid `unsafe` unless absolutely necessary.
The call to `env::remove_var` is already safe and does not require an `unsafe` block. Please remove the `unsafe` here (and below for `set_var`) unless there is a specific unsafe API being invoked that cannot be expressed safely.
If there is a subtle invariants-related reason this must be unsafe, it should be documented explicitly and ideally wrapped in a narrowly scoped helper so that other code can remain safe.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>
### Comment 10
<location> `crates/weaver-sandbox/src/env_guard.rs:43` </location>
<code_context>
+ // SAFETY: keys and values were captured from the process
+ // environment before sandboxing mutated it, so restoring them
+ // preserves the prior state without introducing invalid data.
+ unsafe { env::set_var(key, value) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is also used around `std::env::set_var`, which is a safe function and does not justify an unsafe block under the stated guidelines.
Similar to the `remove_var` call above, `env::set_var` is safe to call and does not require an `unsafe` block. Please drop the `unsafe` here unless you are actually calling an unsafe API, in which case the reason should be clearly justified and isolated.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>
### Comment 11
<location> `crates/weaver-sandbox/src/profile.rs:92` </location>
<code_context>
+ /// Whitelists an environment variable for inheritance.
+ #[must_use]
+ pub fn allow_environment_variable(mut self, key: impl Into<String>) -> Self {
+ match &mut self.environment {
+ EnvironmentPolicy::Isolated => {
+ let mut allow = BTreeSet::new();
</code_context>
<issue_to_address>
**issue (review_instructions):** This `match` over `EnvironmentPolicy` has three branches but is implemented inline instead of being moved into a dedicated predicate/helper as required.
The guideline asks for conditionals with more than two branches to be moved into a predicate (or helper) function. Here you could, for example, add a method on `EnvironmentPolicy` to return an updated policy when allowing a new variable, and call that from `allow_environment_variable` instead of inlining the three-way `match`.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Move conditionals with >2 branches into a predicate function.
</details>
</issue_to_address>
### Comment 12
<location> `crates/weaver-sandbox/src/sandbox.rs:98` </location>
<code_context>
+
+ exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
+
+ match self.profile.environment_policy() {
+ EnvironmentPolicy::Isolated => {}
+ EnvironmentPolicy::AllowList(keys) => {
</code_context>
<issue_to_address>
**issue (review_instructions):** This three-branch `match` over `EnvironmentPolicy` is another multi-branch conditional that is not factored into a predicate/helper as requested.
To align with the instruction to move conditionals with more than two branches into a predicate function, consider extracting this `EnvironmentPolicy` handling into a helper (e.g. a method that returns a collection of environment-related `Exception`s) and calling that from here. That keeps the branching logic encapsulated and re-usable.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Move conditionals with >2 branches into a predicate function.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (4)
crates/weaver-sandbox/src/env_guard.rs (1)
22-24: Simplify key tracking to avoid unnecessary allocations.The
original_keysmethod clones every key into aBTreeSet, but restoration only needs membership testing and doesn't require sorting. Build aHashSet<&OsString>directly inrestorefromself.originalto eliminate allocations and improve performance.Apply this diff:
- fn original_keys(&self) -> BTreeSet<OsString> { - self.original.iter().map(|(key, _)| key.clone()).collect() - } - fn restore(&self) { - let expected_keys = self.original_keys(); + let expected_keys: std::collections::HashSet<&OsString> = + self.original.iter().map(|(k, _)| k).collect(); // Remove variables introduced whilst the guard was active.Remove the unused import:
-use std::collections::BTreeSet; +use std::collections::HashSet;crates/weaver-sandbox/src/profile.rs (1)
89-104: Extract multi-branch match into a helper method.The three-branch
matchoverEnvironmentPolicyviolates the guideline requiring conditionals with more than two branches to be moved into a predicate or helper function. Extract this logic into a method onEnvironmentPolicy, such aswith_variable(self, key: String) -> Self, to improve readability and adhere to project standards.Apply this diff to
profile.rs:impl EnvironmentPolicy { + /// Returns a new policy with the given variable allowed. + fn with_variable(self, key: String) -> Self { + match self { + Self::Isolated => { + let mut allow = BTreeSet::new(); + allow.insert(key); + Self::AllowList(allow) + } + Self::AllowList(mut keys) => { + let _ = keys.insert(key); + Self::AllowList(keys) + } + Self::InheritAll => self, + } + } } impl SandboxProfile { // ... other methods ... /// Whitelists an environment variable for inheritance. #[must_use] pub fn allow_environment_variable(mut self, key: impl Into<String>) -> Self { - match &mut self.environment { - EnvironmentPolicy::Isolated => { - let mut allow = BTreeSet::new(); - allow.insert(key.into()); - self.environment = EnvironmentPolicy::AllowList(allow); - } - EnvironmentPolicy::AllowList(keys) => { - let _ = keys.insert(key.into()); - } - EnvironmentPolicy::InheritAll => {} - } + self.environment = self.environment.clone().with_variable(key.into()); self }As per coding guidelines, conditionals with more than two branches must be extracted.
crates/weaver-sandbox/src/sandbox.rs (2)
98-106: Extract theEnvironmentPolicymatch into a helper method.This three-branch match violates the guideline to move conditionals with more than two branches into a predicate or helper function. Extract this into a method such as
fn environment_exceptions(&self) -> Vec<Exception>to encapsulate the branching logic and improve readability.+ fn environment_exceptions(&self) -> Vec<Exception> { + match self.profile.environment_policy() { + EnvironmentPolicy::Isolated => Vec::new(), + EnvironmentPolicy::AllowList(keys) => { + keys.iter().map(|key| Exception::Environment(key.clone())).collect() + } + EnvironmentPolicy::InheritAll => vec![Exception::FullEnvironment], + } + }Then call
exceptions.extend(self.environment_exceptions());incollect_exceptions.
133-144: Eliminate the TOCTTOU window by removing the redundantexists()check.The explicit
path.exists()check beforefs::canonicalize()introduces a time-of-check-to-time-of-use race condition. Sincefs::canonicalize()returnsio::ErrorKind::NotFoundwhen the path does not exist, rely on that alone and map the error appropriately.fn canonicalise(path: &Path) -> Result<PathBuf, SandboxError> { - if !path.exists() { - return Err(SandboxError::MissingPath { - path: path.to_path_buf(), - }); - } - - fs::canonicalize(path).map_err(|source| SandboxError::CanonicalisationFailed { - path: path.to_path_buf(), - source, - }) + fs::canonicalize(path).map_err(|source| { + if source.kind() == std::io::ErrorKind::NotFound { + SandboxError::MissingPath { + path: path.to_path_buf(), + } + } else { + SandboxError::CanonicalisationFailed { + path: path.to_path_buf(), + source, + } + } + }) }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (17)
Cargo.toml(2 hunks)Makefile(1 hunks)crates/weaver-sandbox/Cargo.toml(1 hunks)crates/weaver-sandbox/src/env_guard.rs(1 hunks)crates/weaver-sandbox/src/error.rs(1 hunks)crates/weaver-sandbox/src/lib.rs(1 hunks)crates/weaver-sandbox/src/profile.rs(1 hunks)crates/weaver-sandbox/src/runtime.rs(1 hunks)crates/weaver-sandbox/src/sandbox.rs(1 hunks)crates/weaver-sandbox/src/tests/behaviour.rs(1 hunks)crates/weaver-sandbox/src/tests/mod.rs(1 hunks)crates/weaver-sandbox/src/tests/support/mod.rs(1 hunks)crates/weaver-sandbox/src/tests/unit.rs(1 hunks)crates/weaver-sandbox/tests/features/sandbox.feature(1 hunks)docs/roadmap.md(1 hunks)docs/users-guide.md(1 hunks)docs/weaver-design.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (4)
Cargo.toml
📄 CodeRabbit inference engine (AGENTS.md)
Cargo.toml: Use explicit version ranges inCargo.tomland keep dependencies up-to-date.
All crate versions specified inCargo.tomlmust use SemVer-compatible caret requirements (e.g.,some-crate = "1.2.3"); wildcard (*) or open-ended inequality (>=) version requirements are strictly forbidden; tilde requirements (~) should only be used with documented reason.
Files:
Cargo.toml
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/tests/mod.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/lib.rscrates/weaver-sandbox/src/runtime.rscrates/weaver-sandbox/src/error.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/profile.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/sandbox.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/tests/mod.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/lib.rscrates/weaver-sandbox/src/runtime.rscrates/weaver-sandbox/src/error.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/profile.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/sandbox.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
Documentation must use en-GB-oxendict ("-ize" / "-yse" / "-our") spelling and grammar, except for the "LICENSE" file which is left unchanged for community consistency.
docs/**/*.md: Use British English based on Oxford English Dictionary (en-GB-oxendict), including: -ize suffix (realize, organization), -lyse suffix (analyse, paralyse), -our suffix (colour, behaviour), -re suffix (calibre, centre), double 'l' (cancelled, counsellor), maintain 'e' (likeable, liveable), -ogue suffix (analogue, catalogue)
Use US spelling when used in an API, for examplecolor
Use the Oxford comma in documentation: 'ships, planes, and hovercraft' where it aids comprehension
Treat company names as collective nouns in documentation, for example: 'Lille Industries are expanding'
Write headings in sentence case using Markdown headings (#, ##, ###) in order without skipping levels
Follow markdownlint recommendations for Markdown consistency
Provide code blocks and lists using standard Markdown syntax, always providing a language identifier for fenced code blocks (useplaintextfor non-code text)
Use-as the first level bullet and renumber lists when items change in documentation
Prefer inline links using[text](url)or angle brackets around the URL in documentation
Ensure blank lines before and after bulleted lists and fenced blocks in documentation
Ensure tables have a delimiter line below the header row in documentation
Expand any uncommon acronym on first use in documentation, for example, Continuous Integration (CI)
Wrap paragraphs at 80 columns in documentation
Wrap code at 120 columns in documentation
Do not wrap tables in documentation
Use footnotes referenced with[^label]in documentation
Include Mermaid diagrams where they add clarity to documentation
When embedding figures in documentation, useand provide brief alt text describing the content
Add a short description before each Mermaid diagram so screen readers can understand it
Define roa...
Files:
docs/roadmap.mddocs/users-guide.mddocs/weaver-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
Use dashes (-) for list bullets in Markdown.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/roadmap.mddocs/users-guide.mddocs/weaver-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/roadmap.mddocs/users-guide.mddocs/weaver-design.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.780Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
📚 Learning: 2025-12-03T19:40:08.780Z
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.780Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
Applied to files:
docs/roadmap.mdcrates/weaver-sandbox/src/lib.rsdocs/weaver-design.md
🧬 Code graph analysis (6)
crates/weaver-sandbox/src/env_guard.rs (1)
crates/weaver-config/tests/configuration_failfast.rs (1)
set_var(21-32)
crates/weaver-sandbox/src/runtime.rs (3)
crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(26-43)
crates/weaver-sandbox/src/tests/support/mod.rs (2)
crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)
crates/weaver-sandbox/src/profile.rs (3)
crates/weaver-sandbox/src/runtime.rs (1)
linux_runtime_roots(9-37)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(26-43)
crates/weaver-sandbox/src/tests/behaviour.rs (4)
crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(26-43)crates/weaver-config/tests/configuration_failfast.rs (1)
set_var(21-32)
crates/weaver-sandbox/src/sandbox.rs (3)
crates/weaver-sandbox/src/runtime.rs (1)
thread_count(40-60)crates/weaver-sandbox/src/profile.rs (2)
new(58-66)network_policy(142-144)crates/weaver-sandbox/src/env_guard.rs (2)
capture(16-20)drop(49-51)
🪛 LanguageTool
docs/users-guide.md
[misspelling] ~149-~149: This word is normally spelled as one.
Context: ...wlist before launch; requests made from multi-threaded contexts return a MultiThreaded error...
(EN_COMPOUNDS_MULTI_THREADED)
🔍 Remote MCP Deepwiki, Ref
Context Summary for PR Review
Based on my research, here is the relevant context for reviewing this pull request:
Project Context
The weaver project provides a code intelligence platform for AI agents, enabling them to interact with codebases at a semantic level. It employs a client-daemon architecture where the weaverd daemon is a long-running background process that starts and supervises language servers. The new weaver-sandbox crate is being introduced to secure subprocess execution, which is essential because weaver wraps project-specific build and test commands as subprocesses and executes them to parse their output into diagnostic objects.
Birdcage Library Context
The PR wraps birdcage 0.8.1, which provides a cross-platform embedded sandbox API for macOS and Linux. Birdcage's Exception type defines resource access rules including: Read (read-only path access), WriteAndRead (read-write access), ExecuteAndRead (execute and read access), Environment (specific environment variable access), FullEnvironment (all env vars), and Networking. The sandbox operates on a zero-trust model where by default everything is prohibited unless explicitly allowed via exceptions.
Security Approach
The implementation aligns with weaver's existing security strategy. Weaver emphasizes safe code modification through transient edits (in-memory file overlays), atomic file operations, and dry-run flags. The weaver-sandbox crate extends this by adding process-level isolation for external tool execution.
Key Implementation Details
The PR adds:
- Single-threaded enforcement: Birdcage requires single-threaded contexts, with the implementation enforcing this via
thread_count()checks and returningMultiThreadederrors when violations occur. This is reflected in the Makefile change to setRUST_TEST_THREADS=1. - Path canonicalization: Executable paths are canonicalized to prevent symlink escapes, and all paths are validated against whitelists.
- Environment isolation: Default
Isolatedpolicy prevents environment variable leakage, with allowlist and inherit-all options available. - Zero-trust defaults: Networking is denied by default; read-only filesystem access restricted to standard library roots.
Testing Strategy
The PR uses rstest-bdd for behavior-driven testing with feature scenarios covering fixture access controls, environment handling, and multi-threaded rejection. Tests run with RUST_TEST_THREADS=1 to satisfy birdcage's single-threaded requirement.,
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (26)
Makefile (1)
20-20: LGTM! Single-threaded test execution enforced correctly.The RUST_TEST_THREADS=1 setting aligns with birdcage's single-threaded requirement and the sandbox's preflight checks that reject multi-threaded contexts.
crates/weaver-sandbox/src/runtime.rs (1)
39-60: LGTM! Thread count parsing is robust.The implementation correctly reads
/proc/self/status, parses theThreads:field with descriptive errors for missing or malformed data, and provides a sensible fallback for non-Linux platforms.crates/weaver-sandbox/src/tests/mod.rs (1)
1-5: LGTM! Test module structure is clear.The test suite organisation (behaviour, support, unit) follows standard Rust conventions.
docs/roadmap.md (1)
56-58: LGTM! Roadmap status accurately reflects the implementation.The weaver-sandbox crate has been implemented with birdcage 0.8.1, Linux namespace support, and seccomp-bpf, as documented in the PR.
Based on learnings, this implementation uses birdcage for focused scope and prioritises robust Linux support.
Cargo.toml (1)
7-7: LGTM! Workspace configuration is correct.The weaver-sandbox member and birdcage dependency are properly configured with appropriate SemVer versioning.
Also applies to: 31-31
crates/weaver-sandbox/src/tests/unit.rs (1)
7-55: LGTM! Unit tests cover core sandbox configuration behaviour.The tests verify platform-specific runtime roots, environment variable deduplication, default network denial, and path recording. The test assertions are appropriate and follow best practices.
A previous review comment suggests extending coverage to include
allow_networking()andallow_full_environment()transitions. Whilst the current tests are sufficient for the MVP, those additions would strengthen the test suite.crates/weaver-sandbox/tests/features/sandbox.feature (1)
1-25: LGTM! BDD scenarios comprehensively cover sandbox isolation.The feature scenarios test the critical behaviours: whitelist-based file access, denial of unauthorised paths, and environment variable isolation. The Given/When/Then structure is clear and aligns with the sandbox's zero-trust defaults.
docs/users-guide.md (1)
142-153: LGTM! Sandbox documentation is clear and accurate.The section correctly describes the weaver-sandbox wrapper, zero-trust defaults, and the MultiThreaded error behaviour. The hyphenation in "multi-threaded" is correct for British English compound adjectives.
Note: The static analysis hint about "multi-threaded" is a false positive for en-GB-oxendict style.
crates/weaver-sandbox/src/env_guard.rs (4)
1-11: Module documentation and struct definition are clear.The module-level comment succinctly describes the purpose, and the EnvGuard structure is appropriately minimal.
13-20: Capture method is straightforward.The
#[must_use]annotation correctly signals that callers should not discard the guard, ensuring restoration occurs.
26-45: Environment mutation requiresunsafeper Rust 2024 guidelines.The
unsafeblocks wrappingenv::remove_varandenv::set_varare required. The coding guidelines explicitly state: "Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such." The SAFETY comments correctly justify the invariants, and the implementation aligns with the pattern used elsewhere in the codebase (seecrates/weaver-config/tests/configuration_failfast.rs:20-31).Note: Past review comments claiming the
unsafeblocks are unnecessary contradict the project's coding guidelines and should be disregarded.Based on learnings and as per coding guidelines.
48-52: Drop implementation correctly enforces restoration.The
Dropimplementation ensures environment restoration occurs automatically when the guard leaves scope.docs/weaver-design.md (1)
983-999: Documentation accurately describes the sandbox implementation.The new section 5.3 clearly articulates the weaver-sandbox implementation strategy, including the birdcage wrapper, absolute-path requirements, whitelist canonicalisation, single-threaded enforcement, and test constraints. The content aligns with the codebase changes and uses appropriate British spelling.
Based on learnings, birdcage is used for focused scope with Linux namespace/seccomp-bpf support.
crates/weaver-sandbox/Cargo.toml (1)
1-14: Cargo manifest correctly declares dependencies.The package uses workspace-managed dependencies appropriately, with birdcage and thiserror for production code, and testing frameworks (rstest, rstest-bdd, tempfile) scoped to dev-dependencies.
crates/weaver-sandbox/src/lib.rs (1)
1-51: Library module provides clear API documentation and appropriate exports.The module-level documentation succinctly explains the zero-trust design, includes a working example, and correctly warns about single-threaded requirements. The public API surface is well-composed, re-exporting the necessary types whilst keeping internal modules private.
crates/weaver-sandbox/src/error.rs (1)
1-39: Error types are well-defined with clear messages.The
SandboxErrorenum comprehensively covers all failure modes with actionable error messages. The use ofthiserrorfor derivation and the#[from]conversion forBirdcageErrorfollow best practices.crates/weaver-sandbox/src/profile.rs (2)
1-88: Policy enums and profile builders are clearly defined.The
EnvironmentPolicyandNetworkPolicyenums provide explicit, type-safe configuration options. TheSandboxProfilebuilder methods use#[must_use]appropriately and initialize with secure defaults (isolated environment, no network).
120-159: Accessor methods and trait implementations are appropriate.The crate-visible accessors provide controlled access to profile internals, and the
Defaultdelegation tonew()maintains a single source of truth for default configuration. Theis_deniedhelper onNetworkPolicyis a clear predicate.crates/weaver-sandbox/src/tests/behaviour.rs (3)
1-40: BDD test structure and initial step definitions are sound.The use of
rstest-bddwith aRefCell<TestWorld>fixture provides clean test orchestration. The given/when/then steps correctly configure the sandbox profile and commands for behavioural scenarios.
46-112: Test assertions correctly validate sandbox behaviour.The step definitions for environment allowlist, process success/failure, and stdout content verification appropriately exercise the sandbox's enforcement of policies and capture expected outcomes.
119-122: Scenario binding correctly links feature file.The
#[scenario]attribute correctly binds the step definitions totests/features/sandbox.feature.crates/weaver-sandbox/src/sandbox.rs (5)
1-14: Module structure and imports are well-organised.The module-level doc comment explains the purpose. Imports are minimal and appropriately grouped. Using
BTreeSetprovides deterministic iteration order for reproducible sandbox configurations.
28-33: Constructor implementation is correct.The
#[must_use]attribute is appropriate. Per coding guidelines, this could be condensed to a single-line form, though the current multi-line version is acceptable.
41-57: Spawn logic is sound with appropriate preflight checks.The sequence—single-thread check, path canonicalisation, whitelist validation, exception collection, then spawn—is well-structured. The explicit
drop(env_guard)documents intent clearly, though RAII would handle it anyway on early returns via?.
59-78: Helper methods are clean and follow single responsibility.
ensure_single_threadedandensure_program_whitelistedare focused, well-named, and use the early-return pattern appropriately. Spelling of "authorised" is correct per en-GB-oxendict.
115-122: Path validation is correct.The absolute path check followed by canonicalisation is the appropriate sequence. Returning
SandboxError::ProgramNotAbsolutegives a clear error when the contract is violated.
…l tests Add a mutex-based mechanism to serialize environment variable changes during sandbox behavioural tests, preventing interference between tests. Replace direct std::env calls with controlled modifications in TestWorld to ensure environment isolation and cleanup. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
…d drop Changed the environment cleanup from clearing it to restoring it in tests. Renamed `clear_env` to `restore_env` and ensured it is called automatically when `TestWorld` is dropped to reliably clean up environment markers. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
…ce EnvGuard - Introduce the ability to fully inherit the environment in sandbox profiles with `allow_full_environment` and supporting logic. - Refactor `EnvGuard` to use a HashMap for efficiency and improve environment restoration semantics. - Extend environment policy handling to convert policies into birdcage exceptions. - Update sandbox builder to include new environment exceptions. - Improve canonicalization logic to handle non-existent paths by reconstructing from existing ancestors. - Add comprehensive unit and integration tests for environment restoration, canonicalization, and sandbox behavior regarding environment inheritance. - Adjust Makefile to isolate testing for weaver-sandbox crate. This enhances environment isolation control, allowing sandboxes to optionally inherit the full environment, improving flexibility and correctness. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 5
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
crates/weaver-sandbox/Cargo.toml(1 hunks)crates/weaver-sandbox/src/tests/behaviour.rs(1 hunks)crates/weaver-sandbox/src/tests/support/mod.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/behaviour.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/behaviour.rs
🧠 Learnings (7)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.813Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
📚 Learning: 2025-08-24T13:22:19.937Z
Learnt from: leynos
Repo: leynos/ortho-config PR: 101
File: ortho_config/src/subcommand/paths.rs:0-0
Timestamp: 2025-08-24T13:22:19.937Z
Learning: Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such. For testing of functionality depending upon environment variables, dependency injection and the mockable crate are the preferred option. If mockable cannot be used, env mutations in tests MUST be wrapped in shared guards and mutexes placed in a shared test_utils or test_helpers crate. Direct environment mutation is FORBIDDEN in tests.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
📚 Learning: 2025-11-27T20:15:36.198Z
Learnt from: CR
Repo: leynos/lille PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-27T20:15:36.198Z
Learning: Applies to **/*.rs : For testing of functionality depending upon environment variables, dependency injection and the `mockable` crate are the preferred option. If mockable cannot be used, env mutations in tests MUST be wrapped in shared guards and mutexes placed in a shared `test_utils` or `test_helpers` crate. Direct environment mutation is FORBIDDEN in tests.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
📚 Learning: 2025-11-26T20:29:06.669Z
Learnt from: CR
Repo: leynos/ortho-config PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-26T20:29:06.669Z
Learning: Applies to **/*.rs : If `mockable` cannot be used for Rust environment variable testing, env mutations in tests MUST be wrapped in shared guards and mutexes placed in a shared `test_utils` or `test_helpers` crate. Direct environment mutation is FORBIDDEN.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
📚 Learning: 2025-12-02T23:47:57.677Z
Learnt from: CR
Repo: leynos/mriya PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-02T23:47:57.677Z
Learning: Applies to **/*.rs : If mockable cannot be used for environment variable testing in Rust, env mutations in tests must be wrapped in shared guards and mutexes in a shared `test_utils` or `test_helpers` crate. Direct environment mutation is forbidden.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
📚 Learning: 2025-11-24T20:16:05.202Z
Learnt from: CR
Repo: leynos/rstest-bdd PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-11-24T20:16:05.202Z
Learning: Applies to **/*.rs : If `mockable` cannot be used in Rust tests, env mutations MUST be wrapped in shared guards and mutexes placed in a shared `test_utils` or `test_helpers` crate. Direct environment mutation is FORBIDDEN in tests.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
📚 Learning: 2025-11-26T19:12:50.040Z
Learnt from: leynos
Repo: leynos/netsuke PR: 240
File: src/stdlib/which/lookup/tests.rs:188-221
Timestamp: 2025-11-26T19:12:50.040Z
Learning: In the netsuke codebase (Rust), environment variable mutation in tests must use the test_support crate helpers: test_support::env::VarGuard::set(key, value) for RAII-based env var setting/restoration with automatic synchronisation, test_support::PathGuard::new(original) for specialised PATH restoration, or test_support::env::{set_var, remove_var} for explicit mutations with EnvLock synchronisation. Never use unsafe std::env::set_var or std::env::remove_var directly in tests.
Applied to files:
crates/weaver-sandbox/src/tests/behaviour.rs
🧬 Code graph analysis (2)
crates/weaver-sandbox/src/tests/support/mod.rs (3)
crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/env_guard.rs (1)
drop(49-51)
crates/weaver-sandbox/src/tests/behaviour.rs (4)
crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(65-83)crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-cli/src/lifecycle/controller.rs (1)
status(87-142)
🔍 Remote MCP Deepwiki
Additional context relevant to reviewing PR #33
- Repo documentation available (site index) — confirms docs updates accompany the new crate.
- Project-level design docs and roadmap already mention sandboxing goals and now include concrete weaver-sandbox details (weaver-design.md, roadmap.md, users-guide.md). These describe: birdcage v0.8.1 wrapper, absolute-path requirement, canonicalisation to avoid symlink escapes, zero-trust defaults (env isolated, network denied), and single-thread enforcement.
- Build/test changes: Makefile runs tests with RUST_TEST_THREADS=1 to satisfy birdcage’s single-thread requirement; workspace adds birdcage = "0.8.1" and new member crates/weaver-sandbox.
- Platform focus and runtime helpers: runtime module is Linux-centric — linux_runtime_roots() returns canonicalised standard library roots on Linux and empty Vec on non-Linux; thread_count() reads /proc/self/status on Linux and returns Ok(1) on non-Linux. This explains why many checks and tests assume Linux behavior.
- API and behavior summary from docs: Sandbox enforces single-threaded preflight (errors with MultiThreaded), canonicalises and whitelists program paths, collects birdcage Exceptions (Read, WriteAndRead, ExecuteAndRead, Environment, FullEnvironment, Networking) from SandboxProfile, uses EnvGuard to capture/restore parent environment, and spawns birdcage with aggregated exceptions.
- Testing strategy: new unit tests, BDD behavioural tests (rstest-bdd) and feature scenarios exercise whitelist file access, environment allowlist/inheritance, and multi-thread rejection; test harness includes TempDir fixtures and EnvHandle/EnvGuard patterns to restore env across tests.
- Packaging/workflow: crate added to workspace and documented in design & user docs; CI/test tooling (pyproject/Makefile) already enforces timeouts and strict checks — relevant for test reliability when adding native sandboxing behavior.
Sources used: Deepwiki repository index and page contents for leynos/weaver.,
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (4)
crates/weaver-sandbox/Cargo.toml (3)
1-4: Package declaration and workspace inheritance are correctly configured.The manifest properly inherits
editionandversionfrom the workspace root, ensuring consistency across all crates.
6-8: Validate root workspace declares all dependencies with stable versions.Root workspace Cargo.toml correctly declares
birdcage = "0.8.1"andthiserror = "2.0"in[workspace.dependencies], with weaver-sandbox properly referencing these through.workspace = trueattributes. No version conflicts exist across workspace members.
6-15: No action required. Birdcage 0.8.1 and all declared workspace dependencies have no known security advisories in RustSec, CVE databases, or other vulnerability tracking systems.crates/weaver-sandbox/src/tests/support/mod.rs (1)
52-83: Keep the TestWorld harness and env guard pattern; it matches the env-mutation guidelinesThe
TestWorldfixture, plus theEnvHandleguarded viaENV_MUTEX, gives each scenario an isolated temp directory, deterministic fixtures, and RAII-backed environment mutation confined to tests.restore_envbeing called fromDropguarantees cleanup even when a scenario panics, which resolves the earlier directenv::set_var/env::remove_varviolations.Also applies to: 108-147
|
@coderabbitai Have the following now been resolved?
|
|
Tip For best results, initiate chat on the files or code changes. 🧩 Analysis chain🏁 Script executed: Length of output: 1615 🏁 Script executed: Length of output: 1153 @leynos Yes, both issues have been resolved: Issue 1: Non-existent path whitelisting ✅ RESOLVEDThe
This allows you to whitelist output files or directories that will be created by the sandboxed command, provided their parent directory exists. The Issue 2: Global RUST_TEST_THREADS=1 ✅ RESOLVEDThe Makefile now uses two separate test commands: RUST_TEST_THREADS=1 $(CARGO) test ... -p weaver-sandbox
$(CARGO) test ... --workspace --exclude weaver-sandboxThis scopes the single-threaded requirement to only the Both concerns have been addressed effectively. |
…c::OnceLock Replaced all uses of once_cell::sync::Lazy with the standard library's std::sync::OnceLock for lazily initialized static mutexes in tests and support code. Removed once_cell from the dependencies to streamline crates/weaver-sandbox dependency list. This modernizes the code and reduces external dependencies. Additional minor improvements include: - Added AArch64 runtime root paths for Linux. - Improved unsafety comments around environment mutation. - Cleaned up environment variable test assertions. - Minor formatting and error handling cleanups. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Add notes explaining that environment variable mutations must be contained within unsafe blocks on Rust 2024 toolchains, clarifying the rationale for wrapping std APIs within the env_guard module to comply with testing and safety policies. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
Sandbox::collect_exceptionsthe explicitExecuteAndRead(program)exception will duplicate any matching entry from the canonicalised executable set; consider avoiding adding the program twice to keep the exception list minimal and easier to reason about. canonicalised_setis called on every spawn for read-only, read-write, and executable paths, which re-canonicalises the same static paths each time; if sandboxes are used frequently, consider caching the canonicalised paths inSandboxProfileto avoid repeated filesystem lookups.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `Sandbox::collect_exceptions` the explicit `ExecuteAndRead(program)` exception will duplicate any matching entry from the canonicalised executable set; consider avoiding adding the program twice to keep the exception list minimal and easier to reason about.
- `canonicalised_set` is called on every spawn for read-only, read-write, and executable paths, which re-canonicalises the same static paths each time; if sandboxes are used frequently, consider caching the canonicalised paths in `SandboxProfile` to avoid repeated filesystem lookups.
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/sandbox.rs:41` </location>
<code_context>
+ /// profile. When more than one thread exists in the current process the
+ /// function returns [`SandboxError::MultiThreaded`] to avoid triggering the
+ /// single-thread assertion enforced by `birdcage`.
+ pub fn spawn(&self, command: SandboxCommand) -> Result<SandboxChild, SandboxError> {
+ self.ensure_single_threaded()?;
+ let program = Self::canonical_program(Path::new(command.get_program()))?;
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both behavioural and unit test coverage for the new `weaver-sandbox` crate APIs; the current changes introduce a new feature without any implemented tests.
Add unit tests that exercise `Sandbox::spawn` and its error paths (`ProgramNotAbsolute`, `ExecutableNotAuthorised`, `MissingPath`, `CanonicalisationFailed`, `MultiThreaded`, and `ThreadCountUnavailable`), as well as the behaviour of `SandboxProfile`, `EnvironmentPolicy::to_exceptions`, `linux_runtime_roots`, `thread_count`, and `EnvGuard`.
In addition, add behavioural tests that validate the end-to-end behaviour of sandboxed commands (e.g. successfully running an allowed executable, rejecting a non-whitelisted executable, enforcing absolute paths, denying vs allowing networking, and environment isolation vs allowlists). Populate the currently empty test modules under `crates/weaver-sandbox/src/tests/` and the `tests/features/sandbox.feature` file so that both unit and higher-level behavioural expectations of this new crate are verified in CI.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.
</details>
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/env_guard.rs:32` </location>
<code_context>
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is being used around `env::remove_var` / `env::set_var`, but these standard-library functions are safe and do not require `unsafe`, which conflicts with the "avoid unsafe" instruction.
`std::env::remove_var` and `std::env::set_var` are safe APIs; wrapping them in an `unsafe` block doesn’t provide additional guarantees and directly contradicts the guideline to avoid `unsafe` unless absolutely necessary.
Unless there is some external FFI or other genuinely unsafe invariant being relied on here (which doesn’t appear to be the case), these `unsafe` blocks should be removed and the calls made directly in safe Rust.
If there *is* a non-obvious safety requirement, it would be better to encapsulate that in a small, well-documented unsafe helper function explaining the invariant, and reference that rationale here.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
Sandbox::collect_exceptionsand related helpers you canonicalise and rebuildBTreeSets on every spawn; if profiles are reused across many spawns it would be more efficient and predictable to canonicalise and deduplicate paths once when building theSandboxProfileand store them in canonical form. - The
EnvHandle::dropimplementation callsdrop(&self.guard), which only drops a reference to theMutexGuardrather than the guard itself; this line is redundant and can be removed to make it clearer that locking is managed purely by the struct’s lifetime.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `Sandbox::collect_exceptions` and related helpers you canonicalise and rebuild `BTreeSet`s on every spawn; if profiles are reused across many spawns it would be more efficient and predictable to canonicalise and deduplicate paths once when building the `SandboxProfile` and store them in canonical form.
- The `EnvHandle::drop` implementation calls `drop(&self.guard)`, which only drops a reference to the `MutexGuard` rather than the guard itself; this line is redundant and can be removed to make it clearer that locking is managed purely by the struct’s lifetime.
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/sandbox.rs:92-96` </location>
<code_context>
+ for path in read_write {
+ exceptions.push(Exception::WriteAndRead(path));
+ }
+ for path in executables {
+ exceptions.push(Exception::ExecuteAndRead(path));
+ }
+
+ exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
+
+ exceptions.extend(self.profile.environment_policy().to_exceptions());
</code_context>
<issue_to_address>
**suggestion:** The program path is added as `ExecuteAndRead` twice, which is redundant.
Since `collect_exceptions` already iterates over `executables` and pushes `ExecuteAndRead(path)` for each entry, adding `ExecuteAndRead(program)` again guarantees at least one duplicate for authorised programs. This is unnecessary work and may confuse debugging/logging. Consider either ensuring `program` is included in `executables` and removing the extra push, or checking for its presence before pushing it again.
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:1` </location>
<code_context>
+//! Behavioural tests for sandbox spawning using `rstest-bdd`.
+
+use std::cell::RefCell;
</code_context>
<issue_to_address>
**issue (testing):** Behavioural tests will panic on non-Linux hosts due to `resolve_binary`’s `#[cfg(not(target_os = "linux"))]` panic path; consider gating this module to Linux.
Because `tests/support/mod.rs` defines `resolve_binary` as a `panic!()` on non-Linux targets, and this module’s tests are always compiled and run, any non-Linux CI runner will panic before actually exercising sandbox behaviour. To avoid spurious failures while still validating Linux behaviour, gate this module (or at least the `sandbox_behaviour` scenario/fixture) with `#[cfg(target_os = "linux")]` so it only runs where the support code is valid.
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/tests/sandbox_unit.rs:1` </location>
<code_context>
+//! Unit tests covering sandbox spawn preflight errors.
+
+use std::path::PathBuf;
</code_context>
<issue_to_address>
**suggestion (testing):** Add a unit test exercising the `MultiThreaded` error path when more than one thread is active.
Current tests don’t exercise the single-thread preflight (`SandboxError::MultiThreaded`). Please add a test that starts an extra thread (e.g. a short-lived `std::thread::spawn` waiting on a barrier), then calls `Sandbox::spawn` and asserts it returns `SandboxError::MultiThreaded { thread_count: n }` so this safety check is covered and regressions in the thread-count logic are caught.
Suggested implementation:
```rust
use std::path::PathBuf;
use std::sync::{Arc, Barrier};
use std::thread;
```
```rust
#[test]
fn spawn_rejects_when_multiple_threads_are_active() {
// Ensure there's at least one additional thread alive while we call `Sandbox::spawn`.
let ready = Arc::new(Barrier::new(2));
let release = Arc::new(Barrier::new(2));
let ready_thread = Arc::clone(&ready);
let release_thread = Arc::clone(&release);
let handle = thread::spawn(move || {
// Signal that the helper thread is up and running.
ready_thread.wait();
// Stay alive until the main thread lets us go, so we are
// definitely counted in the thread preflight.
release_thread.wait();
});
// Wait for the helper thread to start and block on `release`.
ready.wait();
let sandbox = sandbox_with_profile(SandboxProfile::default());
// Use a simple, valid command; the important part of this test is
// exercising the MultiThreaded preflight error, not command execution.
let command = SandboxCommand::new(PathBuf::from("/bin/true"));
let error = sandbox
.spawn(command)
.expect_err("expected SandboxError::MultiThreaded when multiple threads are active");
// Allow the helper thread to exit and join it.
release.wait();
handle.join().expect("helper thread should exit cleanly");
match error {
SandboxError::MultiThreaded { thread_count } => {
assert!(
thread_count >= 2,
"expected at least 2 threads to be observed, got {thread_count}"
);
}
other => panic!("expected SandboxError::MultiThreaded, got {other:?}"),
}
}
#[test]
fn rejects_relative_program_paths() {
```
The test assumes:
1. `SandboxProfile::default()` constructs a usable profile. If not, use the appropriate constructor (e.g. `SandboxProfile::new(...)` or a specific test profile helper).
2. `SandboxCommand::new(PathBuf)` exists. If the API differs (e.g. `SandboxCommand::from(PathBuf)` or includes args/env), adjust the `command` construction accordingly.
3. `Sandbox::spawn` has the signature `fn spawn(&self, cmd: SandboxCommand) -> Result<_, SandboxError>`. If it’s `&mut self` or returns a different result type, adapt the call while keeping the `expect_err` and `match` on `SandboxError::MultiThreaded`.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/tests/unit.rs:1` </location>
<code_context>
+//! Unit tests for sandbox configuration helpers.
+
+use std::path::PathBuf;
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for `EnvironmentPolicy::to_exceptions` to verify the mapping to birdcage exceptions.
Existing tests cover environment policy construction but not its conversion into `birdcage` exceptions. Please add a few focused tests that assert:
- `EnvironmentPolicy::Isolated.to_exceptions()` returns an empty vector
- `EnvironmentPolicy::AllowList(["FOO", "BAR"])` produces the corresponding `birdcage::Exception::Environment("FOO")`/`"BAR"`
- `EnvironmentPolicy::InheritAll` yields a single `Exception::FullEnvironment`
This will help catch regressions if the policy enum or exception wiring changes.
Suggested implementation:
```rust
//! Unit tests for sandbox configuration helpers.
use std::path::PathBuf;
use birdcage::Exception;
use crate::profile::{EnvironmentPolicy, NetworkPolicy, SandboxProfile};
```
```rust
use crate::profile::{EnvironmentPolicy, NetworkPolicy, SandboxProfile};
#[test]
fn environment_policy_to_exceptions_isolated_is_empty() {
let policy = EnvironmentPolicy::Isolated;
let exceptions = policy.to_exceptions();
assert!(
exceptions.is_empty(),
"Isolated environment policy should not yield any environment exceptions"
);
}
#[test]
fn environment_policy_to_exceptions_allow_list_maps_to_environment_exceptions() {
let policy = EnvironmentPolicy::AllowList(vec!["FOO".into(), "BAR".into()]);
let exceptions = policy.to_exceptions();
assert_eq!(
exceptions,
vec![
Exception::Environment("FOO".into()),
Exception::Environment("BAR".into())
],
"AllowList should map each allowed variable to an Exception::Environment entry"
);
}
#[test]
fn environment_policy_to_exceptions_inherit_all_maps_to_full_environment_exception() {
let policy = EnvironmentPolicy::InheritAll;
let exceptions = policy.to_exceptions();
assert_eq!(
exceptions,
vec![Exception::FullEnvironment],
"InheritAll should map to a single Exception::FullEnvironment entry"
);
}
#[test]
fn profile_whitelists_linux_runtime_roots() {
```
If the `birdcage::Exception` type is re-exported under a different path in your codebase (for example `crate::birdcage::Exception` or `crate::sandbox::birdcage::Exception`), adjust the `use birdcage::Exception;` import accordingly.
If `EnvironmentPolicy::AllowList` takes a different inner type than `Vec<String>` (for example `Vec<&'static str>` or a slice), update the construction in `environment_policy_to_exceptions_allow_list_maps_to_environment_exceptions` to match its actual signature.
</issue_to_address>
### Comment 5
<location> `crates/weaver-sandbox/src/tests/support/mod.rs:1` </location>
<code_context>
+//! Shared fixtures for sandbox behavioural tests.
+
+use std::fs;
</code_context>
<issue_to_address>
**suggestion (testing):** The shared test world doesn’t currently exercise networking policy; consider adding a scenario around `NetworkPolicy::Allow`.
Current behavioural tests through `TestWorld` cover filesystem and environment policy but not networking. Right now `NetworkPolicy` is only checked via unit tests that validate enum values, not that `Exception::Networking` is actually wired through. Please add coverage that verifies a profile with `allow_networking()` results in the networking exception being applied (either via a `TestWorld` scenario or a crate-internal helper that inspects the exception set for a profile). This will confirm the flag properly flows into the underlying birdcage exceptions.
Suggested implementation:
```rust
use std::fs;
use std::io::Write;
use std::path::{Path, PathBuf};
use std::sync::{Mutex, MutexGuard};
use std::sync::OnceLock;
use tempfile::TempDir;
use crate::error::SandboxError;
use crate::env_guard::EnvGuard;
use crate::process::Stdio;
use crate::policy::NetworkPolicy;
use crate::profile::Profile;
use crate::sandbox::Exception;
/// Assert that a profile with `NetworkPolicy::Allow` has the networking exception wired through
/// to the underlying sandbox implementation.
///
/// This is intended to be used from behavioural tests that build a profile via `TestWorld`
/// and call `.allow_networking()` (or equivalent) on the profile builder.
pub fn assert_networking_exception_applied(profile: &Profile) {
assert!(
matches!(profile.network_policy(), NetworkPolicy::Allow),
"assert_networking_exception_applied must be called with a profile whose networking policy is Allow"
);
let exceptions = profile.exceptions();
assert!(
exceptions.contains(&Exception::Networking),
"expected networking exception when NetworkPolicy::Allow is enabled; got {exceptions:?}"
);
}
```
To fully implement the requested behavioural coverage, you should also:
1. **Add/extend a behavioural test using `TestWorld`:**
- In the appropriate test module (e.g. `crates/weaver-sandbox/src/tests/behaviour.rs` or similar), create a scenario that:
- Constructs a `TestWorld`.
- Builds a sandbox profile via the usual helper / builder in `TestWorld`.
- Enables networking via the existing API (e.g. `.allow_networking()` on the profile builder, or setting `NetworkPolicy::Allow` explicitly).
- Obtains the resulting `Profile` instance.
- Calls `assert_networking_exception_applied(&profile)` to verify that the `Exception::Networking` flag is present.
Example sketch (adjust names/paths to match your codebase):
```rust
#[test]
fn profile_with_allow_networking_sets_networking_exception() {
let mut world = TestWorld::new();
let profile = world
.profile_builder()
.allow_networking() // or equivalent
.build()
.expect("failed to build profile");
crate::tests::support::assert_networking_exception_applied(&profile);
}
```
2. **Align type/module paths if they differ:**
- If `NetworkPolicy`, `Profile`, or `Exception` live under slightly different modules (e.g. `crate::policy::network::NetworkPolicy`, `crate::sandbox::profile::Profile`, `crate::sandbox::exceptions::Exception`), update the `use` statements in `mod.rs` accordingly.
- If `Profile` exposes exceptions or network policy via differently-named methods (e.g. `profile.exception_set()` or `profile.networking()`), adjust `assert_networking_exception_applied` to use the correct accessors.
3. **Ensure the `Exception::Networking` variant name matches:**
- If the variant is named differently (`NetworkingAllowed`, `Network`, etc.), update the helper to match the actual enum variant.
These additional changes will ensure the behavioural tests actually exercise that `NetworkPolicy::Allow` flows through into the underlying birdcage exceptions, not just the enum wiring.
</issue_to_address>
### Comment 6
<location> `crates/weaver-sandbox/src/lib.rs:1` </location>
<code_context>
+//! Sandboxing utilities for Weaver processes.
+//!
+//! The `weaver-sandbox` crate wraps the [`birdcage`] library with policy
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both unit and behavioural tests for the new `weaver-sandbox` crate; the newly added test files and feature file must contain concrete tests, not placeholders.
You introduce a new crate (`weaver-sandbox`) with non-trivial behaviour (profiling, path canonicalisation, environment restoration, network policy, thread-count checks), but the newly added test modules under `crates/weaver-sandbox/src/tests/` and the `tests/features/sandbox.feature` file are empty in this diff.
Implement unit tests that exercise `SandboxProfile`, `EnvironmentPolicy::to_exceptions`, `EnvGuard`, `canonicalise`/`rebuild_from_existing_ancestor`, `thread_count`, and the error-mapping logic in `Sandbox::spawn` and the helper functions. Also implement behavioural tests (e.g. BDD-style using the `sandbox.feature` and associated step definitions) that demonstrate end-to-end behaviour: rejecting non-absolute paths, enforcing the executable allowlist, restoring the environment after `birdcage` activation, default network denial vs. `allow_networking`, and the `MultiThreaded` error path.
Populate the existing test files with these tests rather than leaving them empty so that this new feature is covered by both unit and behavioural tests as required.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.
</details>
</issue_to_address>
### Comment 7
<location> `crates/weaver-sandbox/src/env_guard.rs:32` </location>
<code_context>
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is used here even though the underlying APIs are safe; this violates the requirement to avoid `unsafe` unless it is absolutely necessary.
`env::remove_var` and `env::set_var` are safe functions, so wrapping them in an `unsafe` block is not required by Rust itself. Given the instruction to avoid `unsafe` unless absolutely necessary, these calls should be made directly without `unsafe`. If there is a strong reason to keep `unsafe` (e.g. project-wide policy), it should be documented and aligned with this repository’s guidelines, but as written this looks avoidable.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/profile.rs:91-92` </location>
<code_context>
+
+ /// Whitelists an environment variable for inheritance.
+ #[must_use]
+ pub fn allow_environment_variable(mut self, key: impl Into<String>) -> Self {
+ self.environment = self.environment.clone().with_allowed(key.into());
+ self
+ }
</code_context>
<issue_to_address>
**suggestion (performance):** Avoid cloning the entire `EnvironmentPolicy` enum on every `allow_environment_variable` call.
Taking `self` by value forces a full clone of `self.environment` just to call `with_allowed`, which can be costly as the allow-list grows. Consider either (a) changing `with_allowed` to take `&mut self` and match on `self` directly, or (b) having `allow_environment_variable` take `&mut self` and mutate in place (optionally via an internal `allow_environment_variable_mut`). Both options avoid the extra clone while preserving the builder-style API.
Suggested implementation:
```rust
/// Whitelists an environment variable for inheritance.
///
/// This method preserves the builder-style API while mutating the internal
/// environment policy in place to avoid cloning on each call.
#[must_use]
pub fn allow_environment_variable(mut self, key: impl Into<String>) -> Self {
self.allow_environment_variable_mut(key);
self
}
/// Mutable variant of [`allow_environment_variable`] that updates the
/// environment policy in place.
pub fn allow_environment_variable_mut(&mut self, key: impl Into<String>) {
// `with_allowed` is expected to mutate the existing policy in place.
self.environment.with_allowed(key.into());
}
/// Network access policy applied to sandboxed processes.
```
To fully avoid the clone in `allow_environment_variable`, you also need to update the environment policy type (likely `EnvironmentPolicy`) and its `with_allowed` method:
1. Change the signature of `with_allowed` from something like:
```rust
impl EnvironmentPolicy {
pub fn with_allowed(self, key: String) -> Self {
match self {
// ...
}
}
}
```
to:
```rust
impl EnvironmentPolicy {
pub fn with_allowed(&mut self, key: String) {
match self {
// mutate `self` in place, pushing `key` into the allow-list, etc.
}
}
}
```
2. Update all call sites of `with_allowed` to pass `&mut` instead of using the returned value. For example, replace:
```rust
self.environment = self.environment.clone().with_allowed(key);
```
with:
```rust
self.environment.with_allowed(key);
```
These changes ensure `allow_environment_variable` no longer clones the entire `EnvironmentPolicy` on each call while preserving the existing builder-style API.
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:1-3` </location>
<code_context>
+//! Behavioural tests for sandbox spawning using `rstest-bdd`.
+
+use std::cell::RefCell;
</code_context>
<issue_to_address>
**suggestion (testing):** Behavioural tests should be gated to Linux to avoid panics from `resolve_binary` on non-Linux hosts
This suite depends on `resolve_binary`, which panics on non-Linux targets. As-is, `behaviour.rs` builds everywhere and will cause macOS/Windows test runs to panic instead of skip. Please gate these tests with `#[cfg(target_os = "linux")]` (either at module level or on the `sandbox_behaviour` scenario, possibly combined with `#[cfg(test)]`) so they are only built and run where supported.
```suggestion
//! Behavioural tests for sandbox spawning using `rstest-bdd`.
//!
//! These tests are only built and run on Linux, since they rely on sandbox
//! functionality that is only available there.
#![cfg(all(test, target_os = "linux"))]
use std::cell::RefCell;
```
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/tests/env_guard.rs:17` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test that relies on `EnvGuard`’s `Drop` implementation rather than explicit `restore`
Current tests only assert behavior when `snapshot.restore()` is called explicitly, but `Sandbox::spawn` relies on `EnvGuard`’s `Drop` to restore the environment. Please add a test that creates an `EnvGuard` with `capture()`, mutates the environment, then lets it go out of scope without calling `restore()`, and finally asserts the environment is restored. This ensures regressions in the `Drop` path are caught.
Suggested implementation:
```rust
fn restores_modified_and_removed_environment_variables() {
```
I can only see the beginning of the existing test function, so I can’t safely anchor an insertion point in the current code. Please **add the following new test function** (for example, just after `restores_modified_and_removed_environment_variables`) to exercise the `Drop` behavior of `EnvGuard`:
```rust
#[test]
fn restores_environment_on_drop_without_explicit_restore() {
const EXISTING: &str = "WEAVER_ENV_GUARD_DROP_EXISTING";
const EPHEMERAL: &str = "WEAVER_ENV_GUARD_DROP_EPHEMERAL";
let _guard = lock_env();
// Set up initial environment state.
unsafe { std::env::set_var(EXISTING, "original") };
{
// Capture the environment.
let _snapshot = EnvGuard::capture();
// Mutate existing and add new variables while the snapshot is alive.
unsafe { std::env::set_var(EXISTING, "modified") };
unsafe { std::env::set_var(EPHEMERAL, "temporary") };
// No explicit call to `_snapshot.restore()`; we rely on `Drop`.
}
// After `_snapshot` is dropped, the environment must be restored.
unsafe {
assert_eq!(std::env::var(EXISTING).as_deref(), Ok("original"));
assert!(std::env::var(EPHEMERAL).is_err(), "ephemeral var should be removed on drop");
}
}
```
This test:
1. Uses `lock_env()` to serialize environment access, matching existing conventions.
2. Calls `EnvGuard::capture()`, mutates the environment, and **does not** call `restore()`.
3. Relies on `EnvGuard`’s `Drop` implementation to restore the environment, then asserts:
- the original value of `EXISTING` is restored, and
- the ephemeral variable is removed.
Place this function alongside the other `EnvGuard` tests in `crates/weaver-sandbox/src/tests/env_guard.rs`.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/tests/sandbox_unit.rs:12` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
</code_context>
<issue_to_address>
**suggestion (testing):** Missing unit coverage for the `MultiThreaded` and `ThreadCountUnavailable` sandbox errors
Current tests cover paths and whitelisting but not the `thread_count()` preflight behavior. Please add a test that:
- starts an extra thread that stays alive,
- calls `Sandbox::spawn`, and
- asserts it returns `SandboxError::MultiThreaded { thread_count: .. }`.
If practical, also add a test for the `ThreadCountUnavailable` branch by injecting or mocking a failing `thread_count()` (e.g., via a wrapper or feature flag in tests) so that IO errors from `/proc/self/status` are verified to surface as `SandboxError::ThreadCountUnavailable`.
</issue_to_address>
### Comment 5
<location> `crates/weaver-sandbox/src/tests/unit.rs:7` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
</code_context>
<issue_to_address>
**suggestion (testing):** The `canonicalises_nonexistent_child_when_parent_exists` test name suggests behaviour it doesn’t actually validate
This test only checks that the future file path is recorded in `read_write_paths`; it never exercises the canonicalisation logic (`canonicalise`/`rebuild_from_existing_ancestor`) used when spawning a sandboxed command. As a result, the name is misleading and `rebuild_from_existing_ancestor` remains untested. Either rename the test to reflect that it covers only profile recording, or (preferably) add a sandbox-level test that drives `Sandbox::spawn`/`collect_exceptions` with a non-existent child path under a temp dir and asserts that the resulting exception path matches the expected rebuilt canonical path.
</issue_to_address>
### Comment 6
<location> `docs/weaver-design.md:989-990` </location>
<code_context>
+v0.8.1 with Weaver defaults. Executables must be supplied as absolute paths and
+whitelisted explicitly; the wrapper canonicalises the whitelist before launch
+to prevent symlink escapes. Standard Linux library roots (`/lib`, `/lib64`,
+`/usr/lib`, and their architecture-specific variants) are whitelisted for
+read-only access by default so dynamically linked binaries remain functional
+without exposing the wider filesystem. Network access remains disabled unless
+requested, and the environment is isolated by default with an allowlist for
</code_context>
<issue_to_address>
**nitpick (typo):** Consider adding a comma before "so" for clarity in this result clause.
In this sentence, consider changing to: "...read-only access by default, so dynamically linked binaries remain functional...". The comma makes it clearer that the second part describes the consequence of the default behavior.
```suggestion
`/usr/lib`, and their architecture-specific variants) are whitelisted for
read-only access by default, so dynamically linked binaries remain functional
```
</issue_to_address>
### Comment 7
<location> `docs/users-guide.md:145-146` </location>
<code_context>
+
+External tools launched by the daemon now run inside the `weaver-sandbox`
+wrapper around `birdcage` 0.8.1. Linux namespaces and `seccomp-bpf` filters are
+applied automatically, networking is disabled by default, and only a small set
+of standard library directories are readable to keep dynamically linked
+executables functioning. Commands must be provided as absolute paths and added
</code_context>
<issue_to_address>
**suggestion (typo):** The sentence here is a bit of a comma splice; consider adjusting punctuation or conjunctions.
In the phrase “are applied automatically, networking is disabled by default,” two independent clauses are joined only by a comma. Consider changing that comma to “and” or a semicolon to avoid the comma splice.
```suggestion
wrapper around `birdcage` 0.8.1. Linux namespaces and `seccomp-bpf` filters are
applied automatically; networking is disabled by default, and only a small set
```
</issue_to_address>
### Comment 8
<location> `crates/weaver-sandbox/src/sandbox.rs:80` </location>
<code_context>
+ })
+ }
+
+ fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
+ let mut exceptions = Vec::new();
+ let read_only = canonicalised_set(self.profile.read_only_paths())?;
</code_context>
<issue_to_address>
**issue (complexity):** Consider unifying program handling within `collect_exceptions` and simplifying `canonicalised_set` to avoid special cases and make the sandbox path logic easier to follow.
You can simplify the “double treatment” of the program and make the path handling a bit easier to follow without changing behaviour.
### 1. Unify program handling in `collect_exceptions`
Right now `program` is:
- canonicalised in `canonical_program`
- checked against `executable_paths` via `canonicalised_set`
- and then pushed again as a bespoke exception in `collect_exceptions`.
You can fold the program into the executable set up front and let the generic logic handle it, removing the special case and the need to reason about “did we already include this path or not?”.
```rust
fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
let mut exceptions = Vec::new();
let read_only = canonicalised_set(self.profile.read_only_paths())?;
let read_write = canonicalised_set(self.profile.read_write_paths())?;
// Start with the canonicalised executable paths.
let mut executables = canonicalised_set(self.profile.executable_paths())?;
// Ensure the canonicalised program is in the executable set as well.
executables.insert(program.to_path_buf());
for path in read_only {
exceptions.push(Exception::Read(path));
}
for path in read_write {
exceptions.push(Exception::WriteAndRead(path));
}
for path in executables {
exceptions.push(Exception::ExecuteAndRead(path));
}
exceptions.extend(self.profile.environment_policy().to_exceptions());
if matches!(self.profile.network_policy(), NetworkPolicy::Allow) {
exceptions.push(Exception::Networking);
}
Ok(exceptions)
}
```
This keeps all functionality intact (the program is still required to be whitelisted and still gets an `ExecuteAndRead` exception) but removes one branch and the bespoke push at the bottom.
### 2. Trim `canonicalised_set` to make behaviour more obvious
You can make `canonicalised_set` more self‑contained and readable by expressing it in terms of iterator combinators. Behaviour stays identical, but the “canonicalise → dedupe” story is more obvious:
```rust
fn canonicalised_set(paths: &[PathBuf]) -> Result<BTreeSet<PathBuf>, SandboxError> {
let mut set = BTreeSet::new();
for path in paths {
set.insert(canonicalise(path)?);
}
Ok(set)
}
```
or, if you prefer a more functional style without changing semantics:
```rust
fn canonicalised_set(paths: &[PathBuf]) -> Result<BTreeSet<PathBuf>, SandboxError> {
paths
.iter()
.map(|p| canonicalise(p))
.collect()
}
```
(Using `FromIterator` for `Result<BTreeSet<_>, _>`.)
These changes don’t touch the ancestor‑rebuild semantics, but they reduce branching and bespoke handling in the hot path, making the code easier to audit.
</issue_to_address>
### Comment 9
<location> `crates/weaver-sandbox/src/lib.rs:1` </location>
<code_context>
+//! Sandboxing utilities for Weaver processes.
+//!
+//! The `weaver-sandbox` crate wraps the [`birdcage`] library with policy
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both unit and behavioural tests covering the new `weaver-sandbox` crate, including its error paths and sandbox policies.
The `weaver-sandbox` crate introduces a substantial new feature surface (profiles, env guard, runtime helpers, sandbox orchestration), but the added test files under `crates/weaver-sandbox/src/tests` and `crates/weaver-sandbox/tests/features/sandbox.feature` are empty in this diff. Implement unit tests for modules like `EnvGuard`, `SandboxProfile`, `Sandbox`, and `runtime::thread_count/linux_runtime_roots`, and add behavioural/feature tests that exercise end-to-end sandboxed command execution and validation of policies (e.g. absolute path requirement, whitelist enforcement, environment isolation, networking toggle, multi-threaded error). Ensure these tests are committed as part of this change.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.
</details>
</issue_to_address>
### Comment 10
<location> `crates/weaver-sandbox/src/env_guard.rs:32` </location>
<code_context>
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `env::remove_var` is a safe API and does not require an `unsafe` block, so this use of `unsafe` violates the "avoid unsafe" guideline.
The standard library guarantees that `std::env::remove_var` is safe to call; wrapping it in `unsafe` does not provide additional correctness and conflicts with the project rule to avoid `unsafe` unless strictly necessary. Please remove the `unsafe` block here and call `env::remove_var` directly (keeping the surrounding logic unchanged).
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>
### Comment 11
<location> `crates/weaver-sandbox/src/env_guard.rs:38` </location>
<code_context>
+
+ for (key, value) in &self.original {
+ // Safety: see note above regarding env mutation policy.
+ unsafe { env::set_var(key, value) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `env::set_var` is also safe and does not justify an `unsafe` block under the current guidelines.
Similarly to `remove_var`, `std::env::set_var` is a safe function and does not need to be wrapped in `unsafe`. Please drop the `unsafe` block and call `env::set_var(key, value)` directly so that the implementation adheres to the project’s restriction on `unsafe` usage.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- In
Sandbox::collect_exceptionsyou always push anExecuteAndReadexception forprogrameven if it is already present in the canonicalised executable set, which can be trivially de-duplicated by checking membership before pushing to avoid redundant exceptions. - Both the behavioural test support (
tests/support/mod.rs) andenv_guardtests define their ownOnceLock<Mutex<()>>for environment serialisation; consider centralising this environment lock in a single helper to avoid duplication and ensure all env-mutating tests share the same synchronisation primitive.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `Sandbox::collect_exceptions` you always push an `ExecuteAndRead` exception for `program` even if it is already present in the canonicalised executable set, which can be trivially de-duplicated by checking membership before pushing to avoid redundant exceptions.
- Both the behavioural test support (`tests/support/mod.rs`) and `env_guard` tests define their own `OnceLock<Mutex<()>>` for environment serialisation; consider centralising this environment lock in a single helper to avoid duplication and ensure all env-mutating tests share the same synchronisation primitive.
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/sandbox.rs:80-105` </location>
<code_context>
+ let mut exceptions = Vec::new();
+ let read_only = canonicalised_set(self.profile.read_only_paths())?;
+ let read_write = canonicalised_set(self.profile.read_write_paths())?;
+ let executables = canonicalised_set(self.profile.executable_paths())?;
+
+ for path in read_only {
+ exceptions.push(Exception::Read(path));
+ }
+ for path in read_write {
+ exceptions.push(Exception::WriteAndRead(path));
+ }
+ for path in executables {
+ exceptions.push(Exception::ExecuteAndRead(path));
+ }
+
+ exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
+
+ exceptions.extend(self.profile.environment_policy().to_exceptions());
</code_context>
<issue_to_address>
**suggestion:** Avoid adding a duplicate `ExecuteAndRead` exception when the program is already whitelisted as executable.
In `collect_exceptions`, you convert all profile `executable_paths` into `ExecuteAndRead` exceptions, then always add another `ExecuteAndRead(program)`. If the canonicalised `program` is already in the profile, this duplicates the exception. To avoid this, only add the extra entry when it’s not already in the `executables` set:
```rust
let executables = canonicalised_set(self.profile.executable_paths())?;
for path in &executables {
exceptions.push(Exception::ExecuteAndRead(path.clone()));
}
if !executables.contains(program) {
exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
}
```
This preserves behavior while avoiding duplicate entries.
```suggestion
fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
let mut exceptions = Vec::new();
let read_only = canonicalised_set(self.profile.read_only_paths())?;
let read_write = canonicalised_set(self.profile.read_write_paths())?;
let executables = canonicalised_set(self.profile.executable_paths())?;
for path in read_only {
exceptions.push(Exception::Read(path));
}
for path in read_write {
exceptions.push(Exception::WriteAndRead(path));
}
for path in &executables {
exceptions.push(Exception::ExecuteAndRead(path.clone()));
}
if !executables.contains(program) {
exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
}
exceptions.extend(self.profile.environment_policy().to_exceptions());
if matches!(self.profile.network_policy(), NetworkPolicy::Allow) {
exceptions.push(Exception::Networking);
}
Ok(exceptions)
}
```
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/tests/unit.rs:85-86` </location>
<code_context>
+}
+
+#[test]
+fn canonicalises_nonexistent_child_when_parent_exists() {
+ let base = tempfile::tempdir().expect("tempdir");
+ let target = base.path().join("future_dir").join("file.txt");
+
+ let profile = SandboxProfile::new().allow_read_write_path(&target);
+
+ let set = profile.read_write_paths();
+ assert!(
+ set.iter().any(|p| p.ends_with("file.txt")),
</code_context>
<issue_to_address>
**suggestion (testing):** This test name suggests canonicalisation behaviour, but it currently only asserts that the path is recorded.
Here the test exercises only that `SandboxProfile` records a path ending in `file.txt`; the actual non-existent-child canonicalisation happens in `sandbox.rs` during spawn/exception collection, not when adding paths to the profile.
To make this clearer and improve coverage, either:
- Rename the test to something like `records_future_child_paths_when_parent_exists`, or
- Add/extend a test that goes through the full sandbox flow: create a temp dir with an existing parent but no leaf file, add the path via `allow_read_write_path`, trigger `collect_exceptions` (e.g. by spawning a whitelisted no-op program), and assert that the resulting `Exception::WriteAndRead` path matches the expected canonicalised form.
```suggestion
#[test]
fn records_future_child_paths_when_parent_exists() {
```
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:126-127` </location>
<code_context>
+ );
+}
+
+#[then("environment markers are cleaned up")]
+fn then_environment_cleaned(world: &RefCell<TestWorld>) {
+ world.borrow_mut().restore_env();
+ assert_ne!(
</code_context>
<issue_to_address>
**suggestion (testing):** The behaviour tests don't currently exercise the MultiThreaded or ThreadCountUnavailable error paths.
Since `Sandbox::spawn` explicitly guards against multi-threaded contexts and `/proc` parsing failures, it would be useful to add tests that cover those preflight paths. You could introduce a small abstraction (e.g., a test-only hook or trait) around `thread_count()` so tests can:
- Force a value > 1 and assert `SandboxError::MultiThreaded`.
- Force an `io::Error` and assert `SandboxError::ThreadCountUnavailable`.
These can be unit tests rather than BDD scenarios and would exercise the most security-sensitive checks.
Suggested implementation:
```rust
#[then("environment markers are cleaned up")]
fn then_environment_cleaned(world: &RefCell<TestWorld>) {
world.borrow_mut().restore_env();
assert_ne!(
std::env::var_os("KEEP_ME"),
Some(std::ffi::OsString::from("present")),
"KEEP_ME still holds the scenario value after restoration"
);
assert_ne!(
std::env::var_os("DROP_ME"),
Some(std::ffi::OsString::from("remove-me")),
"DROP_ME still holds the scenario value after restoration"
);
}
#[cfg(test)]
mod thread_count_error_tests {
use super::*;
use crate::{Sandbox, SandboxError};
/// Helper to construct a sandbox instance suitable for preflight testing.
///
/// This assumes `Sandbox::builder()` exists; adjust to match the actual
/// constructor API if needed.
fn make_sandbox_for_thread_test() -> Sandbox {
Sandbox::builder()
.build()
.expect("failed to build sandbox for thread-count tests")
}
/// When the injected thread counter reports more than one thread,
/// we should fail fast with `SandboxError::MultiThreaded`.
#[test]
fn spawn_fails_when_thread_count_greater_than_one() {
let sandbox = make_sandbox_for_thread_test();
// `spawn_with_thread_counter` is a test-only hook that bypasses the
// real `thread_count()` implementation.
let result = sandbox.spawn_with_thread_counter(|| Ok(2));
match result {
Err(SandboxError::MultiThreaded(_)) => {}
other => panic!("expected SandboxError::MultiThreaded, got {:?}", other),
}
}
/// When the injected thread counter fails with an `io::Error`,
/// we should fail fast with `SandboxError::ThreadCountUnavailable`.
#[test]
fn spawn_fails_when_thread_count_unavailable() {
let sandbox = make_sandbox_for_thread_test();
let io_err = std::io::Error::new(std::io::ErrorKind::Other, "synthetic /proc failure");
let result = sandbox.spawn_with_thread_counter(|| Err(io_err));
match result {
Err(SandboxError::ThreadCountUnavailable(_)) => {}
other => panic!(
"expected SandboxError::ThreadCountUnavailable, got {:?}",
other
),
}
}
}
```
To make these tests compile and exercise the intended preflight paths, the core sandbox code will need a small abstraction around `thread_count()`:
1. In the module where `Sandbox::spawn` is implemented:
- Extract the existing thread-count logic into a function or trait, e.g.:
```rust
pub trait ThreadCounter {
fn thread_count(&self) -> std::io::Result<usize>;
}
pub struct ProcThreadCounter;
impl ThreadCounter for ProcThreadCounter {
fn thread_count(&self) -> std::io::Result<usize> {
thread_count() // existing implementation
}
}
```
- Update the internal implementation of `spawn` to use a generic thread counter, defaulting to the real implementation:
```rust
impl Sandbox {
fn spawn_internal<C, F>(&self, counter: C /* ... other args ... */) -> Result<..., SandboxError>
where
C: FnOnce() -> std::io::Result<usize>,
{
let threads = counter().map_err(SandboxError::ThreadCountUnavailable)?;
if threads > 1 {
return Err(SandboxError::MultiThreaded(threads));
}
// existing spawn logic continues here...
}
pub fn spawn(/* existing signature */) -> Result<..., SandboxError> {
self.spawn_internal(|| thread_count(), /* forward other args */)
}
/// Test-only hook used by `thread_count_error_tests`.
#[cfg(test)]
pub fn spawn_with_thread_counter<C>(&self, counter: C /* mirror spawn args as needed */)
-> Result<..., SandboxError>
where
C: FnOnce() -> std::io::Result<usize>,
{
self.spawn_internal(counter /* forward other args */)
}
}
```
- Ensure `SandboxError::MultiThreaded` and `SandboxError::ThreadCountUnavailable` variants are already defined and used here; if their payloads differ from `usize` or `io::Error`, adjust the tests’ pattern matches accordingly.
2. Align the tests with the real `Sandbox::builder` / `spawn` signatures:
- If `Sandbox` is constructed differently, update `make_sandbox_for_thread_test` to use the correct constructor.
- If `spawn` takes arguments (command, env, etc.), mirror those in `spawn_with_thread_counter` and pass simple dummy values from the tests that do not affect the preflight logic.
3. If the crate uses a different module path for `Sandbox`/`SandboxError`, adjust the `use crate::{Sandbox, SandboxError};` line to the correct paths.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/sandbox.rs:138` </location>
<code_context>
+ }
+}
+
+fn rebuild_from_existing_ancestor(path: &Path) -> Result<PathBuf, SandboxError> {
+ let mut missing_components: Vec<PathBuf> = Vec::new();
+ let mut cursor = Some(path);
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `rebuild_from_existing_ancestor` using `ancestors`/`strip_prefix` and replacing the `BTreeSet`-based path handling with straightforward `Vec<PathBuf>` operations to make the sandbox code easier to follow.
You can keep the current behavior and noticeably reduce complexity in two focused spots:
### 1. Simplify `rebuild_from_existing_ancestor`
You can avoid the manual parent-walking loop, the explicit `Vec` of components, and the `rev()`-fold by using `Path::ancestors()` plus `strip_prefix`:
```rust
fn rebuild_from_existing_ancestor(path: &Path) -> Result<PathBuf, SandboxError> {
// Find the closest existing ancestor
let existing = path
.ancestors()
.find(|p| p.exists())
.ok_or_else(|| SandboxError::MissingPath {
path: path.to_path_buf(),
})?;
// Canonicalize that ancestor
let base = fs::canonicalize(existing).map_err(|source| SandboxError::CanonicalisationFailed {
path: existing.to_path_buf(),
source,
})?;
// Re-attach the missing tail (path relative to existing ancestor)
let tail = path.strip_prefix(existing).unwrap_or_else(|_| Path::new(""));
Ok(base.join(tail))
}
```
This preserves the semantics of “canonicalize the first existing ancestor and then re-append missing components,” but removes the reverse tracking and explicit loop state.
### 2. Avoid `BTreeSet` unless you specifically need ordering/dedup
If you don’t rely on set semantics (ordering, dedup) for correctness, you can drop the `BTreeSet` abstraction and just canonicalize into a `Vec<PathBuf>`. That makes the logic more obvious and avoids per-spawn set construction:
```rust
fn canonicalised_paths(paths: &[PathBuf]) -> Result<Vec<PathBuf>, SandboxError> {
paths.iter().map(|p| canonicalise(p)).collect()
}
```
Usage:
```rust
fn ensure_program_whitelisted(&self, program: &Path) -> Result<(), SandboxError> {
let authorised = canonicalised_paths(self.profile.executable_paths())?;
if authorised.iter().any(|p| p == program) {
return Ok(());
}
Err(SandboxError::ExecutableNotAuthorised {
program: program.to_path_buf(),
})
}
fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
let mut exceptions = Vec::new();
let read_only = canonicalised_paths(self.profile.read_only_paths())?;
let read_write = canonicalised_paths(self.profile.read_write_paths())?;
let executables = canonicalised_paths(self.profile.executable_paths())?;
for path in read_only {
exceptions.push(Exception::Read(path));
}
for path in read_write {
exceptions.push(Exception::WriteAndRead(path));
}
for path in executables {
exceptions.push(Exception::ExecuteAndRead(path));
}
exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
exceptions.extend(self.profile.environment_policy().to_exceptions());
if matches!(self.profile.network_policy(), NetworkPolicy::Allow) {
exceptions.push(Exception::Networking);
}
Ok(exceptions)
}
```
Behavior is preserved (duplicates in the profile still just translate to repeated exceptions, which `birdcage` can ignore or handle), but the per-spawn logic is simpler and more explicit. If you *do* need dedup, you can still dedup the `Vec` with a small helper instead of exposing `BTreeSet` in the control flow.
</issue_to_address>
### Comment 5
<location> `crates/weaver-sandbox/src/lib.rs:51` </location>
<code_context>
+pub use birdcage::process;
+pub use error::SandboxError;
+pub use profile::{EnvironmentPolicy, NetworkPolicy, SandboxProfile};
+pub use sandbox::{Sandbox, SandboxChild, SandboxCommand, SandboxOutput};
--- /dev/null
+++ b/crates/weaver-sandbox/src/env_guard.rs
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both behavioural and unit tests for the new `weaver-sandbox` crate; the newly added test files are currently empty and do not exercise the sandbox behaviour or error paths.
Implement unit tests under `crates/weaver-sandbox/src/tests` that cover the key components (`SandboxProfile`, `Sandbox`, `EnvGuard`, `runtime::thread_count`, and `runtime::linux_runtime_roots`). Exercise at least:
- successful spawning of an authorised absolute executable
- rejection of non-absolute paths (`ProgramNotAbsolute`)
- rejection of non-whitelisted executables (`ExecutableNotAuthorised`)
- handling of missing and non-canonicalisable paths (`MissingPath`, `CanonicalisationFailed`)
- behaviour of environment policies (`Isolated`, `AllowList`, `InheritAll`) and `EnvGuard` restoring the parent environment
- `NetworkPolicy` toggling and how it affects collected `Exception`s
- single-threaded vs multi-threaded preflight (`MultiThreaded`, `ThreadCountUnavailable`).
In addition, implement behavioural tests (e.g. BDD-style scenarios using `rstest-bdd` and `sandbox.feature`) that drive the crate end-to-end via `Sandbox::spawn`, asserting observable behaviour (filesystem access, environment isolation, networking defaults, and the multi-threaded error behaviour). Ensure these tests are actually populated and executed rather than leaving the added test files empty.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.
</details>
</issue_to_address>
### Comment 6
<location> `crates/weaver-sandbox/src/env_guard.rs:29` </location>
<code_context>
+ // Remove variables introduced while the guard was active.
+ for key in current.keys() {
+ if !expected_keys.contains(key) {
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is used around safe `std::env` functions without an apparent technical need, which conflicts with the "avoid unsafe unless absolutely necessary" instruction.
The `restore` method wraps `env::remove_var` and `env::set_var` in `unsafe` blocks:
```rust
unsafe { env::remove_var(key) };
...
unsafe { env::set_var(key, value) };
```
These standard library APIs are already safe to call, so the `unsafe` blocks do not appear technically required to uphold memory or thread safety. That makes this a direct use of `unsafe` without a demonstrated necessity, contrary to the review instruction.
Unless there is a concrete UB-related reason that is missing from the comment, please remove the `unsafe` blocks and call the safe functions directly (or otherwise justify and encapsulate the `unsafe` in the narrowest possible way).
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
- The
canonicalise/rebuild_from_existing_ancestorlogic currently returnsOkfor paths whose leaf component does not exist (e.g./definitely/missing/toolbecomes a canonical path even though it’s missing), which meansSandbox::spawnwill never produceSandboxError::MissingPathfor such cases; consider changingrebuild_from_existing_ancestor(orcanonical_program) to error when the full path does not exist so the error variants and behaviour match the intent of the API and tests. - In
Sandbox::collect_exceptions,ExecuteAndRead(program)is always pushed even whenprogramis already present in the canonicalised executable set, which is harmless but redundant; you could avoid the duplicate by either not re-addingprogramwhen it’s already authorised or by excluding it from the initialexecutablesset.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `canonicalise`/`rebuild_from_existing_ancestor` logic currently returns `Ok` for paths whose leaf component does not exist (e.g. `/definitely/missing/tool` becomes a canonical path even though it’s missing), which means `Sandbox::spawn` will never produce `SandboxError::MissingPath` for such cases; consider changing `rebuild_from_existing_ancestor` (or `canonical_program`) to error when the full path does not exist so the error variants and behaviour match the intent of the API and tests.
- In `Sandbox::collect_exceptions`, `ExecuteAndRead(program)` is always pushed even when `program` is already present in the canonicalised executable set, which is harmless but redundant; you could avoid the duplicate by either not re-adding `program` when it’s already authorised or by excluding it from the initial `executables` set.
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/sandbox.rs:92-96` </location>
<code_context>
+ for path in read_write {
+ exceptions.push(Exception::WriteAndRead(path));
+ }
+ for path in executables {
+ exceptions.push(Exception::ExecuteAndRead(path));
+ }
+
+ exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
+
+ exceptions.extend(self.profile.environment_policy().to_exceptions());
</code_context>
<issue_to_address>
**suggestion (bug_risk):** Avoid adding a duplicate `ExecuteAndRead` exception for the program path.
You’re pushing `ExecuteAndRead` for all `executables` and then again for `program`, which is guaranteed to be one of them. This can create duplicate exceptions for the same path if `birdcage` doesn’t deduplicate. Consider either skipping the final push when `program` is already in `executables`, or removing the loop over `executables` and relying only on the program-specific exception, if that matches your threat model.
```suggestion
for path in &executables {
exceptions.push(Exception::ExecuteAndRead(path.to_path_buf()));
}
if !executables.iter().any(|p| p == program) {
exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
}
```
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/tests/sandbox_unit.rs:1` </location>
<code_context>
+//! Unit tests covering sandbox spawn preflight errors.
+
+use std::path::PathBuf;
</code_context>
<issue_to_address>
**suggestion (testing):** Missing coverage for the `MultiThreaded` and `ThreadCountUnavailable` sandbox preflight errors.
These variants are part of the safety contract for `Sandbox::spawn` and should be explicitly exercised. Please add tests that:
- Assert `SandboxError::MultiThreaded` when `thread_count()` reports more than one thread (e.g. by spawning a parked helper thread; you can gate this behind `#[cfg(target_os = "linux")]` if needed).
- Assert `SandboxError::ThreadCountUnavailable` when `thread_count()` fails, possibly by making `thread_count` injectable or adding a small test seam.
This helps ensure the sandbox continues to fail closed if `/proc` or threading behaviour changes.
Suggested implementation:
```rust
//! Unit tests covering sandbox spawn preflight errors.
use std::{path::PathBuf, thread, time::Duration};
use crate::sandbox::{Sandbox, SandboxCommand};
use crate::{SandboxError, SandboxProfile};
```
```rust
fn sandbox_with_profile(profile: SandboxProfile) -> Sandbox {
Sandbox::new(profile)
}
#[cfg(target_os = "linux")]
#[test]
fn rejects_spawn_when_multi_threaded() {
// Ensure we have at least one extra thread in the process before attempting
// to spawn a sandboxed command. The sandbox should detect this and fail
// closed with `SandboxError::MultiThreaded`.
let helper = thread::spawn(|| {
// Park the helper thread so it stays alive for the duration of the test.
thread::park();
});
// Give the helper thread a brief moment to start and register in `/proc`.
// The exact timing is not critical; we just want to avoid flakiness.
thread::sleep(Duration::from_millis(50));
let profile = SandboxProfile::default();
let sandbox = sandbox_with_profile(profile);
// Use a benign program path; the error we care about is the preflight
// multi-threaded check, not command resolution.
let command = SandboxCommand::new(PathBuf::from("/bin/true"));
let result = sandbox.spawn(&command);
assert!(
matches!(result, Err(SandboxError::MultiThreaded)),
"expected MultiThreaded error when spawning sandbox in a multi-threaded process, got: {result:?}"
);
// Cleanly shut down the helper thread.
helper.thread().unpark();
helper.join().expect("helper thread should join cleanly");
}
#[test]
fn rejects_spawn_when_thread_count_unavailable() {
// This test verifies that a failure to determine the thread count causes
// `Sandbox::spawn` to fail closed with `SandboxError::ThreadCountUnavailable`.
//
// It relies on a test-only seam in the sandbox implementation that allows
// overriding the thread-counting logic.
use crate::sandbox::testing::with_thread_count_override;
let profile = SandboxProfile::default();
let sandbox = sandbox_with_profile(profile);
let command = SandboxCommand::new(PathBuf::from("/bin/true"));
let result = with_thread_count_override(|| {
// Simulate a failure to determine the thread count.
Err(std::io::Error::new(
std::io::ErrorKind::Other,
"simulated thread count failure",
))
.into()
}, || sandbox.spawn(&command));
assert!(
matches!(result, Err(SandboxError::ThreadCountUnavailable)),
"expected ThreadCountUnavailable error when thread counting fails, got: {result:?}"
);
}
#[test]
fn rejects_relative_program_paths() {
```
To make the `rejects_spawn_when_thread_count_unavailable` test compile and behave as intended, you will need to introduce a small test seam in the sandbox implementation:
1. In the sandbox module (likely `crates/weaver-sandbox/src/sandbox.rs` or similar), extract the logic that determines the current process thread count into a helper function, e.g.:
```rust
fn thread_count() -> std::io::Result<usize> { /* existing logic */ }
```
2. Under `#[cfg(test)]`, add a mechanism to override this function for the duration of a closure, for example:
```rust
pub mod testing {
use std::cell::RefCell;
use std::io;
thread_local! {
static OVERRIDE: RefCell<Option<Box<dyn Fn() -> io::Result<usize>>>> = RefCell::new(None);
}
pub fn with_thread_count_override<F, R>(
override_impl: impl Fn() -> io::Result<usize> + 'static,
f: F,
) -> R
where
F: FnOnce() -> R,
{
OVERRIDE.with(|slot| {
*slot.borrow_mut() = Some(Box::new(override_impl));
let result = f();
*slot.borrow_mut() = None;
result
})
}
pub fn thread_count() -> io::Result<usize> {
OVERRIDE.with(|slot| {
if let Some(ref f) = *slot.borrow() {
f()
} else {
super::thread_count()
}
})
}
}
```
3. Update the production `Sandbox::spawn` implementation to call `thread_count()` (or `testing::thread_count()` under `cfg(test)`) so that during tests, `with_thread_count_override` can force `thread_count()` to fail and surface `SandboxError::ThreadCountUnavailable`.
4. Ensure `SandboxCommand::new` and `SandboxProfile::default()` exist and are used consistently with the rest of the test file; if the existing tests construct commands/profiles differently, mirror that pattern in the two new tests instead of the placeholders above.
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/tests/unit.rs:7-16` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
</code_context>
<issue_to_address>
**suggestion (testing):** The `canonicalises_nonexistent_child_when_parent_exists` test name suggests canonicalisation behaviour, but it only checks path recording on the profile.
The current body only checks that `SandboxProfile` records a path ending with `file.txt`, and doesn’t touch the `rebuild_from_existing_ancestor` / canonicalisation logic in `sandbox.rs`, so the name is misleading.
Either rename the test to match what it actually verifies, or adjust it to call the canonicalisation helper (e.g. via a small test-only wrapper) and assert on the rebuilt path so the name and behaviour align.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/tests/env_guard.rs:17-18` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
+ const ANOTHER: &str = "WEAVER_ENV_GUARD_ANOTHER";
</code_context>
<issue_to_address>
**suggestion (testing):** EnvGuard tests don't verify that restoration also happens via Drop when `restore` is not called explicitly.
Right now both tests only cover the explicit `snapshot.restore()` path. Please add a test that relies on `Drop` instead (e.g., create an `EnvGuard` in a scope, mutate env vars as in the existing tests, then assert after the scope ends that the env is restored). This will ensure the `Drop`-based restoration path—how EnvGuard is typically used—remains correct as the implementation evolves.
Suggested implementation:
```rust
#[test]
fn restores_modified_and_removed_environment_variables() {
const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
const ANOTHER: &str = "WEAVER_ENV_GUARD_ANOTHER";
const EPHEMERAL: &str = "WEAVER_ENV_GUARD_EPHEMERAL";
let _guard = lock_env();
unsafe { env::set_var(EXISTING, "original") };
unsafe { env::set_var(ANOTHER, "keep") };
let snapshot = EnvGuard::capture();
unsafe { env::set_var(EXISTING, "changed") };
unsafe { env::remove_var(ANOTHER) };
unsafe { env::set_var(EPHEMERAL, "ephemeral") };
snapshot.restore();
assert_eq!(unsafe { env::var(EXISTING).unwrap() }, "original");
assert_eq!(unsafe { env::var(ANOTHER).unwrap() }, "keep");
assert!(unsafe { env::var_os(EPHEMERAL) }.is_none());
}
#[test]
fn restores_environment_variables_on_drop() {
const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
const ANOTHER: &str = "WEAVER_ENV_GUARD_ANOTHER";
const EPHEMERAL: &str = "WEAVER_ENV_GUARD_EPHEMERAL";
let _guard = lock_env();
// Establish a known baseline
unsafe { env::set_var(EXISTING, "original") };
unsafe { env::set_var(ANOTHER, "keep") };
unsafe { env::remove_var(EPHEMERAL) };
{
let _snapshot = EnvGuard::capture();
// Mutate the environment while the guard is in scope
unsafe { env::set_var(EXISTING, "changed") };
unsafe { env::remove_var(ANOTHER) };
unsafe { env::set_var(EPHEMERAL, "ephemeral") };
// No explicit restore() call here; rely on Drop.
}
// After the guard is dropped, the environment should be restored
assert_eq!(unsafe { env::var(EXISTING).unwrap() }, "original");
assert_eq!(unsafe { env::var(ANOTHER).unwrap() }, "keep");
assert!(unsafe { env::var_os(EPHEMERAL) }.is_none());
}
```
If the rest of the file uses a different pattern for assertions or unsafe env access (e.g., helper functions or a custom macro), you may want to align the new test with those conventions. Also ensure `use std::env;` is present at the top of this file if it isn’t already.
</issue_to_address>
### Comment 5
<location> `docs/users-guide.md:145-148` </location>
<code_context>
+
+External tools launched by the daemon now run inside the `weaver-sandbox`
+wrapper around `birdcage` 0.8.1. Linux namespaces and `seccomp-bpf` filters are
+applied automatically, networking is disabled by default, and only a small set
+of standard library directories are readable to keep dynamically linked
+executables functioning. Commands must be provided as absolute paths and added
</code_context>
<issue_to_address>
**suggestion (typo):** Consider adjusting punctuation/structure to avoid a comma splice in this sentence.
The phrase "applied automatically, networking is disabled by default, and only a small set" links three clauses with commas and reads like a comma splice. Consider semicolons or splitting into two sentences, e.g. "applied automatically; networking is disabled by default; and only a small set of standard library directories are readable" or similar.
```suggestion
wrapper around `birdcage` 0.8.1. Linux namespaces and `seccomp-bpf` filters are
applied automatically; networking is disabled by default; and only a small set
of standard library directories are readable to keep dynamically linked
executables functioning. Commands must be provided as absolute paths and added
```
</issue_to_address>
### Comment 6
<location> `crates/weaver-sandbox/src/lib.rs:51` </location>
<code_context>
+pub use birdcage::process;
+pub use error::SandboxError;
+pub use profile::{EnvironmentPolicy, NetworkPolicy, SandboxProfile};
+pub use sandbox::{Sandbox, SandboxChild, SandboxCommand, SandboxOutput};
--- /dev/null
+++ b/crates/weaver-sandbox/src/env_guard.rs
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both unit and behavioural tests for the new `weaver-sandbox` crate; the crate currently exposes non-trivial sandbox behaviour without any exercising tests.
Implement unit tests for the core building blocks (`SandboxProfile`, `EnvironmentPolicy`, `NetworkPolicy`, `EnvGuard`, `runtime::thread_count`, error conversions, and the path canonicalisation helpers) and behavioural tests that exercise the end-to-end sandbox flow.
At a minimum:
- Add unit tests under `crates/weaver-sandbox/src/tests/` that cover:
- Successful spawning of an authorised absolute executable.
- Rejection of non-absolute program paths (`ProgramNotAbsolute`).
- Rejection of executables not authorised in the profile (`ExecutableNotAuthorised`).
- Correct derivation of `Exception`s from read-only/read-write/executable paths and environment/network policies.
- `EnvGuard` restoring the environment after `birdcage` manipulation.
- `thread_count` parsing, including malformed `/proc/self/status` data (using test doubles or helper functions where direct `/proc` manipulation is not feasible).
- Add behavioural/acceptance tests (for example in `crates/weaver-sandbox/tests/features/sandbox.feature` and the corresponding Rust glue) that validate the observable behaviour of the sandbox in a real or simulated run:
- A happy-path command runs successfully inside the sandbox with expected stdout/stderr.
- Networking is denied by default and allowed only when requested.
- Environment is stripped by default and inherited correctly when whitelisted or when `InheritAll` is used.
- Multi-threaded invocation returns `SandboxError::MultiThreaded` instead of panicking.
Ensure these tests run in CI and demonstrably cover both the behavioural and unit-level expectations for this new feature, in line with the project testing rules.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural and unit tests are required.
</details>
</issue_to_address>
### Comment 7
<location> `crates/weaver-sandbox/src/env_guard.rs:32` </location>
<code_context>
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is used around `env::remove_var`/`env::set_var` even though these APIs are safe; this violates the guideline to avoid `unsafe` unless it is strictly necessary.
`std::env::remove_var` and `std::env::set_var` are safe functions and do not require an `unsafe` block. The comment suggests a project policy, but per the review instructions `unsafe` should be avoided unless absolutely necessary.
Please remove the `unsafe` blocks here (and around `env::set_var` below) or, if there is a concrete unsafety that cannot be expressed with safe code, encapsulate it in the narrowest possible helper with a clear justification tied to actual undefined-behaviour constraints.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Hey there - I've reviewed your changes and found some issues that need to be addressed.
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `crates/weaver-sandbox/src/sandbox.rs:92-96` </location>
<code_context>
+ for path in read_write {
+ exceptions.push(Exception::WriteAndRead(path));
+ }
+ for path in executables {
+ exceptions.push(Exception::ExecuteAndRead(path));
+ }
+
+ exceptions.push(Exception::ExecuteAndRead(program.to_path_buf()));
+
+ exceptions.extend(self.profile.environment_policy().to_exceptions());
</code_context>
<issue_to_address>
**suggestion:** Avoid adding a duplicate ExecuteAndRead exception for the program path.
Since `ensure_program_whitelisted` uses the same canonicalisation as `collect_exceptions`, an authorised `program` will already be in `executables`. In that case, `Exception::ExecuteAndRead(program)` is added twice. Consider either skipping the explicit push when `program` is already in `executables`, or not inserting `program` into `executables` when building exceptions.
</issue_to_address>
### Comment 2
<location> `crates/weaver-sandbox/src/sandbox.rs:82-84` </location>
<code_context>
+
+ fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
+ let mut exceptions = Vec::new();
+ let read_only = canonicalised_set(self.profile.read_only_paths())?;
+ let read_write = canonicalised_set(self.profile.read_write_paths())?;
+ let executables = canonicalised_set(self.profile.executable_paths())?;
+
+ for path in read_only {
</code_context>
<issue_to_address>
**suggestion (performance):** Canonicalising all profile paths on every spawn may be unnecessarily expensive.
Each `spawn` call re-canonicalises all read-only, read-write, and executable paths, adding repeated filesystem work that could be costly when profiles are reused. Consider canonicalising once when constructing `SandboxProfile` (or lazily caching the canonicalised sets on first use) and then reusing those cached paths in `spawn`.
Suggested implementation:
```rust
fn collect_exceptions(&self, program: &Path) -> Result<Vec<Exception>, SandboxError> {
let mut exceptions = Vec::new();
let read_only = self.profile.read_only_paths_canonicalised()?;
let read_write = self.profile.read_write_paths_canonicalised()?;
let executables = self.profile.executable_paths_canonicalised()?;
```
To fully implement the caching and avoid repeated canonicalisation per `spawn`, you will also need to:
1. **Extend `SandboxProfile` to hold cached canonical paths**, for example (names can be adjusted to match your conventions):
```rust
use once_cell::sync::OnceCell;
use std::collections::HashSet;
use std::path::PathBuf;
pub struct SandboxProfile {
// existing fields...
read_only_paths: Vec<PathBuf>,
read_write_paths: Vec<PathBuf>,
executable_paths: Vec<PathBuf>,
read_only_paths_canon: OnceCell<HashSet<PathBuf>>,
read_write_paths_canon: OnceCell<HashSet<PathBuf>>,
executable_paths_canon: OnceCell<HashSet<PathBuf>>,
}
```
2. **Initialize the new `OnceCell` fields** in the `SandboxProfile` constructor / builder:
```rust
impl SandboxProfile {
pub fn new(/* existing params */) -> Self {
Self {
// existing initialisation...
read_only_paths,
read_write_paths,
executable_paths,
read_only_paths_canon: OnceCell::new(),
read_write_paths_canon: OnceCell::new(),
executable_paths_canon: OnceCell::new(),
}
}
}
```
3. **Add the three new methods used in `collect_exceptions`** to lazily canonicalise and cache the sets:
```rust
impl SandboxProfile {
pub fn read_only_paths_canonicalised(&self) -> Result<&HashSet<PathBuf>, SandboxError> {
self.read_only_paths_canon.get_or_try_init(|| {
canonicalised_set(&self.read_only_paths)
})
}
pub fn read_write_paths_canonicalised(&self) -> Result<&HashSet<PathBuf>, SandboxError> {
self.read_write_paths_canon.get_or_try_init(|| {
canonicalised_set(&self.read_write_paths)
})
}
pub fn executable_paths_canonicalised(&self) -> Result<&HashSet<PathBuf>, SandboxError> {
self.executable_paths_canon.get_or_try_init(|| {
canonicalised_set(&self.executable_paths)
})
}
}
```
4. **Ensure `canonicalised_set` is updated (if needed)** to accept the underlying path collection type used in `SandboxProfile` (e.g. `&[PathBuf]` or `&HashSet<PathBuf>`), so that it can be reused by these new methods without per-call allocation.
With these additional changes, canonicalisation will happen at most once per profile per path set, and `collect_exceptions` will reuse the cached canonicalised sets on every `spawn`.
</issue_to_address>
### Comment 3
<location> `crates/weaver-sandbox/src/profile.rs:163` </location>
<code_context>
+ let _ = keys.insert(key);
+ Self::AllowList(keys)
+ }
+ Self::InheritAll => Self::InheritAll,
+ }
+ }
</code_context>
<issue_to_address>
**question:** Allowing a variable while in `InheritAll` mode is a no-op, which may be surprising to callers.
In `EnvironmentPolicy::InheritAll`, calling `allow_environment_variable`/`with_allowed` effectively does nothing. While this matches the existing type, it’s not obvious from the builder-style API and may mislead callers into thinking they are constructing an allow-list. Consider either documenting this explicitly on `allow_environment_variable` or having `with_allowed` convert `InheritAll` into an `AllowList` seeded with the requested key, depending on the intended semantics.
</issue_to_address>
### Comment 4
<location> `crates/weaver-sandbox/src/env_guard.rs:27-32` </location>
<code_context>
+ let expected_keys: HashSet<&OsString> = self.original.keys().collect();
+
+ // Remove variables introduced while the guard was active.
+ for key in current.keys() {
+ if !expected_keys.contains(key) {
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**suggestion:** The use of `unsafe` around `env::remove_var` appears unnecessary and could be avoided or better isolated.
Since `env::remove_var` / `env::set_var` are safe APIs, the `unsafe` here is a policy choice rather than required by the type system. If you want to keep that convention, consider wrapping env mutation in small helpers (e.g. `unsafe fn unset_env(...)`) and using those here. That keeps the `unsafe` surface minimal and the justification centralized and easier to audit.
Suggested implementation:
```rust
unsafe fn unset_env_var<K: AsRef<std::ffi::OsStr>>(key: K) {
// Safety: by project policy, environment mutation is considered unsafe
// until the std APIs settle for Rust 2024. Centralize that policy here
// so that all env mutation is easy to audit.
std::env::remove_var(key);
}
impl EnvGuard {
```
```rust
pub(crate) fn restore(&self) {
let current: HashMap<OsString, OsString> = env::vars_os().collect();
let expected_keys: HashSet<&OsString> = self.original.keys().collect();
// Remove variables introduced while the guard was active.
for key in current.keys() {
if !expected_keys.contains(key) {
// Safety: env mutation is funneled through `unset_env_var` to
// keep the unsafe surface minimal and well-documented.
unsafe { unset_env_var(key) };
}
}
```
If there are other call sites in this crate that directly use `unsafe { env::remove_var(...) }` or `unsafe { env::set_var(...) }` to follow the same policy, they should be updated to call small helpers like `unset_env_var` (and a corresponding `set_env_var`) instead. This will further centralize and minimize the `unsafe` surface around environment mutation.
</issue_to_address>
### Comment 5
<location> `crates/weaver-sandbox/src/tests/unit.rs:7-16` </location>
<code_context>
+ .expect("env mutex poisoned")
+}
+
+#[test]
+fn restores_modified_and_removed_environment_variables() {
+ const EXISTING: &str = "WEAVER_ENV_GUARD_EXISTING";
</code_context>
<issue_to_address>
**issue (testing):** Test `canonicalises_nonexistent_child_when_parent_exists` name and intent don’t match the behaviour being exercised
This test only verifies that `SandboxProfile::allow_read_write_path` records a non-existent path; it never exercises the `canonicalise` / `rebuild_from_existing_ancestor` logic implied by its name. Since `SandboxProfile` doesn’t canonicalise paths, please either rename the test to match its actual behaviour, or refactor it to drive the canonicalisation path (for example, via `Sandbox::spawn` with a non-existent child under an existing parent and asserting the expected `SandboxError`).
</issue_to_address>
### Comment 6
<location> `crates/weaver-sandbox/src/tests/sandbox_unit.rs:1-10` </location>
<code_context>
+//! Unit tests covering sandbox spawn preflight errors.
+
+use std::path::PathBuf;
+
+use crate::sandbox::{Sandbox, SandboxCommand};
+use crate::{SandboxError, SandboxProfile};
+
+fn sandbox_with_profile(profile: SandboxProfile) -> Sandbox {
+ Sandbox::new(profile)
+}
+
+#[test]
+fn rejects_relative_program_paths() {
+ let sandbox = sandbox_with_profile(SandboxProfile::new());
+ let command = SandboxCommand::new("relative/bin");
+
+ let err = sandbox.spawn(command).expect_err("spawn should fail");
+ match err {
+ SandboxError::ProgramNotAbsolute(path) => {
+ assert_eq!(path, PathBuf::from("relative/bin"));
</code_context>
<issue_to_address>
**suggestion (testing):** Add unit tests for multi-threaded and thread-count-unavailable sandbox preflight failures
The tests don’t currently cover the preflight branches in `Sandbox::spawn` for:
- `SandboxError::MultiThreaded { .. }` when `thread_count() > 1`
- `SandboxError::ThreadCountUnavailable { .. }` when `thread_count()` returns `Err`
Consider adding tests that force these conditions (e.g., by making `thread_count` overridable in tests via a small abstraction or `cfg(test)` hook) and asserting that `spawn` produces the expected `SandboxError` variants.
Suggested implementation:
```rust
fn sandbox_with_profile(profile: SandboxProfile) -> Sandbox {
Sandbox::new(profile)
}
/// Test-only helper to construct a `Sandbox` with a controlled thread-count
/// implementation so we can exercise preflight error paths.
///
/// This assumes a test-only constructor on `Sandbox` that allows injecting a
/// custom thread-count function. See `<additional_changes>` for the required
/// implementation on `Sandbox`.
#[cfg(test)]
fn sandbox_with_forced_thread_count<F>(profile: SandboxProfile, thread_count_fn: F) -> Sandbox
where
F: Fn() -> Result<usize, Box<dyn std::error::Error + Send + Sync>> + Send + Sync + 'static,
{
Sandbox::with_thread_counter_for_tests(profile, Box::new(thread_count_fn))
}
#[test]
fn rejects_multithreaded_processes() {
// Force `thread_count()` to report more than one thread so that
// `Sandbox::spawn` must fail with `SandboxError::MultiThreaded { .. }`.
let sandbox = sandbox_with_forced_thread_count(SandboxProfile::new(), || Ok(4));
let command = SandboxCommand::new("/usr/bin/true");
let err = sandbox
.spawn(command)
.expect_err("spawn should fail for multi-threaded processes");
match err {
SandboxError::MultiThreaded { thread_count } => {
assert_eq!(thread_count, 4);
}
other => panic!("expected MultiThreaded error, got: {other:?}"),
}
}
#[test]
fn rejects_when_thread_count_unavailable() {
// Force `thread_count()` to fail so that `Sandbox::spawn` must surface a
// `SandboxError::ThreadCountUnavailable { .. }`.
let sandbox = sandbox_with_forced_thread_count(SandboxProfile::new(), || {
Err("thread count failed".into())
});
let command = SandboxCommand::new("/usr/bin/true");
let err = sandbox
.spawn(command)
.expect_err("spawn should fail when thread count is unavailable");
match err {
SandboxError::ThreadCountUnavailable { .. } => {
// Variant match is sufficient; the exact error type/message is an implementation detail.
}
other => panic!("expected ThreadCountUnavailable error, got: {other:?}"),
}
}
```
` for the required
/// implementation on `Sandbox`.
#[cfg(test)]
fn sandbox_with_forced_thread_count<F>(profile: SandboxProfile, thread_count_fn: F) -> Sandbox
where
F: Fn() -> Result<usize, Box<dyn std::error::Error + Send + Sync>> + Send + Sync + 'static,
{
Sandbox::with_thread_counter_for_tests(profile, Box::new(thread_count_fn))
}
#[test]
fn rejects_multithreaded_processes() {
// Force `thread_count()` to report more than one thread so that
// `Sandbox::spawn` must fail with `SandboxError::MultiThreaded { .. }`.
let sandbox = sandbox_with_forced_thread_count(SandboxProfile::new(), || Ok(4));
let command = SandboxCommand::new("/usr/bin/true");
let err = sandbox
.spawn(command)
.expect_err("spawn should fail for multi-threaded processes");
match err {
SandboxError::MultiThreaded { thread_count } => {
assert_eq!(thread_count, 4);
}
other => panic!("expected MultiThreaded error, got: {other:?}"),
}
}
#[test]
fn rejects_when_thread_count_unavailable() {
// Force `thread_count()` to fail so that `Sandbox::spawn` must surface a
// `SandboxError::ThreadCountUnavailable { .. }`.
let sandbox = sandbox_with_forced_thread_count(SandboxProfile::new(), || {
Err("thread count failed".into())
});
let command = SandboxCommand::new("/usr/bin/true");
let err = sandbox
.spawn(command)
.expect_err("spawn should fail when thread count is unavailable");
match err {
SandboxError::ThreadCountUnavailable { .. } => {
// Variant match is sufficient; the exact error type/message is an implementation detail.
}
other => panic!("expected ThreadCountUnavailable error, got: {other:?}"),
}
}
>>>>>>> REPLACE
</file_operation>
</file_operations>
<additional_changes>
To make these tests compile and correctly exercise the preflight branches in `Sandbox::spawn`, you will need to extend the `Sandbox` implementation (likely in `crates/weaver-sandbox/src/sandbox.rs` or similar):
1. **Add a test-only constructor that accepts a thread-count function:**
- Behind `#[cfg(test)]`, add something like:
```rust
impl Sandbox {
#[cfg(test)]
pub fn with_thread_counter_for_tests(
profile: SandboxProfile,
thread_counter: Box<dyn Fn() -> Result<usize, Box<dyn std::error::Error + Send + Sync>> + Send + Sync>,
) -> Self {
Self {
profile,
thread_counter, // store this in the struct instead of (or in addition to) the production path
// ...copy any other fields you need from `Sandbox::new`
}
}
}
```
- Introduce a `thread_counter` field (or similar) on `Sandbox` that is used by `Sandbox::spawn` instead of calling a free `thread_count()` function directly. In non-test builds, `Sandbox::new` should set this field to the real implementation.
2. **Wire `Sandbox::spawn` to use the injected thread-count function:**
- In `spawn`, replace direct calls to `thread_count()` (or equivalent) with something like:
```rust
let thread_count = (self.thread_counter)().map_err(|source| SandboxError::ThreadCountUnavailable { source })?;
if thread_count > 1 {
return Err(SandboxError::MultiThreaded { thread_count });
}
```
- Ensure the `ThreadCountUnavailable` and `MultiThreaded` variants of `SandboxError` match the patterns used in the tests (names and field types).
3. **Adjust types if necessary:**
- If your existing `SandboxError::ThreadCountUnavailable` wraps a specific error type (e.g. `std::io::Error` or `anyhow::Error`), update the signature of `with_thread_counter_for_tests` and the closure types in this test file to match.
- If `thread_count` is not `usize`, update the tests and the helper to use the correct numeric type while still forcing `> 1` for the multithreaded case.
If your codebase already has a different mechanism for overriding the thread-count implementation in tests (e.g., a `cfg(test)` global setter or a `sandbox::testing` module), you can replace the `Sandbox::with_thread_counter_for_tests` helper and the `sandbox_with_forced_thread_count` wrapper with the existing abstraction and keep the test bodies largely the same.
</issue_to_address>
### Comment 7
<location> `crates/weaver-sandbox/src/tests/behaviour.rs:1-8` </location>
<code_context>
+use rstest::fixture;
</code_context>
<issue_to_address>
**suggestion (testing):** Behavioural tests assume a Linux host but aren’t gated, which may cause failures on non-Linux platforms
These behavioural tests call `resolve_binary` from `tests::support`, which panics on non-Linux targets. Since `behaviour.rs` runs on all targets, scenarios will panic on non-Linux hosts even though the crate’s non-Linux defaults are valid.
Consider either:
- Gating the module or `sandbox_behaviour` with `#[cfg(target_os = "linux")]`, or
- Adding a guard like `if !cfg!(target_os = "linux") { return; }` in the tests
so the tests are skipped rather than panicking on non-Linux, matching the intended Linux-centric assumptions.
```suggestion
#![cfg(target_os = "linux")]
//! Behavioural tests for sandbox spawning using `rstest-bdd`.
use std::cell::RefCell;
use rstest::fixture;
use rstest_bdd_macros::{given, scenario, then, when};
use crate::tests::support::TestWorld;
```
</issue_to_address>
### Comment 8
<location> `crates/weaver-sandbox/src/tests/env_guard.rs:44-66` </location>
<code_context>
+ unsafe { env::remove_var(ANOTHER) };
+}
+
+#[test]
+fn removes_variables_created_during_guard_lifetime() {
+ const PRE_EXISTING: &str = "WEAVER_ENV_GUARD_PRE_EXISTING";
+ const CREATED: &str = "WEAVER_ENV_GUARD_CREATED";
+
+ let _guard = lock_env();
+
+ unsafe { env::set_var(PRE_EXISTING, "value") };
+
+ let snapshot = EnvGuard::capture();
+
+ unsafe { env::set_var(CREATED, "temporary") };
+
+ assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
+ assert_eq!(env::var(CREATED).as_deref(), Ok("temporary"));
+
+ snapshot.restore();
+
+ assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
+ assert!(env::var(CREATED).is_err());
+
+ unsafe { env::remove_var(PRE_EXISTING) };
+}
</code_context>
<issue_to_address>
**suggestion (testing):** Add a test to explicitly verify `EnvGuard` restoration via `Drop` without calling `restore`
Current tests only validate restoration via `snapshot.restore()`, not via `Drop`, so the RAII behaviour isn’t exercised.
Please add a test that uses `EnvGuard::capture()`, mutates the environment, then lets the guard go out of scope without calling `restore`, and finally asserts that the environment matches the pre-snapshot state. This will explicitly cover the `Drop` path and confirm the documented behaviour.
```suggestion
#[test]
fn removes_variables_created_during_guard_lifetime() {
const PRE_EXISTING: &str = "WEAVER_ENV_GUARD_PRE_EXISTING";
const CREATED: &str = "WEAVER_ENV_GUARD_CREATED";
let _guard = lock_env();
unsafe { env::set_var(PRE_EXISTING, "value") };
let snapshot = EnvGuard::capture();
unsafe { env::set_var(CREATED, "temporary") };
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
assert_eq!(env::var(CREATED).as_deref(), Ok("temporary"));
snapshot.restore();
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("value"));
assert!(env::var(CREATED).is_err());
unsafe { env::remove_var(PRE_EXISTING) };
}
#[test]
fn restores_environment_on_drop() {
const PRE_EXISTING: &str = "WEAVER_ENV_GUARD_DROP_PRE_EXISTING";
const CREATED: &str = "WEAVER_ENV_GUARD_DROP_CREATED";
let _guard = lock_env();
// Set up initial environment
unsafe { env::set_var(PRE_EXISTING, "original") };
{
// Capture the environment; rely on Drop for restoration
let _snapshot = EnvGuard::capture();
// Mutate both an existing and a new variable
unsafe {
env::set_var(PRE_EXISTING, "modified");
env::set_var(CREATED, "temporary");
}
// Verify mutations are visible while the guard is alive
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("modified"));
assert_eq!(env::var(CREATED).as_deref(), Ok("temporary"));
}
// After _snapshot is dropped, environment should be restored
assert_eq!(env::var(PRE_EXISTING).as_deref(), Ok("original"));
assert!(env::var(CREATED).is_err());
// Cleanup
unsafe { env::remove_var(PRE_EXISTING) };
}
```
</issue_to_address>
### Comment 9
<location> `crates/weaver-sandbox/src/env_guard.rs:22` </location>
<code_context>
+ }
+ }
+
+ pub(crate) fn restore(&self) {
+ let current: HashMap<OsString, OsString> = env::vars_os().collect();
+ let expected_keys: HashSet<&OsString> = self.original.keys().collect();
</code_context>
<issue_to_address>
**issue (complexity):** Consider simplifying `restore` by clearing and then repopulating the environment from the captured snapshot, optionally storing the snapshot as a linear collection instead of a `HashMap`.
You can simplify `restore` without changing semantics by clearing the current environment and then re-populating it from the captured snapshot. This removes the extra `HashMap` and `HashSet` allocations and makes the unsafe section easier to reason about.
```rust
impl EnvGuard {
pub(crate) fn restore(&self) {
// Clear current environment
for (key, _) in env::vars_os() {
// Safety: project policy requires env mutation to be wrapped in unsafe.
unsafe { env::remove_var(&key) };
}
// Restore original snapshot
for (key, value) in &self.original {
// Safety: see note above regarding env mutation policy.
unsafe { env::set_var(key, value) };
}
}
}
```
If you want to reduce storage complexity as well (optional), you can store a linear collection instead of a `HashMap`, since environment keys are already unique:
```rust
#[derive(Debug)]
pub struct EnvGuard {
original: Vec<(OsString, OsString)>,
}
impl EnvGuard {
#[must_use]
pub fn capture() -> Self {
Self {
original: env::vars_os().collect(),
}
}
pub(crate) fn restore(&self) {
for (key, _) in env::vars_os() {
unsafe { env::remove_var(&key) };
}
for (key, value) in &self.original {
unsafe { env::set_var(key, value) };
}
}
}
```
Both variants preserve the behavior of “restore env to exactly the captured snapshot” with less indirection and state.
</issue_to_address>
### Comment 10
<location> `crates/weaver-sandbox/src/lib.rs:51` </location>
<code_context>
+pub use birdcage::process;
+pub use error::SandboxError;
+pub use profile::{EnvironmentPolicy, NetworkPolicy, SandboxProfile};
+pub use sandbox::{Sandbox, SandboxChild, SandboxCommand, SandboxOutput};
--- /dev/null
+++ b/crates/weaver-sandbox/src/env_guard.rs
</code_context>
<issue_to_address>
**issue (review_instructions):** Add both unit and behavioural tests covering the new `weaver-sandbox` crate behaviour; the newly added test modules and feature file are empty and do not exercise the implemented functionality.
Add comprehensive unit tests (e.g. in `src/tests/unit.rs` / `sandbox_unit.rs`) for the core behaviours of `SandboxProfile`, `Sandbox`, `EnvGuard`, and `runtime` helpers: absolute-path enforcement, executable allowlisting, canonicalisation and `MissingPath` handling, environment policy conversion to `birdcage::Exception`, `NetworkPolicy` handling, and single-threaded preflight checks (including `MultiThreaded` and `ThreadCountUnavailable`).
Add behavioural tests (e.g. BDD scenarios in `tests/features/sandbox.feature` with step definitions) that launch real subprocesses via `Sandbox::spawn` to verify the documented end-to-end behaviour: default network denial, environment isolation and allowlisting, requirement for absolute whitelisted executables, and successful execution paths when correctly configured. Ensure you actually populate the currently empty test files so that both unit and behavioural coverage enforce this new sandboxing feature’s contract.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*`
**Instructions:**
For any new feature or change to an existing feature, both behavioural *and* unit tests are required.
</details>
</issue_to_address>
### Comment 11
<location> `crates/weaver-sandbox/src/env_guard.rs:32` </location>
<code_context>
+ // Safety: project policy requires env mutation to be wrapped in
+ // `unsafe` until the std APIs settle for Rust 2024. We mutate
+ // only after snapshotting to avoid iterator invalidation.
+ unsafe { env::remove_var(key) };
+ }
+ }
</code_context>
<issue_to_address>
**issue (review_instructions):** `unsafe` is used around `env::remove_var` even though the standard library API is safe, which violates the guideline to avoid `unsafe` unless absolutely necessary.
`std::env::remove_var` (and `set_var` below) are safe APIs and do not require `unsafe` blocks. The project guideline is to avoid `unsafe` unless it is strictly necessary, so these `unsafe` wrappers should be removed and the calls made directly. If there is a concrete unsafety concern that justifies this, it should be documented explicitly next to the code and in project guidelines.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.rs`
**Instructions:**
Avoid `unsafe` unless absolutely necessary.
</details>
</issue_to_address>
### Comment 12
<location> `docs/weaver-design.md:987` </location>
<code_context>
+
+The first cut of the dedicated `weaver-sandbox` crate now wraps `birdcage`
+v0.8.1 with Weaver defaults. Executables must be supplied as absolute paths and
+whitelisted explicitly; the wrapper canonicalises the whitelist before launch
+to prevent symlink escapes. Standard Linux library roots (`/lib`, `/lib64`,
+`/usr/lib`, and their architecture-specific variants) are whitelisted for
</code_context>
<issue_to_address>
**issue (review_instructions):** The word “canonicalises” should use -ize spelling (“canonicalizes”) to follow the specified en-GB-oxendic convention.
The style guide here calls for en-GB-oxendic spelling, which prefers -ize rather than -ise. Suggest changing “canonicalises” to “canonicalizes” to match.
<details>
<summary>Review instructions:</summary>
**Path patterns:** `**/*.md`
**Instructions:**
Use en-GB-oxendic (-ize / -yse / -our) spelling and grammar.
</details>
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (13)
Makefile(1 hunks)crates/weaver-sandbox/Cargo.toml(1 hunks)crates/weaver-sandbox/src/env_guard.rs(1 hunks)crates/weaver-sandbox/src/lib.rs(1 hunks)crates/weaver-sandbox/src/profile.rs(1 hunks)crates/weaver-sandbox/src/runtime.rs(1 hunks)crates/weaver-sandbox/src/sandbox.rs(1 hunks)crates/weaver-sandbox/src/tests/behaviour.rs(1 hunks)crates/weaver-sandbox/src/tests/env_guard.rs(1 hunks)crates/weaver-sandbox/src/tests/sandbox_unit.rs(1 hunks)crates/weaver-sandbox/src/tests/support/mod.rs(1 hunks)crates/weaver-sandbox/src/tests/unit.rs(1 hunks)crates/weaver-sandbox/tests/features/sandbox.feature(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/runtime.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/lib.rscrates/weaver-sandbox/src/tests/env_guard.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/profile.rscrates/weaver-sandbox/src/sandbox.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/runtime.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/lib.rscrates/weaver-sandbox/src/tests/env_guard.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/profile.rscrates/weaver-sandbox/src/sandbox.rs
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.813Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
🧬 Code graph analysis (9)
crates/weaver-sandbox/src/tests/sandbox_unit.rs (3)
crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(63-81)crates/weaver-sandbox/src/profile.rs (1)
new(58-66)
crates/weaver-sandbox/src/tests/support/mod.rs (3)
crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/profile.rs (1)
new(58-66)crates/weaver-sandbox/src/env_guard.rs (1)
capture(22-26)
crates/weaver-sandbox/src/tests/unit.rs (2)
crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(63-81)crates/weaver-sandbox/src/profile.rs (6)
new(58-66)read_only_paths(111-113)network_policy(132-134)is_denied(146-148)environment_policy(126-128)read_write_paths(116-118)
crates/weaver-sandbox/src/runtime.rs (2)
crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/profile.rs (1)
new(58-66)
crates/weaver-sandbox/src/env_guard.rs (1)
crates/weaver-sandbox/src/tests/support/mod.rs (3)
set_var(36-38)drop(42-47)drop(142-144)
crates/weaver-sandbox/src/tests/env_guard.rs (1)
crates/weaver-sandbox/src/env_guard.rs (1)
capture(22-26)
crates/weaver-sandbox/src/tests/behaviour.rs (3)
crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(63-81)crates/weaver-sandbox/src/profile.rs (1)
new(58-66)
crates/weaver-sandbox/src/profile.rs (4)
crates/weaver-sandbox/src/runtime.rs (1)
linux_runtime_roots(9-39)crates/weaver-sandbox/src/sandbox.rs (1)
new(31-33)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(63-81)crates/weaver-lsp-host/src/capability.rs (1)
key(25-31)
crates/weaver-sandbox/src/sandbox.rs (3)
crates/weaver-sandbox/src/runtime.rs (1)
thread_count(42-62)crates/weaver-sandbox/src/profile.rs (2)
new(58-66)network_policy(132-134)crates/weaver-sandbox/src/env_guard.rs (2)
capture(22-26)drop(50-52)
🔍 Remote MCP Deepwiki, Ref
Summary of additional facts found (relevant to reviewing PR #33)
-
Workspace / dependency:
- birdcage added as workspace dependency at version 0.8.1 (crate wrapper target).,
-
Platform focus / runtime helpers:
- Implementation is Linux‑centric: linux_runtime_roots() returns canonicalised runtime roots on Linux and empty Vec on other OSes; thread_count() parses /proc/self/status on Linux and returns Ok(1) on non‑Linux. (preflight checks will behave differently off Linux).
-
Single‑thread enforcement and Makefile change:
- Tests now run RUST_TEST_THREADS=1 only for the weaver-sandbox crate (Makefile runs two cargo test invocations: one with RUST_TEST_THREADS=1 -p weaver-sandbox, then workspace tests --exclude weaver-sandbox), avoiding forcing the whole workspace single‑threaded.
-
Path canonicalisation / whitelisting:
- canonicalise() will rebuild non‑existent target paths from the nearest existing ancestor (rebuild_from_existing_ancestor) so MissingPath is only emitted when no ancestor exists — addresses prior concern about allowing non‑existent output paths.
-
EnvGuard and environment handling:
- New EnvGuard captures parent env and restores it on Drop (removes variables created during scope, restores originals). Tests serialize env mutations with a global mutex; env mutation uses unsafe blocks with explicit safety notes.
-
Sandbox behavior and errors:
- Sandbox::spawn enforces single‑threaded preflight (returns SandboxError::MultiThreaded if >1), requires absolute canonicalised program paths, validates against profile whitelists (ExecutableNotAuthorised), and aggregates birdcage Exceptions (read/write/exec, Environment/FullEnvironment, Networking) from SandboxProfile.
-
Tests and test harness:
- Adds unit tests, rstest‑bdd behavioural tests, feature scenarios (fixtures, whitelist access, env allowlist/inheritance, multi‑thread rejection). Test support includes TestWorld harness with TempDir fixtures and resolve_binary helper (Linux‑specific behavior in tests).
-
Documentation & roadmap:
- docs/weaver-design.md, docs/users-guide.md updated to document weaver‑sandbox defaults (birdcage v0.8.1, absolute path requirement, env isolation default, network denied, single‑thread note); roadmap entry marked complete.
Tools used:
- Deepwiki_read_wiki_contents (repo: leynos/weaver) — primary source for crate, tests, Makefile, docs, runtime behavior.
- Ref_ref_search_documentation — search used to confirm dependency/birdcage mention.
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (36)
Makefile (1)
19-21: LGTM!The test target correctly isolates the single-threaded requirement to
weaver-sandboxwhilst allowing the rest of the workspace to run with full parallelism. This addresses the prior concern about globalRUST_TEST_THREADS=1forcing the entire workspace single-threaded.crates/weaver-sandbox/src/tests/sandbox_unit.rs (1)
12-51: LGTM!Tests thoroughly cover the three preflight rejection scenarios (relative paths, missing paths, unwhitelisted executables) with clear assertions and descriptive panic messages.
crates/weaver-sandbox/src/tests/unit.rs (3)
1-6: LGTM!Module documentation is present. Imports are appropriate for the tests.
35-67: Past review concerns addressed.Tests now cover
allow_networking(),allow_full_environment(), and the behaviour that allowlist mutations have no effect afterallow_full_environment()is called. This resolves the outstanding feedback from sourcery-ai.
69-97: LGTM!Path recording and canonicalisation tests exercise the expected behaviour for read-only, read-write, and non-existent child paths. The use of
tempfile::tempdir()with.expect()is appropriate for test setup.crates/weaver-sandbox/src/tests/env_guard.rs (2)
17-66: LGTM!Tests correctly use
unsafeblocks for environment mutations as required by Rust 2024. The mutex guard serialises tests appropriately, and cleanup at the end of each test ensures isolation.
8-15: The twoENV_MUTEXdefinitions serve independent test contexts and should not be consolidated.env_guard.rsis a standalone unit test file that tests theEnvGuardtype itself and requires its own environment lock for isolation.support/mod.rsprovides shared fixtures for behaviour tests with its own separate lock. These are not duplicates but legitimate independent uses—unit tests cannot depend on behaviour test support modules.Likely an incorrect or invalid review comment.
crates/weaver-sandbox/src/env_guard.rs (4)
1-7: LGTM!Module documentation clearly explains the Rust 2024 policy requiring
unsafeblocks for environment mutations. This addresses the outdated past review comments from sourcery-ai—those suggestions are no longer applicable given that "Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such" per coding guidelines.
13-26: LGTM!The
EnvGuardstruct andcapture()implementation are clean. UsingHashMapfor storage is appropriate, and the#[must_use]attribute correctly signals that discarding the guard defeats its purpose.
28-46: Past review concerns are now resolved.The previous sourcery-ai comments requesting removal of
unsafeblocks are obsolete. Per coding guidelines, Rust 2024 mandatesunsafeforenv::set_varandenv::remove_var. The implementation correctly:
- Snapshots current environment before mutation (avoiding iterator invalidation)
- Removes variables introduced during guard lifetime
- Restores original values
The safety comments adequately document the rationale.
49-53: LGTM!The
Dropimplementation delegates torestore(), ensuring automatic cleanup when the guard goes out of scope.crates/weaver-sandbox/Cargo.toml (1)
1-14: LGTM: Clean workspace-based manifest.The manifest correctly uses workspace-based dependencies and includes appropriate testing infrastructure (rstest, rstest-bdd, tempfile) that aligns with the test scaffolding in the crate.
crates/weaver-sandbox/tests/features/sandbox.feature (1)
1-45: LGTM: Comprehensive BDD scenarios.The feature file provides clear coverage of sandbox isolation behaviours including file access restrictions and environment policy modes (allowlist, isolated, full inheritance). Scenarios are well-structured and align with the implemented step definitions.
crates/weaver-sandbox/src/runtime.rs (2)
9-39: LGTM: Platform-aware runtime roots.The function correctly includes both x86_64 and aarch64 library paths, filters by existence, and canonicalises. The silent error dropping via
.ok()is acceptable for whitelist filtering, where unresolvable paths should be excluded.
41-62: LGTM: Robust thread counting.The Linux implementation provides explicit error handling with descriptive messages, while the non-Linux fallback safely returns 1. The parsing logic is clear and handles malformed data gracefully.
crates/weaver-sandbox/src/lib.rs (1)
1-51: LGTM: Well-documented API surface.The module documentation clearly explains the zero-trust design, restrictive defaults, and single-threaded requirement. The example demonstrates typical usage and includes proper attributes (
no_run). The public API surface is cleanly defined through selective re-exports.crates/weaver-sandbox/src/tests/support/mod.rs (3)
50-145: LGTM: Well-structured test harness.The TestWorld fixture provides clean isolation with TempDir, lazy EnvHandle acquisition, and guaranteed cleanup via Drop. The configuration methods follow a clear pattern and align with the BDD scenarios.
147-161: LGTM: Platform-appropriate binary resolution.The function is correctly cfg-gated with a Linux implementation that searches candidates and a non-Linux variant that provides a clear diagnostic. This aligns with the Linux-centric design documented in the PR.
163-168: LGTM: Simple fixture writer.The helper correctly creates and writes fixture files with clear error messages for test debugging.
crates/weaver-sandbox/src/tests/behaviour.rs (5)
10-68: LGTM: Clean step definitions.The given steps correctly configure TestWorld with single borrows and cover all environment policy modes (isolated, allowlist, full inheritance). Implementation aligns well with the feature file scenarios.
70-73: LGTM: Straightforward launch step.The when step correctly triggers sandbox launch via TestWorld.
75-124: LGTM: Clear assertion steps.The then steps provide specific assertions with descriptive error messages. The stdout checks correctly handle Gherkin-quoted strings with
trim_matches('"').
126-139: LGTM: Robust environment cleanup verification.The step correctly verifies that test-specific values have been restored without assuming the host environment lacks these variables. This addresses the concern from past reviews.
141-144: LGTM: Correct scenario binding.The scenario function correctly binds step definitions to the feature file.
crates/weaver-sandbox/src/profile.rs (5)
8-18: LGTM: Well-designed environment policy enum.The three variants (Isolated, AllowList, InheritAll) clearly express the policy modes, with Isolated as the secure default. Using BTreeSet for the allowlist ensures deterministic ordering.
20-28: LGTM: Clear network policy enum.The binary choice (Deny/Allow) is well-documented, with Deny as the secure default.
30-141: LGTM: Clean profile builder.The SandboxProfile provides a fluent builder API with appropriate defaults (Linux runtime roots, restricted environment and network). The accessors are correctly scoped, and the doc example demonstrates typical usage.
143-149: LGTM: Clear predicate method.The
is_deniedmethod provides a readable way to check network policy state.
151-178: LGTM: Clean policy helpers.The
with_allowedmethod correctly transitions between policy states, andto_exceptionsprovides a clear mapping to birdcage exceptions. Both helpers are appropriately scoped as crate-private.crates/weaver-sandbox/src/sandbox.rs (7)
15-33: LGTM: Clean type aliases and constructor.The type aliases provide a cleaner public API whilst maintaining access to the underlying birdcage types. The Sandbox struct is minimal and focused.
35-57: LGTM: Well-orchestrated spawn flow.The spawn method enforces necessary preflights (single-thread, absolute path, whitelist), captures environment state with EnvGuard, collects birdcage exceptions, and spawns the sandboxed process. The implementation aligns with the documented behaviour.
59-78: LGTM: Clear preflight checks.Both methods provide clear validation with specific error variants. The single-thread check prevents the birdcage assertion panic, and the whitelist check enforces explicit executable authorisation.
80-105: LGTM: Comprehensive exception collection.The method correctly aggregates birdcage exceptions from the profile's policies. The delegation to
environment_policy().to_exceptions()cleanly handles the environment policy logic, and the automatic inclusion of the program's ExecuteAndRead exception is appropriate.
107-114: LGTM: Clear program path validation.The method correctly enforces the absolute path requirement before canonicalising.
116-123: LGTM: Correct path set canonicalisation.The function correctly canonicalises paths and collects them into a BTreeSet for uniqueness and ordering. Ignoring the insert result is appropriate when duplicates after canonicalisation are expected.
125-167: LGTM: Robust path canonicalisation with ancestor fallback.The canonicalise function correctly avoids TOCTOU by relying solely on
fs::canonicalizeand mapping NotFound torebuild_from_existing_ancestor. The rebuild helper correctly walks up to find an existing ancestor, canonicalises it, and reconstructs the full path—addressing the PR objective of allowing non-existent output paths to be whitelisted.Based on learnings, this approach provides focused sandboxing with robust Linux support.
|
@coderabbitai Have the following now been resolved? |
…execution - Introduce caching for canonicalised path sets in SandboxProfile for read-only, read-write, and executable paths. - Update Sandbox to use cached canonicalised paths for validation and exceptions. - Add thread counter injection in Sandbox for testing and production use. - Enforce single-threaded process constraint when spawning sandboxed commands; return errors if multi-threaded or thread count unavailable. - Refactor EnvGuard to safely clear and restore environment variables using helper functions. - Add tests for multi-threaded rejection and environment guard restore on drop. - Improve error handling around path canonicalisation including support for nonexistent future paths. - Minor doc improvements and style cleanup related to sandbox usage and design. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
@coderabbitai Have the following now been resolved? |
The fallback logic in the `canonicalise` function to rebuild a missing path from existing ancestors was removed. Now, if a path is missing, an error is always returned immediately without attempting to reconstruct the path from ancestors. This simplifies error handling and code complexity in sandbox path resolution. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
@coderabbitai Have the following now been resolved? |
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
@sourcery-ai resolve |
…aths Replaced the use of BTreeSet with Vec<PathBuf> for the canonicalised executable, read-only, and read-write path sets in SandboxProfile. This change involved updating related methods to use Vec and changing containment checks to iter().any(). Additionally, moved environment mutex locking logic from env_guard tests into a shared test support module to consolidate and reuse the environment locking mechanism across tests. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 9
♻️ Duplicate comments (3)
docs/weaver-design.md (1)
989-990: Add comma before "so" in result clause.Lines 989–990 omit a comma before "so", which introduces a result clause. Add a comma for grammatical clarity and to separate the independent clause properly.
Apply this diff:
`/usr/lib`, and their architecture-specific variants) are whitelisted for -read-only access by default so dynamically linked binaries remain functional +read-only access by default, so dynamically linked binaries remain functionalcrates/weaver-sandbox/src/tests/env_guard.rs (1)
8-84: Keep EnvGuard tests; they now exercise both explicit and Drop-based restoration.These tests lock the shared env mutex, mutate env under
EnvGuard::capture(), and verify restoration via bothrestore()andDrop, including handling of ephemeral variables. This gives good confidence in EnvGuard’s RAII semantics.crates/weaver-sandbox/src/tests/support/mod.rs (1)
32-34: Add SAFETY documentation for unsafe environment mutation.The
unsafeblock lacks a safety comment explaining why this usage is correct. Per coding guidelines, all unsafe code must be documented clearly.Apply this diff:
fn set_var(&mut self, key: &'static str, value: &str) { + // SAFETY: Environment mutation is unsafe in Rust 2024. This is safe + // because the ENV_MUTEX (via `lock_env()`) serialises all env access + // across test threads, and `EnvGuard::restore` reverts changes on Drop. unsafe { std::env::set_var(key, value) }; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (12)
crates/weaver-sandbox/src/env_guard.rs(1 hunks)crates/weaver-sandbox/src/profile.rs(1 hunks)crates/weaver-sandbox/src/sandbox.rs(1 hunks)crates/weaver-sandbox/src/tests/behaviour.rs(1 hunks)crates/weaver-sandbox/src/tests/env_guard.rs(1 hunks)crates/weaver-sandbox/src/tests/mod.rs(1 hunks)crates/weaver-sandbox/src/tests/sandbox_unit.rs(1 hunks)crates/weaver-sandbox/src/tests/support/env.rs(1 hunks)crates/weaver-sandbox/src/tests/support/mod.rs(1 hunks)crates/weaver-sandbox/src/tests/unit.rs(1 hunks)docs/users-guide.md(1 hunks)docs/weaver-design.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/tests/support/env.rscrates/weaver-sandbox/src/tests/env_guard.rscrates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/tests/mod.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/profile.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/tests/support/env.rscrates/weaver-sandbox/src/tests/env_guard.rscrates/weaver-sandbox/src/tests/unit.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/tests/mod.rscrates/weaver-sandbox/src/env_guard.rscrates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/profile.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
Documentation must use en-GB-oxendict ("-ize" / "-yse" / "-our") spelling and grammar, except for the "LICENSE" file which is left unchanged for community consistency.
docs/**/*.md: Use British English based on Oxford English Dictionary (en-GB-oxendict), including: -ize suffix (realize, organization), -lyse suffix (analyse, paralyse), -our suffix (colour, behaviour), -re suffix (calibre, centre), double 'l' (cancelled, counsellor), maintain 'e' (likeable, liveable), -ogue suffix (analogue, catalogue)
Use US spelling when used in an API, for examplecolor
Use the Oxford comma in documentation: 'ships, planes, and hovercraft' where it aids comprehension
Treat company names as collective nouns in documentation, for example: 'Lille Industries are expanding'
Write headings in sentence case using Markdown headings (#, ##, ###) in order without skipping levels
Follow markdownlint recommendations for Markdown consistency
Provide code blocks and lists using standard Markdown syntax, always providing a language identifier for fenced code blocks (useplaintextfor non-code text)
Use-as the first level bullet and renumber lists when items change in documentation
Prefer inline links using[text](url)or angle brackets around the URL in documentation
Ensure blank lines before and after bulleted lists and fenced blocks in documentation
Ensure tables have a delimiter line below the header row in documentation
Expand any uncommon acronym on first use in documentation, for example, Continuous Integration (CI)
Wrap paragraphs at 80 columns in documentation
Wrap code at 120 columns in documentation
Do not wrap tables in documentation
Use footnotes referenced with[^label]in documentation
Include Mermaid diagrams where they add clarity to documentation
When embedding figures in documentation, useand provide brief alt text describing the content
Add a short description before each Mermaid diagram so screen readers can understand it
Define roa...
Files:
docs/users-guide.mddocs/weaver-design.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
Use dashes (-) for list bullets in Markdown.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/users-guide.mddocs/weaver-design.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/users-guide.mddocs/weaver-design.md
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.813Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
🧬 Code graph analysis (6)
crates/weaver-sandbox/src/tests/env_guard.rs (3)
crates/weaver-sandbox/src/tests/support/env.rs (1)
lock_env(5-10)crates/weaver-sandbox/src/tests/support/mod.rs (1)
set_var(32-34)crates/weaver-sandbox/src/env_guard.rs (1)
capture(31-35)
crates/weaver-sandbox/src/tests/sandbox_unit.rs (3)
crates/weaver-sandbox/src/profile.rs (1)
new(61-72)crates/weaver-sandbox/src/sandbox.rs (2)
new(39-44)with_thread_counter_for_tests(47-55)crates/weaver-sandbox/src/runtime.rs (1)
thread_count(42-62)
crates/weaver-sandbox/src/tests/behaviour.rs (3)
crates/weaver-sandbox/src/profile.rs (1)
new(61-72)crates/weaver-sandbox/src/sandbox.rs (1)
new(39-44)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(59-77)
crates/weaver-sandbox/src/env_guard.rs (1)
crates/weaver-sandbox/src/tests/support/mod.rs (4)
set_env_var(102-110)set_var(32-34)drop(38-43)drop(138-140)
crates/weaver-sandbox/src/sandbox.rs (4)
crates/weaver-sandbox/src/runtime.rs (1)
thread_count(42-62)crates/weaver-sandbox/src/profile.rs (2)
new(61-72)network_policy(169-171)crates/weaver-sandbox/src/env_guard.rs (1)
capture(31-35)crates/weaverd/src/process/guard.rs (1)
paths(85-87)
crates/weaver-sandbox/src/profile.rs (2)
crates/weaver-sandbox/src/runtime.rs (1)
linux_runtime_roots(9-39)crates/weaver-sandbox/src/sandbox.rs (2)
new(39-44)canonicalised_set(140-147)
🪛 LanguageTool
docs/users-guide.md
[misspelling] ~149-~149: This word is normally spelled as one.
Context: ...wlist before launch; requests made from multi-threaded contexts return a MultiThreaded error...
(EN_COMPOUNDS_MULTI_THREADED)
🔍 Remote MCP Deepwiki
Summary of additional concrete facts relevant to reviewing PR #33
-
Workspace/dependency
- birdcage added as a workspace dependency (v0.8.1) and new workspace member crates/weaver-sandbox introduced.
-
Test execution / Makefile
- Makefile now runs two cargo test invocations: first -p weaver-sandbox with RUST_TEST_THREADS=1 (single-threaded), then workspace tests with --exclude weaver-sandbox (so only sandbox tests are single-threaded).
-
Platform focus / runtime helpers
- Implementation is Linux‑centric:
- linux_runtime_roots() returns canonicalised runtime roots only on Linux (empty Vec on other OSes).
- thread_count() parses /proc/self/status on Linux; on non-Linux it returns Ok(1). Preflight behavior differs off Linux.
- Implementation is Linux‑centric:
-
Path canonicalisation / whitelisting behavior
- canonicalise() rebuilds non-existent child paths from the nearest existing ancestor (rebuild_from_existing_ancestor) so MissingPath is only emitted when no ancestor exists — enables whitelisting output paths.
-
EnvGuard / environment handling & tests
- EnvGuard captures parent environment and restores it on Drop; tests serialize env mutations with a global mutex pattern. Current repo state: multiple test modules define their own OnceLock<Mutex<()>> (reviewer-requested centralisation remains outstanding).
-
Public API surface added
- New crate exports and types: Sandbox, SandboxCommand/Child/Output aliases, SandboxProfile, EnvironmentPolicy, NetworkPolicy, SandboxError and related public methods. spawn() enforces single-thread preflight and canonicalises/validates executables.
-
Tests added
- Unit tests, rstest-bdd behavioural tests, feature scenarios and test support harness (TestWorld, EnvHandle, env lock) were added to cover canonicalisation, env_guard Drop behavior, multi-thread preflight errors, and behavioural scenarios.
-
Outstanding reviewer action items (concrete)
- Centralise the environment mutex used by tests (avoid duplicate OnceLock<Mutex<()>> across modules).
- Consider simplifying canonicalisation logic (replace BTreeSet use and complex rebuild logic with ancestors()/strip_prefix() unless set semantics justified).
- Remove or justify remaining unsafe usage around std::env::set_var/remove_var (these APIs are safe; unsafe must be justified).
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Sourcery review
🔇 Additional comments (7)
crates/weaver-sandbox/src/tests/mod.rs (1)
1-6: Keep centralised test module wiring as-is.The tests module is succinct, documented, and cleanly organises behaviour, env_guard, support, and unit tests. No changes are required.
crates/weaver-sandbox/src/tests/unit.rs (1)
7-97: Retain these profile and policy unit tests; coverage is solid.These tests exercise the key
SandboxProfilebehaviours (runtime roots, env allow-list semantics including “ignore after InheritAll”, network defaults/overrides, and path recording, including non-existent future paths). This aligns well with the public API and previous review feedback.crates/weaver-sandbox/src/tests/support/env.rs (1)
1-10: Keeplock_envas the single shared environment mutex for tests.This module centralises the
OnceLock<Mutex<()>>used to serialise env mutations in tests, which matches the project’s testing policy and removes the earlier duplication across test modules.crates/weaver-sandbox/src/tests/sandbox_unit.rs (1)
21-92: Retain these spawn preflight tests; they cover the important error variants.The tests for relative paths, missing programmes, unwhitelisted executables,
MultiThreaded, andThreadCountUnavailableexercise the key safety guarantees ofSandbox::spawnand the injected thread-count seam. This is an appropriate level of coverage for the sandbox preflight behaviour.crates/weaver-sandbox/src/env_guard.rs (1)
1-54: Keep EnvGuard’s snapshot-and-restore implementation; the unsafe surface is narrow and well-documented.EnvGuard captures the environment as a simple
Vec<(OsString, OsString)>, clears all current variables, then restores the snapshot, with all env mutations funnelled through small unsafe helpers as required by the project’s env policy. This matches the documented intent and the accompanying tests.#!/bin/bash # Run only the EnvGuard tests to double-check behaviour after any future edits. cargo test -p weaver-sandbox env_guardcrates/weaver-sandbox/src/tests/support/mod.rs (1)
1-164: Test support scaffolding is well-structured.The
TestWorldfixture provides clean orchestration for behavioural tests with temporary directory isolation. TheEnvHandlepattern correctly serialises environment mutations via mutex and restores on drop. The#[cfg(target_os = "linux")]guard onresolve_binaryaddresses platform concerns.crates/weaver-sandbox/src/tests/behaviour.rs (1)
1-145: Behavioural test structure is solid.The
#![cfg(target_os = "linux")]module gate correctly restricts these tests to supported platforms. Step definitions are well-organised with clear assertions. Thethen_environment_cleanedstep correctly usesassert_neto verify scenario-specific values are removed without assuming the host environment lacks these variables.
…ucture Added Figure 5.1, a detailed Mermaid class diagram, to the weaver-design documentation. The diagram illustrates the sandbox crate structure and its integration points, including classes such as Sandbox, SandboxProfile, EnvironmentPolicy, NetworkPolicy, and related externals. This enhances understanding of the sandbox component design and relationships. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
… error handling - Introduce a helper method `canonicalised_paths` in `SandboxProfile` to reduce duplicated caching logic for path canonicalisation. - Change `allow_environment_variable` to avoid cloning large environment policies by moving instead. - Enhance `canonicalise` function to handle non-existent paths by rebuilding from nearest existing ancestor, allowing whitelisting of future outputs. - Simplify exception collection by removing redundant executable check. - Improve environment mutation safety in tests. - Refactor test code for clarity and reduce duplication. - Fix minor typo in documentation from "multi-threaded" to "multithreaded". Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
|
@coderabbitai Please suggest a fix for this issue and supply a prompt for an AI coding agent to enable it to apply the fix: crates/weaver-sandbox/src/sandbox.rs Comment on lines +145 to +192 fn canonicalise(path: &Path, require_exists: bool) -> Result<PathBuf, SandboxError> {
match fs::canonicalize(path) {
Ok(resolved) => Ok(resolved),
Err(error) if error.kind() == std::io::ErrorKind::NotFound => {
if require_exists {
return Err(SandboxError::MissingPath {
path: path.to_path_buf(),
});
}
// Rebuild the path from the closest existing ancestor so callers can
// whitelist future outputs (e.g. files that will be created by the
// sandboxed command) under an allowed directory.
let mut components_to_append = Vec::new();
let mut ancestor_opt = path;
while let Some(parent) = ancestor_opt.parent() {
components_to_append.push(ancestor_opt.file_name().map(PathBuf::from));
ancestor_opt = parent;
if ancestor_opt.exists() {
break;
}
}
if !ancestor_opt.exists() {
return Err(SandboxError::MissingPath {
path: path.to_path_buf(),
});
}
let mut rebuilt = fs::canonicalize(ancestor_opt).map_err(|source| {
SandboxError::CanonicalisationFailed {
path: ancestor_opt.to_path_buf(),
source,
}
})?;
for component in components_to_append.into_iter().rev().flatten() {
rebuilt.push(component);
}
Ok(rebuilt)
}
Err(source) => Err(SandboxError::CanonicalisationFailed {
path: path.to_path_buf(),
source,
}),
}
}❌ New issue: Bumpy Road Ahead |
This comment was marked as resolved.
This comment was marked as resolved.
Refactored the path canonicalization process to move the logic for rebuilding a path from its closest existing ancestor into a new helper function `rebuild_from_existing_ancestor`. This improves code clarity and consolidates error handling related to path existence and canonicalization. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (6)
crates/weaver-sandbox/src/profile.rs(1 hunks)crates/weaver-sandbox/src/sandbox.rs(1 hunks)crates/weaver-sandbox/src/tests/behaviour.rs(1 hunks)crates/weaver-sandbox/src/tests/sandbox_unit.rs(1 hunks)crates/weaver-sandbox/src/tests/support/mod.rs(1 hunks)docs/users-guide.md(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/profile.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/tests/sandbox_unit.rscrates/weaver-sandbox/src/tests/behaviour.rscrates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/tests/support/mod.rscrates/weaver-sandbox/src/profile.rs
docs/**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
Documentation must use en-GB-oxendict ("-ize" / "-yse" / "-our") spelling and grammar, except for the "LICENSE" file which is left unchanged for community consistency.
docs/**/*.md: Use British English based on Oxford English Dictionary (en-GB-oxendict), including: -ize suffix (realize, organization), -lyse suffix (analyse, paralyse), -our suffix (colour, behaviour), -re suffix (calibre, centre), double 'l' (cancelled, counsellor), maintain 'e' (likeable, liveable), -ogue suffix (analogue, catalogue)
Use US spelling when used in an API, for examplecolor
Use the Oxford comma in documentation: 'ships, planes, and hovercraft' where it aids comprehension
Treat company names as collective nouns in documentation, for example: 'Lille Industries are expanding'
Write headings in sentence case using Markdown headings (#, ##, ###) in order without skipping levels
Follow markdownlint recommendations for Markdown consistency
Provide code blocks and lists using standard Markdown syntax, always providing a language identifier for fenced code blocks (useplaintextfor non-code text)
Use-as the first level bullet and renumber lists when items change in documentation
Prefer inline links using[text](url)or angle brackets around the URL in documentation
Ensure blank lines before and after bulleted lists and fenced blocks in documentation
Ensure tables have a delimiter line below the header row in documentation
Expand any uncommon acronym on first use in documentation, for example, Continuous Integration (CI)
Wrap paragraphs at 80 columns in documentation
Wrap code at 120 columns in documentation
Do not wrap tables in documentation
Use footnotes referenced with[^label]in documentation
Include Mermaid diagrams where they add clarity to documentation
When embedding figures in documentation, useand provide brief alt text describing the content
Add a short description before each Mermaid diagram so screen readers can understand it
Define roa...
Files:
docs/users-guide.md
**/*.md
📄 CodeRabbit inference engine (AGENTS.md)
**/*.md: Validate Markdown files usingmake markdownlint.
Runmake fmtafter any documentation changes to format all Markdown files and fix table markup.
Validate Mermaid diagrams in Markdown files by runningmake nixie.
Markdown paragraphs and bullet points must be wrapped at 80 columns.
Code blocks in Markdown must be wrapped at 120 columns.
Tables and headings in Markdown must not be wrapped.
Use dashes (-) for list bullets in Markdown.
Use GitHub-flavoured Markdown footnotes ([^1]) for references and footnotes.
Files:
docs/users-guide.md
⚙️ CodeRabbit configuration file
**/*.md: * Avoid 2nd person or 1st person pronouns ("I", "you", "we")
- Use en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Headings must not be wrapped.
- Documents must start with a level 1 heading
- Headings must correctly increase or decrease by no more than one level at a time
- Use GitHub-flavoured Markdown style for footnotes and endnotes.
- Numbered footnotes must be numbered by order of appearance in the document.
Files:
docs/users-guide.md
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.813Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
🧬 Code graph analysis (3)
crates/weaver-sandbox/src/tests/sandbox_unit.rs (2)
crates/weaver-sandbox/src/sandbox.rs (2)
with_thread_counter_for_tests(47-55)new(39-44)crates/weaver-sandbox/src/runtime.rs (1)
thread_count(42-62)
crates/weaver-sandbox/src/tests/behaviour.rs (3)
crates/weaver-sandbox/src/sandbox.rs (1)
new(39-44)crates/weaver-sandbox/src/profile.rs (1)
new(61-72)crates/weaver-sandbox/src/tests/support/mod.rs (1)
new(61-79)
crates/weaver-sandbox/src/profile.rs (2)
crates/weaver-sandbox/src/runtime.rs (1)
linux_runtime_roots(9-39)crates/weaver-sandbox/src/sandbox.rs (2)
new(39-44)canonicalised_set(136-143)
🔍 Remote MCP Ref
Summary of additional facts relevant to code review
- Workspace + dependency: PR adds crates/weaver-sandbox as a workspace member and adds birdcage = "0.8.1" to workspace dependencies (weaver-sandbox wraps birdcage).
- Test/CI changes: Makefile now runs tests in two steps — first RUST_TEST_THREADS=1 cargo test -p weaver-sandbox (scoped single-threaded), then cargo test --exclude weaver-sandbox for the rest of workspace.
- Tests added: unit suites, sandbox_unit, env_guard unit tests, and rstest-bdd behavioural tests plus features/sandbox.feature (covers whitelist, env inheritance/isolation, network defaults).
- Env mutex / test hygiene: PR replaced once_cell Lazy with std::sync::OnceLock for static test mutexes; reviewers flagged duplicated env-lock definitions and requested centralizing a single shared env mutex across tests.
- Env mutability safety / unsafe usage: EnvGuard wraps std::env set/remove in narrow helpers and PR includes docs asserting environment mutations must be inside unsafe blocks on Rust 2024 toolchains; reviewers requested removing unnecessary unsafe or providing concrete justification.
- Path canonicalisation behavior: canonicalise() and rebuild-from-ancestor logic have been iterated — earlier commits added rebuilding from an existing ancestor to allow whitelisting non-existent child paths; later commits removed that fallback (always error on missing path); a refactor extracted rebuild_from_existing_ancestor into a helper. Reviewers requested simplification (ancestors/strip_prefix) and justification for using BTreeSet semantics.
- Thread-count handling: Sandbox enforces single-threaded preflight via thread_count() (Linux: reads /proc/self/status); tests inject a thread-counter seam to deterministically exercise MultiThreaded and ThreadCountUnavailable errors.
- Platform scope: Implementation and many tests are Linux-centric (linux_runtime_roots, /proc parsing, resolve_binary); docs note Linux defaults.
Sources
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: Sourcery review
- GitHub Check: build-test
🔇 Additional comments (7)
docs/users-guide.md (1)
142-153: Documentation accurately describes sandbox defaults.The new section correctly documents the birdcage 0.8.1 wrapper, default policies (networking disabled, environment stripped), absolute path requirement, and multithreaded context handling. Punctuation issues flagged in past reviews have been addressed.
crates/weaver-sandbox/src/tests/sandbox_unit.rs (1)
1-89: Unit tests comprehensively cover sandbox preflight errors.The test suite validates all error paths: relative programme paths, missing paths, unwhitelisted executables, multithreaded contexts, and thread count unavailability. The test-only thread counter injection via
sandbox_with_forced_thread_countenables deterministic testing of runtime conditions.crates/weaver-sandbox/src/tests/support/mod.rs (1)
1-166: Test support infrastructure is well-structured and properly addresses past concerns.
EnvHandlecorrectly coordinates the environment mutex and snapshot, with restoration occurring before the guard releases. TheTestWorldfixture provides comprehensive orchestration for behavioural tests, including temporary directory isolation and helper methods aligned with BDD scenarios.resolve_binaryis properly gated to Linux.crates/weaver-sandbox/src/sandbox.rs (1)
1-184: Sandbox orchestration correctly implements preflight checks and path canonicalisation.The implementation properly enforces single-threaded execution, validates programme whitelisting, and handles both existing and future paths through the
require_existsparameter. Therebuild_from_existing_ancestorhelper usesancestors()/strip_prefixas suggested in past reviews, avoiding manual parent-walking loops.crates/weaver-sandbox/src/tests/behaviour.rs (1)
1-154: Behavioural tests are properly structured and address all past review concerns.The module is correctly gated to Linux, preventing runtime panics on non-Linux hosts. Borrow checker issues have been resolved by extracting field values before mutable borrows. The
then_environment_cleanedstep now verifies test-specific values were removed rather than assuming the host environment lacks those variables, making tests more robust.crates/weaver-sandbox/src/profile.rs (2)
95-105: Environment variable whitelisting correctly avoids cloning allow-lists.Using
std::mem::taketo move the current policy before callingwith_allowedeliminates the unnecessary clone that would occur when growing allow-lists. This addresses the performance concern raised in past reviews.
180-207: Environment policy helpers properly encapsulate multi-branch conditionals.
EnvironmentPolicy::with_allowedencapsulates the three-branch logic for accumulating allowed environment variables, andto_exceptionsmaps policy variants tobirdcage::Exceptiontypes. This follows the coding guideline to extract multi-branch conditionals into helper functions.
…:sync::OnceCell Replaced usage of std::sync::OnceLock with once_cell::sync::OnceCell for caching canonicalized paths in sandbox profile. Updated Cargo.toml and Cargo.lock to add the once_cell dependency. This change simplifies the caching logic by using get_or_try_init directly, improving code clarity and error handling. Additionally added error mapping in sandbox.rs to signal invariant break more precisely. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
crates/weaver-sandbox/Cargo.toml(1 hunks)crates/weaver-sandbox/src/profile.rs(1 hunks)crates/weaver-sandbox/src/sandbox.rs(1 hunks)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.rs
📄 CodeRabbit inference engine (AGENTS.md)
**/*.rs: Runmake check-fmt,make lint, andmake testbefore committing, which executecargo fmt --workspace -- --check,cargo clippy --workspace --all-targets --all-features -- -D warnings, andcargo test --workspacerespectively.
Clippy warnings MUST be disallowed.
Fix any warnings emitted during tests in the code itself rather than silencing them.
Where a function is too long, extract meaningfully named helper functions adhering to separation of concerns and CQRS.
Where a function has too many parameters, group related parameters in meaningfully named structs.
Where a function is returning a large error, consider usingArcto reduce the amount of data returned.
Write unit and behavioural tests for new functionality; run both before and after making any change.
Every module must begin with a module level (//!) comment explaining the module's purpose and utility.
Document public APIs using Rustdoc comments (///) so documentation can be generated withcargo doc.
Prefer immutable data and avoid unnecessarymutbindings.
Handle errors with theResulttype instead of panicking where feasible.
Avoidunsafecode unless absolutely necessary and document any usage clearly.
Place function attributes after doc comments in Rust.
Do not usereturnin single-line functions in Rust.
Use predicate functions for conditional criteria with more than two branches in Rust.
Lints must not be silenced except as a last resort; lint rule suppressions must be tightly scoped and include a clear reason.
Preferexpectoverallowin Rust.
Userstestfixtures for shared setup in Rust tests.
Replace duplicated tests with#[rstest(...)]parameterised cases in Rust.
Prefermockallfor mocks/stubs in Rust.
Prefer.expect()over.unwrap()in Rust.
Useconcat!()to combine long string literals in Rust rather than escaping newlines with a backslash.
Prefer single-line function definitions in Rust where appropriate (e.g., `pub fn new(id: u64) -> Self { ...
Files:
crates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/profile.rs
⚙️ CodeRabbit configuration file
**/*.rs: * Seek to keep the cognitive complexity of functions no more than 9.
- Adhere to single responsibility and CQRS
- Place function attributes after doc comments.
- Do not use
returnin single-line functions.- Move conditionals with >2 branches into a predicate function.
- Avoid
unsafeunless absolutely necessary.- Every module must begin with a
//!doc comment that explains the module's purpose and utility.- Comments and docs must follow en-GB-oxendict (-ize / -yse / -our) spelling and grammar
- Lints must not be silenced except as a last resort.
#[allow]is forbidden.- Only narrowly scoped
#[expect(lint, reason = "...")]is allowed.- No lint groups, no blanket or file-wide suppression.
- Include
FIXME:with link if a fix is expected.- Where code is only used by specific features, it must be conditionally compiled or a conditional expectation for unused_code applied.
- Use
rstestfixtures for shared setup and to avoid repetition between tests.- Replace duplicated tests with
#[rstest(...)]parameterised cases.- Prefer
mockallfor mocks/stubs.- Prefer
.expect()over.unwrap()in tests..expect()and.unwrap()are forbidden outside of tests. Errors must be propagated.- Ensure that any API or behavioural changes are reflected in the documentation in
docs/- Ensure that any completed roadmap steps are recorded in the appropriate roadmap in
docs/- Files must not exceed 400 lines in length
- Large modules must be decomposed
- Long match statements or dispatch tables should be decomposed by domain and collocated with targets
- Large blocks of inline data (e.g., test fixtures, constants or templates) must be moved to external files and inlined at compile-time or loaded at run-time.
- Environment access (env::set_var and env::remove_var) are always unsafe in Rust 2024 and MUST be marked as such
- For testing of functionality depending upon environment variables, dependency injection and...
Files:
crates/weaver-sandbox/src/sandbox.rscrates/weaver-sandbox/src/profile.rs
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: leynos/weaver PR: 0
File: docs/roadmap.md:0-0
Timestamp: 2025-12-03T19:40:08.813Z
Learning: Implement `weaver-sandbox` crate using `birdcage` for focused scope, prioritising robust Linux support via namespaces and seccomp-bpf
🧬 Code graph analysis (1)
crates/weaver-sandbox/src/sandbox.rs (3)
crates/weaver-sandbox/src/runtime.rs (1)
thread_count(42-62)crates/weaver-sandbox/src/profile.rs (2)
new(63-74)network_policy(148-150)crates/weaver-sandbox/src/env_guard.rs (2)
capture(31-35)drop(51-53)
⏰ Context from checks skipped due to timeout of 120000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: build-test
- GitHub Check: Sourcery review
🔇 Additional comments (22)
crates/weaver-sandbox/Cargo.toml (1)
8-8: Do not removeonce_celldependency—it is still actively used in the implementation.The crate uses
once_cell::sync::OnceCellincrates/weaver-sandbox/src/profile.rs, so the dependency cannot be removed. Whilestd::sync::OnceLockis used in tests, the implementation relies ononce_cell. Keep the dependency declaration as-is.Likely an incorrect or invalid review comment.
crates/weaver-sandbox/src/sandbox.rs (11)
1-14: LGTM!Module documentation present, imports are well-organised and necessary. The
std::fmtimport at line 13 could be grouped with the otherstdimports at lines 3-4 for consistency, but this is minor.
15-26: LGTM!Type aliases provide useful abstraction over
birdcagetypes, and thethread_counterfield enables dependency injection for testing—a sound approach that avoids environment mutation in tests.
28-34: LGTM!Manual
Debugimplementation correctly handles the non-Debugclosure field withfinish_non_exhaustive().
36-55: LGTM!The test seam via
with_thread_counter_for_testsenables deterministic testing without environment mutation. The#[must_use]annotation is appropriate.
57-79: LGTM!The
spawnmethod correctly sequences preflight checks, environment capture, exception collection, and sandbox activation. The explicitdrop(env_guard)at line 77 documents the intent to restore the environment immediately after spawning, which aids readability.
81-90: LGTM!Error handling is correct and the logic properly enforces
birdcage's single-threaded requirement.
92-100: LGTM!Whitelist check is correct. The
iter().any()pattern is appropriate forVec<PathBuf>membership testing.
102-125: LGTM!The
_programparameter is correctly unused sinceensure_program_whitelistedalready guarantees the programme is in the executable whitelist, and the loop overexecutableswill include it. Exception collection is well-structured and delegates policy mapping to helper methods.
127-134: LGTM!Correct validation: programmes must be absolute and must exist.
145-162: LGTM!The
canonicalisefunction correctly implements the dual semantics: programmes must exist (require_exists=true), whilst whitelist paths may reference future outputs under existing ancestors (require_exists=false). The TOCTTOU concern from earlier reviews has been addressed.
164-187: LGTM!The refactor using
Path::ancestors()andstrip_prefix()is clean and correct. The comment at lines 177-179 appropriately documents the defensive error mapping for an invariant that should always hold.crates/weaver-sandbox/src/profile.rs (10)
1-9: LGTM!Module documentation is present and imports are appropriate.
10-20: LGTM!Secure-by-default with
Isolatedas the default variant.BTreeSetprovides ordered and deduplicated environment variable names.
22-30: LGTM!Deny-by-default network policy is the secure choice.
Copyderive is appropriate for this small enum.
32-48: LGTM!The struct design with
OnceCellcaching for canonicalised paths avoids repeated filesystem operations. Documentation clearly explains the restrictive defaults.
50-121: LGTM!Builder pattern is well-implemented with
#[must_use]annotations. Thestd::mem::takeat line 104 correctly avoids cloning the environment policy on eachallow_environment_variablecall.
123-150: LGTM!The accessor methods correctly delegate to
canonicalised_paths, which usesget_or_try_initas requested in prior reviews. Thenetwork_policyaccessor correctly returns a copy sinceNetworkPolicyisCopy.
153-161: LGTM!The
get_or_try_initpattern correctly handles fallible initialization and caching in a single atomic operation, addressing prior review feedback.
163-175: LGTM!The
Defaultimplementation correctly delegates tonew(), andis_denied()is a concise predicate.
177-203: LGTM!The
with_allowedmethod correctly handles all three policy variants, andto_exceptionsproperly maps policies tobirdcage::Exceptiontypes. The factoring of these methods addresses prior review feedback about extracting multi-branch conditionals.
6-6: The recommendation to replaceonce_cell::sync::OnceCellwithstd::sync::OnceLockis not feasible on stable Rust. The code usesget_or_try_init()(line 159), which is only available as an unstable nightly feature (once_cell_try) onstd::sync::OnceLock. WhileOnceLockitself was stabilized in Rust 1.70,get_or_try_initremains nightly-only. Theonce_celldependency is required to support stable Rust builds.Likely an incorrect or invalid review comment.
- Replace explicit loop in `canonicalised_set` with iterator and collect for cleaner code. - Change error mapping in `rebuild_from_existing_ancestor` from `MissingPath` to `CanonicalisationFailed` to better reflect an internal invariant violation rather than a caller error. Co-authored-by: terragon-labs[bot] <terragon-labs[bot]@users.noreply.github.com>
Summary
Changes
Why
How to test
Migration / Notes
Risk & rollback
🌿 Generated by Terry
ℹ️ Tag @terragon-labs to ask questions and address PR feedback
📎 Task: https://www.terragonlabs.com/task/e978d32d-903a-4ae9-b14d-f0f7a4f21136
Summary by Sourcery
Introduce a new weaver-sandbox crate that wraps the birdcage library to provide opinionated sandboxing for Weaver subprocesses, update workspace configuration and docs, and add focused tests for the new sandbox behaviour.
New Features:
Enhancements:
Build:
Documentation:
Tests: