Skip to content

fix: let workers/branches continue when secrets detected in tool output#323

Merged
jamiepine merged 3 commits intomainfrom
fix/worker-secret-leak-continue
Mar 4, 2026
Merged

fix: let workers/branches continue when secrets detected in tool output#323
jamiepine merged 3 commits intomainfrom
fix/worker-secret-leak-continue

Conversation

@jamiepine
Copy link
Member

Summary

  • Workers and branches no longer terminate when leak detection finds a secret pattern in tool output — they log a warning and continue working
  • Added scrub_leaks() function for regex-based redaction of unknown secret patterns on all egress paths (worker results, branch conclusions, status updates)
  • Fixed pre-existing gap where cortex task worker results were completely unscrubbed before channel injection

Problem

When a worker encountered a secret pattern in tool output (e.g. reading a file containing an API key during a security audit), guard_tool_result returned HookAction::Terminate, killing the entire worker with:

cancelled: Tool output contained a secret. Agent terminated to prevent exfiltration.

This was overly aggressive — workers can't communicate with users directly (no reply tool, no messaging), so the terminate-to-prevent-exfiltration rationale doesn't apply. The existing egress scrubbing on worker results, branch conclusions, and status updates already prevents secrets from reaching users.

Approach

Core change: guard_tool_result is now process-type-aware:

  • Worker/Branch → log warning, return Continue
  • Channel/Compactor/Cortex → keep existing Terminate behavior

Defense-in-depth: Since workers/branches now continue past secrets, all egress paths apply both scrubbing layers:

  • Layer 1: exact-match redaction of known secrets from the store (scrub_with_store)
  • Layer 2: regex-based redaction of unknown leak patterns (scrub_leaks) — new
Egress path File Change
Worker result → channel channel_dispatch.rs Added scrub_leaks()
Branch conclusion → channel branch.rs Added scrub_leaks()
Branch failure → channel channel_dispatch.rs Added scrub_leaks() (was unscrubbed)
Worker status → channel set_status.rs Added scrub_leaks()
Cortex task worker → channel injection cortex.rs Added both layers (was completely unscrubbed)

Verification

  • gate-pr passes (373 lib tests, clippy clean, fmt clean)
  • 3 new tests for scrub_leaks() covering single secret, multiple secrets, and clean passthrough

Workers and branches were being terminated when leak detection found a
secret pattern in tool output (e.g. reading a file containing an API
key). This was overly aggressive — these processes cannot communicate
with users directly, so the terminate-to-prevent-exfiltration rationale
doesn't apply.

Changes:
- guard_tool_result now checks process_type: Worker/Branch log a warning
  and continue; Channel/Compactor/Cortex still terminate
- Added scrub_leaks() to scrub.rs for regex-based redaction of unknown
  secret patterns on egress paths
- All egress paths (worker results, branch conclusions, set_status,
  cortex task workers, branch failure messages) now apply both Layer 1
  (exact-match store scrubbing) and Layer 2 (regex leak scrubbing)
- Fixed pre-existing gap: cortex pickup_one_ready_task worker results
  were completely unscrubbed before reaching channel injection
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 4, 2026

Walkthrough

Adds a regex-based redaction pass (scrub_leaks) and applies it after existing store-based scrubbing across agent flows, tool status broadcasting, and hook leak handling; refines leak handling behavior for different process types (non-fatal for Worker/Branch, fatal for Channel/Compactor/Cortex).

Changes

Cohort / File(s) Summary
Secret scrubbing infra
src/secrets/scrub.rs
Adds pub fn scrub_leaks(&str) -> String, a PEM-block regex and plaintext leak patterns redaction; includes new tests covering single/multiple secrets and PEM blocks.
Agent flows (branch/channel/cortex)
src/agent/branch.rs, src/agent/channel_dispatch.rs, src/agent/cortex.rs
Always apply two-layer scrubbing: run existing store-based scrub_with_store then scrub_leaks on branch conclusions, channel/worker results, and cortex worker outputs/errors; capture secret snapshots at spawn time.
Tool status & hooks
src/tools/set_status.rs, src/hooks/spacebot.rs
Apply scrub_leaks after store-based scrubbing for status broadcasts; refine guard_tool_result to log non-fatal leaks for Worker/Branch and keep termination for Channel/Compactor/Cortex; adjust event emissions and leak logging.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately captures the main change: workers/branches now continue when secrets are detected in tool output, rather than terminating.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the problem, approach, and verification for the secret-handling improvements.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/worker-secret-leak-continue

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Comment on lines 260 to 273
async move {
if let Err(error) = branch.run(&prompt).await {
tracing::error!(branch_id = %branch_id, %error, "branch failed");
// Scrub the failure message in case the error contains secrets
// (e.g. from failed tool calls echoing back prompt content).
let conclusion =
crate::secrets::scrub::scrub_leaks(&format!("Branch failed: {error}"));
let _ = event_tx.send(crate::ProcessEvent::BranchResult {
agent_id,
branch_id,
channel_id,
conclusion: format!("Branch failed: {error}"),
conclusion,
});
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch scrubbing the branch failure path.

One thing: this only applies scrub_leaks(). For consistency with the other egress paths in this PR (and to cover secrets that are in the store), it’d be good to also apply scrub_with_store() here. Since branch.run() consumes branch, you can snapshot secrets before calling it.

Suggested change
async move {
if let Err(error) = branch.run(&prompt).await {
tracing::error!(branch_id = %branch_id, %error, "branch failed");
// Scrub the failure message in case the error contains secrets
// (e.g. from failed tool calls echoing back prompt content).
let conclusion =
crate::secrets::scrub::scrub_leaks(&format!("Branch failed: {error}"));
let _ = event_tx.send(crate::ProcessEvent::BranchResult {
agent_id,
branch_id,
channel_id,
conclusion: format!("Branch failed: {error}"),
conclusion,
});
}
async move {
let secrets_snapshot = branch.deps.runtime_config.secrets.load().clone();
if let Err(error) = branch.run(&prompt).await {
tracing::error!(branch_id = %branch_id, %error, "branch failed");
// Scrub the failure message in case the error contains secrets
// (e.g. from failed tool calls echoing back prompt content).
let raw = format!("Branch failed: {error}");
let scrubbed = if let Some(store) = secrets_snapshot.as_ref() {
crate::secrets::scrub::scrub_with_store(&raw, store)
} else {
raw
};
let conclusion = crate::secrets::scrub::scrub_leaks(&scrubbed);
let _ = event_tx.send(crate::ProcessEvent::BranchResult {
agent_id,
branch_id,
channel_id,
conclusion,
});
}
}

Comment on lines +233 to +243
/// Does NOT check encoded forms (base64, hex, URL-encoded) — those are
/// unlikely to appear in LLM-generated output text.
pub fn scrub_leaks(content: &str) -> String {
let mut result = content.to_string();
for pattern in LEAK_PATTERNS.iter() {
result = pattern
.replace_all(&result, "[LEAKED_SECRET_REDACTED]")
.into_owned();
}
result
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scrub_leaks() currently uses LEAK_PATTERNS, which includes the private-key header regex. That pattern only matches the header line, so this will redact -----BEGIN ... PRIVATE KEY----- but leave the base64 body untouched.

Given this PR is relying on egress scrubbing more heavily, I think it’s worth adding a dedicated multi-line PEM block scrub so we don’t accidentally “hide the header” but still leak the key material.

Suggested change
/// Does NOT check encoded forms (base64, hex, URL-encoded) — those are
/// unlikely to appear in LLM-generated output text.
pub fn scrub_leaks(content: &str) -> String {
let mut result = content.to_string();
for pattern in LEAK_PATTERNS.iter() {
result = pattern
.replace_all(&result, "[LEAKED_SECRET_REDACTED]")
.into_owned();
}
result
}
pub fn scrub_leaks(content: &str) -> String {
static PEM_BLOCK: LazyLock<Regex> = LazyLock::new(|| {
Regex::new(r"(?s)-----BEGIN [^-]+-----.*?-----END [^-]+-----").expect("hardcoded regex")
});
let mut result = PEM_BLOCK
.replace_all(content, "[LEAKED_SECRET_REDACTED]")
.into_owned();
for pattern in LEAK_PATTERNS.iter() {
result = pattern
.replace_all(&result, "[LEAKED_SECRET_REDACTED]")
.into_owned();
}
result
}

Comment on lines 260 to +286
if let Some(leak) = self.scan_for_leaks(result) {
tracing::error!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret leak detected in tool output, terminating agent"
);
return HookAction::Terminate {
reason: "Tool output contained a secret. Agent terminated to prevent exfiltration."
.into(),
};
match self.process_type {
ProcessType::Worker | ProcessType::Branch => {
// Workers and branches cannot communicate with users directly.
// Their egress paths (worker results, branch conclusions,
// status updates) scrub secrets before content reaches the
// channel. Log and continue rather than killing the process.
tracing::warn!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret detected in tool output (non-channel process, continuing)"
);
}
ProcessType::Channel | ProcessType::Compactor | ProcessType::Cortex => {
tracing::error!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret leak detected in tool output, terminating agent"
);
return HookAction::Terminate {
reason:
"Tool output contained a secret. Agent terminated to prevent exfiltration."
.into(),
};
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit on the new “continue” path: leak_prefix still logs the first few chars of the matched secret. Since this can fire in more places now (workers/branches), I’d avoid logging any secret fragment at all.

One option is to log only the leak kind (plaintext vs url/base64/hex-encoded) without the match contents:

Suggested change
if let Some(leak) = self.scan_for_leaks(result) {
tracing::error!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret leak detected in tool output, terminating agent"
);
return HookAction::Terminate {
reason: "Tool output contained a secret. Agent terminated to prevent exfiltration."
.into(),
};
match self.process_type {
ProcessType::Worker | ProcessType::Branch => {
// Workers and branches cannot communicate with users directly.
// Their egress paths (worker results, branch conclusions,
// status updates) scrub secrets before content reaches the
// channel. Log and continue rather than killing the process.
tracing::warn!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret detected in tool output (non-channel process, continuing)"
);
}
ProcessType::Channel | ProcessType::Compactor | ProcessType::Cortex => {
tracing::error!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_prefix = %&leak[..leak.len().min(8)],
"secret leak detected in tool output, terminating agent"
);
return HookAction::Terminate {
reason:
"Tool output contained a secret. Agent terminated to prevent exfiltration."
.into(),
};
}
if let Some(leak) = self.scan_for_leaks(result) {
let leak_kind = leak
.split_once(':')
.map(|(kind, _)| kind)
.unwrap_or("plaintext");
match self.process_type {
ProcessType::Worker | ProcessType::Branch => {
// Workers and branches cannot communicate with users directly.
// Their egress paths (worker results, branch conclusions,
// status updates) scrub secrets before content reaches the
// channel. Log and continue rather than killing the process.
tracing::warn!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_kind = %leak_kind,
"secret detected in tool output (non-channel process, continuing)"
);
}
ProcessType::Channel | ProcessType::Compactor | ProcessType::Cortex => {
tracing::error!(
process_id = %self.process_id,
tool_name = %tool_name,
leak_kind = %leak_kind,
"secret leak detected in tool output, terminating agent"
);
return HookAction::Terminate {
reason:
"Tool output contained a secret. Agent terminated to prevent exfiltration."
.into(),
};
}
}
}

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
src/tools/set_status.rs (1)

39-41: ⚠️ Potential issue | 🟠 Major

Status scrubbing depends on a spawn-time secret snapshot.

SetStatusTool redacts with tool_secret_pairs captured at construction. If secrets rotate or are added during a long-running worker, Layer 1 exact-match redaction can miss them. Consider reading from SecretsStore at call() time (scrub_with_store) before applying scrub_leaks().

Also applies to: 104-107

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/set_status.rs` around lines 39 - 41, SetStatusTool currently
captures secrets at construction via with_tool_secrets into tool_secret_pairs,
which causes redaction to miss rotated/added secrets; update the call()
implementation to fetch the latest secrets from SecretsStore at runtime (use
scrub_with_store to apply the store-backed redaction) before running
scrub_leaks, and refactor with_tool_secrets/tool_secret_pairs usage to either
keep as fallback or remove reliance on the snapshot so call() always consults
SecretsStore first; ensure references to SetStatusTool, with_tool_secrets,
tool_secret_pairs, call(), scrub_with_store, scrub_leaks, and SecretsStore are
used so the live-secret scrub happens at spawn-time execution rather than
construction.
src/agent/channel_dispatch.rs (1)

623-636: ⚠️ Potential issue | 🟠 Major

Panic-derived worker failures bypass both scrub layers.

The panic path wraps panic_message directly into WorkerCompletionError::failed(...), then emits it in completion events without scrub_with_store/scrub_leaks.

🔧 Proposed fix
             Err(panic_payload) => {
                 let panic_message = panic_payload
                     .downcast_ref::<&str>()
                     .map(|message| (*message).to_string())
                     .or_else(|| panic_payload.downcast_ref::<String>().cloned())
                     .unwrap_or_else(|| "unknown panic payload".to_string());
@@
-                Err(WorkerCompletionError::failed(format!(
-                    "worker task panicked: {panic_message}"
-                )))
+                let scrubbed_panic = if let Some(store) = &secrets_store {
+                    crate::secrets::scrub::scrub_with_store(&panic_message, store)
+                } else {
+                    panic_message
+                };
+                let scrubbed_panic = crate::secrets::scrub::scrub_leaks(&scrubbed_panic);
+                Err(WorkerCompletionError::failed(format!(
+                    "worker task panicked: {scrubbed_panic}"
+                )))
             }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_dispatch.rs` around lines 623 - 636, The panic path
captures panic_payload into panic_message and then constructs
WorkerCompletionError::failed and logs it without redaction; update this path to
run panic_message through the existing scrub pipeline (call scrub_with_store and
then scrub_leaks, using the same store/context used elsewhere) before logging
and before passing into WorkerCompletionError::failed so completion events see
the scrubbed string; reference the panic_payload/panic_message handling in the
Err arm, the WorkerCompletionError::failed call, and reuse the existing
scrub_with_store and scrub_leaks helpers to produce the sanitized message for
both tracing::error and the error construction.
src/hooks/spacebot.rs (1)

259-291: ⚠️ Potential issue | 🟡 Minor

Add targeted tests for the new process-type leak matrix.

This security-critical behavior split (Worker/Branch => continue, Channel/Compactor/Cortex => terminate) needs direct tests for both guard_tool_result outcomes and emitted event payload safety.

As per coding guidelines, "For changes in async/stateful paths (worker lifecycle, cancellation, retrigger, recall cache behavior), include explicit race/terminal-state reasoning in the PR summary and run targeted tests in addition to just gate-pr".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/spacebot.rs` around lines 259 - 291, Add targeted unit tests for
guard_tool_result: mock or stub scan_for_leaks to return Some(leak) and assert
that for ProcessType::Worker and ProcessType::Branch the function returns
HookAction::Continue and emits a warning (tracing message) that includes only
the leak_prefix (first ≤8 chars) and not the full secret; and assert that for
ProcessType::Channel, ProcessType::Compactor, and ProcessType::Cortex it returns
HookAction::Terminate with the expected reason string and emits an error-level
tracing record containing the leak_prefix but not the full secret. Also add
tests that any event/emitted payloads produced after guard_tool_result are
sanitized (no full secret content) to validate emitted event payload safety.
Ensure tests reference guard_tool_result, scan_for_leaks, process_type and
HookAction to locate the code under test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 598-605: The redaction uses the spawn-time snapshot variable
secrets_store when calling crate::secrets::scrub::scrub_with_store(&text,
store), so exact-match redaction can miss secrets updated while the worker ran;
replace use of the captured secrets_store with a runtime lookup of the current
secrets store at completion (e.g., fetch the up-to-date store from the
shared/sealed secrets manager or lock the Arc/Mutex that holds the live store)
and then call scrub_with_store with that live store before calling
crate::secrets::scrub::scrub_leaks; apply the same change where secrets_store is
used at lines ~618-619 to ensure exact-match redaction always uses the latest
secrets rather than the spawn-time snapshot.
- Around line 263-267: The branch-failure path only applies redact regexes via
crate::secrets::scrub::scrub_leaks to build conclusion and then sends
ProcessEvent::BranchResult, which can still leak secrets stored at Layer 1;
update the code around the conclusion variable so you first apply the Layer 1
store scrubber (e.g., call the Layer 1 scrub function from the secrets store
module) and then pass that result through scrub_leaks, combining both phases
into the final conclusion before sending crate::ProcessEvent::BranchResult so
both store-level and regex-based scrubbing are applied.

In `@src/agent/cortex.rs`:
- Around line 1340-1342: The cortex event logging still writes raw
error.to_string() into logger.log/cortex_events even though
map_worker_completion_result wraps the error with
WorkerCompletionError::failed(scrub(...)); update the logging calls that use
error or error.to_string() (the occurrence around map_worker_completion_result
and the similar block at 1371-1378) to log the scrubbed message instead—either
extract the scrubbed string from scrub(error.to_string()) before constructing
WorkerCompletionError or use the already-scrubbed error_message returned by
map_worker_completion_result when calling logger.log/cortex_events so no raw
error text is sent to logs.
- Around line 1263-1274: Currently the spawned task captures a one-time
secrets_snapshot so Layer 1 redaction is stale; instead, move a handle to the
live secrets container into the spawned task (e.g. clone
deps.runtime_config.secrets and move that in), remove the pre-captured
secrets_snapshot, and change the scrub closure to fetch the current store on
each invocation (call secrets_handle.load().clone() and pass that to
crate::secrets::scrub::scrub_with_store) before calling scrub_leaks; keep using
the same scrub_with_store and scrub_leaks functions inside the closure and
ensure the tokio::spawn captures the secrets handle rather than a single
snapshot.

In `@src/hooks/spacebot.rs`:
- Around line 259-287: guard_tool_result currently detects leaks for
ProcessType::Worker and ProcessType::Branch but then continues, allowing
on_tool_result to emit ProcessEvent::ToolCompleted with the raw unsanitized
result; modify guard_tool_result (and related handling around
on_tool_result/ProcessEvent::ToolCompleted) so that when scan_for_leaks returns
Some(leak) for Worker or Branch you redact/sanitize the result before continuing
(e.g., call the existing scrub/sanitize routine or replace leaked spans with a
safe placeholder) and return/propagate the sanitized payload (or a HookAction
variant that carries the sanitized result) instead of the original string;
ensure callers of guard_tool_result (and any logic that emits
ProcessEvent::ToolCompleted) use that sanitized value so no raw tool output is
broadcast (also apply the same change to the other occurrence around the 500-537
region).

In `@src/secrets/scrub.rs`:
- Around line 233-243: scrub_leaks currently only replaces plaintext matches
(using LEAK_PATTERNS) so encoded secrets (base64/hex/url-encoded) slip through;
update scrub_leaks to also detect and redact encoded forms by either (a)
generating and replacing encoded variants of each LEAK_PATTERNS entry (base64,
hex, percent-encoding) before returning, or (b) scanning the content for encoded
substrings, decoding them (base64/hex/url-decode) and if the decoded value
matches a LEAK_PATTERNS entry, replace the original encoded substring with
"[LEAKED_SECRET_REDACTED]"; reference scrub_leaks, LEAK_PATTERNS, and
scan_for_leaks to mirror its encoding detection logic so egress redaction is
consistent.

---

Outside diff comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 623-636: The panic path captures panic_payload into panic_message
and then constructs WorkerCompletionError::failed and logs it without redaction;
update this path to run panic_message through the existing scrub pipeline (call
scrub_with_store and then scrub_leaks, using the same store/context used
elsewhere) before logging and before passing into WorkerCompletionError::failed
so completion events see the scrubbed string; reference the
panic_payload/panic_message handling in the Err arm, the
WorkerCompletionError::failed call, and reuse the existing scrub_with_store and
scrub_leaks helpers to produce the sanitized message for both tracing::error and
the error construction.

In `@src/hooks/spacebot.rs`:
- Around line 259-291: Add targeted unit tests for guard_tool_result: mock or
stub scan_for_leaks to return Some(leak) and assert that for ProcessType::Worker
and ProcessType::Branch the function returns HookAction::Continue and emits a
warning (tracing message) that includes only the leak_prefix (first ≤8 chars)
and not the full secret; and assert that for ProcessType::Channel,
ProcessType::Compactor, and ProcessType::Cortex it returns HookAction::Terminate
with the expected reason string and emits an error-level tracing record
containing the leak_prefix but not the full secret. Also add tests that any
event/emitted payloads produced after guard_tool_result are sanitized (no full
secret content) to validate emitted event payload safety. Ensure tests reference
guard_tool_result, scan_for_leaks, process_type and HookAction to locate the
code under test.

In `@src/tools/set_status.rs`:
- Around line 39-41: SetStatusTool currently captures secrets at construction
via with_tool_secrets into tool_secret_pairs, which causes redaction to miss
rotated/added secrets; update the call() implementation to fetch the latest
secrets from SecretsStore at runtime (use scrub_with_store to apply the
store-backed redaction) before running scrub_leaks, and refactor
with_tool_secrets/tool_secret_pairs usage to either keep as fallback or remove
reliance on the snapshot so call() always consults SecretsStore first; ensure
references to SetStatusTool, with_tool_secrets, tool_secret_pairs, call(),
scrub_with_store, scrub_leaks, and SecretsStore are used so the live-secret
scrub happens at spawn-time execution rather than construction.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 27f96b9d-6bd1-4ca7-b673-966b3d727fa0

📥 Commits

Reviewing files that changed from the base of the PR and between 1d8a06a and f8e4000.

📒 Files selected for processing (6)
  • src/agent/branch.rs
  • src/agent/channel_dispatch.rs
  • src/agent/cortex.rs
  • src/hooks/spacebot.rs
  • src/secrets/scrub.rs
  • src/tools/set_status.rs

Comment on lines +598 to +605
// Layer 1: exact-match redaction of known secrets from the store.
// Layer 2: regex-based redaction of unknown secret patterns.
let scrubbed = if let Some(store) = &secrets_store {
crate::secrets::scrub::scrub_with_store(&text, store)
} else {
text
};
let scrubbed = crate::secrets::scrub::scrub_leaks(&scrubbed);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Worker completion redaction still uses a spawn-time store snapshot.

Layer 1 here depends on secrets_store captured before the worker starts. If secrets change during execution, exact-match redaction can miss updated values.

Also applies to: 618-619

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_dispatch.rs` around lines 598 - 605, The redaction uses the
spawn-time snapshot variable secrets_store when calling
crate::secrets::scrub::scrub_with_store(&text, store), so exact-match redaction
can miss secrets updated while the worker ran; replace use of the captured
secrets_store with a runtime lookup of the current secrets store at completion
(e.g., fetch the up-to-date store from the shared/sealed secrets manager or lock
the Arc/Mutex that holds the live store) and then call scrub_with_store with
that live store before calling crate::secrets::scrub::scrub_leaks; apply the
same change where secrets_store is used at lines ~618-619 to ensure exact-match
redaction always uses the latest secrets rather than the spawn-time snapshot.

Comment on lines +1263 to +1274
let secrets_snapshot = deps.runtime_config.secrets.load().clone();
tokio::spawn(async move {
// Helper closure: scrub both known secrets (Layer 1) and unknown leak
// patterns (Layer 2) from text before it reaches channels or events.
let scrub = |text: String| -> String {
let scrubbed = if let Some(store) = secrets_snapshot.as_ref() {
crate::secrets::scrub::scrub_with_store(&text, store)
} else {
text
};
crate::secrets::scrub::scrub_leaks(&scrubbed)
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Cortex task scrubbing uses a stale secrets snapshot.

secrets_snapshot is captured before the worker task starts. For long-running picked-up tasks, secret updates after spawn are invisible to Layer 1 redaction.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/cortex.rs` around lines 1263 - 1274, Currently the spawned task
captures a one-time secrets_snapshot so Layer 1 redaction is stale; instead,
move a handle to the live secrets container into the spawned task (e.g. clone
deps.runtime_config.secrets and move that in), remove the pre-captured
secrets_snapshot, and change the scrub closure to fetch the current store on
each invocation (call secrets_handle.load().clone() and pass that to
crate::secrets::scrub::scrub_with_store) before calling scrub_leaks; keep using
the same scrub_with_store and scrub_leaks functions inside the closure and
ensure the tokio::spawn captures the secrets handle rather than a single
snapshot.

Comment on lines 1340 to 1342
let (error_message, notify, success) = map_worker_completion_result(Err(
WorkerCompletionError::failed(error.to_string()),
WorkerCompletionError::failed(scrub(error.to_string())),
));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Failure logs still persist unsanitized error text.

You scrub the failure text for completion events, but logger.log still stores raw {error}/error.to_string(), which can reintroduce secret leakage in cortex_events.

🔧 Proposed fix
-                let (error_message, notify, success) = map_worker_completion_result(Err(
-                    WorkerCompletionError::failed(scrub(error.to_string())),
-                ));
+                let scrubbed_error = scrub(error.to_string());
+                let (error_message, notify, success) = map_worker_completion_result(Err(
+                    WorkerCompletionError::failed(scrubbed_error.clone()),
+                ));
@@
                 logger.log(
                     "task_pickup_failed",
-                    &format!("Picked-up task #{} failed: {error}", task.task_number),
+                    &format!("Picked-up task #{} failed: {scrubbed_error}", task.task_number),
                     Some(serde_json::json!({
                         "task_number": task.task_number,
                         "worker_id": worker_id.to_string(),
-                        "error": error.to_string(),
+                        "error": scrubbed_error,
                     })),
                 );

Also applies to: 1371-1378

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/cortex.rs` around lines 1340 - 1342, The cortex event logging still
writes raw error.to_string() into logger.log/cortex_events even though
map_worker_completion_result wraps the error with
WorkerCompletionError::failed(scrub(...)); update the logging calls that use
error or error.to_string() (the occurrence around map_worker_completion_result
and the similar block at 1371-1378) to log the scrubbed message instead—either
extract the scrubbed string from scrub(error.to_string()) before constructing
WorkerCompletionError or use the already-scrubbed error_message returned by
map_worker_completion_result when calling logger.log/cortex_events so no raw
error text is sent to logs.

/// Apply shared safety checks for tool output before any downstream handling.
///
/// For channels, a detected secret terminates the agent immediately to prevent
/// exfiltration via the `reply` tool. For workers and branches, secrets are
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to double-check with the new Worker/Branch “continue” behavior: the model will now see raw secret values in tool output.

Even if workers/branches can’t reply, they can still egress via tools like browser (outbound HTTP) and potentially shell/exec. Might be worth adding an args-level leak guard for those egressy tools (e.g. skip browser calls when scan_for_leaks(args) hits), or at least clarifying that the threat model here is strictly “don’t leak to channel/user”.

…crub ToolCompleted events

- Add PEM_BLOCK regex to scrub_leaks() so full private key blocks are
  redacted (header + body + footer), not just the header line
- Remove leak_prefix from worker/branch warn logs — only log the
  encoding kind (plaintext/url/base64/hex) and length, never secret
  fragments
- Scrub worker/branch ToolCompleted event payloads before they reach the
  SSE dashboard
- Add scrub_with_store() to branch failure path for Layer 1 consistency
- Add PEM block scrubbing test
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/hooks/spacebot.rs (1)

287-293: ⚠️ Potential issue | 🟠 Major

Terminate-path logging still leaks secret fragments via leak_prefix.

The Channel/Compactor/Cortex branch logs part of the matched secret. Please log only metadata (kind/length) here as well to avoid persisting secret material in logs.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/spacebot.rs` around lines 287 - 293, The log in the branch handling
ProcessType::Channel | ProcessType::Compactor | ProcessType::Cortex currently
emits leak_prefix (part of the secret); change the tracing::error call in that
branch (the block using self.process_id and tool_name) to stop including
leak_prefix and instead log only non-secret metadata such as the secret kind and
its length (e.g., leak_kind and leak_len) so no secret fragments are written to
logs.
src/secrets/scrub.rs (1)

241-255: ⚠️ Potential issue | 🟠 Major

scrub_leaks() misses encoded leak forms that scan_for_leaks() already flags.

scan_for_leaks() detects url/base64/hex-encoded secrets, but scrub_leaks() only redacts plaintext/PEM. In worker/branch “continue” paths, encoded secrets can still egress in payload text.

Please align redaction with detection (or decode-and-replace encoded matches) and add coverage for encoded redaction cases.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/secrets/scrub.rs` around lines 241 - 255, scrub_leaks currently only
redacts PEM_BLOCK and LEAK_PATTERNS but misses encoded forms that scan_for_leaks
flags; update scrub_leaks (the function) to also redact encoded matches by
either (a) applying the same encoded regexes used by scan_for_leaks (e.g.
BASE64_PATTERN, HEX_PATTERN, URL_ENCODED_PATTERN) to replace_with
"[LEAKED_SECRET_REDACTED]" or (b) invoking scan_for_leaks to enumerate matches
and replacing each match (including decoded variants) in the result; ensure you
reference and reuse the existing regex constants (PEM_BLOCK, LEAK_PATTERNS and
the encoded-pattern constants) so encoded/base64/hex/url-encoded secrets are
redacted the same way they are detected.
src/agent/channel_dispatch.rs (1)

252-253: ⚠️ Potential issue | 🟠 Major

Spawn-time secrets snapshots can miss rotations before completion/failure scrubbing.

Both the branch failure path and worker completion/failure path scrub with stores captured before the async run starts. If secrets change during execution, Layer 1 redaction can miss newer values.

Prefer loading the current store at scrub time (inside the async completion/failure block) instead of using a spawn-time snapshot.

Also applies to: 269-274, 608-613, 621-626

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/agent/channel_dispatch.rs` around lines 252 - 253, Currently a spawn-time
snapshot variable secrets_snapshot is captured and used for scrubbing, which can
miss secret rotations; instead, remove the use of the pre-captured
secrets_snapshot and call state.deps.runtime_config.secrets.load().clone()
inside the async completion/failure scrub blocks so the scrub reads the current
store at scrub time (replace occurrences of secrets_snapshot with a fresh load
in the completion/failure handlers and the other similar occurrences).
🧹 Nitpick comments (1)
src/secrets/scrub.rs (1)

360-364: Avoid embedding PEM-looking key blocks directly in test source.

This fixture is likely to keep triggering secret scanners. Prefer constructing the PEM string dynamically at runtime (split/concat header/body/footer) so test intent remains while avoiding static secret-pattern matches in committed code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/secrets/scrub.rs` around lines 360 - 364, The test
scrub_leaks_redacts_full_pem_block currently embeds a PEM-like block literal
which can trigger secret scanners; change the fixture to build the PEM string at
runtime by concatenating or joining separate pieces (e.g., header, body
segments, footer) instead of a single hard-coded block so the test intent
remains but no static secret-pattern appears in source; update the variable in
scrub_leaks_redacts_full_pem_block accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/hooks/spacebot.rs`:
- Around line 548-555: The ToolCompleted branch currently only calls scrub_leaks
on result and misses store-based redaction; fetch the live/current secrets store
(using your existing secrets store accessor, e.g.
crate::secrets::store::current() or equivalent), call
crate::secrets::scrub::scrub_with_store(result, &store) first, then pass that
output into crate::secrets::scrub::scrub_leaks, then truncate with
crate::tools::truncate_output(..., crate::tools::MAX_TOOL_OUTPUT_BYTES) and
finally call self.emit_tool_completed_event_from_capped(tool_name, capped);
ensure you still only do this for matches!(self.process_type,
ProcessType::Worker | ProcessType::Branch).

---

Duplicate comments:
In `@src/agent/channel_dispatch.rs`:
- Around line 252-253: Currently a spawn-time snapshot variable secrets_snapshot
is captured and used for scrubbing, which can miss secret rotations; instead,
remove the use of the pre-captured secrets_snapshot and call
state.deps.runtime_config.secrets.load().clone() inside the async
completion/failure scrub blocks so the scrub reads the current store at scrub
time (replace occurrences of secrets_snapshot with a fresh load in the
completion/failure handlers and the other similar occurrences).

In `@src/hooks/spacebot.rs`:
- Around line 287-293: The log in the branch handling ProcessType::Channel |
ProcessType::Compactor | ProcessType::Cortex currently emits leak_prefix (part
of the secret); change the tracing::error call in that branch (the block using
self.process_id and tool_name) to stop including leak_prefix and instead log
only non-secret metadata such as the secret kind and its length (e.g., leak_kind
and leak_len) so no secret fragments are written to logs.

In `@src/secrets/scrub.rs`:
- Around line 241-255: scrub_leaks currently only redacts PEM_BLOCK and
LEAK_PATTERNS but misses encoded forms that scan_for_leaks flags; update
scrub_leaks (the function) to also redact encoded matches by either (a) applying
the same encoded regexes used by scan_for_leaks (e.g. BASE64_PATTERN,
HEX_PATTERN, URL_ENCODED_PATTERN) to replace_with "[LEAKED_SECRET_REDACTED]" or
(b) invoking scan_for_leaks to enumerate matches and replacing each match
(including decoded variants) in the result; ensure you reference and reuse the
existing regex constants (PEM_BLOCK, LEAK_PATTERNS and the encoded-pattern
constants) so encoded/base64/hex/url-encoded secrets are redacted the same way
they are detected.

---

Nitpick comments:
In `@src/secrets/scrub.rs`:
- Around line 360-364: The test scrub_leaks_redacts_full_pem_block currently
embeds a PEM-like block literal which can trigger secret scanners; change the
fixture to build the PEM string at runtime by concatenating or joining separate
pieces (e.g., header, body segments, footer) instead of a single hard-coded
block so the test intent remains but no static secret-pattern appears in source;
update the variable in scrub_leaks_redacts_full_pem_block accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 710d64a5-420d-4fd4-9a4e-dda582021324

📥 Commits

Reviewing files that changed from the base of the PR and between f8e4000 and 321dd25.

📒 Files selected for processing (3)
  • src/agent/channel_dispatch.rs
  • src/hooks/spacebot.rs
  • src/secrets/scrub.rs

Comment on lines +548 to +555
// event subscribers with multi-MB tool results. For worker/branch
// processes, scrub leak patterns from the event payload so secrets
// don't reach the SSE dashboard.
if matches!(self.process_type, ProcessType::Worker | ProcessType::Branch) {
let scrubbed = crate::secrets::scrub::scrub_leaks(result);
let capped =
crate::tools::truncate_output(&scrubbed, crate::tools::MAX_TOOL_OUTPUT_BYTES);
self.emit_tool_completed_event_from_capped(tool_name, capped);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Worker/Branch ToolCompleted payload scrubbing is missing Layer 1 store redaction.

This path applies scrub_leaks() only. Secrets known to the store but not matching regex can still reach SSE/dashboard via ProcessEvent::ToolCompleted.

Please apply store-based scrubbing first (scrub_with_store) and then scrub_leaks, using a live/current secrets store.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/hooks/spacebot.rs` around lines 548 - 555, The ToolCompleted branch
currently only calls scrub_leaks on result and misses store-based redaction;
fetch the live/current secrets store (using your existing secrets store
accessor, e.g. crate::secrets::store::current() or equivalent), call
crate::secrets::scrub::scrub_with_store(result, &store) first, then pass that
output into crate::secrets::scrub::scrub_leaks, then truncate with
crate::tools::truncate_output(..., crate::tools::MAX_TOOL_OUTPUT_BYTES) and
finally call self.emit_tool_completed_event_from_capped(tool_name, capped);
ensure you still only do this for matches!(self.process_type,
ProcessType::Worker | ProcessType::Branch).

@jamiepine jamiepine merged commit 9eceb96 into main Mar 4, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant