Split channel.rs and standardize adapter metadata keys#271
Conversation
Break up the ~3900-line channel.rs into flat channel_*.rs files: - channel_attachments.rs: image/text/audio download, base64, transcription - channel_history.rs: history reconciliation, message formatting, 18 tests - channel_prompt.rs: TemporalContext, timezone resolution, constants - channel_dispatch.rs: branch/worker spawning, limits, readiness checks Add metadata_keys module in lib.rs with standard keys (server_name, channel_name, message_id, reply_to_message_id, reply_to_text). All adapters now set these alongside platform-specific keys. Consumers read standard keys instead of platform-specific fallback chains: - channel.rs conversation context uses metadata_keys::SERVER_NAME/CHANNEL_NAME - extract_message_id reads standard message_id (was discord-only) - retrigger metadata uses metadata_keys::REPLY_TO_MESSAGE_ID - reply context reads metadata_keys::REPLY_TO_TEXT - extract_display_name reads channel_name (display formatting pushed to adapters) Slack adapter now sets channel_name with # prefix for channels. Email adapter already sets 'Email: subject' format. Discord outbound reply reads standard key only (fallback removed).
|
Note Currently processing new changes in this PR. This may take a few minutes, please wait... 📒 Files selected for processing (8)
✏️ Tip: You can disable in-progress messages and the fortune message in your review settings. WalkthroughThis PR refactors the agent module by extracting channel-related functionality into four new specialized modules: channel_attachments for attachment processing, channel_dispatch for spawning background tasks and workers, channel_history for history reconciliation and message formatting, and channel_prompt for temporal context and system prompt assembly. It also introduces a centralized metadata_keys module with standardized keys (SERVER_NAME, CHANNEL_NAME, MESSAGE_ID, REPLY_TO_MESSAGE_ID, REPLY_TO_TEXT) used across all messaging adapters. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
src/agent/channel_history.rs
Outdated
| format!("[{display_name}] ({absolute_timestamp}; {relative_text}): {text_content}") | ||
| } | ||
|
|
||
| pub(crate) fn extract_message_id(message: &InboundMessage) -> Option<u64> { |
There was a problem hiding this comment.
MESSAGE_ID is set as a string in adapters (e.g. Slack ts, Telegram message_id). as_u64() will always return None and reply threading will regress for numeric platforms. Suggest parsing the string too.
| pub(crate) fn extract_message_id(message: &InboundMessage) -> Option<u64> { | |
| pub(crate) fn extract_message_id(message: &InboundMessage) -> Option<u64> { | |
| let value = message.metadata.get(crate::metadata_keys::MESSAGE_ID)?; | |
| if let Some(id) = value.as_u64() { | |
| return Some(id); | |
| } | |
| value.as_str()?.parse().ok() | |
| } |
| .get("reply_to_author") | ||
| .and_then(|v| v.as_str()) | ||
| .map(|author| { | ||
| let content_preview = message |
There was a problem hiding this comment.
If not every adapter is populating the standardized reply_to_text key yet, this will drop the quoted preview. Maybe keep a temporary fallback to the legacy key.
| let content_preview = message | |
| let content_preview = message | |
| .metadata | |
| .get(crate::metadata_keys::REPLY_TO_TEXT) | |
| .or_else(|| message.metadata.get("reply_to_text")) | |
| .and_then(|v| v.as_str()) | |
| .unwrap_or(""); |
| let content = String::from_utf8_lossy(&bytes).into_owned(); | ||
|
|
||
| // Truncate very large files to avoid blowing up context | ||
| let truncated = if content.len() > 50_000 { |
There was a problem hiding this comment.
&content[..50_000] can panic if 50k lands mid-codepoint. Using floor_char_boundary keeps this safe while still truncating by bytes.
| let truncated = if content.len() > 50_000 { | |
| let truncated = if content.len() > 50_000 { | |
| let boundary = content.floor_char_boundary(50_000); | |
| format!( | |
| "{}...\n[truncated — {} bytes total]", | |
| &content[..boundary], | |
| content.len() | |
| ) | |
| } else { | |
| content | |
| }; |
| }; | ||
| (scrubbed, true, true) | ||
| } | ||
| Err(error) => { |
There was a problem hiding this comment.
Nice that success results are scrubbed before emitting WorkerComplete. The error path still forwards Worker failed: {error} unsanitized, which can leak tool secrets into channel context.
| Err(error) => { | |
| Err(error) => { | |
| tracing::error!(worker_id = %worker_id, %error, "worker failed"); | |
| let error_text = format!("Worker failed: {error}"); | |
| let scrubbed = if let Some(store) = &secrets_store { | |
| crate::secrets::scrub::scrub_with_store(&error_text, store) | |
| } else { | |
| error_text | |
| }; | |
| (scrubbed, true, false) | |
| } |
Also minor: worker_duration_seconds is labeled as builtin even when this helper is used by OpenCode workers now.
There was a problem hiding this comment.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/messaging/discord.rs (1)
916-935:⚠️ Potential issue | 🟠 MajorPopulate
REPLY_TO_MESSAGE_IDfor replied Discord messages.
respond()now readscrate::metadata_keys::REPLY_TO_MESSAGE_ID, but this metadata block only stores reply author/text. Without writing the reply target ID, reply-thread targeting becomes a no-op for replied messages.🔧 Proposed fix
if let Some(referenced) = &message.referenced_message { + metadata.insert( + crate::metadata_keys::REPLY_TO_MESSAGE_ID.into(), + referenced.id.get().into(), + ); let reply_author = referenced .author .global_name .as_deref() .unwrap_or(&referenced.author.name);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/messaging/discord.rs` around lines 916 - 935, The reply-metadata block for message.referenced_message populates reply author/text but omits the reply target ID, so respond() cannot target the original message; update the block that handles message.referenced_message to also insert crate::metadata_keys::REPLY_TO_MESSAGE_ID with the referenced message's ID (e.g., referenced.id converted to a string) into metadata so reply-thread targeting works; locate this in the same if let Some(referenced) = &message.referenced_message { ... } section and add the metadata.insert call using the existing metadata map and crate::metadata_keys::REPLY_TO_MESSAGE_ID symbol.
🧹 Nitpick comments (1)
src/messaging/webhook.rs (1)
267-270: Consider adding standardized metadata keys for consistency.The webhook adapter now sets
sender_display_name, but unlike other adapters (Discord, Slack, Telegram, Email, Twitch), it doesn't setcrate::metadata_keys::CHANNEL_NAMEorcrate::metadata_keys::MESSAGE_ID. This may causeextract_display_nameinsrc/conversation/channels.rsto returnNonefor webhook channels.If webhook channels should appear in the channel list with a display name, consider adding:
metadata.insert( "sender_display_name".into(), serde_json::Value::String(request.sender_id.clone()), ); + metadata.insert( + crate::metadata_keys::CHANNEL_NAME.into(), + serde_json::Value::String(format!("webhook:{}", request.conversation_id)), + );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/messaging/webhook.rs` around lines 267 - 270, The webhook adapter currently inserts "sender_display_name" into metadata via metadata.insert(..., serde_json::Value::String(request.sender_id.clone())), but doesn't set the standardized keys crate::metadata_keys::CHANNEL_NAME and crate::metadata_keys::MESSAGE_ID, which can make extract_display_name in src/conversation/channels.rs return None; update the webhook metadata block to also insert crate::metadata_keys::CHANNEL_NAME (e.g., the webhook channel or request.channel value) and crate::metadata_keys::MESSAGE_ID (e.g., request.message_id or a generated id) as serde_json::Value::String entries so webhook channels surface a display name consistently with Discord/Slack/Telegram/Email/Twitch adapters.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/agent/channel_attachments.rs`:
- Around line 419-423: The slice &content[..50_000] can panic on UTF-8
boundaries; compute a safe byte index using the project's floor_char_boundary
helper (e.g. let end = content.floor_char_boundary(50_000);) and replace
&content[..50_000] with &content[..end] when building the truncated string (the
let truncated = ... block referencing content). Ensure you pass the original
content.len() into the existing message and keep the same format string.
In `@src/agent/channel_dispatch.rs`:
- Around line 500-514: The Err arm of the future.await match formats the error
string without scrubbing secrets, which can leak sensitive data; update the Err
branch (the match handling that sets result_text, notify, success) to run the
error text through the same scrub logic used in the Ok branch (use secrets_store
and crate::secrets::scrub::scrub_with_store) before setting result_text so that
both success and failure payloads are scrubbed prior to emitting completion
events.
- Around line 194-205: The code currently drops send errors by calling .ok() on
state.deps.event_tx.send(crate::ProcessEvent::BranchStarted { ... }) which hides
failures to publish lifecycle events; replace the .ok() usage with explicit
error handling: await the send result, and on Err(logically handle) either log
the error with context using the existing logger (e.g., include agent_id,
branch_id, channel_id and description) or propagate the error from the
surrounding function; apply the same change pattern to the other send sites
mentioned (the other state.deps.event_tx.send(...) calls around the blocks at
the locations you flagged) so no ProcessEvent publish failure is silently
discarded.
- Around line 524-526: The duration metric currently hardcodes the worker type
label as "builtin" which misclassifies OpenCode workers; change the
.with_label_values call to use the actual worker type label variable instead of
the string literal (e.g. use the worker_type or worker_label argument passed
into spawn_worker_task). Update the metric invocation that references
worker_duration_seconds and agent_id so it uses that worker_type variable, and
ensure all call sites (spawn_worker_task and the places that call
spawn_worker_task with worker.run() / worker_span) pass the correct worker type
string ("builtin" or "opencode") into the spawning/metrics path.
In `@src/agent/channel_history.rs`:
- Around line 337-342: The extract_message_id function currently only tries
value.as_u64(), which ignores numeric IDs stored as strings; update
extract_message_id (operating on InboundMessage and metadata key
metadata_keys::MESSAGE_ID) to first try as_u64() and, if that returns None, try
value.as_str().and_then(|s| s.parse::<u64>().ok()) so stringified numeric IDs
are parsed and returned as Some(u64), keeping the same Option<u64> signature and
behavior when parsing fails.
- Around line 800-801: The tests incorrectly import Arc from crate root; replace
any use statements like "use crate::{Arc, InboundMessage};" with "use
std::sync::Arc; use crate::InboundMessage;" (or separate imports) so Arc comes
from std::sync while InboundMessage remains from crate; ensure all occurrences
(e.g., the imports around Arc at the top of channel_history.rs) are updated and
no lingering "crate::Arc" references remain.
In `@src/messaging/twitch.rs`:
- Around line 279-282: extract_message_id() currently only uses
serde_json::Value::as_u64(), which returns None for the adapters that store
MESSAGE_ID as a string (serde_json::Value::String); update extract_message_id()
in channel_history.rs to also handle Value::String by attempting to parse the
string into a u64 (e.g., match on Value::String(s) then s.parse::<u64>()), and
return that parsed value as the message id so reply_target_message_id and
extract_reply_message_id() work across all adapters that stringify MESSAGE_ID;
keep existing as_u64() branch to preserve numeric storage support.
---
Outside diff comments:
In `@src/messaging/discord.rs`:
- Around line 916-935: The reply-metadata block for message.referenced_message
populates reply author/text but omits the reply target ID, so respond() cannot
target the original message; update the block that handles
message.referenced_message to also insert
crate::metadata_keys::REPLY_TO_MESSAGE_ID with the referenced message's ID
(e.g., referenced.id converted to a string) into metadata so reply-thread
targeting works; locate this in the same if let Some(referenced) =
&message.referenced_message { ... } section and add the metadata.insert call
using the existing metadata map and crate::metadata_keys::REPLY_TO_MESSAGE_ID
symbol.
---
Nitpick comments:
In `@src/messaging/webhook.rs`:
- Around line 267-270: The webhook adapter currently inserts
"sender_display_name" into metadata via metadata.insert(...,
serde_json::Value::String(request.sender_id.clone())), but doesn't set the
standardized keys crate::metadata_keys::CHANNEL_NAME and
crate::metadata_keys::MESSAGE_ID, which can make extract_display_name in
src/conversation/channels.rs return None; update the webhook metadata block to
also insert crate::metadata_keys::CHANNEL_NAME (e.g., the webhook channel or
request.channel value) and crate::metadata_keys::MESSAGE_ID (e.g.,
request.message_id or a generated id) as serde_json::Value::String entries so
webhook channels surface a display name consistently with
Discord/Slack/Telegram/Email/Twitch adapters.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
src/agent.rssrc/agent/channel.rssrc/agent/channel_attachments.rssrc/agent/channel_dispatch.rssrc/agent/channel_history.rssrc/agent/channel_prompt.rssrc/conversation/channels.rssrc/lib.rssrc/messaging/discord.rssrc/messaging/email.rssrc/messaging/slack.rssrc/messaging/telegram.rssrc/messaging/twitch.rssrc/messaging/webhook.rssrc/tools/branch_tool.rssrc/tools/spawn_worker.rs
| state | ||
| .deps | ||
| .event_tx | ||
| .send(crate::ProcessEvent::BranchStarted { | ||
| agent_id: state.deps.agent_id.clone(), | ||
| branch_id, | ||
| channel_id: state.channel_id.clone(), | ||
| description: status_label.to_string(), | ||
| reply_to_message_id: *state.reply_target_message_id.read().await, | ||
| }) | ||
| .ok(); | ||
|
|
There was a problem hiding this comment.
Do not silently discard ProcessEvent publish failures.
Using .ok() / let _ = here hides dropped-subscriber failures and removes operational visibility for lifecycle events.
🔧 Proposed fix pattern
- state
- .deps
- .event_tx
- .send(crate::ProcessEvent::WorkerStarted {
+ if let Err(error) = state
+ .deps
+ .event_tx
+ .send(crate::ProcessEvent::WorkerStarted {
agent_id: state.deps.agent_id.clone(),
worker_id,
channel_id: Some(state.channel_id.clone()),
task: task.clone(),
worker_type: "builtin".into(),
})
- .ok();
+ {
+ tracing::debug!(%error, worker_id = %worker_id, "failed to publish WorkerStarted");
+ }
- let _ = event_tx.send(ProcessEvent::WorkerComplete {
+ if let Err(error) = event_tx.send(ProcessEvent::WorkerComplete {
agent_id,
worker_id,
channel_id,
result: result_text,
notify,
success,
- });
+ }) {
+ tracing::debug!(%error, worker_id = %worker_id, "failed to publish WorkerComplete");
+ }As per coding guidelines: "Don't silently discard errors; use let _ = only on channel sends where the receiver may be dropped; handle, log, or propagate all other errors."
Also applies to: 337-348, 452-463, 529-536
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel_dispatch.rs` around lines 194 - 205, The code currently
drops send errors by calling .ok() on
state.deps.event_tx.send(crate::ProcessEvent::BranchStarted { ... }) which hides
failures to publish lifecycle events; replace the .ok() usage with explicit
error handling: await the send result, and on Err(logically handle) either log
the error with context using the existing logger (e.g., include agent_id,
branch_id, channel_id and description) or propagate the error from the
surrounding function; apply the same change pattern to the other send sites
mentioned (the other state.deps.event_tx.send(...) calls around the blocks at
the locations you flagged) so no ProcessEvent publish failure is silently
discarded.
| let (result_text, notify, success) = match future.await { | ||
| Ok(text) => { | ||
| // Scrub tool secret values from the result before it reaches | ||
| // the channel. The channel never sees raw secret values. | ||
| let scrubbed = if let Some(store) = &secrets_store { | ||
| crate::secrets::scrub::scrub_with_store(&text, store) | ||
| } else { | ||
| text | ||
| }; | ||
| (scrubbed, true, true) | ||
| } | ||
| Err(error) => { | ||
| tracing::error!(worker_id = %worker_id, %error, "worker failed"); | ||
| (format!("Worker failed: {error}"), true, false) | ||
| } |
There was a problem hiding this comment.
Scrub secrets in worker failure text before emitting completion events.
Failure payloads currently bypass secret scrubbing, so sensitive values in error messages can leak into channel-visible history.
🔧 Proposed fix
Err(error) => {
tracing::error!(worker_id = %worker_id, %error, "worker failed");
- (format!("Worker failed: {error}"), true, false)
+ let failure_text = format!("Worker failed: {error}");
+ let scrubbed = if let Some(store) = &secrets_store {
+ crate::secrets::scrub::scrub_with_store(&failure_text, store)
+ } else {
+ failure_text
+ };
+ (scrubbed, true, false)
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel_dispatch.rs` around lines 500 - 514, The Err arm of the
future.await match formats the error string without scrubbing secrets, which can
leak sensitive data; update the Err branch (the match handling that sets
result_text, notify, success) to run the error text through the same scrub logic
used in the Ok branch (use secrets_store and
crate::secrets::scrub::scrub_with_store) before setting result_text so that both
success and failure payloads are scrubbed prior to emitting completion events.
| .worker_duration_seconds | ||
| .with_label_values(&[&*agent_id, "builtin"]) | ||
| .observe(worker_start.elapsed().as_secs_f64()); |
There was a problem hiding this comment.
Use the actual worker type in duration metrics.
The shared completion path hardcodes "builtin", so OpenCode worker durations are misclassified.
🔧 Proposed refactor
-pub(crate) fn spawn_worker_task<F, E>(
+pub(crate) fn spawn_worker_task<F, E>(
worker_id: WorkerId,
event_tx: broadcast::Sender<ProcessEvent>,
agent_id: crate::AgentId,
channel_id: Option<ChannelId>,
secrets_store: Option<Arc<crate::secrets::store::SecretsStore>>,
+ worker_type: &'static str,
future: F,
) -> tokio::task::JoinHandle<()>
@@
metrics
.worker_duration_seconds
- .with_label_values(&[&*agent_id, "builtin"])
+ .with_label_values(&[&*agent_id, worker_type])
.observe(worker_start.elapsed().as_secs_f64());// call sites:
spawn_worker_task(..., secrets_store, "builtin", worker.run().instrument(worker_span));
spawn_worker_task(..., oc_secrets_store, "opencode", async move { ... }.instrument(worker_span));🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel_dispatch.rs` around lines 524 - 526, The duration metric
currently hardcodes the worker type label as "builtin" which misclassifies
OpenCode workers; change the .with_label_values call to use the actual worker
type label variable instead of the string literal (e.g. use the worker_type or
worker_label argument passed into spawn_worker_task). Update the metric
invocation that references worker_duration_seconds and agent_id so it uses that
worker_type variable, and ensure all call sites (spawn_worker_task and the
places that call spawn_worker_task with worker.run() / worker_span) pass the
correct worker type string ("builtin" or "opencode") into the spawning/metrics
path.
| use crate::{Arc, InboundMessage}; | ||
| use chrono::Utc; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if Arc is exported from crate root
grep -n "pub.*Arc" src/lib.rs
# Check the actual imports at lines 800-801 and 874-875
sed -n '798,802p' src/agent/channel_history.rs
sed -n '872,876p' src/agent/channel_history.rs
# Also check if there's a use statement re-exporting Arc at module level
grep -n "pub use.*Arc" src/lib.rsRepository: spacedriveapp/spacebot
Length of output: 1249
🏁 Script executed:
# Check the top-level imports in lib.rs
head -50 src/lib.rs | grep -n "use\|Arc"Repository: spacedriveapp/spacebot
Length of output: 263
Tests import non-existent crate::Arc symbol; use std::sync::Arc directly.
crate::Arc is not exported from the crate root. Import Arc from std::sync instead.
Proposed fix
- use crate::{Arc, InboundMessage};
+ use crate::InboundMessage;
+ use std::sync::Arc;Applies to lines 800-801 and 874-875.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use crate::{Arc, InboundMessage}; | |
| use chrono::Utc; | |
| use crate::InboundMessage; | |
| use std::sync::Arc; | |
| use chrono::Utc; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/agent/channel_history.rs` around lines 800 - 801, The tests incorrectly
import Arc from crate root; replace any use statements like "use crate::{Arc,
InboundMessage};" with "use std::sync::Arc; use crate::InboundMessage;" (or
separate imports) so Arc comes from std::sync while InboundMessage remains from
crate; ensure all occurrences (e.g., the imports around Arc at the top of
channel_history.rs) are updated and no lingering "crate::Arc" references remain.
…ry panic in attachment truncation, improve Telegram/webhook channel_name metadata
| }; | ||
| metadata.insert("reply_to_content".into(), truncated.into()); | ||
| metadata.insert("reply_to_content".into(), truncated.clone().into()); | ||
| metadata.insert(crate::metadata_keys::REPLY_TO_TEXT.into(), truncated.into()); |
There was a problem hiding this comment.
extract_reply_message_id() now reads metadata_keys::REPLY_TO_MESSAGE_ID, but inbound Discord metadata never sets it for referenced_message, so reply targeting looks like it’ll always be a no-op.
| metadata.insert(crate::metadata_keys::REPLY_TO_TEXT.into(), truncated.into()); | |
| metadata.insert( | |
| crate::metadata_keys::REPLY_TO_MESSAGE_ID.into(), | |
| serde_json::Value::String(referenced.id.get().to_string()), | |
| ); | |
| metadata.insert(crate::metadata_keys::REPLY_TO_TEXT.into(), truncated.into()); |
Summary
channel_*.rsmodules: attachments, history, prompt, dispatchmetadata_keysmodule with standard keys all adapters now set (server_name,channel_name,message_id,reply_to_message_id,reply_to_text)#prefix, EmailEmail:prefix) pushed into adapters soextract_display_nameis platform-agnosticFile split
channel_attachments.rschannel_history.rschannel_prompt.rschannel_dispatch.rschannel.rsMetadata standardization
Adapters updated: Discord, Telegram, Slack, Twitch, Email, Webhook
Consumers updated:
channel.rs— conversation context readsSERVER_NAME/CHANNEL_NAMEchannel_history.rs—extract_message_idreads standard key (was Discord-only),reply_to_textreads standard key (was fallback chain)channel.rs— retrigger metadata usesREPLY_TO_MESSAGE_IDdiscord.rs— outbound reply reads standard key only (fallback removed)conversation/channels.rs—extract_display_namereadschannel_namegenericallyNote
This refactoring reduces channel.rs from ~3900 lines to ~1800 lines by extracting focused modules for attachments, history reconciliation, prompt building, and dispatch logic. The new
metadata_keysmodule establishes a standardized contract across all 6 adapters (Discord, Telegram, Slack, Twitch, Email, Webhook), eliminating platform-specific fallback chains in consumers. All 18 existing tests are preserved inchannel_history.rs.Written by Tembo for commit a6fe6a9.