Skip to content

feat(conductor): implement conductor task routing and observability features#157

Merged
yacosta738 merged 14 commits into
mainfrom
conductor
Mar 7, 2026
Merged

feat(conductor): implement conductor task routing and observability features#157
yacosta738 merged 14 commits into
mainfrom
conductor

Conversation

@yacosta738
Copy link
Copy Markdown
Contributor

This pull request introduces the initial implementation of a new "conductor" runtime for task orchestration in the agent runtime. The changes add core modules for classification, planning, event handling, runtime management, and performers, as well as integration points for channel-based task submission and testing. The conductor runtime is designed to classify, plan, and execute tasks in a modular and extensible way, with event broadcasting and test coverage for channel ingress.

Conductor Runtime Core Implementation

  • Added the main conductor runtime module (conductor/mod.rs), including runtime activation, task submission, and basic domain inference logic. This provides the foundational structure for orchestrating tasks and managing their lifecycle.
  • Introduced a modular task classification system with rule-based and chainable classifiers (conductor/classifier.rs). This enables the conductor to determine task domains (e.g., Coding, Research) and confidence levels, supporting both rule-based and (future) LLM-based classification.
  • Implemented a planning interface and a local plan model for decomposing tasks into steps, integrated into the conductor runtime.

Performer and Domain Support

  • Added performer stubs for Coding and Browser domains (conductor/performers/coding.rs, conductor/performers/browser.rs). These provide the interface for domain-specific task execution, returning completed status as placeholders. [1] [2]

Event and Configuration Handling

  • Introduced an event bus for broadcasting conductor events (conductor/events.rs), allowing other components to subscribe to task lifecycle events.
  • Added configuration utilities for resolving tick intervals from config or markdown front matter (conductor/config.rs).

Channel Integration and Testing

  • Integrated conductor task routing into channel message processing, allowing /task commands from channels to be submitted to the conductor runtime. Added a test for channel-based task submission and event emission. [1] [2] [3]
  • Added the notify crate dependency, likely for future file or event watching.

These changes lay the groundwork for a modular, event-driven task orchestration system within the agent runtime, with clear extension points for future domain performers, advanced classification, and richer planning strategies.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 7, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds a new in-process Conductor subsystem to agent-runtime: planner/classifier, scheduler/runtime, performer pool with sandboxing/approval, durable task store (in-memory/SQLite), multi‑surface explicit task routing (/task via channels/CLI/gateway/cron), observability/metrics, config schema, prompt hot-reload, workspace isolation, many tests and docs.

Changes

Cohort / File(s) Summary
Core types & config
clients/agent-runtime/src/conductor/types.rs, clients/agent-runtime/src/conductor/config.rs, clients/agent-runtime/src/config/schema.rs, clients/agent-runtime/src/config/mod.rs
New conductor types (TaskId/StepId, TaskRequest, domains, statuses), ConductorConfig and nested planner/concurrency/retry/performer configs, validation, and tick-interval resolution (front-matter precedence).
Planner & classifier
clients/agent-runtime/src/conductor/planner.rs, clients/agent-runtime/src/conductor/classifier.rs, clients/agent-runtime/src/conductor/prompt_watcher.rs
Planner with fast/slow paths, plan validation and cycle detection; rule-based/LLM-chained classifiers; prompt hot-reload watcher.
Runtime, scheduler & events
clients/agent-runtime/src/conductor/mod.rs, clients/agent-runtime/src/conductor/service.rs, clients/agent-runtime/src/conductor/events.rs, clients/agent-runtime/src/conductor/task_store.rs
Active runtime activation/management, ConductorRuntime orchestration (submit→plan→enqueue→dispatch→execute), ConductorServiceCore queuing/backpressure, TaskStore with TransitionLog (InMemory/Failing/SQLite), atomic transitions and recovery, global event bus.
Performers & sandbox
clients/agent-runtime/src/conductor/performers/mod.rs, clients/agent-runtime/src/conductor/performers/{browser,coding,research,system}.rs
Performer trait and domain implementations; ScopedSandboxExecutor enforcing path scope; ApprovalGate; PerformerPool drives step execution with sandbox/approval/timeouts and progress events.
Ingress, gateway, daemon & cron
clients/agent-runtime/src/conductor/sources/mod.rs, clients/agent-runtime/src/channels/mod.rs, clients/agent-runtime/src/gateway/mod.rs, clients/agent-runtime/src/daemon/mod.rs, clients/agent-runtime/src/cron/{scheduler.rs,types.rs}, clients/agent-runtime/src/tools/cron_add.rs, clients/agent-runtime/src/lib.rs, clients/agent-runtime/src/main.rs
Routing helpers for explicit /task messages (channel/CLI/gateway/cron), gateway WS for conductor events, webhook normalization to TaskRequest, daemon supervision for conductor, cron job type support and submission flow, CLI routing to conductor.
Workspace & security
clients/agent-runtime/src/conductor/workspace.rs, clients/agent-runtime/src/conductor/performers/mod.rs
WorkspaceManager for per-task isolated workspaces, sanitize_workspace_leaf and is_within_root checks; path normalization and sandbox scope enforcement utilities.
Observability & metrics
clients/agent-runtime/src/observability/{traits,log,otel,prometheus}.rs
New ObserverEvent/ObserverMetric variants (ConductorStepLifecycle, ConductorApprovalTransition, PlannerLatency, ConductorQueueDepth), recording helper record_conductor_lifecycle, and metric registration/recording in Otel/Prometheus observers.
Tests & QA
clients/agent-runtime/tests/*
~15+ new/updated test modules covering config validation/precedence, planner, scheduler fairness, runtime end‑to‑end, store recovery, security/approval, observability, gateway/cron integration, sources, workspace isolation, serialization, and performance guards.
Docs, spec & archive
openspec/specs/conductor/spec.md, openspec/changes/archive/...
Design/proposal/tasks/verify/archival documents for the Conductor architecture and rollout.
Misc & tooling
clients/agent-runtime/Cargo.toml, clients/web/.../custom.css, clients/web/.../MarketingLayout.astro, clients/agent-runtime/src/onboard/wizard.rs, clients/agent-runtime/src/peripherals/mod.rs, .github/workflows/sonarqube-analysis.yml, lychee.toml
Added dependency notify = "6.1", CSS/layout formatting tweaks, onboarding defaults conductor config, clippy allow on peripheral stub, SonarQube step resilience and Lychee exclusion rule.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Channel/CLI/Gateway
    participant Router as Message Router
    participant Runtime as ConductorRuntime
    participant Planner as Planner
    participant Scheduler as ConductorServiceCore
    participant Pool as PerformerPool
    participant Store as TaskStore
    participant EventBus as Event Bus

    Client->>Router: /task "do X"
    Router->>Runtime: submit_task(TaskRequest)
    Runtime->>Planner: classify(request)
    Planner-->>Runtime: (domain, confidence)
    Runtime->>Store: insert_task(TaskRecord)
    Store-->>Runtime: ok
    Runtime->>EventBus: publish(TaskAccepted)
    par Supervised Worker (loop)
        Runtime->>Planner: plan(request)
        Planner-->>Runtime: TaskPlan
        Runtime->>Scheduler: enqueue steps
        Scheduler-->>Runtime: ReadySteps
        Runtime->>Pool: execute_step(step)
        Pool->>Pool: ApprovalGate.decide()
        Pool->>Pool: ScopedSandbox.run_wrapped(command)
        Pool-->>Runtime: StepStatus
        Runtime->>Store: transition_step(status)
        Store-->>Runtime: ok
        Runtime->>EventBus: publish(StepStateChanged)
    end
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

  • dallay/corvus#126: Overlapping changes to clients/agent-runtime/src/channels/mod.rs affecting explicit task routing and process_channel_message modifications.
  • dallay/corvus#119: Related gateway changes touching clients/agent-runtime/src/gateway/mod.rs (websocket/webhook routing) that intersect with conductor integration.
  • dallay/corvus#125: Changes to channel pre-processing and ingress handling that may conflict with or duplicate routing logic.

Suggested labels

area:rust, risk:high, risk:security, area:ci

Suggested reviewers

  • yuniel-acosta
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch conductor

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Mar 7, 2026

✅ Contributor Report

User: @yacosta738
Status: Passed (12/13 metrics passed)

Metric Description Value Threshold Status
PR Merge Rate PRs merged vs closed 88% >= 30%
Repo Quality Repos with ≥100 stars 0 >= 0
Positive Reactions Positive reactions received 9 >= 1
Negative Reactions Negative reactions received 0 <= 5
Account Age GitHub account age 3053 days >= 30 days
Activity Consistency Regular activity over time 108% >= 0%
Issue Engagement Issues with community engagement 0 >= 0
Code Reviews Code reviews given to others 391 >= 0
Merger Diversity Unique maintainers who merged PRs 2 >= 0
Repo History Merge Rate Merge rate in this repo 90% >= 0%
Repo History Min PRs Previous PRs in this repo 122 >= 0
Profile Completeness Profile richness (bio, followers) 90 >= 0
Suspicious Patterns Spam-like activity detection 1 N/A

Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-03-07 to 2026-03-07

@snyk-io
Copy link
Copy Markdown
Contributor

snyk-io Bot commented Mar 7, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
🔚 Open Source Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@sentry
Copy link
Copy Markdown

sentry Bot commented Mar 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 54

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
clients/agent-runtime/src/observability/otel.rs (1)

346-356: ⚠️ Potential issue | 🟠 Major

Don’t record conductor metrics as generic request/message metrics.

PlannerLatency is now recorded as corvus.request.latency, and ConductorQueueDepth as corvus.queue.depth, with no attribute to distinguish them from the existing series. That mixes unrelated signals and makes dashboards/alerts unable to separate planner behavior from request latency or conductor depth from message-queue depth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/observability/otel.rs` around lines 346 - 356, The
current match arm records PlannerLatency and ConductorQueueDepth into the
generic instruments self.request_latency and self.queue_depth which mixes
signals; update the code so PlannerLatency is recorded to a distinct
metric/instrument (e.g., self.planner_latency.record(...)) or include a
distinguishing attribute (e.g., attributes:
[KeyValue::new("component","planner")]) when calling
self.request_latency.record, and likewise record ConductorQueueDepth to a
separate instrument (e.g., self.conductor_queue_depth.record(...)) or add an
attribute like ("source","conductor") when calling self.queue_depth.record;
modify the ObserverMetric handling for PlannerLatency and ConductorQueueDepth
accordingly and ensure the new instruments/attributes are defined where metrics
are initialized.
clients/agent-runtime/src/observability/prometheus.rs (1)

208-225: ⚠️ Potential issue | 🟠 Major

Keep conductor metrics distinct in Prometheus.

PlannerLatency and ConductorQueueDepth are new signals, but this maps them onto the same series as RequestLatency and QueueDepth. After that, dashboards cannot tell agent latency/backlog from conductor latency/backlog, and the last queue update wins.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/observability/prometheus.rs` around lines 208 -
225, The record_metric implementation currently merges PlannerLatency into
request_latency and ConductorQueueDepth into queue_depth; update record_metric
so ObserverMetric::PlannerLatency is observed on self.planner_latency (instead
of self.request_latency) and ObserverMetric::ConductorQueueDepth sets
self.conductor_queue_depth (instead of self.queue_depth), keeping
ObserverMetric::RequestLatency and ObserverMetric::QueueDepth mapped to
self.request_latency and self.queue_depth respectively; modify the match arms in
record_metric to reference the planner_latency and conductor_queue_depth fields
so conductor and planner metrics are recorded as distinct Prometheus series.
clients/agent-runtime/src/cron/types.rs (1)

4-31: ⚠️ Potential issue | 🟠 Major

Fix serde rename attribute to prevent silent round-trip failures for ConductorTask.

With #[serde(rename_all = "lowercase")], the new ConductorTask variant serializes as "conductortask", but as_str() and parse() use "conductor_task". If a CronJob is serialized to JSON and deserialized back, ConductorTask will silently become Shell (the default fallback in parse()).

Fix
-#[serde(rename_all = "lowercase")]
+#[serde(rename_all = "snake_case")]
 pub enum JobType {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/cron/types.rs` around lines 4 - 31, The serde
rename attribute is wrong for JobType: #[serde(rename_all = "lowercase")] makes
ConductorTask serialize as "conductortask" but as_str()/parse() expect
"conductor_task", causing silent round-trip failures; fix by making
serialization match parsing (either change the enum attribute to
#[serde(rename_all = "snake_case")] or add an explicit #[serde(rename =
"conductor_task")] on the ConductorTask variant so JobType::ConductorTask
serializes/deserializes as "conductor_task" and stays consistent with
JobType::as_str() and JobType::parse().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/Cargo.toml`:
- Line 126: The dependency pin for the notify crate (currently "notify =
\"6.1\"") should be evaluated and updated to a stable >=7.x if compatible:
change the Cargo.toml entry for notify to a newer semver (e.g., "7" or "7.x"),
run cargo update and cargo tree to see transitive changes, then run cargo build
/ cargo test and specifically exercise PromptHotReload-related code paths to
catch behavioral or API differences; if compilation fails, adjust usages in the
PromptHotReload implementation to match notify 7.x API changes (callbacks,
watcher creation, and event types) or pin back to 6.x with a clear comment
explaining the stability reason.

In `@clients/agent-runtime/src/channels/mod.rs`:
- Around line 471-490: The code currently looks up ctx.channels_by_name before
calling crate::conductor::submit_task, causing /task messages to be dropped if
the channel lookup fails; change the flow in the route_explicit_task_message
handler to call crate::conductor::submit_task(*request) first, capture the
resulting ack string (handle Ok(Submitted), Ok(RuntimeInactive), and Err(error)
the same way), then perform a best-effort lookup of
ctx.channels_by_name.get(&msg.channel) and, if present, send the
SendMessage::new(ack, &msg.reply_target) via target_channel.send(...).await;
ensure submit_task receives the owned request (clone or move as needed) so the
task is always queued even when no reply channel exists.
- Around line 2365-2368: The test activates a process-global conductor runtime
via Config::default() and crate::conductor::activate_runtime(&cfg) which mutates
ACTIVE_RUNTIME but never tears it down; add a cleanup step by implementing a
deactivation API (e.g., crate::conductor::deactivate_runtime or returning a RAII
guard from activate_runtime such as RuntimeGuard) that clears ACTIVE_RUNTIME,
and invoke that in the test teardown (or replace the direct call to
activate_runtime with a scoped guard) so each test isolates runtime state and
prevents order-dependent failures.

In `@clients/agent-runtime/src/conductor/events.rs`:
- Around line 10-13: The publish function drops serialization errors silently;
update pub fn publish(event: &ConductorEventEnvelope) to return a Result<(), E>
(e.g., Result<(), serde_json::Error> or a crate error type) and propagate or
handle failures: call serde_json::to_string(event) and on Err return Err with
the serde error (and/or emit a redacted warning via the logger including minimal
event context), and also check the RESULT/return value of
EVENT_BUS.send(serialized) and return an Err or log if that send fails; ensure
references to ConductorEventEnvelope, publish, and EVENT_BUS are used so the
caller can observe and handle failures instead of swallowing them.

In `@clients/agent-runtime/src/conductor/mod.rs`:
- Around line 189-193: The function absolute_workspace_root currently swallows a
current_dir() error and falls back to ".", which can produce a relative path and
weaken the sandbox; change absolute_workspace_root to return a Result<PathBuf,
std::io::Error> (or your crate's error type) instead of PathBuf, remove
unwrap_or_else, call std::env::current_dir()? and on error propagate the error
(e.g., using ?), then join workspace_dir and return Ok(path); update all callers
of absolute_workspace_root to handle the Result and surface the error so runtime
activation fails rather than silently using ".".

In `@clients/agent-runtime/src/conductor/performers/browser.rs`:
- Around line 14-19: The browser performer's execute currently returns
Ok(StepStatus::Completed) which falsely signals success; change execute in the
browser performer (the async fn execute taking &PlannedStepForExecution and
&PerformerContext) to fail explicitly instead of returning StepStatus::Completed
— for example return a Err(...) with a clear error (e.g.,
anyhow::anyhow!("browser performer not implemented")) or Ok(StepStatus::Failed)
so downstream steps do not observe phantom success.

In `@clients/agent-runtime/src/conductor/performers/coding.rs`:
- Around line 14-20: The coding performer's execute function currently ignores
the PlannedStepForExecution and PerformerContext and always returns
Ok(StepStatus::Completed); change it to "fail closed" instead: in the execute
method for the coding performer, return a failure (either Err with a clear error
message or Ok(StepStatus::Failed)) indicating the performer is not implemented,
so coding steps are not silently marked completed until real execution behavior
is added; reference execute, PlannedStepForExecution, PerformerContext and
StepStatus when making this change.

In `@clients/agent-runtime/src/conductor/performers/mod.rs`:
- Around line 252-266: The normalize_path function can yield a relative or empty
PathBuf when there are more ParentDir components than available segments (e.g.,
"/a/../../b"); update normalize_path to either return a Result<PathBuf, Error>
or ensure it anchors to root so excessive ".." do not silently produce a
relative path: detect when Component::ParentDir would pop an empty normalized
buffer and either return an error (propagate to callers like
assert_command_scope) or treat the path as rooted (e.g., keep a leading root
component) so the returned PathBuf is always absolute and unambiguous; update
callers (e.g., assert_command_scope) to handle the new Result or rely on the
guaranteed absolute return.
- Around line 109-119: PerformerContext::new currently drops the mpsc receiver
causing progress_tx sends to fail; modify the PerformerContext struct to include
a progress_rx field (e.g., progress_rx: mpsc::Receiver<...>) and change new() to
create the channel with let (progress_tx, progress_rx) = mpsc::channel(64) and
store progress_rx instead of discarding it, or alternatively change new() to
accept an existing receiver parameter; update any constructor call sites
accordingly and ensure types match the existing progress event type referenced
by progress_tx.
- Around line 57-89: The current assert_command_scope implementation (involving
assert_command_scope, strip_wrapping_quotes, normalize_path, allowed_root) is
vulnerable because naive split_whitespace misses quoted args, shell
metacharacters and env expansion; replace the heuristic tokenizer with a proper
shell-aware parser (e.g., use a shell/tokenizer crate such as shell-words or a
vetted tokenizer) to produce real argument tokens before validation, or
alternatively explicitly reject any command containing shell metacharacters or
expansion characters (e.g., ; | & && || $ ` > < \n ( ) { } *) before
tokenization; once you have proper tokens, still trim quotes, resolve env
expansions only if intentionally allowed, normalize paths with normalize_path,
resolve relative paths against allowed_root, and bail if any candidate does not
start_with allowed_root or if any token contains forbidden constructs.

In `@clients/agent-runtime/src/conductor/performers/research.rs`:
- Around line 14-19: The current no-op implementation of execute in the Research
performer returns Ok(StepStatus::Completed) which incorrectly advances the task
graph; change the execute method on the performer to return an explicit failure
instead (e.g., return Err(anyhow!("research performer unimplemented") or an
equivalent error) rather than Ok(StepStatus::Completed)) so downstream steps do
not see phantom success; update the execute signature body that takes
PlannedStepForExecution and PerformerContext to produce an error Result instead
of completing.

In `@clients/agent-runtime/src/conductor/planner.rs`:
- Around line 227-247: detect_cycle builds an adjacency map of owned Strings and
then has_cycle stores/clones those Strings into visiting/visited which causes
allocations; change the representation to avoid cloning by either (a) building
adjacency as HashMap<&str, Vec<&str>> using references to step.id and
step.depends_on (ensuring the lifetimes tie to the TaskPlan passed in) so
has_cycle works with &str keys, or (b) map each step.id to a numeric index
(e.g., HashMap<String, usize>) and use Vec<Vec<usize>> plus HashSet<usize> for
visiting/visited; update detect_cycle, the adjacency construction, and the
has_cycle signature/usage accordingly to eliminate per-traversal String
allocations.
- Around line 124-131: The code silently falls back from WatchedPromptSource to
FilePromptSource when WatchedPromptSource::new fails; change the prompt_source
creation so that you capture the Err from WatchedPromptSource::new, log a
warning including the error details before falling back to FilePromptSource::new
(use process logger or the crate logger), and if the configuration explicitly
requests hot-reload (e.g., a field like config.hot_reload or config.watch),
instead propagate or return the error rather than silently falling back; update
the match handling around WatchedPromptSource::new and the logic that sets
prompt_source to reference these symbols (WatchedPromptSource::new,
FilePromptSource::new, prompt_source, config.prompt_path) so the failure is
either logged or becomes a hard error when hot-reload is required.
- Around line 99-108: The FilePromptSource::load_prompt implementation performs
blocking I/O (fs::read_to_string) which will block the Tokio runtime; change the
PromptSource.load_prompt API to be async (e.g., async fn load_prompt(&self) ->
Result<String>) and update FilePromptSource::load_prompt to use
tokio::fs::read_to_string(...).await (trimming the result as before), and then
update callers such as Planner::plan to await load_prompt; alternatively, if you
cannot change the trait signature, wrap the blocking read in
tokio::task::spawn_blocking and await that inside an async load_prompt
implementation — reference PromptSource, FilePromptSource::load_prompt, and
Planner::plan when making the edits.

In `@clients/agent-runtime/src/conductor/prompt_watcher.rs`:
- Around line 60-65: In latest_prompt(), replace the manual mapping clone
pattern with the idiomatic .cloned(): when locking self.current_prompt, use
.cloned() on the resulting guard instead of .map(|value| value.clone()) so the
function returns the owned String the same way but with clearer, more concise
code; locate the change in the latest_prompt method that accesses
self.current_prompt.

In `@clients/agent-runtime/src/conductor/service.rs`:
- Around line 101-105: The enqueue method currently drops ReadyStep for unknown
TaskDomain (e.g., TaskDomain::Composite); change pub fn enqueue(&mut self, step:
ReadyStep) to return Result<(), EnqueueError> (or Result<(), String>) and,
inside, if let Some(queue) = self.queues.get_mut(&step.domain) {
queue.push_back(step); Ok(()) } else { return Err(format!("no queue for domain:
{:?}", step.domain)); } — include a clear error type/message referencing
ReadyStep and TaskDomain and update callers to handle the Result (or
alternatively add a process_logger.warn call inside the else branch if you
prefer logging rather than returning an error).
- Around line 320-327: The current single-pass loop over plan.steps skips steps
whose dependencies aren't yet met (using
self.store.dependencies_completed(&task_id, &planned.depends_on)?) and never
retries them, so dependent steps never run; change the logic to repeatedly
iterate over plan.steps until no more progress is made: keep completed_steps (or
a new progress counter/flag), loop while completed_steps < plan.steps.len(), in
each pass attempt to execute any step whose dependencies are now satisfied
(using the same self.store.dependencies_completed check), mark steps as
completed when done and increment completed_steps, and break with an error if a
full pass makes zero progress to avoid an infinite loop. Ensure you reference
and update the existing completed_steps variable and the plan.steps collection
so previously skipped steps get retried.
- Around line 177-192: The pop_eligible function currently drains and rebuilds
the entire queue even when the front element is eligible; change it to
short-circuit by first checking the front element: get_mut the queue for the
domain, if queue.front() exists and is_eligible(&front.status, now_epoch_ms)
then simply return queue.pop_front() immediately; otherwise fall back to the
existing loop that drains the queue into a pending VecDeque to find the first
eligible item, reassign *queue = pending, and return the selected ReadyStep.
Reference symbols: pop_eligible, TaskDomain, ReadyStep, is_eligible, and
self.queues.
- Around line 337-344: PlannedStepForExecution is always constructed with risk:
RiskLevel::Low which bypasses the PerformerPool::execute_step approval gate; fix
by populating the risk from the planned step or an assessment function: add a
risk field to PlannedStep (or call a helper like assess_risk(domain, command))
and set step.risk = planned.risk (or the returned RiskLevel) when constructing
PlannedStepForExecution in service.rs, ensuring RiskLevel is used consistently
so PerformerPool::execute_step can trigger approvals for non-Low risks.

In `@clients/agent-runtime/src/conductor/task_store.rs`:
- Around line 181-184: insert_task currently overwrites existing tasks and
dependencies_completed treats missing deps as "not completed", which can erase
state or strand tasks; change insert_task (TaskStore::insert_task) to first
check if tasks contains the given task.id and return an Err if duplicate, and
update dependencies_completed to treat any dependency id not present in the
store as an invalid dependency (returning Err or false with an explicit error)
instead of silently treating it as incomplete; ensure both functions return a
clear Result error variant/message (e.g., DuplicateTaskId or UnknownDependency)
so callers can surface invalid task data rather than silently replacing or
stalling tasks.
- Around line 246-257: The reconcile_restart implementation mutates step.status
directly (setting Running/Scheduled -> Queued) without emitting the persistent
transition log; instead call the existing transition helper (transition_step)
for each affected step so the transition is recorded before state change, using
the same transition type you use for re-queueing; do the same fix for the other
similar block (the dependency-cancel path around the 286-299 area) so steps
moved to dependency-cancelled are persisted via transition_step rather than only
in-memory mutation; locate reconcile_restart, transition_step, and uses of
StepStatus::Queued / StepStatus::DependencyCancelled to apply the change.
- Around line 32-37: TaskRecordBuilder::build currently panics by unwrapping
missing id/description and StepId parsing and silently overwrites duplicate step
ids; change build to return Result<TaskRecord, TaskRecordBuildError> (or
similar) instead of panicking, validate that id: Option<TaskId> and description:
Option<String> are Some and return Err when missing, parse each step id from the
steps Vec<(String, TaskDomain, Vec<String>)> using fallible parsing (map_err and
return Err on parse failure) rather than expect, and when constructing the steps
HashMap check for existing keys (contains_key) and return an Err on duplicates
instead of overwriting; update TaskRecordBuilder::default() usage or remove
#[derive(Default)] if it encourages invalid empty builders.
- Around line 103-106: Replace panicking .lock().expect("...poisoned") calls
with fallible locking and propagate errors: for each occurrence (e.g., the block
using self.transitions.lock().expect("transition log mutex
poisoned").push(record.clone())), change .lock().expect(...) to
.lock().map_err(|_| anyhow::anyhow!("<mutex name> lock poisoned"))? (or
equivalent Context) and then perform the push; do this for all mutexes mentioned
so the surrounding functions return Err on poisoning rather than panic, using
the ? operator to propagate the error.

In `@clients/agent-runtime/src/conductor/types.rs`:
- Around line 4-32: Change TaskId and StepId to require validation on
deserialization by removing automatic Deserialize derive and adding
#[serde(try_from = "String")] to each struct; implement TryFrom<String> for
TaskId and StepId that calls their existing new(...) constructors (returning
anyhow::Error on failure) so serde uses TryFrom during deserialization and
validation is enforced; keep Serialize derived as before and ensure as_str() and
other methods remain unchanged.

In `@clients/agent-runtime/src/conductor/workspace.rs`:
- Around line 23-29: The current lexical starts_with check can be bypassed by
symlinks; fix by resolving real filesystem paths with canonicalization and
re-checking after any creation: canonicalize and store the canonical
workspace_root, and in create_workspace() compute workspace =
self.workspace_root.join(leaf), create the directory (std::fs::create_dir_all)
only to ensure it exists, then canonicalize the created workspace path and
verify the canonicalized workspace starts_with the canonicalized workspace_root
(if it does not, remove what you just created and return an error); do the same
canonicalize-and-compare approach inside is_within_workspace() and any other
checks (use the function names create_workspace(), is_within_root(),
is_within_workspace(), and the workspace_root/workspace variables to locate the
changes).
- Around line 18-29: The workspace path currently uses only the sanitized hint
which causes collisions; change the logic in the function that builds the
workspace path so the final directory always includes the task ID (e.g., join
the sanitized hint and task_id into a subpath) rather than replacing it—use
sanitize_workspace_leaf(hint) if present and then append task_id.as_str() (or
default to task_id when hint is None), then run
is_within_root(&self.workspace_root, &workspace) and
std::fs::create_dir_all(&workspace) as before to validate and create the
directory.

In `@clients/agent-runtime/src/cron/scheduler.rs`:
- Around line 120-136: run_conductor_task_job currently skips scheduler policy
checks; update it to enforce the same gates as Shell jobs by calling the
scheduler/policy helpers (e.g., can_act and is_rate_limited) before
building/submitting work and only calling record_action after a successful
submit. Specifically, in run_conductor_task_job call the same can_act(...) and
is_rate_limited(...) checks used by the Shell job handler (mirror the logic in
the Shell path), return the appropriate false/string message when those checks
fail, then proceed to build_conductor_task_request and call
crate::conductor::submit_task; if submit returns Submitted then call
record_action(...) and return success, otherwise do not record action and return
the existing error/runtime-inactive messages. Ensure you reference
run_conductor_task_job, build_conductor_task_request,
crate::conductor::submit_task, and the scheduler policy methods
can_act/is_rate_limited/record_action so the behavior matches Shell jobs.

In `@clients/agent-runtime/src/gateway/mod.rs`:
- Around line 1718-1751: The conductor branch currently returns early on
RuntimeInactive or Err without releasing the previously-recorded idempotency
key; update the non-submitted arms of the match on crate::conductor::submit_task
(the RuntimeSubmitOutcome::RuntimeInactive and Err(error) branches) to call
IdempotencyStore::remove(...) on the stored key (via whatever field holds the
store in state, e.g., state.idempotency_store) before returning the 503/500
responses so the key is unreserved and retries are not treated as duplicates;
ensure you use the same key value that was recorded and handle any removal
errors/logging as appropriate.
- Line 1362: The conduit for CONDUCTOR_EVENTS_WS_PATH is upgrading sockets
without any auth/origin checks; before calling on_upgrade in the
handle_conductor_events_ws flow, enforce the same admin bearer auth and origin
guard used elsewhere in the gateway (reuse the existing admin/auth/origin
middleware or guard functions) so only authorized origins and admin principals
can reach the upgrade. Locate handle_conductor_events_ws and the route bound to
CONDUCTOR_EVENTS_WS_PATH, perform the admin token validation and origin check
first, and only call on_upgrade if those checks pass (return an appropriate
401/403/400 response otherwise).

In `@clients/agent-runtime/src/main.rs`:
- Around line 891-905: When a CLI message uses the /task command while the
conductor feature is disabled, return an explicit error instead of letting it
fall through to normal agent handling; detect this by checking message_text for
the task route (or the route_cli_message result) and config.conductor.enabled,
and if the command is a task and config.conductor.enabled is false, return
Err(anyhow!(...)) with a clear message like "Conductor is disabled; start
`corvus daemon` to use /task". Update the block that calls
crate::conductor::sources::route_cli_message (and the surrounding if let
Some(message_text) handler) so that CliRouteOutcome::Task is guarded by
config.conductor.enabled and produces the explicit error when disabled before
any fallback.

In `@clients/agent-runtime/src/observability/log.rs`:
- Around line 155-167: ObserverEvent arms ConductorStepLifecycle (task_id,
step_id, status) and ConductorApprovalTransition (task_id, step_id, decision)
currently log status and decision verbatim; update the backend logging in log.rs
to redact those sensitive fields before emitting logs (e.g., map or replace
status/decision with a redacted placeholder or call an existing redact helper)
so the info! calls for "conductor.step" and "conductor.approval" never include
raw status/decision values even if callers send them directly; keep task_id and
step_id as-is.

In `@clients/agent-runtime/src/observability/otel.rs`:
- Around line 194-196: The match arm in otel.rs that currently routes
ObserverEvent::ConductorStepLifecycle and
ObserverEvent::ConductorApprovalTransition into the no-op branch must be
replaced with minimal telemetry emission: for
ObserverEvent::ConductorStepLifecycle and
ObserverEvent::ConductorApprovalTransition, create a short span or event using
the existing tracer/helpers in this module (use the same pattern as other
ObserverEvent arms), attach key attributes (e.g., step_id, approval_id,
state/status, timestamp) and record the event/attribute values, then end the
span; in other words, remove these variants from the empty branch and emit a
concise span/event for each using the module’s tracer APIs so the OTLP pipeline
captures the new step/approval flows.

In `@clients/agent-runtime/src/tools/cron_add.rs`:
- Around line 203-231: The ConductorTask branch currently calls
cron::add_agent_job and then only mutates the returned CronJob.job_type (and
drops any caller-supplied delivery by passing None), so persist the correct job
type and delivery into the scheduler instead of just rewriting the local struct:
call or add a dedicated constructor/creator (e.g., a cron::add_conductor_job or
cron::create_job_with_type) or perform an explicit update/save after
cron::add_agent_job that sets job_type = JobType::ConductorTask and preserves
the delivery parameter from args, and return the persisted CronJob; ensure
validation of command/prompt follows Tool trait schema and use
Self::error_result for failures without panics.

In `@clients/agent-runtime/tests/conductor_gateway_cron_integration.rs`:
- Around line 54-55: Add assertions for the new parse contract: update the tests
around JobType::parse to include checks that JobType::parse("conductor_task")
returns JobType::ConductorTask (and also JobType::parse("conductor") returns
JobType::ConductorTask if that alias is supported) in addition to the existing
"shell" and "agent" assertions; specifically modify the assertions near the
existing JobType::parse("shell")/("agent") lines and the ones around lines 86-87
to include these new checks so regressions to the conductor_task parsing will be
caught by the integration suite.

In `@clients/agent-runtime/tests/conductor_observability.rs`:
- Around line 64-80: The test currently only checks event count and statuses but
not metric payloads; after acquiring concrete.metrics (the Mutex for emitted
metrics) add assertions that the three emitted metrics are the expected queue
depths by asserting matches! on concrete.metrics entries for
ObserverMetric::ConductorQueueDepth(3), ObserverMetric::ConductorQueueDepth(2)
and ObserverMetric::ConductorQueueDepth(1). Locate the metrics lock usage
(concrete.metrics.lock()) near the existing event assertions and add these three
assertions to validate that record_conductor_lifecycle produced the correct
ConductorQueueDepth variants.

In `@clients/agent-runtime/tests/conductor_performance_guards.rs`:
- Around line 40-43: The timing assertion using start.elapsed() <
Duration::from_millis(10) is too tight for CI; change the test to either (a)
assert the planner took the fast-path directly (inspect the planner.plan result
or its metadata from planner.plan(&request) to check the fast-path flag/enum) or
(b) relax the timing check to a more forgiving threshold (e.g., increase to a
larger Duration with ample headroom) and keep the runtime/block_on usage
(tokio::runtime::Runtime::new, runtime.block_on(planner.plan(&request)),
start/Instant) unchanged so the test is robust across CI variance.

In `@clients/agent-runtime/tests/conductor_planner.rs`:
- Around line 108-113: Remove the flaky wall-clock assertion by deleting the
Instant timing code and the assert that checks elapsed <
Duration::from_millis(10); keep the functional assertions that plan.steps.len()
== 1 and calls.load(Ordering::SeqCst) == 0 (which prove the slow model was
skipped). Also remove the now-unused Instant import. Update the test block
around planner.plan(&request(...)) accordingly (references: planner.plan,
request, calls).

In `@clients/agent-runtime/tests/conductor_workspace_isolation.rs`:
- Around line 21-26: The test
workspace_paths_are_sanitized_and_prevent_traversal only asserts that the
sanitized result doesn’t contain ".." or '/' — add a Windows-specific traversal
input and assertion: call sanitize_workspace_leaf with a Windows-style path like
"..\\prod-secrets" (or r"..\prod-secrets") and assert the returned sanitized
string does not contain ".." and does not contain a backslash ('\\'); keep the
existing Unix-style check for '/' and the existing call to
sanitize_workspace_leaf so both separators and traversal forms are covered.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/apply-progress.md`:
- Around line 135-138: Update the "Remaining items" section under the "Remaining
items" heading to replace the optimistic "None in `tasks.md`" with a concise
list of follow-ups: enumerate known limitations/constraints, any technical debt
introduced during implementation, monitoring/alerting gaps, and documentation
gaps (user docs, runbooks, operational guides); reference `tasks.md` where
specific follow-up tickets should be tracked and add a short line indicating
ownership/priority for each item so they can be triaged later.
- Around line 54-83: Update the TDD evidence entries under "GREEN focused
matrix", "GREEN targeted remediation evidence (second pass)", and "Broader
validation" to include verifiable metadata: for each `cargo test ...` and `make
test` / `make build` result add the CI pipeline URL, ISO 8601 timestamp of the
run, and the short commit hash used for the test; place these details inline
after each Result line (e.g., after "PASS" or "all focused conductor suites
pass") and ensure every command listed (including `cargo test
conductor_ws_e2e_event_delivery_meets_latency_and_payload_contract` and `cargo
test conductor_runtime_ingress_`) has the same three pieces of metadata so
readers can reproduce and verify the claims.
- Around line 7-26: Update the progress document to reference the existing
rollback procedure in design.md under "## Rollback Strategy" (add a
cross-reference sentence and link to that section) and append a short
threat/risk assessment for the new security controls: describe attacker goals
and assumptions, failure modes, and mitigation for ScopedSandboxExecutor
(least-privilege enforcement), conductor/workspace.rs (path-sanitization and
root-boundary escape risk), and performer isolation (tokio::spawn + bounded
timeout and panic isolation); include a one-paragraph summary of operational
steps to revert changes (disable conductor via config, preserve task records,
revert worker registration) and note residual risks and monitoring/alerts to
detect sandbox/workspace escapes.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/design.md`:
- Around line 237-256: Summary: The archive table overstates delivered work by
listing files and integrations not actually included in this PR; update the
record to reflect only shipped artifacts or clearly mark planned items. Fix:
open
openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/design.md
and edit the table rows referencing
clients/agent-runtime/src/conductor/service.rs, task_store.rs, performers/*.rs,
sources/*.rs, workspace.rs (if missing), and the gateway/cron/CLI wiring rows
(clients/agent-runtime/src/gateway/mod.rs, src/cron/*, src/main.rs,
src/daemon/mod.rs changes that weren’t shipped) — either remove those rows or
change the Action/Description to "Planned" or "Proposed" (not
"Create"/"Modify"), and leave only the files actually added/modified in this PR
(e.g., mod.rs, types.rs, planner.rs, config/schema.rs if present) so the archive
accurately matches the diff.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/exploration.md`:
- Around line 1-47: Change the top-level heading from H2 to H1 (make the first
line start with a single "#") and ensure there is a blank line before and after
each major section heading (e.g., "Current State", "Affected Areas",
"Approaches", "Recommendation", "Risks", "Ready for Proposal") so markdownlint
rules MD041/MD022 are satisfied; edit the headings in this document (the initial
heading and the listed section titles) to add an empty line above and below each
heading line.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/proposal.md`:
- Around line 137-138: The proposal references the wrong exploration file path;
in the proposal.md entry that currently links to
openspec/changes/2026-03-07-conductor-architecture/exploration.md, update the
link to point to the exploration.md that lives under the archive directory for
this PR
(openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/exploration.md)
so the dependency trail is correct.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/specs/conductor/spec.md`:
- Around line 7-9: Update the archived spec sentence to only claim the explicit
entry point actually added by this PR: change the runtime requirement to accept
explicit task submissions via the `/task` channel command and keep non-task
conversational traffic on existing AgentLoop paths; remove references to CLI
task commands, gateway task APIs, and cron `ConductorTask` from this archived
document, or alternatively move the broader routing contract into a separate
(non-archived) proposal document so the archived spec only reflects the
implemented `/task` channel ingress.
- Around line 83-141: The archived spec currently overstates guarantees; edit
the three top-level requirement sections ("Requirement: Fair Scheduling, Bounded
Concurrency, and Backpressure", "Requirement: Latency Protections via Planner
Fast Path and Bounded Slow Path", and "Requirement: Durable State Transitions
and Deterministic Crash Recovery") and their scenarios so they only assert
behaviors actually implemented by this PR (e.g., keep any concrete fast-path
planner behavior or concurrency cap changes that this PR introduces), and remove
or demote broader MUST-level guarantees (replace with SHOULD/NOT or move into a
"Deferred / Future Work" subsection) for items like global/per-domain caps,
deterministic SQLite WAL atomicity, and exhaustive backpressure semantics; also
add a short note that remaining claims are deferred to a follow-up spec or
implementation task and reference the specific scenario headings ("Scenario:
Concurrency caps are enforced...", "Scenario: Fast path avoids planner network
call", "Scenario: Crash recovery re-queues interrupted running steps") so
reviewers can verify which behaviors remain.
- Around line 153-154: Update the compound adjective to use a hyphen: change the
phrase "existing endpoint request/response behavior MUST remain backward
compatible" so that "backward-compatible" is hyphenated; locate the sentence
containing "existing endpoint request/response behavior" (and the adjacent line
"AND new Conductor endpoints MUST be additive rather than breaking existing
routes.") and replace "backward compatible" with "backward-compatible" to follow
proper compound-adjective style.
- Around line 194-223: The spec section "Performer Execution with Timeout,
Retry, and Panic Safety" asserts MUST-level guarantees that the current Coding
and Browser performer implementations (placeholders returning completed status)
don't provide; either implement the behaviors or change the spec to reflect they
are future requirements: update the header and scenarios to indicate these are
"Future/Planned Requirements" or weaker wording (e.g., SHOULD/SHALL be targeted)
and add a note referencing the current stub state for the Coding and Browser
performers, and remove actionable language that implies runtime uses JoinHandle
+ catch_unwind/AssertUnwindSafe; ensure symbols mentioned (Performer Execution
with Timeout, Retry, and Panic Safety, Scenario: Step timeout cancels execution,
Scenario: Step retry with exponential backoff, Scenario: Performer panic is
isolated and recorded, JoinHandle, AssertUnwindSafe, catch_unwind, Coding
performer, Browser performer) are updated so the docs align with the actual
code/PR summary.
- Around line 224-270: The archived spec section currently asserts guarantees
beyond this PR (chat progress delivery, WebSocket `/api/conductor/events` SLA,
ObserverEvent/telemetry integrations, and hot-reload semantics for
CONDUCTOR.md/Planner/tick_interval_ms precedence); restrict the text to only
what the PR implements or split these into separate follow-up requirements:
remove or demote scenarios mentioning chat progress delivery, the 1s WebSocket
delivery guarantee, explicit ObserverEvent/prometheus/otel/logging MUSTs, and
the hot-reload timing/precedence assertions for CONDUCTOR.md and Planner, or
mark them as "future work"/"TBD" with links to issues; keep references to
ConductorEvent, ObserverEvent, `/api/conductor/events`, CONDUCTOR.md, Planner,
and tick_interval_ms in the comments so reviewers can re-add precise guarantees
in follow-ups.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/tasks.md`:
- Around line 39-67: The checklist incorrectly marks later conductor phases as
completed starting at the "6.3 Implement WebSocket event stream" item; revert
the completed markers for items 6.3–8.3 to unchecked so the docs reflect the
narrower PR scope—specifically change the [x] to [ ] for the items referencing
"6.3 Implement WebSocket event stream", "6.4 Implement CONDUCTOR.md hot-reload",
"6.5 Add performance validation tests/bench hooks", "6.6 Run phased verification
hooks", "6.7 Final refactor and hardening pass", and all Phase 7 and Phase 8
items (7.1–7.5, 8.1–8.3) so the file's checklist (tasks.md) accurately
represents work remaining.

In
`@openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/verify-report.md`:
- Line 29: Update the sentence in verify-report.md that currently reads
"residual coverage gaps are downgraded to warnings due partial (not absent)
evidence" to the corrected wording "residual coverage gaps are downgraded to
warnings due to partial (not absent) evidence" to fix the typo and improve
clarity; locate the sentence in the archive note (the line containing "New
suites added for runtime path, workspace isolation, and config precedence;
residual coverage gaps...") and replace the phrase "due partial" with "due to
partial" exactly as suggested.
- Around line 15-20: Update the "Runtime execution evidence" block to include
explicit entries for the Rust format and lint gates: add lines for `cargo fmt
--all -- --check` and `cargo clippy --all-targets -- -D warnings` showing
PASS/FAIL or mark them as SKIPPED with a short reason; modify the same section
that currently lists the test/build/coverage commands so the new entries appear
alongside the existing `cargo test`/`make` lines and clearly state results (or
skipped + justification) for the formatting (`cargo fmt`) and lint (`cargo
clippy`) checks.
- Line 20: The coverage gate is incorrectly using the aggregate Kotlin report
(make test-coverage -> koverHtmlReport) which pulls in the unrelated composeApp
module (7.1%); update the CI/reporting so the conductor change only checks the
Rust crate coverage for clients/agent-runtime/src/conductor or explicitly
document that composeApp is external. Specifically, remove or exclude composeApp
from the make test-coverage aggregation for this PR, or add a separate coverage
step that targets the Rust crate (cargo test + coverage) and update
verify-report.md to note that composeApp is out-of-scope when leaving the
aggregate check in place; references: make test-coverage, koverHtmlReport,
composeApp, and clients/agent-runtime/src/conductor.

In `@openspec/specs/conductor/spec.md`:
- Around line 149-154: Update the requirement phrase "MUST remain backward
compatible" to hyphenated form "MUST remain backward-compatible" in the Scenario
block titled "Existing gateway and admin contracts remain stable" (the line that
currently reads "THEN existing endpoint request/response behavior MUST remain
backward compatible") so the wording matches the spec's hyphenation style.

---

Outside diff comments:
In `@clients/agent-runtime/src/cron/types.rs`:
- Around line 4-31: The serde rename attribute is wrong for JobType:
#[serde(rename_all = "lowercase")] makes ConductorTask serialize as
"conductortask" but as_str()/parse() expect "conductor_task", causing silent
round-trip failures; fix by making serialization match parsing (either change
the enum attribute to #[serde(rename_all = "snake_case")] or add an explicit
#[serde(rename = "conductor_task")] on the ConductorTask variant so
JobType::ConductorTask serializes/deserializes as "conductor_task" and stays
consistent with JobType::as_str() and JobType::parse().

In `@clients/agent-runtime/src/observability/otel.rs`:
- Around line 346-356: The current match arm records PlannerLatency and
ConductorQueueDepth into the generic instruments self.request_latency and
self.queue_depth which mixes signals; update the code so PlannerLatency is
recorded to a distinct metric/instrument (e.g.,
self.planner_latency.record(...)) or include a distinguishing attribute (e.g.,
attributes: [KeyValue::new("component","planner")]) when calling
self.request_latency.record, and likewise record ConductorQueueDepth to a
separate instrument (e.g., self.conductor_queue_depth.record(...)) or add an
attribute like ("source","conductor") when calling self.queue_depth.record;
modify the ObserverMetric handling for PlannerLatency and ConductorQueueDepth
accordingly and ensure the new instruments/attributes are defined where metrics
are initialized.

In `@clients/agent-runtime/src/observability/prometheus.rs`:
- Around line 208-225: The record_metric implementation currently merges
PlannerLatency into request_latency and ConductorQueueDepth into queue_depth;
update record_metric so ObserverMetric::PlannerLatency is observed on
self.planner_latency (instead of self.request_latency) and
ObserverMetric::ConductorQueueDepth sets self.conductor_queue_depth (instead of
self.queue_depth), keeping ObserverMetric::RequestLatency and
ObserverMetric::QueueDepth mapped to self.request_latency and self.queue_depth
respectively; modify the match arms in record_metric to reference the
planner_latency and conductor_queue_depth fields so conductor and planner
metrics are recorded as distinct Prometheus series.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a35a3f39-6d45-4898-9c92-2679d65a3c51

📥 Commits

Reviewing files that changed from the base of the PR and between f6b717d and 506b71e.

⛔ Files ignored due to path filters (1)
  • clients/agent-runtime/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (59)
  • clients/agent-runtime/Cargo.toml
  • clients/agent-runtime/src/channels/mod.rs
  • clients/agent-runtime/src/conductor/classifier.rs
  • clients/agent-runtime/src/conductor/config.rs
  • clients/agent-runtime/src/conductor/events.rs
  • clients/agent-runtime/src/conductor/mod.rs
  • clients/agent-runtime/src/conductor/performers/browser.rs
  • clients/agent-runtime/src/conductor/performers/coding.rs
  • clients/agent-runtime/src/conductor/performers/mod.rs
  • clients/agent-runtime/src/conductor/performers/research.rs
  • clients/agent-runtime/src/conductor/performers/system.rs
  • clients/agent-runtime/src/conductor/planner.rs
  • clients/agent-runtime/src/conductor/prompt_watcher.rs
  • clients/agent-runtime/src/conductor/service.rs
  • clients/agent-runtime/src/conductor/sources/mod.rs
  • clients/agent-runtime/src/conductor/task_store.rs
  • clients/agent-runtime/src/conductor/types.rs
  • clients/agent-runtime/src/conductor/workspace.rs
  • clients/agent-runtime/src/config/mod.rs
  • clients/agent-runtime/src/config/schema.rs
  • clients/agent-runtime/src/cron/scheduler.rs
  • clients/agent-runtime/src/cron/types.rs
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/gateway/mod.rs
  • clients/agent-runtime/src/lib.rs
  • clients/agent-runtime/src/main.rs
  • clients/agent-runtime/src/observability/log.rs
  • clients/agent-runtime/src/observability/mod.rs
  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
  • clients/agent-runtime/src/observability/traits.rs
  • clients/agent-runtime/src/onboard/wizard.rs
  • clients/agent-runtime/src/peripherals/mod.rs
  • clients/agent-runtime/src/tools/cron_add.rs
  • clients/agent-runtime/tests/conductor_config_precedence.rs
  • clients/agent-runtime/tests/conductor_config_validation.rs
  • clients/agent-runtime/tests/conductor_gateway_cron_integration.rs
  • clients/agent-runtime/tests/conductor_observability.rs
  • clients/agent-runtime/tests/conductor_performance_guards.rs
  • clients/agent-runtime/tests/conductor_planner.rs
  • clients/agent-runtime/tests/conductor_runtime_path.rs
  • clients/agent-runtime/tests/conductor_scheduler_fairness.rs
  • clients/agent-runtime/tests/conductor_security_approval.rs
  • clients/agent-runtime/tests/conductor_sources_integration.rs
  • clients/agent-runtime/tests/conductor_store_recovery.rs
  • clients/agent-runtime/tests/conductor_types_roundtrip.rs
  • clients/agent-runtime/tests/conductor_workspace_isolation.rs
  • clients/web/apps/docs/src/styles/custom.css
  • clients/web/apps/marketing/src/layouts/MarketingLayout.astro
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/CONDUCTOR.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/apply-progress.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/archive-report.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/design.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/exploration.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/proposal.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/specs/conductor/spec.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/tasks.md
  • openspec/changes/archive/2026-03-07-2026-03-07-conductor-architecture/verify-report.md
  • openspec/specs/conductor/spec.md

Comment thread clients/agent-runtime/Cargo.toml
Comment thread clients/agent-runtime/src/channels/mod.rs
Comment thread clients/agent-runtime/src/channels/mod.rs
Comment thread clients/agent-runtime/src/conductor/events.rs
Comment thread clients/agent-runtime/src/conductor/performers/browser.rs
Comment thread openspec/specs/conductor/spec.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review continued from previous batch...

Comment thread clients/agent-runtime/src/conductor/mod.rs
Comment thread clients/agent-runtime/src/conductor/performers/mod.rs
Comment thread clients/agent-runtime/src/conductor/performers/mod.rs
Comment thread clients/agent-runtime/src/conductor/performers/mod.rs
Comment thread clients/agent-runtime/src/conductor/planner.rs
Comment thread clients/agent-runtime/src/conductor/service.rs
Comment thread clients/agent-runtime/src/conductor/service.rs
Comment thread clients/agent-runtime/src/conductor/types.rs
Comment thread clients/agent-runtime/src/conductor/workspace.rs
Comment thread clients/agent-runtime/src/gateway/mod.rs
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Mar 7, 2026

Deploying corvus with  Cloudflare Pages  Cloudflare Pages

Latest commit: fbfea49
Status: ✅  Deploy successful!
Preview URL: https://77d961e5.corvus-42x.pages.dev
Branch Preview URL: https://conductor.corvus-42x.pages.dev

View logs

@coderabbitai coderabbitai Bot added area:docs and removed area:rust labels Mar 7, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (1)
clients/agent-runtime/src/observability/otel.rs (1)

209-211: ⚠️ Potential issue | 🟠 Major

Emit telemetry for the new conductor events instead of dropping them.

clients/agent-runtime/src/observability/traits.rs:129-142 records ConductorStepLifecycle for conductor progress, but this branch turns both new conductor variants into no-ops, so the OTLP backend loses the step/approval signal this PR adds. Please emit at least a small span or event for each variant instead of folding them into the empty arm.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/observability/otel.rs` around lines 209 - 211, The
ObserverEvent match arm currently drops MissionTerminated,
ConductorStepLifecycle and ConductorApprovalTransition; modify the handler in
clients/agent-runtime/src/observability/otel.rs so that instead of the empty arm
it emits telemetry for the conductor events: create a small span or telemetry
event for ConductorStepLifecycle (including step id/state) and for
ConductorApprovalTransition (including approval id/from->to), and emit a concise
event for MissionTerminated; use the same tracing/OTLP helpers used elsewhere in
this module (the existing span/event creation utilities) so these variants
produce OTLP output rather than being ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In @.github/workflows/sonarqube-analysis.yml:
- Line 99: The workflow step currently uses the YAML key continue-on-error: true
which masks all failures; remove that line (or set continue-on-error: false) so
unexpected script errors surface, and keep the existing explicit curl guards
that already downgrade the two expected curl failures to warnings; ensure the
Python parsing block (the script that parses Python) and other commands run
without the global continue-on-error so real regressions will fail the step.

In `@clients/agent-runtime/src/cron/types.rs`:
- Around line 22-30: The parse function currently maps any unrecognized raw
string to Self::Shell, which can cause arbitrary inputs to be executed as shell
jobs; update the parsing logic in parse (in types.rs) so unknown values do not
fall through to Shell—either make parse return Result<JobType, ParseError> (or
Option<JobType>) and return an explicit Err/None for unrecognized strings, or
add an explicit Unknown variant and ensure callers validate/ reject Unknown
before scheduling; ensure callers of parse are updated to handle the
error/None/Unknown case so invalid job_type inputs fail validation rather than
becoming Shell jobs.
- Around line 5-10: The JobType enum's serde representation isn't aligned with
parse_job_type and docs: add #[serde(alias = "conductor")] to the ConductorTask
variant in the JobType enum so deserializing CronJob accepts both
"conductor_task" and "conductor"; update the schema/enum documentation to
include both spellings and extend tests that cover deserialization and
parse_job_type (referencing JobType, ConductorTask, CronJob, and the
parse_job_type function in cron_add.rs) to assert both "conductor_task" and
"conductor" are accepted and behave identically.

In `@clients/agent-runtime/src/observability/prometheus.rs`:
- Line 27: The field conductor_queue_depth is currently a GaugeVec but
ObserverMetric::ConductorQueueDepth is a plain u64, so change the
conductor_queue_depth type from GaugeVec to an IntGauge and update all places
that construct/register or update it (references around the
conductor_queue_depth field, the registration/initialization sites and the
update sites near the blocks noted at 116-120 and 252-255) to use IntGauge APIs
instead of GaugeVec/with_label_values; ensure labels are removed and updates
write the scalar value directly to the IntGauge to match the scalar shape used
by ObserverMetric::ConductorQueueDepth and the observability traits/otel
implementations.

---

Duplicate comments:
In `@clients/agent-runtime/src/observability/otel.rs`:
- Around line 209-211: The ObserverEvent match arm currently drops
MissionTerminated, ConductorStepLifecycle and ConductorApprovalTransition;
modify the handler in clients/agent-runtime/src/observability/otel.rs so that
instead of the empty arm it emits telemetry for the conductor events: create a
small span or telemetry event for ConductorStepLifecycle (including step
id/state) and for ConductorApprovalTransition (including approval id/from->to),
and emit a concise event for MissionTerminated; use the same tracing/OTLP
helpers used elsewhere in this module (the existing span/event creation
utilities) so these variants produce OTLP output rather than being ignored.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: b4c6987e-a7fe-40c4-aa1d-22fa8de6f430

📥 Commits

Reviewing files that changed from the base of the PR and between 506b71e and 5165686.

📒 Files selected for processing (4)
  • .github/workflows/sonarqube-analysis.yml
  • clients/agent-runtime/src/cron/types.rs
  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: pr-checks
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (4)
clients/agent-runtime/src/**/*.rs

📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)

clients/agent-runtime/src/**/*.rs: Never log secrets, tokens, raw credentials, or sensitive payloads in any logging statements
Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency

Files:

  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
  • clients/agent-runtime/src/cron/types.rs
clients/agent-runtime/**/*.rs

📄 CodeRabbit inference engine (clients/agent-runtime/AGENTS.md)

Run cargo fmt --all -- --check, cargo clippy --all-targets -- -D warnings, and cargo test for code validation, or document which checks were skipped and why

Files:

  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
  • clients/agent-runtime/src/cron/types.rs
**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.

Files:

  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
  • clients/agent-runtime/src/cron/types.rs
**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

  • clients/agent-runtime/src/observability/otel.rs
  • clients/agent-runtime/src/observability/prometheus.rs
  • clients/agent-runtime/src/cron/types.rs
🧠 Learnings (2)
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/.github/**/*.{yml,yaml} : For workflow/template-only changes, ensure YAML/template syntax validity

Applied to files:

  • .github/workflows/sonarqube-analysis.yml
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths

Applied to files:

  • clients/agent-runtime/src/observability/otel.rs
🪛 Gitleaks (8.30.0)
.github/workflows/sonarqube-analysis.yml

[high] 104-104: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)


[high] 129-129: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)

🔇 Additional comments (1)
.github/workflows/sonarqube-analysis.yml (1)

104-109: Good targeted fallback for transient Sonar API failures.

Downgrading the GET/POST curl failures to warnings is the right level of tolerance here and is much safer than making the whole step best-effort.

Also applies to: 129-134

Comment thread .github/workflows/sonarqube-analysis.yml
Comment thread clients/agent-runtime/src/cron/types.rs
Comment thread clients/agent-runtime/src/cron/types.rs
Comment thread clients/agent-runtime/src/observability/prometheus.rs
@coderabbitai coderabbitai Bot added area:rust and removed area:docs labels Mar 7, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
.github/workflows/sonarqube-analysis.yml (1)

134-173: ⚠️ Potential issue | 🟠 Major

The quality gate threshold change is never reverted.

This step mutates the shared Corvus Gate new_coverage threshold to 70% permanently. The mutation persists in SonarCloud across subsequent workflow runs, and if this job fails after the update, the relaxed threshold remains in effect. Capture the original threshold value and restore it in a if: always() step, or avoid mutating the shared gate.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/sonarqube-analysis.yml around lines 134 - 173, This step
permanently mutates the shared Corvus Gate new_coverage threshold; capture the
original threshold before changing it and restore it in a guaranteed cleanup
step. Specifically, read the current new_coverage condition value (from the JSON
saved to gate.json after calling "${api_base}/qualitygates/show"), save both the
condition "id" (condition_id) and its original "error" (or equivalent threshold)
into outputs or an artifact, then perform the POST to
"${api_base}/qualitygates/update_condition" to set error=70; add a subsequent
workflow job/step that runs always() which posts the saved original "error" back
to update_condition using the saved condition_id to revert the change. Ensure
the saved original value is persisted between steps (workflow outputs, file
artifact, or environment) so the always() restore step can access it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In @.github/workflows/sonarqube-analysis.yml:
- Around line 134-173: This step permanently mutates the shared Corvus Gate
new_coverage threshold; capture the original threshold before changing it and
restore it in a guaranteed cleanup step. Specifically, read the current
new_coverage condition value (from the JSON saved to gate.json after calling
"${api_base}/qualitygates/show"), save both the condition "id" (condition_id)
and its original "error" (or equivalent threshold) into outputs or an artifact,
then perform the POST to "${api_base}/qualitygates/update_condition" to set
error=70; add a subsequent workflow job/step that runs always() which posts the
saved original "error" back to update_condition using the saved condition_id to
revert the change. Ensure the saved original value is persisted between steps
(workflow outputs, file artifact, or environment) so the always() restore step
can access it.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 01880820-5a49-4044-a508-2d2d6be954eb

📥 Commits

Reviewing files that changed from the base of the PR and between 5165686 and 0877b01.

📒 Files selected for processing (3)
  • .github/workflows/sonarqube-analysis.yml
  • clients/web/apps/docs/src/styles/custom.css
  • lychee.toml
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: sonar
  • GitHub Check: pr-checks
  • GitHub Check: pr-checks
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (1)
**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

  • clients/web/apps/docs/src/styles/custom.css
  • lychee.toml
🧠 Learnings (1)
📚 Learning: 2026-02-17T12:31:17.076Z
Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/.github/**/*.{yml,yaml} : For workflow/template-only changes, ensure YAML/template syntax validity

Applied to files:

  • .github/workflows/sonarqube-analysis.yml
🪛 Gitleaks (8.30.0)
.github/workflows/sonarqube-analysis.yml

[high] 141-141: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)


[high] 166-166: Discovered a potential basic authorization token provided in a curl command, which could compromise the curl accessed resource.

(curl-auth-user)

🔇 Additional comments (4)
clients/web/apps/docs/src/styles/custom.css (2)

186-200: Looks good.

The heading font stack is preserved, and keeping line-height: 1.1 explicit avoids hero-title rendering drift across fallback fonts.


337-348: Looks good.

This keeps the docs heading typography aligned with the hero heading stack and does not change selector behavior or fallback semantics.

lychee.toml (1)

27-28: Scoped exclusion looks good.

Anchoring this ignore rule to file:// and the specific package paths keeps the suppression narrow and avoids hiding real remote-link failures.

.github/workflows/sonarqube-analysis.yml (1)

134-137: Remove step-level continue-on-error here.

The explicit curl guards already downgrade the expected Sonar API failures. Leaving continue-on-error: true in place still masks unrelated regressions in the Python parsing block and any later commands added to this step.

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Mar 7, 2026

@yacosta738 yacosta738 merged commit a4edf15 into main Mar 7, 2026
21 checks passed
@yacosta738 yacosta738 deleted the conductor branch March 7, 2026 21:06
@yacosta738 yacosta738 mentioned this pull request Mar 16, 2026
This was referenced Apr 19, 2026
This was referenced Apr 29, 2026
@dallay-bot dallay-bot Bot mentioned this pull request May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant