Skip to content

feat(service): add linger option for service install and restart commands#71

Merged
yacosta738 merged 5 commits into
mainfrom
feature/update-linger-command
Feb 23, 2026
Merged

feat(service): add linger option for service install and restart commands#71
yacosta738 merged 5 commits into
mainfrom
feature/update-linger-command

Conversation

@yacosta738
Copy link
Copy Markdown
Contributor

@yacosta738 yacosta738 commented Feb 23, 2026

This pull request introduces several improvements and new features to the Corvus agent runtime, with a focus on service management, update notifications, and enhanced documentation. The most significant changes are the addition of a service restart command, support for Linux linger modes, and automatic update checks. The changes also improve error handling and clarify documentation for users.

Service management enhancements:

  • Added a Restart command to the ServiceCommands enum, allowing users to restart the background service without manually stopping and starting it. [1] [2] [3]
  • Introduced the --linger option for the service install command, enabling Linux users to keep the service running after logout or reboot. This includes support for enabling/disabling linger and status reporting, with clear warnings for unsupported OSes. [1] [2] [3]
  • Updated service status output to display the current linger state on Linux, improving visibility for users.

Update notification improvements:

  • Added automatic update checks and notifications to the agent, daemon, and status commands, with an option to disable via CORVUS_DISABLE_UPDATE_CHECK. [1] [2] [3] [4] [5]

Documentation and usability:

  • Improved documentation in README.md to reflect new commands (restart, --linger), clarify service lifecycle management, and explain update notifications. [1] [2]
  • Enhanced error handling for gateway port conflicts, providing actionable guidance to users when a port is already in use.

These changes collectively make Corvus easier to manage, more robust, and clearer for end users.

Summary by CodeRabbit

  • New Features

    • Automatic update notices shown before agent/daemon/status commands (disable via CORVUS_DISABLE_UPDATE_CHECK); checks run with a short timeout and won’t block startup
    • Added service restart command and per-install "linger" option with Linux linger reporting in status
    • Health registry can now clear individual components
  • Bug Fixes

    • One-time clearer warning when gateway/service ports are already in use
    • HTTP response headers now display actual values (sensitive headers still redacted)
  • Documentation

    • Quick Start and service command docs updated with lifecycle, linger and restart examples

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Feb 23, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds an update-checking module and integrates update notices into agent/daemon/status; extends service CLI with per-install Linux linger modes and a Restart command; implements Linux linger handling in service install/status and restart; improves daemon logging for gateway port conflicts; fixes HTTP response header display and README docs.

Changes

Cohort / File(s) Summary
Service API & CLI
clients/agent-runtime/src/lib.rs, clients/agent-runtime/src/main.rs
Introduce ServiceLingerMode; change ServiceCommands::Install to Install { linger }; add Restart; re-export service types from lib; wire update notice printing into agent/daemon/status; add internal update module.
Service Implementation
clients/agent-runtime/src/service/mod.rs
Handle Install { linger } and Restart; change install signature to accept linger; implement Linux linger helpers (apply_linux_linger_mode, linux_linger_state, current_username), status reporting, restart fallback, and unsupported-linger warnings; add tests.
Update Checking Module
clients/agent-runtime/src/update/mod.rs
New module with pub async fn maybe_print_update_notice(config: &Config); TTL-cached state (24h) persisted to disk; fetch latest release from primary/fallback GitHub endpoints with timeouts and normalization/comparison; respects CORVUS_DISABLE_UPDATE_CHECK; includes unit tests.
Daemon Supervisor & Health
clients/agent-runtime/src/daemon/mod.rs, clients/agent-runtime/src/health/mod.rs
Detect gateway AddrInUse via is_addr_in_use_error, emit a one-time restart hint on conflict, unify error-string handling for component failures; add health::clear_component to remove components; add tests/guards.
HTTP Tooling & Tests
clients/agent-runtime/src/tools/http_request.rs, clients/agent-runtime/src/gateway/mod.rs
Fix response header display to use actual header values (still redact sensitive keys); update tests to match new header formatting; minor test formatting cleanups.
Docs
clients/agent-runtime/README.md
Document update checks for agent/daemon/status, introduce CORVUS_DISABLE_UPDATE_CHECK, expand Quick Start with linger and restart examples.

Sequence Diagram(s)

sequenceDiagram
    participant CLI
    participant Runtime as Agent/Daemon Runtime
    participant Update as Update Module
    participant Cache as Local Cache
    participant GH as GitHub API

    CLI->>Runtime: run command (agent / daemon / status)
    Runtime->>Update: maybe_print_update_notice(config)
    alt CORVUS_DISABLE_UPDATE_CHECK set
        Update-->>Runtime: no-op
    else
        Update->>Cache: load_state()
        Cache-->>Update: state (fresh or stale)
        alt state fresh
            Update-->>Runtime: notice_from_state / no notice
        else
            Update->>GH: fetch primary release endpoint
            GH-->>Update: version or error
            alt primary failed
                Update->>GH: fetch fallback endpoint
                GH-->>Update: version or error
            end
            Update->>Update: normalize & compare versions
            Update->>Cache: save_state(updated)
            Update-->>Runtime: print notice if update_available
        end
    end
    Runtime-->>CLI: continue execution
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested reviewers

  • yuniel-acosta
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The PR title accurately summarizes the main features: adding a linger option for service install and a restart command. It clearly conveys the primary changes from the changeset.
Description check ✅ Passed The PR description is comprehensive and well-structured, covering service management enhancements, update notification improvements, and documentation updates with supporting links. All required template sections are adequately addressed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/update-linger-command

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 23, 2026

✅ Contributor Report

User: @yacosta738
Status: Passed (12/13 metrics passed)

Metric Description Value Threshold Status
PR Merge Rate PRs merged vs closed 88% >= 30%
Repo Quality Repos with ≥100 stars 0 >= 0
Positive Reactions Positive reactions received 9 >= 1
Negative Reactions Negative reactions received 0 <= 5
Account Age GitHub account age 3040 days >= 30 days
Activity Consistency Regular activity over time 108% >= 0%
Issue Engagement Issues with community engagement 0 >= 0
Code Reviews Code reviews given to others 365 >= 0
Merger Diversity Unique maintainers who merged PRs 3 >= 0
Repo History Merge Rate Merge rate in this repo 88% >= 0%
Repo History Min PRs Previous PRs in this repo 51 >= 0
Profile Completeness Profile richness (bio, followers) 90 >= 0
Suspicious Patterns Spam-like activity detection 1 N/A

Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-02-23 to 2026-02-23

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
clients/agent-runtime/src/tools/http_request.rs (1)

305-313: ⚠️ Potential issue | 🟡 Minor

Pre-existing: response header values are never rendered — key is printed twice.

The iterator binds (k, _), silently discarding the HeaderValue. The non-sensitive branch then formats the key as both the name and the "value", producing output like content-type: "content-type" instead of content-type: "application/json".

🐛 Proposed fix
-let headers_text = response_headers
-    .map(|(k, _)| {
-        let is_sensitive = k.as_str().to_lowercase().contains("set-cookie");
-        if is_sensitive {
-            format!("{}: ***REDACTED***", k.as_str())
-        } else {
-            format!("{}: {:?}", k.as_str(), k.as_str())
-        }
-    })
+let headers_text = response_headers
+    .map(|(k, v)| {
+        let is_sensitive = k.as_str().to_lowercase().contains("set-cookie");
+        if is_sensitive {
+            format!("{}: ***REDACTED***", k.as_str())
+        } else {
+            format!("{}: {}", k.as_str(), v.to_str().unwrap_or("***BINARY***"))
+        }
+    })
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/tools/http_request.rs` around lines 305 - 313, The
map closure currently drops the header value by binding `(k, _)`, causing the
key to be printed twice; change the closure to bind `(k, v)` so you can render
the actual header value, keep the existing is_sensitive check, and format the
non-sensitive branch with the header value (e.g., use
v.to_str().unwrap_or_else(|_| format!("{:?}", v)) or format!("{:?}", v)) instead
of k.as_str(); update the closure where the iterator is mapped (the `(k, _)`
binding and is_sensitive logic) so headers print as `"name: value"` with
sensitive values redacted.
♻️ Duplicate comments (1)
clients/agent-runtime/src/main.rs (1)

92-117: ServiceLingerMode and ServiceCommands are duplicated from lib.rs.

Already noted in the lib.rs review — consider re-exporting from the library crate (like HardwareCommands on line 79) to keep a single source of truth.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/main.rs` around lines 92 - 117, ServiceLingerMode
and ServiceCommands are duplicated here; remove these local definitions and
re-use the single source from the library crate by importing/re-exporting the
types instead (mirror how HardwareCommands is referenced). Replace the local
enum declarations with a use or pub use of the library's ServiceLingerMode and
ServiceCommands so all code relies on the definitions in lib.rs, ensuring any
arg/value_enum derives remain provided by the shared types.
🧹 Nitpick comments (6)
clients/agent-runtime/src/lib.rs (1)

74-79: ServiceLingerMode is duplicated between lib.rs and main.rs.

This enum (and the updated ServiceCommands) is defined identically in both crate roots. While this is a necessary pattern when the same module (service/mod.rs) is compiled under both the library and binary crate, the definitions must be kept in sync manually. Consider having the binary re-export from the library (like HardwareCommands and PeripheralCommands in main.rs line 79) to eliminate the duplication and reduce the risk of divergence.

♻️ Re-export from the library instead of duplicating

In main.rs, replace the local definitions with re-exports:

-#[derive(Debug, Clone, Copy, ValueEnum, Serialize, Deserialize, PartialEq, Eq)]
-enum ServiceLingerMode {
-    Keep,
-    On,
-    Off,
-}
-
-#[derive(Subcommand, Debug)]
-enum ServiceCommands {
-    /// Install daemon service unit for auto-start and restart
-    Install {
-        /// Linux only: keep user service active without an interactive session
-        #[arg(long, value_enum, default_value_t = ServiceLingerMode::Keep)]
-        linger: ServiceLingerMode,
-    },
-    /// Start daemon service
-    Start,
-    /// Restart daemon service
-    Restart,
-    /// Stop daemon service
-    Stop,
-    /// Check daemon service status
-    Status,
-    /// Uninstall daemon service unit
-    Uninstall,
-}
+pub use corvus::{ServiceCommands, ServiceLingerMode};

This mirrors the existing pattern at line 79 for HardwareCommands and PeripheralCommands.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/lib.rs` around lines 74 - 79, Remove the duplicated
ServiceLingerMode (and the identically duplicated ServiceCommands) from main.rs
and re-export the definitions from the library crate instead (the versions
defined in lib.rs), following the existing pattern used for HardwareCommands and
PeripheralCommands; update main.rs to use pub use (or equivalent) to reference
the ServiceLingerMode and ServiceCommands from the library so there is a single
authoritative definition in lib.rs and no duplicated enum/command definitions in
main.rs.
clients/agent-runtime/src/service/mod.rs (1)

169-173: current_username() relies on USER/LOGNAME environment variables.

These env vars are typically available in interactive sessions but may be absent in non-interactive contexts (e.g., cron, some CI environments). Since service install and service status are interactive CLI commands, this should be fine in practice. If you ever need to support non-interactive invocations, consider falling back to whoami command output or the nix crate's getuid/getpwuid.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 169 - 173,
current_username() currently only reads USER/LOGNAME env vars which can be
missing in non-interactive contexts; update it to fall back when those vars are
absent by attempting to run the system "whoami" command (or use the nix crate's
getuid/getpwuid to resolve the login name) and return that result, ensuring any
command execution errors are converted into the same Result::Err with context;
reference the current_username function to implement the fallback path and
preserve the existing context message on final failure.
clients/agent-runtime/src/main.rs (1)

608-613: Update check adds up to ~2s latency on cold cache to the agent startup path.

When the version check cache is stale or missing, maybe_print_update_notice performs an HTTP request with a 2s timeout before the agent starts. This is tolerable given the 24h cache TTL (most runs will be instant), but it does add observable latency on first use or after cache expiry.

If this becomes a user experience concern, consider spawning the update check concurrently (e.g., tokio::spawn) and printing the notice after the agent completes, or running it in the background entirely.

Based on learnings: "Keep startup path lean and avoid heavy initialization in command parsing flow."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/main.rs` around lines 608 - 613, The startup path
blocks on update::maybe_print_update_notice(&config).await causing up to ~2s
latency before agent::run starts when the cache is stale; change it to run the
update check concurrently so agent::run(config, ...) starts immediately—spawn
the check via tokio::spawn (or similar) calling
update::maybe_print_update_notice cloned with needed args and handle/log its
result asynchronously (optionally printing the notice after agent::run
completes), ensuring no await on maybe_print_update_notice in the main startup
path.
clients/agent-runtime/src/update/mod.rs (2)

136-161: Synchronous file I/O (std::fs) called from async context.

load_state and save_state use blocking std::fs::read_to_string / std::fs::write / std::fs::create_dir_all. These are called transitively from the async check_for_update. For a tiny JSON file on a startup-only code path, the practical impact is negligible, but it technically blocks the Tokio runtime thread.

If you'd like to align with the async-first pattern used elsewhere in the codebase (e.g., tokio::fs in daemon/mod.rs line 120/134), consider switching to tokio::fs — though this is low priority.

As per coding guidelines, "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/update/mod.rs` around lines 136 - 161, The
synchronous file I/O in load_state and save_state blocks the Tokio runtime; make
them async and switch from std::fs to tokio::fs (e.g.,
tokio::fs::read_to_string, tokio::fs::create_dir_all, tokio::fs::write) so they
return Result<Option<VersionCheckState>> and Result<()> as async fns, keeping
the existing anyhow::Context usage for error messages; update the call site
(check_for_update) to .await these new async functions and adjust
signatures/uses accordingly (preserve VersionCheckState, path handling, and the
same error messages).

219-236: Pre-release suffixes are silently stripped during comparison.

parse_semverish strips everything after - or +, so 1.0.0-beta.1 is treated as equal to 1.0.0. This means a pre-release tagged as GitHub's "latest" could produce a false "update available" notice (or miss a real update). This is likely acceptable for your use case since GitHub "latest" typically points to stable releases, but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/update/mod.rs` around lines 219 - 236,
parse_semverish currently strips any pre-release/build suffix (splitting on '-'
or '+') so versions like 1.0.0-beta.1 are treated identical to 1.0.0; update
parse_semverish to preserve the pre-release part instead of discarding it (for
example return a tuple including an Option<String> pre-release field, e.g.,
(u64,u64,u64, Option<String>)), and then adjust the version-comparison logic
that consumes parse_semverish to treat pre-release versions with lower
precedence than the same core version without a pre-release (or explicitly
reject/ignore pre-release versions depending on desired behavior) so comparisons
correctly distinguish 1.0.0 from 1.0.0-beta.1.
clients/agent-runtime/src/daemon/mod.rs (1)

162-169: Helpful diagnostic for port conflicts.

The guidance to use corvus service restart is useful for users encountering this after upgrades. Two minor notes:

  1. e.to_string() is called on both line 163 and line 165. You could bind it once and reuse it to avoid the duplicate allocation in the error path.
  2. The string match on "Address already in use" is best-effort since OS error messages can vary by locale/platform, but this is acceptable for a diagnostic hint.
♻️ Reuse the error string
             Err(e) => {
-                    crate::health::mark_component_error(name, e.to_string());
-                    tracing::error!("Daemon component '{name}' failed: {e}");
-                    if name == "gateway" && e.to_string().contains("Address already in use") {
+                    let err_msg = e.to_string();
+                    crate::health::mark_component_error(name, &err_msg);
+                    tracing::error!("Daemon component '{name}' failed: {err_msg}");
+                    if name == "gateway" && err_msg.contains("Address already in use") {
                         tracing::warn!(
                             "Gateway port is already in use. This usually means another daemon/gateway instance is already running. If this happened after an upgrade, run `corvus service restart` instead of starting a second daemon process."
                         );
                     }

As per coding guidelines, "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 162 - 169, The error
path repeatedly calls e.to_string()—bind a single local (e.g., let err_str =
e.to_string()) before using it, then pass err_str to
crate::health::mark_component_error(name, err_str.clone() or &err_str as
appropriate) and use err_str in tracing::error! and the contains check for
"Address already in use" to avoid duplicate allocations; update the block around
the Err(e) match arm (referencing name and e) to reuse that single string
variable.
ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between f667e03 and a29a676.

📒 Files selected for processing (8)
  • clients/agent-runtime/README.md
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/gateway/mod.rs
  • clients/agent-runtime/src/lib.rs
  • clients/agent-runtime/src/main.rs
  • clients/agent-runtime/src/service/mod.rs
  • clients/agent-runtime/src/tools/http_request.rs
  • clients/agent-runtime/src/update/mod.rs
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/src/update/mod.rs`:
- Around line 59-92: The outer tokio::time::timeout wrapping
fetch_latest_release_version() in check_for_update causes the loop inside
fetch_latest_release_version (which uses a reqwest::Client per-request timeout)
to not reach fallback endpoints when the first endpoint is slow; remove the
outer tokio::time::timeout and call fetch_latest_release_version().await
directly (or alternatively increase the timeout to VERSION_CHECK_TIMEOUT_SECS *
2 + 1 if you prefer keeping it), ensuring existing error handling around the
returned Result remains unchanged; refer to check_for_update,
fetch_latest_release_version, VERSION_CHECK_TIMEOUT_SECS, tokio::time::timeout,
reqwest::Client, and RELEASE_ENDPOINTS to locate and update the code.

---

Outside diff comments:
In `@clients/agent-runtime/src/tools/http_request.rs`:
- Around line 305-313: The map closure currently drops the header value by
binding `(k, _)`, causing the key to be printed twice; change the closure to
bind `(k, v)` so you can render the actual header value, keep the existing
is_sensitive check, and format the non-sensitive branch with the header value
(e.g., use v.to_str().unwrap_or_else(|_| format!("{:?}", v)) or format!("{:?}",
v)) instead of k.as_str(); update the closure where the iterator is mapped (the
`(k, _)` binding and is_sensitive logic) so headers print as `"name: value"`
with sensitive values redacted.

---

Duplicate comments:
In `@clients/agent-runtime/src/main.rs`:
- Around line 92-117: ServiceLingerMode and ServiceCommands are duplicated here;
remove these local definitions and re-use the single source from the library
crate by importing/re-exporting the types instead (mirror how HardwareCommands
is referenced). Replace the local enum declarations with a use or pub use of the
library's ServiceLingerMode and ServiceCommands so all code relies on the
definitions in lib.rs, ensuring any arg/value_enum derives remain provided by
the shared types.

---

Nitpick comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 162-169: The error path repeatedly calls e.to_string()—bind a
single local (e.g., let err_str = e.to_string()) before using it, then pass
err_str to crate::health::mark_component_error(name, err_str.clone() or &err_str
as appropriate) and use err_str in tracing::error! and the contains check for
"Address already in use" to avoid duplicate allocations; update the block around
the Err(e) match arm (referencing name and e) to reuse that single string
variable.

In `@clients/agent-runtime/src/lib.rs`:
- Around line 74-79: Remove the duplicated ServiceLingerMode (and the
identically duplicated ServiceCommands) from main.rs and re-export the
definitions from the library crate instead (the versions defined in lib.rs),
following the existing pattern used for HardwareCommands and PeripheralCommands;
update main.rs to use pub use (or equivalent) to reference the ServiceLingerMode
and ServiceCommands from the library so there is a single authoritative
definition in lib.rs and no duplicated enum/command definitions in main.rs.

In `@clients/agent-runtime/src/main.rs`:
- Around line 608-613: The startup path blocks on
update::maybe_print_update_notice(&config).await causing up to ~2s latency
before agent::run starts when the cache is stale; change it to run the update
check concurrently so agent::run(config, ...) starts immediately—spawn the check
via tokio::spawn (or similar) calling update::maybe_print_update_notice cloned
with needed args and handle/log its result asynchronously (optionally printing
the notice after agent::run completes), ensuring no await on
maybe_print_update_notice in the main startup path.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 169-173: current_username() currently only reads USER/LOGNAME env
vars which can be missing in non-interactive contexts; update it to fall back
when those vars are absent by attempting to run the system "whoami" command (or
use the nix crate's getuid/getpwuid to resolve the login name) and return that
result, ensuring any command execution errors are converted into the same
Result::Err with context; reference the current_username function to implement
the fallback path and preserve the existing context message on final failure.

In `@clients/agent-runtime/src/update/mod.rs`:
- Around line 136-161: The synchronous file I/O in load_state and save_state
blocks the Tokio runtime; make them async and switch from std::fs to tokio::fs
(e.g., tokio::fs::read_to_string, tokio::fs::create_dir_all, tokio::fs::write)
so they return Result<Option<VersionCheckState>> and Result<()> as async fns,
keeping the existing anyhow::Context usage for error messages; update the call
site (check_for_update) to .await these new async functions and adjust
signatures/uses accordingly (preserve VersionCheckState, path handling, and the
same error messages).
- Around line 219-236: parse_semverish currently strips any pre-release/build
suffix (splitting on '-' or '+') so versions like 1.0.0-beta.1 are treated
identical to 1.0.0; update parse_semverish to preserve the pre-release part
instead of discarding it (for example return a tuple including an Option<String>
pre-release field, e.g., (u64,u64,u64, Option<String>)), and then adjust the
version-comparison logic that consumes parse_semverish to treat pre-release
versions with lower precedence than the same core version without a pre-release
(or explicitly reject/ignore pre-release versions depending on desired behavior)
so comparisons correctly distinguish 1.0.0 from 1.0.0-beta.1.

Comment thread clients/agent-runtime/src/update/mod.rs
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Feb 23, 2026

Deploying corvus-plugins with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1fedaff
Status: ✅  Deploy successful!
Preview URL: https://786740db.corvus-plugins.pages.dev
Branch Preview URL: https://feature-update-linger-comman.corvus-plugins.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (2)
clients/agent-runtime/src/daemon/mod.rs (1)

163-164: Unnecessary clone() — pass &err_str instead.

mark_component_error accepts impl ToString, which &String satisfies. Cloning allocates a second String that is immediately dropped after the call.

♻️ Proposed fix
-                    let err_str = e.to_string();
-                    crate::health::mark_component_error(name, err_str.clone());
-                    tracing::error!("Daemon component '{name}' failed: {err_str}");
+                    let err_str = e.to_string();
+                    crate::health::mark_component_error(name, &err_str);
+                    tracing::error!("Daemon component '{name}' failed: {err_str}");

As per coding guidelines: "Avoid unnecessary allocations, clones, and blocking operations to maintain performance and efficiency."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 163 - 164, The code
creates err_str and unnecessarily clones it when calling
crate::health::mark_component_error(name, err_str.clone()); instead pass a
reference to avoid allocating a second String: call mark_component_error with
&err_str (or borrow err_str directly) since mark_component_error accepts impl
ToString; update the call site in mod.rs where err_str is defined and used to
remove .clone() and pass the borrowed &err_str.
clients/agent-runtime/src/service/mod.rs (1)

169-189: No unit tests for current_username, apply_linux_linger_mode, or linux_linger_state.

current_username is fully testable with manipulated environment variables; the others require a Linux + loginctl environment but could at least have compile-time coverage on CI.

Example for current_username:

#[test]
fn current_username_reads_user_env() {
    // Safety: only safe in single-threaded tests; use serial_test crate if needed
    std::env::set_var("USER", "testuser");
    assert_eq!(current_username().unwrap(), "testuser");
    std::env::remove_var("USER");
}

#[test]
fn current_username_falls_back_to_logname() {
    std::env::remove_var("USER");
    std::env::set_var("LOGNAME", "loguser");
    assert_eq!(current_username().unwrap(), "loguser");
    std::env::remove_var("LOGNAME");
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 169 - 189, Add unit
tests for current_username and compile-time/CI coverage for
apply_linux_linger_mode and linux_linger_state: write tests that set and remove
environment variables using std::env::set_var/std::env::remove_var and assert
current_username() returns the expected string (e.g., set USER and assert
"testuser", remove USER set LOGNAME and assert "loguser"), ensure tests run
single-threaded or use the serial_test crate to avoid env races, and add
conditional #[cfg(target_os = "linux")] tests for apply_linux_linger_mode and
linux_linger_state (and #[cfg(not(target_os = "linux"))] stub tests) so CI
compiles those symbols even if loginctl is unavailable. Ensure tests clean
up/restore environment state after running and place them under the same module
so they reference current_username, apply_linux_linger_mode, and
linux_linger_state directly.
ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between a29a676 and ce9d4ca.

📒 Files selected for processing (5)
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/main.rs
  • clients/agent-runtime/src/service/mod.rs
  • clients/agent-runtime/src/tools/http_request.rs
  • clients/agent-runtime/src/update/mod.rs
🚧 Files skipped from review as they are similar to previous changes (2)
  • clients/agent-runtime/src/update/mod.rs
  • clients/agent-runtime/src/tools/http_request.rs
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/src/main.rs`:
- Around line 583-591: The detached tokio::spawn for
update::maybe_print_update_notice causes the notice to be dropped in
single-message (--message) mode and to interleave with agent I/O in interactive
mode; instead, call and await update::maybe_print_update_notice(&update_config)
before entering agent::run (or the interactive loop) so the notice completes
synchronously, and if you want to bound startup latency use
tokio::time::timeout(Duration::from_millis(500),
update::maybe_print_update_notice(&update_config)).await and handle the timeout
result gracefully; update the main() branch where update_config is cloned (and
ensure similar behavior for Commands::Daemon and Commands::Status) to keep
output coordinated.
- Line 605: The call to update::maybe_print_update_notice(&config).await blocks
daemon startup; change it so the update check is non-blocking for the daemon
path by either spawning it as a background task or skipping it for daemon runs:
locate the call (update::maybe_print_update_notice) invoked before daemon::run
and replace the await with a fire-and-forget tokio::spawn (or equivalent
runtime::spawn) of an async closure that calls
update::maybe_print_update_notice(config.clone()) so the HTTP round-trip does
not stall startup, or add a config/flag check (e.g., detecting the Daemon
command) to not call it at all for daemon runs.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 25-38: The code mutates system linger state before installation
and can leave it changed if install_linux(config) fails; change the call order
so install_linux(config) is invoked first and only after it returns Ok
apply_linux_linger_mode(linger)? is called, or alternatively ensure
apply_linux_linger_mode(linger)? is reverted on error by wrapping it in a
transactional pattern; update the install function (referencing install,
install_linux, and apply_linux_linger_mode and the linger parameter) to perform
the linger change only after a successful install or to roll back the linger
change if install_linux fails.
- Around line 208-227: The linux_linger_state() helper mis-parses legacy systemd
output and treats non-zero loginctl exits as success; update
linux_linger_state() to (1) call loginctl in a way that checks the child process
exit status (or change run_capture to return Err when status is non-zero) so
failures propagate instead of returning Ok(""), and (2) normalize the output by
handling both bare "yes"/"no" and "Key=Value" forms (e.g., if output contains
'=', split on '=' and use the right-hand side, then trim and compare
case-insensitively to "yes"). Ensure these changes are reflected where
linux_linger_state() is used (e.g., status()) so unavailable/error cases are
propagated instead of being reported as false.

---

Nitpick comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 163-164: The code creates err_str and unnecessarily clones it when
calling crate::health::mark_component_error(name, err_str.clone()); instead pass
a reference to avoid allocating a second String: call mark_component_error with
&err_str (or borrow err_str directly) since mark_component_error accepts impl
ToString; update the call site in mod.rs where err_str is defined and used to
remove .clone() and pass the borrowed &err_str.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 169-189: Add unit tests for current_username and compile-time/CI
coverage for apply_linux_linger_mode and linux_linger_state: write tests that
set and remove environment variables using
std::env::set_var/std::env::remove_var and assert current_username() returns the
expected string (e.g., set USER and assert "testuser", remove USER set LOGNAME
and assert "loguser"), ensure tests run single-threaded or use the serial_test
crate to avoid env races, and add conditional #[cfg(target_os = "linux")] tests
for apply_linux_linger_mode and linux_linger_state (and #[cfg(not(target_os =
"linux"))] stub tests) so CI compiles those symbols even if loginctl is
unavailable. Ensure tests clean up/restore environment state after running and
place them under the same module so they reference current_username,
apply_linux_linger_mode, and linux_linger_state directly.

Comment thread clients/agent-runtime/src/main.rs
Comment thread clients/agent-runtime/src/main.rs Outdated
Comment thread clients/agent-runtime/src/service/mod.rs
Comment thread clients/agent-runtime/src/service/mod.rs
@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented Feb 23, 2026

Deploying corvus with  Cloudflare Pages  Cloudflare Pages

Latest commit: 1fedaff
Status: ✅  Deploy successful!
Preview URL: https://c208d7a8.corvus-42x.pages.dev
Branch Preview URL: https://feature-update-linger-comman.corvus-42x.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (3)
clients/agent-runtime/src/daemon/mod.rs (2)

166-170: Missing test for the gateway port-conflict warning path.

The new if name == "gateway" && … branch has no test coverage. At a minimum, a test that spawns the supervisor with name = "gateway" and an error carrying the port-conflict string should assert that the health state and restart count are set correctly.

🧪 Example test skeleton
#[tokio::test]
async fn supervisor_sets_error_on_gateway_addr_in_use() {
    let handle = spawn_component_supervisor("gateway", 1, 1, || async {
        Err(anyhow::anyhow!(std::io::Error::from(
            std::io::ErrorKind::AddrInUse
        )))
    });

    tokio::time::sleep(Duration::from_millis(50)).await;
    handle.abort();
    let _ = handle.await;

    let snapshot = crate::health::snapshot_json();
    let component = &snapshot["components"]["gateway"];
    assert_eq!(component["status"], "error");
    assert!(component["restart_count"].as_u64().unwrap_or(0) >= 1);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 166 - 170, Add a new
async test that exercises the gateway port-conflict branch by using the existing
spawn_component_supervisor helper to spawn a supervisor with name "gateway" and
a task that returns an error containing "Address already in use" (e.g., an
AddrInUse io::Error); after a short sleep abort the handle, call
crate::health::snapshot_json(), and assert that
snapshot["components"]["gateway"]["status"] == "error" and that
snapshot["components"]["gateway"]["restart_count"] is >= 1; this ensures the if
name == "gateway" && err_str.contains("Address already in use") branch in
daemon::mod.rs is covered.

166-170: Missing test coverage for the new gateway port-conflict branch.

The new if name == "gateway" && err_str.contains("Address already in use") path has no corresponding test. At a minimum, a test that spawns the supervisor with name = "gateway" and an Err whose message contains "Address already in use" would confirm the health state and restart count behave correctly (and, once tracing is captured, that the warning fires).

🧪 Example skeleton
#[tokio::test]
async fn supervisor_logs_hint_on_gateway_addr_in_use() {
    let handle = spawn_component_supervisor("gateway", 1, 1, || async {
        Err(anyhow::anyhow!("Address already in use"))
    });

    tokio::time::sleep(Duration::from_millis(50)).await;
    handle.abort();
    let _ = handle.await;

    let snapshot = crate::health::snapshot_json();
    let component = &snapshot["components"]["gateway"];
    assert_eq!(component["status"], "error");
    assert!(component["last_error"]
        .as_str()
        .unwrap_or("")
        .contains("Address already in use"));
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 166 - 170, Add a
tokio::test that exercises the new gateway branch by calling
spawn_component_supervisor("gateway", ...) with a closure that returns
Err(anyhow::anyhow!("Address already in use")), sleep briefly, abort the handle,
then call crate::health::snapshot_json() and assert the "gateway" component has
status "error" and that its "last_error" contains "Address already in use"; also
assert the restart count/health fields update as expected and (optionally)
capture tracing to assert the specific warning message was emitted. Ensure the
test name is descriptive (e.g., supervisor_logs_hint_on_gateway_addr_in_use) and
uses the same helper functions (spawn_component_supervisor,
crate::health::snapshot_json) referenced in the diff.
clients/agent-runtime/src/service/mod.rs (1)

191-206: current_username() called unnecessarily for ServiceLingerMode::Keep.

current_username() runs a USER/LOGNAME env lookup (and potentially spawns whoami) even when linger == Keep, where the result is immediately discarded. Move the call into the On/Off arms.

♻️ Proposed refactor
 fn apply_linux_linger_mode(linger: crate::ServiceLingerMode) -> Result<()> {
-    let user = current_username()?;
     match linger {
         crate::ServiceLingerMode::Keep => Ok(()),
         crate::ServiceLingerMode::On => {
+            let user = current_username()?;
             run_checked(Command::new("loginctl").args(["enable-linger", &user]))?;
             println!("✅ Enabled linger for user '{user}'");
             Ok(())
         }
         crate::ServiceLingerMode::Off => {
+            let user = current_username()?;
             run_checked(Command::new("loginctl").args(["disable-linger", &user]))?;
             println!("✅ Disabled linger for user '{user}'");
             Ok(())
         }
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 191 - 206, The call to
current_username() is done unconditionally in apply_linux_linger_mode even when
ServiceLingerMode::Keep is returned immediately; move the current_username()
invocation into the On and Off match arms so it's only executed when needed
(i.e., call current_username() at the start of the On arm and reuse its Result
or ? to propagate errors, and likewise in the Off arm), leaving the Keep branch
to return Ok(()) without calling current_username().
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 166-170: The warning currently matches the OS-specific string
"Address already in use" and is emitted every retry; change it to detect
AddrInUse by walking the anyhow error chain and downcasting to std::io::Error to
check io::ErrorKind::AddrInUse (do this check before calling e.to_string() or
otherwise ensure the original error is available), and add a deduplication guard
(e.g., a local bool or a static AtomicBool like GATEWAY_PORT_IN_USE_WARNED) so
the warning inside the branch handling name == "gateway" is only logged once per
process.
- Around line 166-170: Replace the fragile OS-specific substring check with an
error-kind match against std::io::ErrorKind::AddrInUse when handling gateway
startup failures (i.e., where the code currently checks name == "gateway" &&
err_str.contains("Address already in use")), and add a boolean flag inside the
supervisor closure (e.g., gateway_port_conflict_logged) that is set the first
time you log the warning so subsequent retry/backoff cycles do not re-emit the
same message; ensure the branch still targets the "gateway" name and includes
the original descriptive message when the error kind matches AddrInUse.

---

Nitpick comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 166-170: Add a new async test that exercises the gateway
port-conflict branch by using the existing spawn_component_supervisor helper to
spawn a supervisor with name "gateway" and a task that returns an error
containing "Address already in use" (e.g., an AddrInUse io::Error); after a
short sleep abort the handle, call crate::health::snapshot_json(), and assert
that snapshot["components"]["gateway"]["status"] == "error" and that
snapshot["components"]["gateway"]["restart_count"] is >= 1; this ensures the if
name == "gateway" && err_str.contains("Address already in use") branch in
daemon::mod.rs is covered.
- Around line 166-170: Add a tokio::test that exercises the new gateway branch
by calling spawn_component_supervisor("gateway", ...) with a closure that
returns Err(anyhow::anyhow!("Address already in use")), sleep briefly, abort the
handle, then call crate::health::snapshot_json() and assert the "gateway"
component has status "error" and that its "last_error" contains "Address already
in use"; also assert the restart count/health fields update as expected and
(optionally) capture tracing to assert the specific warning message was emitted.
Ensure the test name is descriptive (e.g.,
supervisor_logs_hint_on_gateway_addr_in_use) and uses the same helper functions
(spawn_component_supervisor, crate::health::snapshot_json) referenced in the
diff.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 191-206: The call to current_username() is done unconditionally in
apply_linux_linger_mode even when ServiceLingerMode::Keep is returned
immediately; move the current_username() invocation into the On and Off match
arms so it's only executed when needed (i.e., call current_username() at the
start of the On arm and reuse its Result or ? to propagate errors, and likewise
in the Off arm), leaving the Keep branch to return Ok(()) without calling
current_username().
ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ce9d4ca and cf709fc.

📒 Files selected for processing (3)
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/main.rs
  • clients/agent-runtime/src/service/mod.rs

Comment thread clients/agent-runtime/src/daemon/mod.rs Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
clients/agent-runtime/src/service/mod.rs (3)

509-557: Environment save/restore in tests is not panic-safe.

If an assertion (or any code between set_env and the restore block) panics, the original environment values are never restored. Subsequent tests holding ENV_LOCK will see corrupted state. A small RAII guard would fix this:

💡 Sketch of an EnvGuard helper
struct EnvGuard {
    key: &'static str,
    original: Option<String>,
}

impl EnvGuard {
    fn set(key: &'static str, value: &str) -> Self {
        let original = std::env::var(key).ok();
        unsafe { std::env::set_var(key, value) };
        Self { key, original }
    }
    fn remove(key: &'static str) -> Self {
        let original = std::env::var(key).ok();
        unsafe { std::env::remove_var(key) };
        Self { key, original }
    }
}

impl Drop for EnvGuard {
    fn drop(&mut self) {
        match &self.original {
            Some(v) => unsafe { std::env::set_var(self.key, v) },
            None => unsafe { std::env::remove_var(self.key) },
        }
    }
}

This would simplify each test to:

let _guard = ENV_LOCK.lock().unwrap();
let _u = EnvGuard::set("USER", "testuser");
let _l = EnvGuard::set("LOGNAME", "loguser");
assert_eq!(current_username().unwrap(), "testuser");
// restore happens automatically on drop
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 509 - 557, The tests
manipulate environment variables with set_env/remove_env around assertions but
are not panic-safe, so if a test panics the original env is not restored and
ENV_LOCK-protected state is corrupted; create an RAII helper (e.g., EnvGuard)
and replace the manual save/restore in the tests
(current_username_prefers_user_env and
current_username_uses_logname_when_user_missing) with EnvGuard::set /
EnvGuard::remove instances so the Drop impl always restores the original value;
keep using ENV_LOCK for synchronization and reference current_username, set_env,
and remove_env usages when locating where to swap in the guard.

161-167: maybe_warn_unsupported_linger_mode prints to stdout — consider eprintln! for warnings.

Warnings are conventionally written to stderr so they don't interfere with machine-parseable stdout output. Using eprintln! (or tracing::warn!) would align with the convention used elsewhere in the daemon.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 161 - 167, The warning
in maybe_warn_unsupported_linger_mode currently prints to stdout with println!;
change it to emit the warning to stderr (use eprintln!) or use the project's
logging/tracing facility (e.g., tracing::warn!) so warnings don't pollute
stdout; update the function to call eprintln! (or tracing::warn!) with the same
message and ensure any tests or callers expecting stdout are adjusted
accordingly.

40-43: Simple stop-then-start restart — works but differs from native restart semantics on Linux.

systemctl --user restart corvus.service would be atomic and avoid the brief window between stop and start. The current approach is fine for a CLI tool (and is more portable across the OS branches), but worth noting if you want to tighten the restart window on Linux.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 40 - 43, The restart
function currently does stop(config)? then start(config) which leaves a small
downtime window; change restart(config: &Config) to, on Linux when systemd is
available, invoke an atomic systemctl --user restart corvus.service (e.g. spawn
a Command calling "systemctl" with args ["--user","restart","corvus.service"]
and propagate errors) and fall back to the existing stop(config)?; start(config)
behavior on other platforms or if systemctl fails; update restart to reference
the restart function itself and use Config only for the fallback path as needed.
clients/agent-runtime/src/daemon/mod.rs (1)

304-325: Test uses the real component name "gateway", which shares state with the global health registry.

The health registry is a process-wide singleton. Using "gateway" here means this test writes to the same slot that a real daemon (or future test) would use. Today no other test collides, but parallel test runs or a new test targeting "gateway" will produce flaky failures. Consider a unique name like "gateway-addr-in-use-test" and adjusting the assertions accordingly — the name == "gateway" branch in the supervisor is the only part that actually requires the literal "gateway" string, but the health-state assertions don't.

Actually, the supervisor itself checks name == "gateway" to decide whether to inspect for AddrInUse, so using "gateway" is intentional and necessary to exercise that code path. Given that constraint, the current approach is acceptable but worth noting.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 304 - 325, The test
supervisor_logs_hint_on_gateway_addr_in_use currently writes to the global
health registry under the real component name "gateway", risking cross-test
collisions; keep calling spawn_component_supervisor("gateway", ...) to exercise
the name == "gateway" AddrInUse branch but avoid global-state leakage by
isolating or cleaning the health registry after the test — e.g., call a teardown
helper to remove or reset the component entry (use
crate::health::clear_component("gateway") or
crate::health::reset()/crate::health::remove(...) if available) or alternatively
change the health assertions to target a unique component key (e.g.,
"gateway-addr-in-use-test") if you modify the supervisor registration, and
ensure snapshot checks (crate::health::snapshot_json and the component variable)
reference that unique key.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 304-325: The test supervisor_logs_hint_on_gateway_addr_in_use
currently writes to the global health registry under the real component name
"gateway", risking cross-test collisions; keep calling
spawn_component_supervisor("gateway", ...) to exercise the name == "gateway"
AddrInUse branch but avoid global-state leakage by isolating or cleaning the
health registry after the test — e.g., call a teardown helper to remove or reset
the component entry (use crate::health::clear_component("gateway") or
crate::health::reset()/crate::health::remove(...) if available) or alternatively
change the health assertions to target a unique component key (e.g.,
"gateway-addr-in-use-test") if you modify the supervisor registration, and
ensure snapshot checks (crate::health::snapshot_json and the component variable)
reference that unique key.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 509-557: The tests manipulate environment variables with
set_env/remove_env around assertions but are not panic-safe, so if a test panics
the original env is not restored and ENV_LOCK-protected state is corrupted;
create an RAII helper (e.g., EnvGuard) and replace the manual save/restore in
the tests (current_username_prefers_user_env and
current_username_uses_logname_when_user_missing) with EnvGuard::set /
EnvGuard::remove instances so the Drop impl always restores the original value;
keep using ENV_LOCK for synchronization and reference current_username, set_env,
and remove_env usages when locating where to swap in the guard.
- Around line 161-167: The warning in maybe_warn_unsupported_linger_mode
currently prints to stdout with println!; change it to emit the warning to
stderr (use eprintln!) or use the project's logging/tracing facility (e.g.,
tracing::warn!) so warnings don't pollute stdout; update the function to call
eprintln! (or tracing::warn!) with the same message and ensure any tests or
callers expecting stdout are adjusted accordingly.
- Around line 40-43: The restart function currently does stop(config)? then
start(config) which leaves a small downtime window; change restart(config:
&Config) to, on Linux when systemd is available, invoke an atomic systemctl
--user restart corvus.service (e.g. spawn a Command calling "systemctl" with
args ["--user","restart","corvus.service"] and propagate errors) and fall back
to the existing stop(config)?; start(config) behavior on other platforms or if
systemctl fails; update restart to reference the restart function itself and use
Config only for the fallback path as needed.
ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between cf709fc and ee8e1db.

📒 Files selected for processing (2)
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/service/mod.rs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (3)
clients/agent-runtime/src/daemon/mod.rs (1)

321-343: Test name doesn't match what is actually asserted.

supervisor_logs_hint_on_gateway_addr_in_use asserts health state (error status, restart count, last_error string) but does not verify the tracing::warn! emission. The name implies the warning is the subject under test. Renaming to something like supervisor_sets_error_state_on_gateway_addr_in_use would better reflect the actual assertions and avoid confusion when the test is read in isolation.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/daemon/mod.rs` around lines 321 - 343, The test
named supervisor_logs_hint_on_gateway_addr_in_use currently asserts health state
(component status, restart_count, last_error) but does not check for any
tracing::warn! emission; rename the test function to a name that reflects its
actual assertions (for example
supervisor_sets_error_state_on_gateway_addr_in_use) and update any references to
the old test name; ensure the new name appears on the async test function
declaration (supervisor_logs_hint_on_gateway_addr_in_use ->
supervisor_sets_error_state_on_gateway_addr_in_use) and keep the body
(HealthComponentGuard, spawn_component_supervisor call, and the snapshot
assertions) unchanged.
clients/agent-runtime/src/service/mod.rs (2)

461-502: EnvGuard::drop relies on an external lock invariant the type doesn't enforce; lock poisoning can cascade test failures.

Two related concerns with the test helpers:

  1. EnvGuard doesn't hold ENV_LOCK itself. Its Drop impl calls set_env/remove_env which carry the SAFETY requirement that ENV_LOCK is held by the caller. Any future test that constructs an EnvGuard without first locking will silently violate this contract. Consider making EnvGuard hold the MutexGuard internally, or at minimum make EnvGuard::set/remove accept &MutexGuard<'_, ()> as proof the lock is held.

  2. ENV_LOCK.lock().unwrap() poisons on panic. If a test panics while holding the lock, every subsequent test calling .unwrap() also panics, obscuring the original failure. Use .unwrap_or_else(|e| e.into_inner()) to recover from a poisoned lock.

♻️ Suggested fix for poisoning resilience
-        let _guard = ENV_LOCK.lock().unwrap();
+        let _guard = ENV_LOCK.lock().unwrap_or_else(|e| e.into_inner());

(apply to both current_username_prefers_user_env and current_username_uses_logname_when_user_missing)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 461 - 502, EnvGuard
currently relies on the external invariant that ENV_LOCK is held, which is
unsafe and brittle; change EnvGuard to capture the lock by storing the
MutexGuard from ENV_LOCK (or alternately make EnvGuard::set and EnvGuard::remove
accept a &MutexGuard<'_, ()> parameter) so Drop can call set_env/remove_env
safely, and update call sites to pass the guard or construct EnvGuard after
locking. Also make all ENV_LOCK.lock() usages resilient to poisoning by
replacing .unwrap() with .unwrap_or_else(|e| e.into_inner()) so a panicked test
doesn't permanently poison the lock (apply this to tests like
current_username_prefers_user_env and
current_username_uses_logname_when_user_missing).

40-52: Silent fallback path on Linux could benefit from a diagnostic log.

When systemctl --user restart fails, the code silently falls back to stop + start without any indication that the primary path failed. While the user eventually sees "✅ Service stopped / started", the reason for the two-step path is invisible.

♻️ Suggested improvement
 fn restart(config: &Config) -> Result<()> {
     if cfg!(target_os = "linux") {
         if run_checked(Command::new("systemctl").args(["--user", "restart", "corvus.service"]))
             .is_ok()
         {
             println!("✅ Service restarted");
             return Ok(());
+        } else {
+            tracing::debug!("systemctl restart failed; falling back to stop + start");
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@clients/agent-runtime/src/service/mod.rs` around lines 40 - 52, The restart
function currently calls run_checked(Command::new("systemctl").args(["--user",
"restart", "corvus.service"])) and silently falls back to stop() + start() on
failure; change it to capture the Result from run_checked, and when it is Err
include a diagnostic log (e.g., log::warn! or eprintln!) that prints a clear
message plus the error/output from the failed systemctl invocation (referencing
restart, run_checked, and the Command::new("systemctl").args(...) call), then
proceed with the existing stop(config)?; start(config) fallback so the fallback
remains but is no longer silent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@clients/agent-runtime/src/daemon/mod.rs`:
- Around line 321-343: The test named
supervisor_logs_hint_on_gateway_addr_in_use currently asserts health state
(component status, restart_count, last_error) but does not check for any
tracing::warn! emission; rename the test function to a name that reflects its
actual assertions (for example
supervisor_sets_error_state_on_gateway_addr_in_use) and update any references to
the old test name; ensure the new name appears on the async test function
declaration (supervisor_logs_hint_on_gateway_addr_in_use ->
supervisor_sets_error_state_on_gateway_addr_in_use) and keep the body
(HealthComponentGuard, spawn_component_supervisor call, and the snapshot
assertions) unchanged.

In `@clients/agent-runtime/src/service/mod.rs`:
- Around line 461-502: EnvGuard currently relies on the external invariant that
ENV_LOCK is held, which is unsafe and brittle; change EnvGuard to capture the
lock by storing the MutexGuard from ENV_LOCK (or alternately make EnvGuard::set
and EnvGuard::remove accept a &MutexGuard<'_, ()> parameter) so Drop can call
set_env/remove_env safely, and update call sites to pass the guard or construct
EnvGuard after locking. Also make all ENV_LOCK.lock() usages resilient to
poisoning by replacing .unwrap() with .unwrap_or_else(|e| e.into_inner()) so a
panicked test doesn't permanently poison the lock (apply this to tests like
current_username_prefers_user_env and
current_username_uses_logname_when_user_missing).
- Around line 40-52: The restart function currently calls
run_checked(Command::new("systemctl").args(["--user", "restart",
"corvus.service"])) and silently falls back to stop() + start() on failure;
change it to capture the Result from run_checked, and when it is Err include a
diagnostic log (e.g., log::warn! or eprintln!) that prints a clear message plus
the error/output from the failed systemctl invocation (referencing restart,
run_checked, and the Command::new("systemctl").args(...) call), then proceed
with the existing stop(config)?; start(config) fallback so the fallback remains
but is no longer silent.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between ee8e1db and 1fedaff.

📒 Files selected for processing (3)
  • clients/agent-runtime/src/daemon/mod.rs
  • clients/agent-runtime/src/health/mod.rs
  • clients/agent-runtime/src/service/mod.rs

@yacosta738 yacosta738 merged commit 7af6bf3 into main Feb 23, 2026
15 checks passed
@yacosta738 yacosta738 deleted the feature/update-linger-command branch February 23, 2026 10:35
@yacosta738 yacosta738 mentioned this pull request Mar 16, 2026
This was referenced Apr 19, 2026
This was referenced Apr 29, 2026
@dallay-bot dallay-bot Bot mentioned this pull request May 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant