fix(rook): implement operational doctor diagnostics by yacosta738 · Pull Request #709 · dallay/corvus

yacosta738 · 2026-04-28T06:27:19Z

Related Issues

fixes #679

Summary

implement production-focused rook doctor diagnostics with structured pass/warn/fail output and actionable operator guidance
reuse shared effective config and startup-equivalent database readiness checks, including inbound auth validation and dashboard asset validation
add doctor-focused test coverage, sync the gateway spec, and archive the completed OpenSpec change

Tested Information

cargo fmt --all -- --check
cargo clippy --all-targets -- -D warnings
cargo test

Documentation Impact

Docs updated in:
- openspec/specs/gateway/spec.md
- openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
- openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
- openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/design.md
- openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/tasks.md
- openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md
No docs update required because:
I verified the documentation matches the current behavior.

Breaking Changes

None.

Checklist

I have checked that there isn’t already a PR solving the same problem.
I have read the Contributing Guidelines.
I ensured my code follows the project's style guidelines.
I have added or updated tests that prove my fix is effective or that my feature works.
I have updated the documentation, or I explained above why no documentation update is needed.
I verified the documentation matches the current behavior.
I have documented any breaking changes in the Breaking Changes section.
I have linked the related issue (if any).

coderabbitai · 2026-04-28T06:27:33Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Enhanced rook doctor with structured pass/warn/fail diagnostics, summaries, details and actionable guidance.
- Startup-readiness checks: database open/migration readiness with remediation guidance, effective bind-target reporting, inbound-auth state (redacted) and dashboard asset availability.
- New startup-readiness snapshot endpoint exposed for diagnostics.
Tests
- End-to-end and unit tests covering success and failure scenarios, token redaction, and asset overrides.

Walkthrough

Refactors effective-config assembly and validation into a reusable seam, adds startup-readiness diagnostics for DB/registry/server/assets/auth, updates doctor to emit structured check results (summary/guidance/details) and advisory checks, and adds tests and docs for deterministic local-first operational diagnostics.

Changes

Cohort / File(s)	Summary
Config & bind helpers `clients/rook/src/config/mod.rs`	Split validation into `validate_non_auth()` + auth validation; added `assemble_effective_config()` and refactored `load_effective_config()` to validate after assembly; added `effective_bind_target()` and `InboundAuthOperatorState::summary()`.
Startup readiness (DB & registry) `clients/rook/src/db/mod.rs`, `clients/rook/src/registry/mod.rs`	Added `DbStartupReadiness` type, `SqliteDb::check_startup_readiness()` and `readiness_guidance()`; exposed `RookRegistry::check_startup_readiness()` delegating to DB.
Server diagnostics API `clients/rook/src/server/mod.rs`	Added `StartupReadinessSnapshot` and `diagnose_startup_readiness()` which computes bind target, requests DB readiness, checks assets and inbound-auth operator state.
Doctor CLI & reporting `clients/rook/src/doctor.rs`, `clients/rook/src/main.rs`	Reworked `DoctorCheckResult` → structured `summary/guidance/details`; `DoctorReport` gains `advisory_checks`; doctor now uses `assemble_effective_config()` and `diagnose_startup_readiness()`; rendering/ensure semantics updated and tests adjusted.
Dashboard assets override (tests) `clients/rook/src/dashboard/mod.rs`	Added thread-safe `OnceLock<Mutex<Option<bool>>>` override and `AssetsReadyOverrideGuard` for test-controlled `assets_ready()` behavior.
Tests `clients/rook/tests/doctor_operational_diagnostics.rs`, `clients/rook/src/db/mod.rs` (tests)	Added end-to-end doctor tests covering pass/fail/advisory scenarios; added DB startup-readiness unit tests and guidance assertions.
Spec & docs `openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/*`, `openspec/specs/gateway/spec.md`	Added proposal, design, tasks, verify report and updated gateway spec mandating a shared effective-config/doctor seam, structured reporting, and secret-redaction constraints.

Sequence Diagram

sequenceDiagram
    actor User
    participant Doctor as "rook doctor"
    participant Config as "RookConfig / assemble_effective_config"
    participant Server as "diagnose_startup_readiness"
    participant Registry as "RookRegistry"
    participant DB as "SqliteDb"
    participant Dashboard as "DashboardAssets"

    User->>Doctor: invoke
    Doctor->>Config: assemble_effective_config(input)
    Config->>Config: apply defaults/file/env/cli (no validation)
    Config->>Config: validate_non_auth()
    Config->>Config: inbound_auth.validate()
    Config-->>Doctor: effective RookConfig
    Doctor->>Server: diagnose_startup_readiness(&config)
    Server->>Config: effective_bind_target()
    Server->>Registry: check_startup_readiness(db_path)
    Registry->>DB: check_startup_readiness(path)
    DB->>DB: open/create + migrations
    DB-->>Registry: DbStartupReadiness {opened,guidance}
    Registry-->>Server: DbStartupReadiness
    Server->>Dashboard: assets_ready()
    Dashboard-->>Server: bool
    Server-->>Doctor: StartupReadinessSnapshot
    Doctor->>Doctor: build DoctorReport (summary/guidance/details + advisory_checks)
    Doctor-->>User: render_report() + exit code

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

fix(rook): add config export and env overrides #707: Touches same config resolution / load_effective_config surface and may overlap on assemble/validate behavior.

Suggested reviewers

yuniel-acosta

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 47.92% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	Title follows Conventional Commit format with 'fix' prefix, clear imperative description, and is 51 characters—well under the 72-character limit.
Description check	✅ Passed	PR description is well-structured, includes related issue reference, summary, testing steps, documentation impact, breaking changes section, and completed checklist per template.
Linked Issues check	✅ Passed	All code changes directly address `#679` objectives: database readiness checks, effective config validation, inbound auth validation, dashboard asset validation, structured pass/warn/fail output, and both happy-path and failure test coverage implemented.
Out of Scope Changes check	✅ Passed	All changes are scoped to `#679` requirements: config refactoring (validation split, effective assembly), database readiness API, dashboard asset override, doctor diagnostics enhancement, server startup readiness snapshot, and associated tests and documentation—no unrelated changes detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch rook

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cloudflare-workers-and-pages · 2026-04-28T06:27:50Z

Deploying corvus with Cloudflare Pages

Latest commit:	`78f3476`
Status:	✅ Deploy successful!
Preview URL:	https://8af1c729.corvus-42x.pages.dev
Branch Preview URL:	https://rook.corvus-42x.pages.dev

View logs

github-actions · 2026-04-28T06:28:40Z

✅ Contributor Report

User: @yacosta738
Status: Passed (12/13 metrics passed)

Metric	Description	Value	Threshold	Status
PR Merge Rate	PRs merged vs closed	92%	>= 30%	✅
Repo Quality	Repos with ≥100 stars	0	>= 0	✅
Positive Reactions	Positive reactions received	9	>= 1	✅
Negative Reactions	Negative reactions received	0	<= 5	✅
Account Age	GitHub account age	3104 days	>= 30 days	✅
Activity Consistency	Regular activity over time	108%	>= 0%	✅
Issue Engagement	Issues with community engagement	0	>= 0	✅
Code Reviews	Code reviews given to others	604	>= 0	✅
Merger Diversity	Unique maintainers who merged PRs	2	>= 0	✅
Repo History Merge Rate	Merge rate in this repo	93%	>= 0%	✅
Repo History Min PRs	Previous PRs in this repo	289	>= 0	✅
Profile Completeness	Profile richness (bio, followers)	90	>= 0	✅
Suspicious Patterns	Spam-like activity detection	1	N/A	❌

_{Contributor Report evaluates based on public GitHub activity. Analysis period: 2025-04-28 to 2026-04-28}

coderabbitai

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/rook/src/config/mod.rs`:
- Around line 178-180: The effective_bind_target method constructs "host:port"
naively causing IPv6 ambiguity; update effective_bind_target to detect if
self.host contains ':' (or is a literal IPv6) and, when so, wrap the host in
square brackets before appending ":{port}" so IPv6 like "::1" becomes
"[::1]:4141"; modify the function that returns String (effective_bind_target) to
perform this check using self.host and self.port and return the bracketed form
for IPv6 and unchanged host:port for IPv4/hostname.

In `@clients/rook/src/dashboard/mod.rs`:
- Around line 33-37: The current code calls .lock().expect("assets override lock
should work") on the Mutex inside ASSETS_READY_OVERRIDE which will panic if the
lock is poisoned; instead recover the inner guard on poison to avoid crashing
diagnostics: replace the .lock().expect(...) usage (and the analogous
occurrences around the 49-51 range) with handling that calls
.lock().unwrap_or_else(|poison| poison.into_inner()) (or the equivalent
map_or_else on the result) so you still get the locked guard even if the mutex
was poisoned, then continue using .as_ref() as before.
- Around line 47-52: The current set_assets_ready_override mutates
process-global ASSETS_READY_OVERRIDE and requires manual reset, which leaks
state; replace this with a scoped guard pattern: introduce an
AssetsReadyOverrideGuard (or similar) whose constructor acquires
ASSETS_READY_OVERRIDE, sets the override value, and whose Drop implementation
resets the override to None, then change callers to use
AssetsReadyOverrideGuard::new(value) (or provide a convenience
with_scoped_assets_ready_override) instead of calling set_assets_ready_override
directly; update or add tests to use the guard so overrides are automatically
cleared even if panics occur and remove or deprecate the standalone
set_assets_ready_override to avoid accidental global leaks.

In `@clients/rook/src/doctor.rs`:
- Around line 257-266: The advisory block in render logic (the loop over
report.advisory_checks) only emits name, status and summary but omits actionable
fields; update the loop that builds lines for advisory checks (iterating over
report.advisory_checks in doctor.rs) to also append the advisory check's details
and guidance when present—e.g., after the existing formatted summary line add
conditional lines for check.details and check.guidance (or
render_details/render_guidance helpers if available) so warn/fail advisories
surface actionable text to operators.

In `@clients/rook/tests/doctor_operational_diagnostics.rs`:
- Around line 38-57: The test
doctor_enabled_inbound_auth_without_token_reports_inbound_auth_failure relies on
default DB behavior causing nondeterministic failures; create and initialize a
temporary DB (as used by the happy-path test) and inject its path into the
environment (e.g., set ROOK_DB_PATH or the same config key the suite uses)
before calling run_with_config_path(None, &env) so the test isolates
inbound-auth errors; ensure you reference the temp DB lifecycle in the test and
remove any reliance on the default DB path so the inbound_auth check is the only
failure source.
- Around line 85-93: The test sets global state with
rook::dashboard::set_assets_ready_override(Some(false)) but restores it only
after operations, which won't run if expect_err panics; replace the manual
restore with a panic-safe drop guard: capture the previous override
(rook::dashboard::get_assets_ready_override() or call set_assets_ready_override
to read), then create a small RAII guard struct (e.g., AssetsReadyOverrideGuard)
that sets the override to Some(false) on creation and restores the previous
value in Drop, and use that guard around the run_with_config_path/ensure_success
calls so the original state is restored even on panic; update the test to remove
the final set_assets_ready_override(None) and rely on the guard's Drop.

In
`@openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md`:
- Line 5: The sentence in the proposal contains nonstandard phrasing "needs
answered"; update that phrase to a standard form such as "needs to be answered"
(or "must be answered") so the line reads e.g. "`rook doctor` already exists,
but today it only covers a narrow subset of the operational questions an
operator needs to be answered before putting Rook into service." Locate and edit
the phrase "needs answered" in the proposal text and replace with the chosen
standard phrasing, preserving the rest of the sentence.

In
`@openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml`:
- Around line 4-19: Update the top-level status key in the state.yaml so it
matches the completed phases: change the top-level "status" value from "planned"
to "completed" (or "archived" if you want to indicate archival) so the top-level
"status" aligns with the "phases" mapping where explore, propose, spec, design,
tasks, apply, and verify are all "completed".

In
`@openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md`:
- Line 1: The markdown's first heading uses level 2 ("## Verification Report")
which violates markdownlint MD041; change that first heading to a top-level
heading by replacing "## Verification Report" with "# Verification Report" so
the document begins with an H1.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 6b8a7e26-045c-4955-b65a-7f3bc0179168

📥 Commits

Reviewing files that changed from the base of the PR and between 83a43f5 and a5184e5.

📒 Files selected for processing (15)

clients/rook/src/config/mod.rs
clients/rook/src/dashboard/mod.rs
clients/rook/src/db/mod.rs
clients/rook/src/doctor.rs
clients/rook/src/main.rs
clients/rook/src/registry/mod.rs
clients/rook/src/server/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/design.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/tasks.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
openspec/specs/gateway/spec.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)

GitHub Check: report / Contributor Quality Report
GitHub Check: sonar
GitHub Check: pr-checks
GitHub Check: submit-gradle
GitHub Check: Cloudflare Pages

🧰 Additional context used

📓 Path-based instructions (3)

**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
clients/rook/src/registry/mod.rs
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
clients/rook/src/dashboard/mod.rs
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/design.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/tasks.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md
clients/rook/src/main.rs
clients/rook/src/server/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/db/mod.rs
clients/rook/src/config/mod.rs
openspec/specs/gateway/spec.md
clients/rook/src/doctor.rs

**/*.{md,mdx}

⚙️ CodeRabbit configuration file

**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.

Files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/design.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/tasks.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md
openspec/specs/gateway/spec.md

**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.

Files:

clients/rook/src/registry/mod.rs
clients/rook/src/dashboard/mod.rs
clients/rook/src/main.rs
clients/rook/src/server/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/db/mod.rs
clients/rook/src/config/mod.rs
clients/rook/src/doctor.rs

🧠 Learnings (5)

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes

Applied to files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable

Applied to files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md
clients/rook/src/main.rs
openspec/specs/gateway/spec.md

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths

Applied to files:

clients/rook/src/main.rs

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why

Applied to files:

clients/rook/src/main.rs
clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/doctor.rs

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/channels/**/*.rs : Implement `Channel` trait in `src/channels/` with consistent `send`, `listen`, and `health_check` semantics and cover auth/allowlist/health behavior with tests

Applied to files:

clients/rook/src/main.rs

🪛 LanguageTool

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md

[style] ~5-~5: The double modal “needs answered” is nonstandard (only accepted in certain dialects). Consider “to be answered”.
Context: ...operational questions an operator needs answered before putting Rook into service. It va...

(NEEDS_FIXED)

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/specs/gateway/spec.md

[style] ~66-~66: Consider a more concise word here.
Context: ...tion, or other network-dependent checks in order to determine overall success. Doctor cove...

(IN_ORDER_TO_PREMIUM)

🪛 markdownlint-cli2 (0.22.1)

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md

[warning] 1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/design.md

[warning] 1-1: First line in a file should be a top-level heading

(MD041, first-line-heading, first-line-h1)

🔇 Additional comments (4)

clients/rook/src/main.rs (1)

733-736: Stronger doctor contract assertions look good.

These updates improve validation of structured detail output and failure wording while preserving secret-safety checks.

Also applies to: 777-810

clients/rook/src/registry/mod.rs (1)

61-63: Startup-readiness seam delegation is clean.

Keeping readiness evaluation in SqliteDb avoids duplication and keeps registry behavior cohesive.

clients/rook/src/server/mod.rs (1)

82-104: Startup readiness snapshot extraction is well-factored.

This keeps doctor/startup checks aligned without coupling diagnostics to socket bind flow.

clients/rook/src/db/mod.rs (1)

121-151: DB startup-readiness diagnostics and failure-mode tests are solid.

The readiness probe and guidance helper are backed by practical failing-scenario coverage, which improves operator actionability.

Also applies to: 293-372

coderabbitai · 2026-04-28T06:37:12Z

+pub fn set_assets_ready_override(value: Option<bool>) {
+    *ASSETS_READY_OVERRIDE
+        .get_or_init(|| Mutex::new(None))
+        .lock()
+        .expect("assets override lock should work") = value;
+}


⚠️ Potential issue | 🟠 Major

Global override API is leak-prone and can cause order-dependent test failures.

set_assets_ready_override relies on manual reset, so a panic/early-return can leave stale process-global state that affects other tests.

Suggested refactor (scoped override guard)

+#[doc(hidden)] +pub struct AssetsReadyOverrideGuard { + previous: Option<bool>, +} + +impl Drop for AssetsReadyOverrideGuard { + fn drop(&mut self) { + if let Ok(mut slot) = ASSETS_READY_OVERRIDE + .get_or_init(|| Mutex::new(None)) + .lock() + { + *slot = self.previous; + } + } +} + +#[doc(hidden)] +pub fn scoped_assets_ready_override(value: Option<bool>) -> AssetsReadyOverrideGuard { + let mut slot = ASSETS_READY_OVERRIDE + .get_or_init(|| Mutex::new(None)) + .lock() + .unwrap_or_else(|poisoned| poisoned.into_inner()); + let previous = *slot; + *slot = value; + AssetsReadyOverrideGuard { previous } +}

As per coding guidelines "Security first, performance second... Look for behavioral regressions, missing tests, and contract breaks across modules."

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@clients/rook/src/dashboard/mod.rs` around lines 47 - 52, The current set_assets_ready_override mutates process-global ASSETS_READY_OVERRIDE and requires manual reset, which leaks state; replace this with a scoped guard pattern: introduce an AssetsReadyOverrideGuard (or similar) whose constructor acquires ASSETS_READY_OVERRIDE, sets the override value, and whose Drop implementation resets the override to None, then change callers to use AssetsReadyOverrideGuard::new(value) (or provide a convenience with_scoped_assets_ready_override) instead of calling set_assets_ready_override directly; update or add tests to use the guard so overrides are automatically cleared even if panics occur and remove or deprecate the standalone set_assets_ready_override to avoid accidental global leaks.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@clients/rook/tests/doctor_operational_diagnostics.rs`:
- Around line 4-16: The temp DB created in initialized_db_env via
RookRegistry::open is never removed; change initialized_db_env to create the DB
inside a tempfile-managed resource (e.g., tempfile::NamedTempFile or
tempfile::TempDir), use its path with RookRegistry::open, and keep the
tempfile/TempDir guard alive for the duration of the test (for example return
the guard alongside the HashMap or wrap both in a small struct) so the OS file
is removed on drop; ensure the ROOK_DB_PATH value still points to the
guard.path().to_string_lossy() result and use the initialized_db_env and
RookRegistry::open symbols to locate the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f556d42b-dec3-4a3b-bc05-40a506ee5078

📥 Commits

Reviewing files that changed from the base of the PR and between a5184e5 and 14e1721.

📒 Files selected for processing (7)

clients/rook/src/config/mod.rs
clients/rook/src/dashboard/mod.rs
clients/rook/src/doctor.rs
clients/rook/tests/doctor_operational_diagnostics.rs
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: pr-checks
GitHub Check: sonar
GitHub Check: Cloudflare Pages

🧰 Additional context used

📓 Path-based instructions (3)

**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml
clients/rook/src/dashboard/mod.rs
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md
clients/rook/src/config/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/doctor.rs

**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.

Files:

clients/rook/src/dashboard/mod.rs
clients/rook/src/config/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/doctor.rs

**/*.{md,mdx}

⚙️ CodeRabbit configuration file

**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.

Files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md
openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md

🧠 Learnings (5)

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/main.rs : Preserve CLI contract unless change is intentional and documented; prefer explicit errors over silent fallback for unsupported critical paths

Applied to files:

clients/rook/src/dashboard/mod.rs
clients/rook/tests/doctor_operational_diagnostics.rs

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/{security,gateway,tools,config}/**/*.rs : Do not silently weaken security policy or access constraints; keep default behavior secure-by-default with deny-by-default where applicable

Applied to files:

clients/rook/src/dashboard/mod.rs

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/src/tools/**/*.rs : Implement `Tool` trait in `src/tools/` with strict parameter schema, validate and sanitize all inputs, and return structured `ToolResult` without panics in runtime path

Applied to files:

clients/rook/src/dashboard/mod.rs

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Include threat/risk notes and rollback strategy for security, runtime, and gateway changes; add or update tests for boundary checks and failure modes

Applied to files:

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md

📚 Learning: 2026-02-17T12:31:17.076Z

Learnt from: CR
Repo: dallay/corvus PR: 0
File: clients/agent-runtime/AGENTS.md:0-0
Timestamp: 2026-02-17T12:31:17.076Z
Learning: Applies to clients/agent-runtime/**/*.rs : Run `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` for code validation, or document which checks were skipped and why

Applied to files:

clients/rook/tests/doctor_operational_diagnostics.rs
clients/rook/src/doctor.rs

🔇 Additional comments (15)

clients/rook/src/doctor.rs (5)

257-275: Advisory checks now render full details.

Previous review concern addressed: advisory checks now render details and guidance fields, not just name/status/summary.

49-93: Solid early-return on config failure.

Config validation failure returns immediately with actionable guidance rather than proceeding with subsequent checks that would fail or produce confusing output. The effective bind target is still reported even on failure for debugging context.

95-121: Startup-equivalent DB readiness check.

Using server::diagnose_startup_readiness() ensures doctor validates the same DB initialization path that serve uses, preventing false confidence from a weaker read-only check.

171-211: Secret-safe inbound auth reporting.

The inbound_auth_check_result function uses auth_state.summary() which returns only static strings like "enabled with token configured" — the actual token value is never included in summary, guidance, or details fields.

123-144: Assets check provides actionable guidance.

Clear failure messaging with specific remediation steps ("rebuild the production binary with embedded dashboard assets") helps operators understand what action to take.

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/verify-report.md (1)

1-81: LGTM!

The verification report is well-structured with clear completeness metrics, command outcomes, and spec compliance evidence. The previous MD041 heading issue has been addressed.

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/proposal.md (1)

1-121: LGTM!

The proposal is comprehensive with clear scope boundaries, risk assessment, and rollback strategy. The previous grammar issue ("needs answered") has been corrected.

openspec/changes/archive/2026-04-28-rook-doctor-operational-diagnostics-679/state.yaml (1)

1-29: LGTM!

The previous status inconsistency has been resolved. Top-level status now correctly shows completed to match all completed phases.

clients/rook/src/dashboard/mod.rs (1)

30-61: LGTM! Past issues addressed.

The poison-safe lock handling and RAII guard pattern resolve previous review concerns. The guard sets override on construction and clears to None on drop, ensuring cleanup even on panic.

Note: Nested guards would interfere (inner drop clears outer's value), but since this is #[doc(hidden)] and test-only, that's acceptable for current usage.

clients/rook/tests/doctor_operational_diagnostics.rs (2)

37-58: LGTM!

Previous review concerns addressed: test now uses initialized_db_env() to isolate inbound-auth failure from unrelated DB conditions.

83-105: LGTM!

Previous panic-safety concern resolved. The test now uses AssetsReadyOverrideGuard for RAII-based cleanup instead of manual set_assets_ready_override(None).

clients/rook/src/config/mod.rs (4)

178-184: LGTM!

IPv6 bind-target formatting issue from previous review is resolved. The implementation correctly wraps IPv6 addresses in brackets.

235-258: Clean validation split.

Separating validate_non_auth() from validate() allows doctor to run diagnostics on partially-valid configs and still report useful auth-specific failures separately. Good design for progressive diagnostics.

872-880: Secret-safe operator state reporting.

summary() returns only static strings ("disabled", "enabled with token configured", etc.) — token value never exposed. This aligns with the spec requirement to "confirm that required credentials/config are present without leaking secret values."

587-601: Shared config assembly seam.

assemble_effective_config provides the unvalidated config assembly that both load_effective_config and doctor can use. This ensures doctor evaluates the same effective configuration as serve, preventing drift.

coderabbitai · 2026-04-28T08:50:38Z

+async fn initialized_db_env() -> HashMap<String, String> {
+    let db_path = std::env::temp_dir().join(format!(
+        "rook-doctor-operational-{}.db",
+        uuid::Uuid::new_v4()
+    ));
+    rook::registry::RookRegistry::open(&db_path.to_string_lossy())
+        .await
+        .expect("test database should initialize");
+    HashMap::from([(
+        "ROOK_DB_PATH".to_string(),
+        db_path.to_string_lossy().to_string(),
+    )])
+}


🧹 Nitpick | 🔵 Trivial

Temp DB files persist after tests.

The RookRegistry::open() call creates the DB but there's no cleanup. These temp files accumulate over test runs.

Consider wrapping with a guard or using tempfile::NamedTempFile for automatic cleanup, though this is low priority for test code.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@clients/rook/tests/doctor_operational_diagnostics.rs` around lines 4 - 16, The temp DB created in initialized_db_env via RookRegistry::open is never removed; change initialized_db_env to create the DB inside a tempfile-managed resource (e.g., tempfile::NamedTempFile or tempfile::TempDir), use its path with RookRegistry::open, and keep the tempfile/TempDir guard alive for the duration of the test (for example return the guard alongside the HashMap or wrap both in a small struct) so the OS file is removed on drop; ensure the ROOK_DB_PATH value still points to the guard.path().to_string_lossy() result and use the initialized_db_env and RookRegistry::open symbols to locate the change.

sonarqubecloud · 2026-04-28T09:15:54Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

fix(rook): implement operational doctor diagnostics

a5184e5

github-actions Bot added the size/xl Denotes an extra large change size label Apr 28, 2026

coderabbitai Bot added area:rust area:docs risk:high risk:security labels Apr 28, 2026

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

yacosta738 added 2 commits April 28, 2026 09:59

Merge branch 'main' into rook

6b88154

fix(rook): address review findings

14e1721

coderabbitai Bot removed area:rust area:docs risk:high risk:security labels Apr 28, 2026

Merge branch 'main' into rook

78f3476

coderabbitai Bot reviewed Apr 28, 2026

View reviewed changes

yacosta738 merged commit 6e92609 into main Apr 28, 2026
17 checks passed

yacosta738 deleted the rook branch April 28, 2026 09:29

This was referenced Apr 28, 2026

feat(rook): add production observability metrics #713

Merged

feat(rook): document operational health probes #764

Merged

Conversation

yacosta738 commented Apr 28, 2026

Related Issues

Summary

Tested Information

Documentation Impact

Breaking Changes

Checklist

Uh oh!

coderabbitai Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

cloudflare-workers-and-pages Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deploying corvus with Cloudflare Pages

Uh oh!

github-actions Bot commented Apr 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Contributor Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

sonarqubecloud Bot commented Apr 28, 2026

Quality Gate passed

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Apr 28, 2026 •

edited

Loading

cloudflare-workers-and-pages Bot commented Apr 28, 2026 •

edited

Loading

github-actions Bot commented Apr 28, 2026 •

edited

Loading