Skip to content

feat(rook): operational parity – undercover mode, debug diagnostics, multi-provider routing controls#792

Merged
yacosta738 merged 2 commits into
mainfrom
feature/operational-parity-538
May 6, 2026
Merged

feat(rook): operational parity – undercover mode, debug diagnostics, multi-provider routing controls#792
yacosta738 merged 2 commits into
mainfrom
feature/operational-parity-538

Conversation

@yacosta738
Copy link
Copy Markdown
Contributor

Related Issues

Closes #538


Summary

Implements operational parity for the Rook gateway (#538): undercover/redaction mode, debug diagnostics, and multi-provider routing controls.

What changed and why:

  • OperationalConfig (config/mod.rs) — new struct { undercover: bool, debug_diagnostics: bool } with TOML parsing, env overlay (ROOK_UNDERCOVER, ROOK_DEBUG_DIAGNOSTICS), partial overlay support, and a safe export view. Defaults: both false.
  • Config propagationOperationalConfig threads through RookConfig → ServerConfig → GatewayState and AdminState so every layer has access without global state.
  • emit_gateway_debug / debug_diagnostics_enabled (gateway/handlers.rs) — safe debug emission helpers that gate on debug_diagnostics flag. Instrumented on both buffered and streaming execution paths. All events use normalize_model_label / normalize_vendor_label — bounded, never raw prompts, bodies, tokens, credentials, or paths.
  • /status endpoint — extended OperatorStatusView with nested OperationalStatusView { undercover, debug_diagnostics, redaction_baseline: "always_on" }. Redaction baseline is a static string — it cannot be toggled off.
  • Specopenspec/specs/gateway/spec.md updated with 🛡️ Operational Parity: Undercover, Multi-Provider, and Debugging #538 requirements as source of truth.

Tested Information

  • 409 lib tests pass: cargo test --manifest-path clients/rook/Cargo.toml --lib
  • 1 pre-existing failure (metrics_route_counts_upstream_success_http_error_and_route_rejected_outcomes) confirmed broken on main before this branch — unrelated to these changes.
  • New regression tests added:
    • operational_config_propagates_to_admin_state — verifies config flows from ServerConfig through to AdminState
    • emit_gateway_debug_is_silent_when_debug_diagnostics_disabled — verifies no debug events fire when flag is off
  • cargo check clean; pre-commit and pre-push hooks pass.

Documentation Impact


Breaking Changes

None. All changes are additive:

  • New config fields default to false — existing deployments behave identically without config changes.
  • OperatorStatusView gains a new nested operational field — existing consumers ignoring unknown fields are unaffected.

Checklist

  • I have checked that there isn't already a PR solving the same problem.
  • I have read the Contributing Guidelines.
  • I ensured my code follows the project's style guidelines.
  • I have added or updated tests that prove my fix is effective or that my feature works.
  • I have updated the documentation, or I explained above why no documentation update is needed.
  • I verified the documentation matches the current behavior.
  • I have documented any breaking changes in the Breaking Changes section.
  • I have linked the related issue (if any).

…multi-provider routing controls

- Add OperationalConfig { undercover, debug_diagnostics } with TOML parsing,
  env overlay (ROOK_UNDERCOVER, ROOK_DEBUG_DIAGNOSTICS), partial overlay,
  and export view
- Thread OperationalConfig through RookConfig → ServerConfig → GatewayState
  and AdminState
- Add emit_gateway_debug / debug_diagnostics_enabled helpers in
  gateway/handlers.rs; instrument buffered and streaming execution paths
- Sanitise tracing warn/info with normalize_model_label on all hot paths
- Expose OperationalStatusView { undercover, debug_diagnostics,
  redaction_baseline: always_on } nested in OperatorStatusView via /status
- Add regression tests:
  operational_config_propagates_to_admin_state (server/mod.rs)
  emit_gateway_debug_is_silent_when_debug_diagnostics_disabled (gateway/handlers.rs)
- Update openspec/specs/gateway/spec.md with #538 requirements

Closes #538
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Warning

Rate limit exceeded

@yacosta738 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 43 minutes and 27 seconds before requesting another review.

To continue reviewing without waiting, purchase usage credits in the billing tab.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 40ea8b32-b5b0-4932-b0a4-3d7563269dfa

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad9a2b and 6daf45c.

📒 Files selected for processing (5)
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/config/mod.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/server/mod.rs
📝 Walkthrough

Walkthrough

This PR implements operational control infrastructure for Rook by introducing an OperationalConfig struct with undercover and debug_diagnostics flags. The config is wired through the configuration system (defaults, file loading, environment variables, export views), propagated into runtime state (AdminState, GatewayState, ServerConfig), and used to gate conditional debug logging in gateway handlers. Specification sections define undercover/redaction boundaries, multi-provider controls, and debugging workflows.

Changes

Operational Controls Infrastructure

Layer / File(s) Summary
Config Types
clients/rook/src/config/mod.rs
New OperationalConfig (with undercover, debug_diagnostics), PartialOperationalConfig, and OperationalExportView structs; wired through RookConfig with defaults, partial overlays, environment parsing (ROOK_UNDERCOVER, ROOK_DEBUG_DIAGNOSTICS), and export views with redaction_baseline: "always_on".
Admin Status Types
clients/rook/src/admin/types.rs
New OperationalStatusView struct added with undercover, debug_diagnostics, and redaction_baseline fields; integrated into OperatorStatusView.
State Propagation
clients/rook/src/admin/mod.rs, clients/rook/src/admin/handlers.rs, clients/rook/src/gateway/mod.rs, clients/rook/src/server/mod.rs
OperationalConfig field added to AdminState, GatewayState, and ServerConfig; operational field populated in admin handler's status view construction from state and fixed baseline.
Conditional Debug Logging
clients/rook/src/gateway/handlers.rs
New debug_diagnostics_enabled() helper and emit_gateway_debug() function; gateway events (route_resolved, upstream_attempt_start/success/failure, upstream_stream_start/success/failure) conditionally logged when diagnostics enabled.
Export & Tests
clients/rook/src/main.rs
Test assertion added to verify redaction_baseline: "always_on" in config export; operational defaults and file/env loading tested in config tests; test state construction updated to initialize operational field.
Specification
openspec/specs/gateway/spec.md
New sections define operational undercover/redaction boundary, multi-provider controls, agent debugging/logging workflows, and rollout boundaries with security implications.

Sequence Diagram

sequenceDiagram
    participant Config as Config System
    participant State as Runtime State
    participant Gateway as Gateway Handler
    participant Logger as Structured Log

    Config->>Config: Load OperationalConfig from defaults,<br/>files, environment variables
    Config->>State: Propagate to ServerConfig,<br/>GatewayState, AdminState
    State->>Gateway: expose debug_diagnostics flag
    Gateway->>Gateway: check debug_diagnostics_enabled()
    alt debug_diagnostics enabled
        Gateway->>Logger: emit_gateway_debug(route_resolved,<br/>upstream_attempt_success, etc.)
    else debug_diagnostics disabled
        Gateway->>Gateway: skip debug logging
    end
    Gateway->>State: include operational in<br/>OperatorStatusView for admin
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • dallay/corvus#627: Introduced GatewayState struct that this PR extends with operational field for debug diagnostics control.
  • dallay/corvus#768: Added operator status types and admin handlers that this PR augments with OperationalStatusView and operational config wiring.
  • dallay/corvus#707: Centralized config export pipeline that this PR extends with OperationalExportView and redaction baseline logic.

Suggested labels

area:rust, area:docs, risk:security, risk:high

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Title check ⚠️ Warning Title exceeds 72-character limit at 100 characters and uses Conventional Commit prefix correctly but violates the stated length requirement. Shorten title to ≤72 characters; consider: 'feat(rook): operational parity – undercover and debug diagnostics' or similar concise variant.
Docstring Coverage ⚠️ Warning Docstring coverage is 37.21% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Description check ✅ Passed Description comprehensively covers all template sections: related issues (#538), detailed summary, tested information with test names and cargo output, documentation impact, breaking changes (none), and completed checklist.
Linked Issues check ✅ Passed Code changes implement all four acceptance criteria from #538: undercover/redaction behavior defined and enforced via OperationalConfig, multi-provider routing controls in config/mod.rs and propagated through layers, debug diagnostics workflows and logging in gateway/handlers.rs, and security tradeoffs documented in openspec/specs/gateway/spec.md.
Out of Scope Changes check ✅ Passed All changes directly serve #538 objectives: OperationalConfig, propagation through RookConfig/ServerConfig/GatewayState/AdminState, debug emission helpers, /status endpoint extension, and spec documentation align strictly with linked issue scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/operational-parity-538

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@cloudflare-workers-and-pages
Copy link
Copy Markdown

cloudflare-workers-and-pages Bot commented May 6, 2026

Deploying corvus with  Cloudflare Pages  Cloudflare Pages

Latest commit: 6daf45c
Status: ✅  Deploy successful!
Preview URL: https://121f0421.corvus-42x.pages.dev
Branch Preview URL: https://feature-operational-parity-5.corvus-42x.pages.dev

View logs

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
clients/rook/src/config/mod.rs (1)

979-986: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Redact db_path when undercover mode is active.

Line 985 still exports the full filesystem path even when operational.undercover is true. The new undercover boundary explicitly treats local paths as sensitive operator output, so rook config export will leak deployment layout instead of failing closed.

Proposed fix
         Self {
             host: config.host.clone(),
             port: config.port,
             enable_tui: config.enable_tui,
-            db_path: config.db_path.display().to_string(),
+            db_path: if config.operational.undercover {
+                "[redacted]".to_string()
+            } else {
+                config.db_path.display().to_string()
+            },
             operational: OperationalExportView::from(&config.operational),
             inbound_auth: InboundAuthExportView {

As per coding guidelines, **/*: Security first, performance second. Validate input boundaries, auth/authz implications, and secret management.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@clients/rook/src/config/mod.rs` around lines 979 - 986, The export currently
always includes the full filesystem path from config.db_path in
RookConfigExportView::from_config; change it to redact that value when the
source config indicates undercover mode by checking
config.operational.undercover and returning a safe placeholder (e.g.
"<redacted>" or an empty string) instead of config.db_path.display().to_string()
so local paths are not leaked; update the RookConfigExportView::from_config
function to branch on config.operational.undercover (referencing RookConfig,
OperationalExportView::from, and the db_path field).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@clients/rook/src/admin/handlers.rs`:
- Around line 279-283: The OperationalStatusView currently exposes
state.operational.undercover in the /status response but there is no runtime
enforcement; either remove undercover from the status payload or implement
enforcement where redaction happens. To fix: locate OperationalStatusView
construction in handlers.rs (the block setting operational:
OperationalStatusView { undercover: state.operational.undercover, ... }) and
either delete the undercover field and related struct member, or add enforcement
in the redaction/request flow (where requests are redacted or metadata is
filtered) to consult state.operational.undercover and apply the intended
behavior; update the OperationalStatusView definition and any callers/tests
accordingly (search for OperationalStatusView, state.operational, and redaction
code paths) so the status accurately reflects implemented behavior.

In `@clients/rook/src/config/mod.rs`:
- Around line 1061-1067: The Default implementation for OperationalConfig
currently sets undercover: true which violates the new operational config
contract; update the impl Default for OperationalConfig (the default() function)
to set undercover to false (and keep debug_diagnostics as false) so both new
flags default to false, ensuring operator-visible/exported config remains
unchanged for non-opted-in users.

In `@clients/rook/src/gateway/handlers.rs`:
- Around line 1787-1816: The test
emit_gateway_debug_is_silent_when_debug_diagnostics_disabled currently only
asserts debug_diagnostics_enabled and never calls emit_gateway_debug; update the
test to actually exercise emit_gateway_debug by hooking a tracing subscriber (or
test tracing collector) before calling super::emit_gateway_debug(&state, ...),
then assert that no "gateway.debug" event is recorded when
state.operational.debug_diagnostics is false; alternatively, if you prefer the
current assertions only, rename the test to reflect it only checks
debug_diagnostics_enabled and fix the stale inline comment about undercover
default (change or remove the mention of undercover=true) and keep references to
test_state(), OperationalConfig, debug_diagnostics_enabled, and
emit_gateway_debug so reviewers can find the code.
- Around line 571-577: The current emit_gateway_debug call that logs
"upstream_stream_success" fires when the upstream SSE opens but before the
stream completes; change this to emit a neutral event like
"upstream_stream_opened" here (replace the call at the emit_gateway_debug
invocation), then move the terminal "success" vs "failure" emission into the SSE
completion and error handlers where the stream abort is detected (the code path
referenced around the abort at line ~1279) so that a "upstream_stream_success"
is only emitted on clean completion and a corresponding failure event is emitted
on errors/abort; apply the same change to the similar block around lines 590-595
so both openings are logged neutrally and terminal outcomes are recorded from
the completion/error hooks.

In `@clients/rook/src/server/mod.rs`:
- Around line 2692-2720: Replace the current test body in
operational_config_propagates_to_admin_state so it exercises the real
propagation path: create the OperationalConfig, open the in-memory registry
(RookRegistry::open_in_memory), call build_app_with_registry_and_startup_state
(or the app bootstrap helper used in this module) to construct the running app
with that registry and startup state, then call the app's status endpoint (e.g.
GET /api/status) or otherwise retrieve the AdminState from the running app and
assert AdminState.operational.undercover and
AdminState.operational.debug_diagnostics match the original OperationalConfig
values instead of instantiating AdminState directly.

---

Outside diff comments:
In `@clients/rook/src/config/mod.rs`:
- Around line 979-986: The export currently always includes the full filesystem
path from config.db_path in RookConfigExportView::from_config; change it to
redact that value when the source config indicates undercover mode by checking
config.operational.undercover and returning a safe placeholder (e.g.
"<redacted>" or an empty string) instead of config.db_path.display().to_string()
so local paths are not leaked; update the RookConfigExportView::from_config
function to branch on config.operational.undercover (referencing RookConfig,
OperationalExportView::from, and the db_path field).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7f3f9710-2891-46e8-8900-eb5a449128e7

📥 Commits

Reviewing files that changed from the base of the PR and between cfb9b3d and 6ad9a2b.

📒 Files selected for processing (9)
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/config/mod.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/gateway/mod.rs
  • clients/rook/src/main.rs
  • clients/rook/src/server/mod.rs
  • openspec/specs/gateway/spec.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
  • GitHub Check: sonar
  • GitHub Check: pr-checks
  • GitHub Check: semgrep-cloud-platform/scan
  • GitHub Check: Cloudflare Pages
🧰 Additional context used
📓 Path-based instructions (3)
**/*.rs

⚙️ CodeRabbit configuration file

**/*.rs: Focus on Rust idioms, memory safety, and ownership/borrowing correctness.
Flag unnecessary clones, unchecked panics in production paths, and weak error context.
Prioritize unsafe blocks, FFI boundaries, concurrency races, and secret handling.

Files:

  • clients/rook/src/main.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/gateway/mod.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/gateway/handlers.rs
  • clients/rook/src/server/mod.rs
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/config/mod.rs
**/*

⚙️ CodeRabbit configuration file

**/*: Security first, performance second.
Validate input boundaries, auth/authz implications, and secret management.
Look for behavioral regressions, missing tests, and contract breaks across modules.

Files:

  • clients/rook/src/main.rs
  • clients/rook/src/admin/mod.rs
  • clients/rook/src/gateway/mod.rs
  • clients/rook/src/admin/types.rs
  • clients/rook/src/gateway/handlers.rs
  • openspec/specs/gateway/spec.md
  • clients/rook/src/server/mod.rs
  • clients/rook/src/admin/handlers.rs
  • clients/rook/src/config/mod.rs
**/*.{md,mdx}

⚙️ CodeRabbit configuration file

**/*.{md,mdx}: Verify technical accuracy and that docs stay aligned with code changes.
For user-facing docs, check EN/ES parity or explicitly note pending translation gaps.

Files:

  • openspec/specs/gateway/spec.md

Comment thread clients/rook/src/admin/handlers.rs
Comment thread clients/rook/src/config/mod.rs
Comment on lines +571 to +577
emit_gateway_debug(
state,
"upstream_stream_success",
&request.model,
&metric_context,
Some("success"),
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Delay upstream_stream_success until the SSE actually finishes.

This event is emitted as soon as the upstream stream opens, but Line 1279 shows the stream can still abort mid-flight. In that case diagnostics will report success even though the stream never completed, and there is no corresponding terminal failure event for the abort. Emit a neutral “stream opened” event here, then record success/failure from the completion/error hook instead.

As per coding guidelines, **/*: Look for behavioral regressions, missing tests, and contract breaks across modules.

Also applies to: 590-595

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@clients/rook/src/gateway/handlers.rs` around lines 571 - 577, The current
emit_gateway_debug call that logs "upstream_stream_success" fires when the
upstream SSE opens but before the stream completes; change this to emit a
neutral event like "upstream_stream_opened" here (replace the call at the
emit_gateway_debug invocation), then move the terminal "success" vs "failure"
emission into the SSE completion and error handlers where the stream abort is
detected (the code path referenced around the abort at line ~1279) so that a
"upstream_stream_success" is only emitted on clean completion and a
corresponding failure event is emitted on errors/abort; apply the same change to
the similar block around lines 590-595 so both openings are logged neutrally and
terminal outcomes are recorded from the completion/error hooks.

Comment thread clients/rook/src/gateway/handlers.rs
Comment thread clients/rook/src/server/mod.rs
- config: fix OperationalConfig default – undercover now false (was true)
- config: redact db_path in export view when undercover mode is active
- admin: remove undercover from OperationalStatusView – no enforcement exists;
  expose only debug_diagnostics and redaction_baseline in /status
- gateway: rename upstream_stream_success → upstream_stream_opened at stream
  open site; success/failure semantics belong at completion, not open
- gateway: rename guard-only test to debug_diagnostics_enabled_guard_returns_correct_value;
  fix stale comment that referenced undercover=true default
- server: rewrite operational_config_propagates_to_admin_state to exercise
  the real ServerConfig → build_app_with_registry → /api/status path
- config: update test name and assertion to match new undercover=false default
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 6, 2026

@yacosta738 yacosta738 merged commit 355cc75 into main May 6, 2026
17 checks passed
@yacosta738 yacosta738 deleted the feature/operational-parity-538 branch May 6, 2026 13:14
This was referenced May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

🛡️ Operational Parity: Undercover, Multi-Provider, and Debugging

1 participant