feat: forester dashboard by sergeytimoshin · Pull Request #2269 · Lightprotocol/light-protocol

sergeytimoshin · 2026-02-10T10:43:04Z

Summary by CodeRabbit

New Features
- Introduced a modern Next.js-based dashboard with real-time monitoring pages: Overview, Trees, Metrics, and Compressible Accounts.
- Added Prometheus metrics integration for enhanced performance tracking.
- Added support for monitoring compressible account statistics.
- New standalone dashboard server mode via CLI command.
Bug Fixes
- Enhanced proof validation error handling with explicit count verification.
- Improved error detection and recovery for transaction batch processing.
- Fixed concurrent proof submission constraints for stability.
Chores
- Deprecated legacy static HTML dashboard in favor of the modern Next.js interface.
- Restructured API server to support asynchronous background data providers.

- add dashboard app (Next.js + Tailwind), components, hooks, and API client - include dashboard Dockerfile + env example; update gitignore and main Dockerfile - remove legacy static dashboard.html - extend Rust backend/CLI to expose status, metrics, epochs, and compressible-related data - refactor compressible/config/processor helpers, queue pressure + validation, and related telemetry/metrics

coderabbitai · 2026-02-10T10:43:24Z

Caution

Review failed

Failed to post review comments

📝 Walkthrough

Walkthrough

This PR introduces a Next.js-based dashboard frontend for the Forester service with corresponding backend API enhancements. The dashboard provides real-time monitoring of epochs, trees, metrics, and compressible accounts. The API server is refactored to stream metrics and compressible data via watch channels rather than generating them on-demand, improving scalability. ForesterStatus is expanded with aggregate statistics. A new CLI dashboard command enables standalone operation. The static HTML dashboard is removed.

Changes

Cohort / File(s)	Summary
Dashboard Configuration `forester/dashboard/package.json`, `forester/dashboard/next.config.mjs`, `forester/dashboard/postcss.config.mjs`, `forester/dashboard/tailwind.config.ts`, `forester/dashboard/.env.local.example`, `forester/dashboard/.gitignore`, `forester/dashboard/next-env.d.ts`	Initial Next.js project setup including build configuration, styling stack (Tailwind + PostCSS), TypeScript configuration, and environment/gitignore patterns.
Dashboard UI Pages `forester/dashboard/src/app/page.tsx`, `forester/dashboard/src/app/layout.tsx`, `forester/dashboard/src/app/globals.css`, `forester/dashboard/src/app/trees/page.tsx`, `forester/dashboard/src/app/metrics/page.tsx`, `forester/dashboard/src/app/compressible/page.tsx`	Core dashboard pages: root layout with sidebar, overview dashboard with epoch/forester/tree status, trees with filterable table, metrics breakdown, and compressible accounts display.
Dashboard Components `forester/dashboard/src/components/Sidebar.tsx`, `forester/dashboard/src/components/EpochCard.tsx`, `forester/dashboard/src/components/ForesterList.tsx`, `forester/dashboard/src/components/TreeTable.tsx`, `forester/dashboard/src/components/MetricsPanel.tsx`, `forester/dashboard/src/components/CompressiblePanel.tsx`, `forester/dashboard/src/components/QueuePressureChart.tsx`, `forester/dashboard/src/components/ProgressBar.tsx`, `forester/dashboard/src/components/ErrorState.tsx`, `forester/dashboard/src/components/StatusBadge.tsx`, `forester/dashboard/src/components/TreeBatchDetail.tsx`	Reusable UI components for displaying status cards, progress visualization, data tables, charts, error handling, and nested detail views. Includes responsive grid layouts and interactive filtering.
Dashboard Utilities & Types `forester/dashboard/src/hooks/useForesterStatus.ts`, `forester/dashboard/src/hooks/useMetrics.ts`, `forester/dashboard/src/hooks/useCompressible.ts`, `forester/dashboard/src/lib/api.ts`, `forester/dashboard/src/lib/utils.ts`, `forester/dashboard/src/types/forester.ts`, `forester/dashboard/Dockerfile`	Data-fetching hooks (SWR-based), API client with error handling, formatting utilities (address truncation, SOL balance, time/slot conversion), TypeScript type definitions, and containerization for the Next.js app.
API Server Refactoring `forester/src/api_server.rs`	Major refactoring to replace on-demand metric/compressible generation with streaming via watch channels. Introduces background providers, configurable ApiServerConfig, refactored error handling, and metrics/compressible snapshots served from in-memory channels.
Status & Metrics Enhancements `forester/src/forester_status.rs`, `forester/src/metrics.rs`	Extended ForesterStatus with aggregate queue stats (total/active/rolled-over trees, pending items per type). Metrics converted from async to sync for registry updates. Added Prometheus querying capability via new `query_prometheus_metrics` function.
Compressible Features `forester/src/compressible/bootstrap_helpers.rs`, `forester/src/compressible/config.rs`, `forester/src/compressible/mod.rs`, `forester/src/compressible/ctoken/compressor.rs`, `forester/src/compressible/validation.rs`, `forester/src/compressible/subscriber.rs`	Added count functions for compressible accounts (CToknes, Mints, PDAs). Replaced string-based registry ID with concrete constant. Enhanced backoff reset on stream closure. Updated compressor error handling and field exposure.
Transaction & Proof Processing `forester/src/processor/v1/helpers.rs`, `forester/src/processor/v1/send_transaction.rs`, `forester/src/processor/v1/tx_builder.rs`, `forester/src/processor/v2/helpers.rs`, `forester/src/processor/v2/proof_worker.rs`, `forester/src/processor/v2/root_guard.rs`	Enhanced validation (proof count mismatch checks), improved error resilience (continue on batch build failure), timeout-driven batch polling, concurrency-bounded proof pipeline, and refined root reconciliation logging.
Epoch & Slot Management `forester/src/epoch_manager.rs`, `forester/src/slot_tracker.rs`, `forester/src/rollover/operations.rs`	Added scope guard for processing flag reset, dynamic wait durations based on slot count, fixed slot duration constant (400ms), and replaced hardcoded tree heights with constants from account_compression.
CLI & Configuration `forester/src/cli.rs`, `forester/src/main.rs`	Added new Dashboard command variant with DashboardArgs struct. Extended StartArgs with optional Prometheus URL. Updated spawn_api_server signature to accept ApiServerConfig parameter. Main now routes both standard and dashboard modes.
Telemetry `forester/src/telemetry.rs`	Introduced LOG_GUARD static to maintain non-blocking file appender lifetime. Simplified stdout layer configuration and removed explicit filter usage for file logger.
Docker & Environment `forester/Dockerfile`, `forester/.gitignore`	Removed static asset copying from builder stage. Added .env.mainnet to gitignore. Added explicit empty CMD to final image.
Removed Static HTML Dashboard `forester/static/dashboard.html`	Deleted entire self-contained HTML/CSS/JS dashboard (~1236 lines). Replaced by modern Next.js frontend.
Tests `forester/tests/priority_fee_test.rs`	Updated StartArgs initialization to include new prometheus_url field.

Sequence Diagram(s)

sequenceDiagram
    actor User
    participant Dashboard as Next.js Dashboard
    participant APIServer as Forester API Server
    participant MetricsProvider as Background Metrics Provider
    participant PrometheusDB as Prometheus
    participant RPC as Solana RPC

    User->>Dashboard: Loads /metrics page
    Dashboard->>APIServer: GET /metrics/json
    
    par Parallel Background Tasks
        MetricsProvider->>MetricsProvider: Periodically runs (10s interval)
        MetricsProvider->>PrometheusDB: Query transaction_rate,<br/>transactions_processed,<br/>forester_balances
        MetricsProvider->>MetricsProvider: Parse & aggregate results
        MetricsProvider->>MetricsProvider: Store in watch channel
    and
        MetricsProvider->>RPC: (fallback) Get local metrics<br/>if Prometheus unavailable
    end
    
    APIServer->>APIServer: Read latest snapshot<br/>from watch channel
    APIServer-->>Dashboard: Return MetricsSnapshot JSON
    Dashboard->>Dashboard: Render transaction rates,<br/>balances, queue depths
    Dashboard-->>User: Display metrics dashboard

sequenceDiagram
    actor User
    participant Dashboard as Next.js Dashboard
    participant APIServer as Forester API Server
    participant StatusHandler as Status Route Handler
    participant CompressibleProvider as Background<br/>Compressible Provider
    participant RPC as Solana RPC
    participant Trackers as In-Memory Trackers

    User->>Dashboard: Loads /compressible page
    Dashboard->>APIServer: GET /compressible
    
    par Concurrent Sources
        CompressibleProvider->>CompressibleProvider: Runs periodically
        alt Trackers Available
            CompressibleProvider->>Trackers: Read cached counts<br/>(CTokens, Mints, PDAs)
        else Trackers Unavailable
            CompressibleProvider->>RPC: Query getProgramAccounts<br/>with pagination & filters
            CompressibleProvider->>CompressibleProvider: Count results
        end
        CompressibleProvider->>CompressibleProvider: Store in watch channel
    and
        APIServer->>APIServer: Read latest snapshot<br/>from channel
    end
    
    APIServer-->>Dashboard: Return CompressibleSnapshot
    Dashboard->>Dashboard: Render account counts
    Dashboard-->>User: Display compressible panel

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~90 minutes

Rationale: This diff introduces substantial, heterogeneous changes across multiple domains: (1) a complete Next.js dashboard frontend with configuration, 11 components, 3 pages, hooks, utilities, and TypeScript types; (2) a significant API server refactoring changing core async patterns and request flow; (3) extended ForesterStatus and related structs affecting multiple query paths; (4) supporting changes to compressible, processors, metrics, and CLI. While individual file changes are sometimes localized, the cumulative scope, API-surface modifications (e.g., spawn_api_server signature, struct field additions), and interconnected logic across frontend/backend layers demands careful review of interaction boundaries, error handling consistency, and type correctness throughout.

Possibly related PRs

feat: forester API server #2163: Overlapping refactoring of forester API server, spawn_api_server signature changes, and metrics/status modules indicate direct code-level dependency.
feat: forester compressible ctoken support #2052: Both PRs add and modify the compressible tracker/subscriber/bootstrap APIs, CToken compressor, and CLI configuration for compressible features.
refactor: improve status output for v2 trees #1879: Both PRs restructure ForesterStatus and TreeStatus representations, affecting status fetching and aggregate queue statistics computation paths.

Suggested labels

ai-review

Suggested reviewers

SwenSchaeferjohann

Poem

🎨 A dashboard blooms, all charts unfold,
Metrics stream through channels bold,
Epochs march, trees stand tall,
React and Rust together enthrall! 📊✨

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'feat: forester dashboard' clearly identifies the main feature addition: a new dashboard for the Forester application. It aligns with the substantial changes across dashboard infrastructure, components, hooks, and API server modifications.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 70.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch sergey/forester-dashboard

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 33

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (7)

forester/src/processor/v1/tx_builder.rs (1)
131-141: ⚠️ Potential issue | 🟠 Major

Overly broad "not found" error swallowing — this will silently discard real failures.

The check err_str.to_lowercase().contains("not found") matches far more than indexer record-not-found errors. Any error message containing "not found" — connection not found, method not found, key not found, socket not found, etc. — will be silently treated as an empty batch and the work items will be skipped.

Additionally, note that the first condition ("record not found") is logically redundant since the second condition ("not found") already subsumes it. And .to_lowercase() is called twice on the same string.

I'd recommend tightening this to match only the specific indexer error you expect:
Proposed fix
-                let err_str = e.to_string();
-                return if err_str.to_lowercase().contains("record not found")
-                    || err_str.to_lowercase().contains("not found")
-                {
+                let err_str = e.to_string().to_lowercase();
+                return if err_str.contains("record not found") {
If you genuinely need to catch a broader class of indexer "not found" responses, consider matching on a structured error variant rather than substring matching on the message. Error messages are not a stable API contract — an upstream library changing its wording will silently alter your control flow.
forester/src/processor/v2/root_guard.rs (1)
96-101: 🧹 Nitpick | 🔵 Trivial

Consider adding a test for the non-warning uninitialized path.

This test exercises the case where indexer_root != onchain_root (triggers the warning log). There's no test for expected_root == 0 with indexer_root == onchain_root, which is the "clean" uninitialized startup. It's a one-liner to add and would round out coverage of both branches inside the new block:
Suggested additional test
#[test]
fn proceeds_when_expected_is_zero_and_roots_agree() {
    assert_eq!(
        reconcile_roots(root(0), root(5), root(5)),
        RootReconcileDecision::Proceed
    );
}
forester/src/rollover/operations.rs (2)
64-72: ⚠️ Potential issue | 🟡 Minor

Const generic 26 on Line 65 can silently diverge from STATE_MERKLE_TREE_HEIGHT.

Line 72 now correctly uses STATE_MERKLE_TREE_HEIGHT, but the const generic parameter on Line 65 is still a hardcoded 26. If the constant ever changes, this will compile fine but produce incorrect tree reads at runtime — a subtle and dangerous mismatch.

I understand that Rust's stable const generics don't allow FOO as usize in generic position, so you may not be able to use the constant directly. At minimum, add a compile-time or debug assertion to keep these in sync:
Proposed safeguard
+            // Safety: const generic must match the tree height constant.
+            assert_eq!(STATE_MERKLE_TREE_HEIGHT as usize, 26, "STATE_MERKLE_TREE_HEIGHT changed — update the const generic on get_concurrent_merkle_tree");
             let merkle_tree =
                 get_concurrent_merkle_tree::<StateMerkleTreeAccount, R, Poseidon, 26>(
The same pattern appears in sdk-libs/client/src/indexer/tree_info.rs (the height() method also hardcodes 26 for StateV1 / AddressV1), so this is a codebase-wide consistency concern worth tracking.

104-111: ⚠️ Potential issue | 🟡 Minor

Same const-generic / named-constant mismatch for AddressV1.

Line 105 hardcodes 26 as a const generic for get_indexed_merkle_tree, while Line 111 now uses ADDRESS_MERKLE_TREE_HEIGHT. Same divergence risk as the StateV1 path above — please add an equivalent assertion here too.
Proposed safeguard
+            assert_eq!(ADDRESS_MERKLE_TREE_HEIGHT as usize, 26, "ADDRESS_MERKLE_TREE_HEIGHT changed — update the const generic on get_indexed_merkle_tree");
             let merkle_tree =
                 get_indexed_merkle_tree::<AddressMerkleTreeAccount, R, Poseidon, usize, 26, 16>(
forester/src/compressible/subscriber.rs (2)
159-168: ⚠️ Potential issue | 🟠 Major

Backoff reset on StreamClosed defeats its own purpose and produces misleading logs.

Two problems here:

Misleading log: You increment attempt on line 160 and log it on line 162, then immediately reset it to 0 on line 168. Every StreamClosed will always log "attempt 1", making it impossible to tell if you're in a reconnect storm.

No backoff for rapid stream closures: If the server accepts the WebSocket but immediately drops it (e.g., overloaded, auth issue, rate-limited), run_connection returns StreamClosed every time. Because you unconditionally reset current_delay and attempt, you'll reconnect every initial_delay (1 s) forever — effectively a tight loop that hammers the server.

The comment says "Reset backoff after a successful connection," which is the right idea — but StreamClosed doesn't mean the connection was healthy. A more robust approach: track how long the connection was alive, and only reset backoff if it survived past some threshold (e.g., > 30 s). Otherwise, keep the exponential backoff.
Proposed fix — reset only for long-lived connections
 Ok(ConnectionResult::StreamClosed) => {
     attempt += 1;
     warn!(
         "{} connection lost (attempt {}), reconnecting in {:?}...",
         self.config.name, attempt, current_delay
     );
-
-                    // Reset backoff after a successful connection
-                    current_delay = self.reconnect_config.initial_delay;
-                    attempt = 0;
+                    // Note: backoff is NOT reset here. If you want to reset
+                    // after a long-lived connection, track connection duration
+                    // in run_connection() and return it with StreamClosed.
 }
Alternatively, measure elapsed time inside run_connection and return it with the StreamClosed variant:
enum ConnectionResult {
    Shutdown,
    StreamClosed { connected_duration: Duration },
}
Then in the caller:
Ok(ConnectionResult::StreamClosed { connected_duration }) => {
    if connected_duration > Duration::from_secs(30) {
        // Connection was stable — reset backoff
        current_delay = self.reconnect_config.initial_delay;
        attempt = 0;
    } else {
        attempt += 1;
    }
    // ...
}
144-193: ⚠️ Potential issue | 🟠 Major

The Err path correctly applies exponential backoff — nice contrast to the StreamClosed path.

Worth noting: the backoff computation on lines 189-192 runs for both StreamClosed and Err arms, but since StreamClosed resets current_delay on line 167 before reaching it, the exponential increase after StreamClosed just computes initial_delay * multiplier — so on a second consecutive StreamClosed you'd sleep initial_delay (reset) then set delay to initial_delay * 2, but then the next StreamClosed resets it again. The backoff never actually grows. This reinforces the concern above.
forester/src/processor/v2/helpers.rs (1)
435-473: 🧹 Nitpick | 🔵 Trivial

Good direction — condvar with timeout is the right move here. Replacing the old thread::sleep polling with Condvar::wait_timeout is a meaningful improvement: lower CPU overhead and faster wake-up when data actually arrives. The 120s overall cap is a sensible safety net. A couple of things to tighten up, though:

1. Double lock acquisition creates a TOCTOU gap and unnecessary contention.

Each loop iteration acquires available_elements at line 439 (read + drop), then re-acquires the same lock at line 467 for the condvar wait. Between those two acquisitions, the background producer could update available_elements and call ready.notify_all() (line 702). Since we're not waiting on the condvar at that moment, the notification is lost, and we sleep for the full 50ms poll interval before rechecking.

You can eliminate this gap by holding the guard across the check and the wait:
♻️ Suggested restructure to single lock acquisition per iteration
         loop {
-            let available = *lock_recover(
-                &self.available_elements,
-                "streaming_address_queue.available_elements",
-            );
-            if available >= batch_end {
-                return available;
-            }
-
             let complete = *lock_recover(
                 &self.fetch_complete,
                 "streaming_address_queue.fetch_complete",
             );
-            if complete {
-                return available;
-            }
 
-            if start.elapsed() > POLL_TIMEOUT {
-                tracing::warn!(
-                    "wait_for_batch timed out after {:?} waiting for {} elements (available: {})",
-                    POLL_TIMEOUT,
-                    batch_end,
-                    available
-                );
-                return available;
-            }
-
-            // Use condvar wait with timeout instead of thread::sleep to avoid
-            // blocking the thread and to wake up promptly when data arrives.
-            let guard = lock_recover(
+            let guard = lock_recover(
                 &self.available_elements,
                 "streaming_address_queue.available_elements",
             );
-            let _ = self
-                .data_ready
-                .wait_timeout(guard, std::time::Duration::from_millis(50));
+            let available = *guard;
+
+            if available >= batch_end || complete {
+                return available;
+            }
+
+            if start.elapsed() > POLL_TIMEOUT {
+                tracing::warn!(
+                    "wait_for_batch timed out after {:?} waiting for {} elements (available: {})",
+                    POLL_TIMEOUT,
+                    batch_end,
+                    available
+                );
+                return available;
+            }
+
+            // Hold the guard into wait_timeout so the notify between
+            // "check" and "wait" cannot be lost.
+            let (_guard, _timeout_result) = self
+                .data_ready
+                .wait_timeout(guard, std::time::Duration::from_millis(50))
+                .unwrap_or_else(|e| {
+                    tracing::warn!("Poisoned condvar (recovering): streaming_address_queue.data_ready");
+                    e.into_inner()
+                });
         }
Note: fetch_complete uses a different mutex, so it still needs its own lock — but moving it before the available_elements lock keeps the critical window tight. The condvar wait atomically releases the available_elements guard, so no notification from the producer (which locks available_elements → updates → drops → notifies) can slip through unobserved.

2. Stale available on timeout return (minor).

When the timeout fires at line 455, the available value was read at the top of the iteration (line 439) and could be out of date. In get_batch_data, this stale value feeds into the start >= available early-return check (line 479), which could produce a spurious None even though data exists. If you adopt the restructured loop above, this is resolved naturally since available is read from the held guard right before the timeout check.

🤖 Fix all issues with AI agents

In `@forester/dashboard/Dockerfile`:
- Around line 17-18: Combine the two consecutive Docker RUN layers into a single
RUN to reduce image layers and satisfy DL3059: replace the separate RUN addgroup
--system --gid 1001 nodejs and RUN adduser --system --uid 1001 nextjs with one
RUN that executes both commands (using shell chaining such as &&) so both
addgroup and adduser run in the same layer and preserve exit-on-failure
behavior.
- Around line 23-26: Add a Docker HEALTHCHECK instruction to the Dockerfile to
probe the running Next.js server (after the existing USER nextjs / EXPOSE 3000 /
ENV PORT=3000 / CMD ["node","server.js"] lines): implement a lightweight HTTP
probe (e.g., wget or curl against http://localhost:3000/ or a dedicated /health
route if available) with sensible parameters (interval, timeout, start-period
and retries) so the container runtime can mark the container unhealthy when the
server is unresponsive.

In `@forester/dashboard/src/app/metrics/page.tsx`:
- Around line 22-24: The empty-state check for isEmpty currently only inspects
metrics.transactions_processed_total and metrics.forester_balances; update it to
also consider metrics.queue_lengths (and optionally metrics.transaction_rate) so
that isEmpty truly reflects no data across all metric groups. Locate the isEmpty
declaration and include Object.keys(metrics.queue_lengths).length === 0 (and
Object.keys(metrics.transaction_rate).length === 0 if you want to treat
transaction_rate as contributing to non-empty) in the combined boolean
expression. Ensure you use the same metrics object keys (metrics.queue_lengths,
metrics.transaction_rate) and keep the logical AND semantics so the page only
treats it as empty when all relevant metric maps are empty.

In `@forester/dashboard/src/app/page.tsx`:
- Around line 45-52: Replace the array index key in the warnings.rendering (the
warnings.map callback) with the warning string itself: use the iteration value
(the warning string variable passed as w) as the React key in the div (ensuring
it’s unique/deterministic), or fall back to a stable derived id if you have any
risk of duplicates; update the key prop in the map inside the component that
renders warnings (the warnings.map callback) accordingly.
- Around line 29-34: The literal 1000 should be replaced with a named constant
to clarify intent and make tuning easier: add a top-level constant like
REGISTRATION_WARNING_SLOT_THRESHOLD (or similar) in page.tsx and use it in the
check inside the block that currently references
status.slots_until_next_registration (the if that also checks
status.registration_epoch_foresters.length). Replace the magic number with that
constant, choose a descriptive name and add a brief comment describing what the
threshold represents.

In `@forester/dashboard/src/components/ErrorState.tsx`:
- Around line 6-18: The current ErrorState component renders an "Error" UI even
when the error prop is falsy; change the logic in ErrorState so that when error
is undefined/null it returns null (not the error block) to avoid showing a false
error to users, and add a one-line comment above that early return explaining
it’s intentionally returning null because callers should guard rendering (or, if
you prefer to keep a defensive fallback, keep the UI but add a comment
clarifying it’s for the rare case where isError is true but error is undefined);
reference the ErrorState function and its props error and fallbackMessage when
making this change.
- Around line 20-23: The current fragile detection in ErrorState uses isApiError
computed from error.name === "ApiError" || error.message.startsWith("Forester
API returned"); remove the fallback string-prefix check and rely on the
structured discriminator (error.name === "ApiError") or add a typed property
(e.g., error.type === "ApiError" or a custom ApiError class) when creating
errors; update the isApiError logic (and any code that constructs errors) to use
that structured field instead of error.message.startsWith to avoid brittle
message parsing.

In `@forester/dashboard/src/components/ForesterList.tsx`:
- Around line 20-24: Replace the array index key with the unique forester
identifier: in the foresters.map callback (where the code currently uses
key={i}), use f.authority as the React key instead; update the JSX return in the
map (the element with key={...}) to key={f.authority} so React can correctly
reconcile items when the list changes.

In `@forester/dashboard/src/components/MetricsPanel.tsx`:
- Around line 66-73: The parameter name in the formatSol utility is misleading:
rename the function parameter in utils.ts from lamports to a clearer name like
sol or amount (e.g., change function formatSol(lamports: number) to
formatSol(sol: number)) and update any internal variable/JSdoc references inside
formatSol to use the new name; keep the function behavior unchanged and ensure
all call sites (e.g., the MetricsPanel usage formatSol(balance)) continue to
work without changes.

In `@forester/dashboard/src/components/ProgressBar.tsx`:
- Line 14: The computed percent "pct" can go negative when "value" < 0; update
the calculation in ProgressBar.tsx (the pct = ... expression using value and
max) to clamp to the [0,100] range by wrapping the existing min(...) with
Math.max(0, ...), e.g. compute the ratio as before and then apply Math.max(0,
Math.min(...)) so negative widths become 0 and values above 100 remain capped.

In `@forester/dashboard/src/components/Sidebar.tsx`:
- Around line 23-37: The nav highlight logic uses exact match (pathname ===
item.href) so parent items don't highlight for sub-routes; update the
computation of active (used when mapping navItems and passed to the Link
rendering) to treat the root "/" specially and use startsWith for other items —
e.g., compute active as item.href === "/" ? pathname === "/" :
pathname.startsWith(item.href) (replace the current pathname === item.href check
in the Sidebar mapping where item.href, pathname, and navItems are referenced).

In `@forester/dashboard/src/components/StatusBadge.tsx`:
- Around line 6-16: Add a short inline comment next to the StatusBadge component
(near the color prop and the className interpolation in StatusBadge) warning
that Tailwind/JIT only picks up literal class names and that constructing
classes dynamically (e.g., `bg-${severity}-500`) will be purged in production;
reference callers like batchStateColor in utils.ts as examples of where full
literals must exist and instruct future authors to pass predefined literal class
strings into the color prop or update the Tailwind safelist if dynamic
generation is required.

In `@forester/dashboard/src/components/TreeBatchDetail.tsx`:
- Around line 17-30: The three batch fields are inconsistently formatted:
formatNumber is used for info.zkp_batch_size but not for
info.input_pending_batches or info.output_pending_batches; update the JSX in
TreeBatchDetail (the <p> elements that render info.input_pending_batches and
info.output_pending_batches) to wrap their values with the formatNumber helper
so all three fields use formatNumber(info.<field>) for consistent numeric
formatting.

In `@forester/dashboard/src/components/TreeTable.tsx`:
- Around line 68-78: The filter buttons in TreeTable are missing an explicit
type and will default to "submit" inside forms; update the button element
rendered in the map (the one using key={f}, onClick={() => setFilter(f)} and
rendering {f === "all" ? "All" : f}) to include type="button" so clicking it
won't submit any surrounding form.
- Around line 216-242: The "+N" badge uses a fixed 8 instead of the actual
number of items shown, causing wrong counts near the end; compute the
remaining/hidden count using the computed start and visible slice (e.g., const
hidden = Math.max(0, schedule.length - (start + visible.length))) and render
+{hidden} instead of +{schedule.length - 8}; update references to schedule,
start, visible, and the span that outputs "+{...}" accordingly.
- Line 207: Move the stray "import { Fragment } from 'react'" into the top
import block alongside the existing React imports in the TreeTable component so
all React imports are colocated; then remove the duplicate import on line 207.
Locate the import usage by checking the TreeTable.tsx file and adjust the top
import statements (where React is imported) to include Fragment, ensuring there
are no duplicate React imports remaining.

In `@forester/dashboard/src/hooks/useForesterStatus.ts`:
- Around line 7-13: The magic number 0.46 in the refreshInterval calculation
should be replaced with a named constant (e.g., SLOT_DURATION_SECONDS or
FORESTER_SLOT_DURATION_SEC) declared at the top of useForesterStatus.ts; update
the expression in refreshInterval (the function using
data.slots_until_next_light_slot) to multiply by that constant instead of 0.46
and adjust any +500 or ms conversion logic if needed; ensure the constant name
documents the unit (seconds or ms) and consider aligning its value with the
backend's SLOT_DURATION_MS for consistency.

In `@forester/dashboard/src/lib/api.ts`:
- Around line 38-39: The current code returns res.json() directly so a 200 OK
with non-JSON body causes an opaque SyntaxError; wrap the JSON parse in a
try/catch (or first check Content-Type) around the res.json() call, attempt
res.text() on failure, and throw a new Error that includes the request context
(HTTP status and URL) and the response body/snippet to aid debugging; update the
function that calls res.json() (refer to res and res.json() in the file) to
produce this enriched error instead of propagating the raw SyntaxError.

In `@forester/dashboard/src/lib/utils.ts`:
- Around line 6-9: The parameter name in formatSol is misleading: it is named
lamports but actually receives SOL (e.g., ForesterInfo.balance_sol), which can
cause incorrect future fixes; rename the parameter in the function signature and
all callers from lamports to sol (keep the type number | null | undefined) and
preserve the current logic (no /1e9 conversion) so formatSol(sol) returns "-"
for null/undefined or `${sol.toFixed(4)} SOL`; update references to formatSol to
use the new parameter name where applicable.

In `@forester/src/api_server.rs`:
- Around line 381-384: The CORS setup currently uses
warp::cors().allow_any_origin() which is unsafe for production; update the code
that constructs the cors variable to read a new configuration option (e.g.,
cors_origin: Option<String> or Vec<String>) or reuse the existing
allow_public_bind flag to choose between allow_any_origin() for local/public
testing and a restricted .allow_origin(...) call for production. Locate the cors
builder in api_server.rs where cors is defined and replace the unconditional
allow_any_origin() with a conditional: if a cors_origin config is present use
warp::cors().allow_origin(...) (or map multiple origins) and permit only
necessary methods/headers, otherwise fall back to allow_any_origin() only when
allow_public_bind is true.
- Around line 85-97: CompressibleTrackers and CompressibleDashboardState are
duplicates; remove CompressibleDashboardState and replace its uses with
CompressibleTrackers (which already derives Clone), update ApiServerConfig to
hold CompressibleTrackers instead of CompressibleDashboardState, and change the
place that converted CompressibleDashboardState into tracker fields to simply
clone the CompressibleTrackers (e.g., replace the manual ctoken/pda/mint mapping
with a .clone() of CompressibleTrackers); ensure all references and imports for
CompressibleDashboardState are removed and compile.

In `@forester/src/compressible/bootstrap_helpers.rs`:
- Around line 493-496: The reqwest Client is given a 30s per-request timeout
which races with the module-level tokio timeout used by send_rpc_request
(RPC_REQUEST_TIMEOUT); remove the client-level timeout to avoid double-timing
and rely on send_rpc_request's timeout instead—i.e., delete the
.timeout(Duration::from_secs(30)) call in the Client::builder() code that
constructs the client so the client is built without that per-request timeout
while leaving send_rpc_request and RPC_REQUEST_TIMEOUT unchanged.
- Around line 488-559: The non-localhost pagination in count_program_accounts
duplicates logic found in bootstrap_v2_api (payload construction, paginationKey
handling, sleep, break conditions); refactor by extracting a shared paginator
function (e.g., paginate_rpc_pages) that accepts parameters like rpc client,
rpc_url, method name ("getProgramAccountsV2"), program_id, optional dataSlice,
filters, PAGE_SIZE, DEFAULT_PAGINATION_DELAY_MS and a per-page callback or
collector; have paginate_rpc_pages use extract_accounts_array and
extract_pagination_cursor for page parsing and return collected results or
invoke the callback for each page, then update count_program_accounts to call
this paginator (and similarly update bootstrap_v2_api to reuse it) to remove
duplication and centralize retry/rate-limit behavior.

In `@forester/src/epoch_manager.rs`:
- Around line 926-927: The code casts slots_to_wait (u64) to u32 when computing
wait_duration (slot_duration() * slots_to_wait as u32), which can silently
truncate large values; change this to clamp/saturate the value before casting
(e.g., let slots_to_wait_clamped = slots_to_wait.min(u32::MAX as u64) and use
slot_duration() * (slots_to_wait_clamped as u32)) or otherwise compute the
multiplied Duration via checked arithmetic so it cannot wrap; update the
multiplication at the sleep call in epoch_manager.rs (the slot_duration() * ...
expression) and mirror the same safe clamping approach used in
calculate_remaining_time_or_default / the prewarm timeout sites.

In `@forester/src/forester_status.rs`:
- Around line 588-589: The match on tree.tree_type returns a 7-tuple
(fullness_percentage, next_index, capacity, height, threshold, queue_length,
v2_queue_info) which hurts readability; create a small struct (e.g.,
ParsedTreeMetrics) with named fields matching those seven items, update each
match arm to return ParsedTreeMetrics instead of a tuple, and then destructure
or access fields from that struct where fullness_percentage, next_index,
capacity, height, threshold, queue_length, and v2_queue_info are currently used;
update any function signatures or variable bindings that expect the tuple to use
ParsedTreeMetrics (or its fields) to preserve type correctness.
- Around line 688-691: The threshold computation in the StateV2 and AddressV2
arms recomputes 1u64 << height and uses plain multiplication, causing
inconsistency with V1; change both to compute threshold_val from the
already-calculated capacity variable using
capacity.saturating_mul(merkle_tree.metadata.rollover_metadata.rollover_threshold)
/ 100 so the multiplication is overflow-safe and you don't recalculate 1u64 <<
height (refer to capacity, threshold_val,
merkle_tree.metadata.rollover_metadata.rollover_threshold, StateV2, AddressV2,
height, saturating_mul).

In `@forester/src/main.rs`:
- Around line 147-166: The Dashboard branch (Commands::Dashboard) currently
awaits ctrl_c().await then calls api_server_handle.shutdown() with no
forced-exit fallback; update it to reuse the existing spawn_shutdown_handler (or
implement equivalent) so that after the first Ctrl+C you trigger
api_server_handle.shutdown() and also spawn a watcher that on a second Ctrl+C or
after a timeout calls std::process::exit(1); specifically replace the direct
ctrl_c().await + api_server_handle.shutdown() sequence with a call into
spawn_shutdown_handler (or wrap api_server_handle.shutdown() in a
shutdown-with-timeout helper) to ensure a forced exit if graceful shutdown
hangs.

In `@forester/src/metrics.rs`:
- Around line 343-382: The code swallows Prometheus query errors by only
handling Ok(...) for tx_total, tx_rate, last_run, balances, and queues; update
each branch that currently uses if let Ok(ref v) = ... to also handle the Err(e)
case and log the error (e.g., warn! or debug!) with context (which query failed
and error details) before falling back to the zero/default values; make these
changes around the blocks that populate transactions_processed_total,
transaction_rate, last_run_timestamp, forester_balances, and queue_lengths and
reuse extract_label_values where applicable so failures are logged but existing
fallback behavior remains.

In `@forester/src/processor/v2/helpers.rs`:
- Around line 471-473: The call to self.data_ready.wait_timeout(guard,
Duration::from_millis(50)) currently discards a potential PoisonError; update
this to handle the Result consistently with the file's lock_recover pattern by
matching the return value, calling lock_recover (or equivalent recovery) if
Err(PoisonError) is returned, and logging or propagating the recovered guard;
specifically modify the wait_timeout invocation tied to the guard produced by
lock_recover so that Err(poison) is handled (recover the mutex and continue
using the recovered guard) rather than silently ignored.

In `@forester/src/processor/v2/proof_worker.rs`:
- Line 3: Replace the hard-coded MAX_CONCURRENT_PROOFS constant with a
configurable field on ProverConfig (e.g., max_concurrent_proofs: usize)
defaulting to 64, then use that field inside spawn_proof_workers and any
functions in proof_worker.rs that currently reference MAX_CONCURRENT_PROOFS;
update spawn_proof_workers' parameter usage to read config.max_concurrent_proofs
(or fall back to 64 if None) and remove or deprecate the constant so deployments
can tune concurrency via ProverConfig.
- Around line 182-186: The code stores the Result from
semaphore.clone().acquire_owned().await into permit and then moves that Result
into the spawned task as _permit, which means the actual OwnedSemaphorePermit
isn’t held and errors are silently ignored; change the await to unwrap/expect
(or match and early-return/log) so you obtain an OwnedSemaphorePermit before
spawning. Concretely, call semaphore.clone().acquire_owned().await.expect("...")
(or handle Err explicitly) so permit is an OwnedSemaphorePermit, then move that
permit into the tokio::spawn closure and bind it to _permit to keep the permit
alive for the task.

In `@forester/src/slot_tracker.rs`:
- Around line 13-16: The slot duration is hardcoded to 400ms in SLOT_DURATION_MS
and exposed via slot_duration(), but the frontend uses a different value
(460ms); to avoid divergence either extract a shared canonical constant (e.g.,
move SLOT_DURATION_MS/slot_duration to a shared config crate or an exported API
consumed by the dashboard) or, at minimum, add a clear comment above
SLOT_DURATION_MS stating it is the canonical slot duration (400ms) and call out
the frontend's 460ms polling approximation and where to change it; update
references to SLOT_DURATION_MS/slot_duration() accordingly so all code points
use the single source of truth.

In `@forester/src/telemetry.rs`:
- Line 42: The file_layer created with
fmt::Layer::new().with_writer(non_blocking) should explicitly disable ANSI
escape codes; update the file_layer construction (symbol: file_layer, function:
fmt::Layer::new()) to call .with_ansi(false) so logs written via the
non_blocking writer do not contain terminal ANSI sequences.

coderabbitai · 2026-02-10T11:03:13Z

forester/dashboard/Dockerfile

+RUN addgroup --system --gid 1001 nodejs
+RUN adduser --system --uid 1001 nextjs


🧹 Nitpick | 🔵 Trivial

Consolidate consecutive RUN instructions.

These two RUN commands can be merged into one layer, reducing image size slightly and addressing the Hadolint DL3059 hint.

♻️ Proposed fix

-RUN addgroup --system --gid 1001 nodejs -RUN adduser --system --uid 1001 nextjs +RUN addgroup --system --gid 1001 nodejs && \ + adduser --system --uid 1001 nextjs

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

RUN addgroup --system --gid 1001 nodejs

RUN adduser --system --uid 1001 nextjs

RUN addgroup --system --gid 1001 nodejs && \

adduser --system --uid 1001 nextjs

🧰 Tools

🪛 Hadolint (2.14.0)

[info] 18-18: Multiple consecutive RUN instructions. Consider consolidation.

(DL3059)

🤖 Prompt for AI Agents

In `@forester/dashboard/Dockerfile` around lines 17 - 18, Combine the two consecutive Docker RUN layers into a single RUN to reduce image layers and satisfy DL3059: replace the separate RUN addgroup --system --gid 1001 nodejs and RUN adduser --system --uid 1001 nextjs with one RUN that executes both commands (using shell chaining such as &&) so both addgroup and adduser run in the same layer and preserve exit-on-failure behavior.

coderabbitai · 2026-02-10T11:03:13Z

forester/dashboard/Dockerfile

+USER nextjs
+EXPOSE 3000
+ENV PORT=3000
+CMD ["node", "server.js"]


🧹 Nitpick | 🔵 Trivial

Consider adding a HEALTHCHECK instruction.

Both Trivy and Checkov flag the missing HEALTHCHECK. For a production dashboard container, a simple health check helps orchestrators (Docker Compose, ECS, K8s with Docker health probes) detect when the Next.js server is unresponsive.

🩺 Example HEALTHCHECK

USER nextjs EXPOSE 3000 ENV PORT=3000 +HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \ + CMD ["wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/", "||", "exit", "1"] CMD ["node", "server.js"]

Note: node:20-alpine includes wget. Adjust the endpoint to a dedicated health route if one exists.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

USER nextjs

EXPOSE 3000

ENV PORT=3000

CMD ["node", "server.js"]

USER nextjs

EXPOSE 3000

ENV PORT=3000

HEALTHCHECK --interval=30s --timeout=5s --start-period=10s --retries=3 \

CMD ["wget", "--no-verbose", "--tries=1", "--spider", "http://localhost:3000/", "||", "exit", "1"]

CMD ["node", "server.js"]

🧰 Tools

🪛 Checkov (3.2.334)

[low] 1-26: Ensure that HEALTHCHECK instructions have been added to container images

(CKV_DOCKER_2)

🤖 Prompt for AI Agents

In `@forester/dashboard/Dockerfile` around lines 23 - 26, Add a Docker HEALTHCHECK instruction to the Dockerfile to probe the running Next.js server (after the existing USER nextjs / EXPOSE 3000 / ENV PORT=3000 / CMD ["node","server.js"] lines): implement a lightweight HTTP probe (e.g., wget or curl against http://localhost:3000/ or a dedicated /health route if available) with sensible parameters (interval, timeout, start-period and retries) so the container runtime can mark the container unhealthy when the server is unresponsive.

coderabbitai · 2026-02-10T11:03:13Z

forester/dashboard/src/app/metrics/page.tsx

+  const isEmpty =
+    Object.keys(metrics.transactions_processed_total).length === 0 &&
+    Object.keys(metrics.forester_balances).length === 0;


🧹 Nitpick | 🔵 Trivial

Consider including queue_lengths in the empty-state check.

The emptiness check only looks at transactions_processed_total and forester_balances. If the API returns data solely in queue_lengths or transaction_rate, the page will still show the MetricsPanel — which is probably fine since MetricsPanel conditionally renders those sections. Just flagging in case the intent was to detect "truly no data at all."

🤖 Prompt for AI Agents

In `@forester/dashboard/src/app/metrics/page.tsx` around lines 22 - 24, The empty-state check for isEmpty currently only inspects metrics.transactions_processed_total and metrics.forester_balances; update it to also consider metrics.queue_lengths (and optionally metrics.transaction_rate) so that isEmpty truly reflects no data across all metric groups. Locate the isEmpty declaration and include Object.keys(metrics.queue_lengths).length === 0 (and Object.keys(metrics.transaction_rate).length === 0 if you want to treat transaction_rate as contributing to non-empty) in the combined boolean expression. Ensure you use the same metrics object keys (metrics.queue_lengths, metrics.transaction_rate) and keep the logical AND semantics so the page only treats it as empty when all relevant metric maps are empty.

coderabbitai · 2026-02-10T11:03:13Z

forester/dashboard/src/app/page.tsx

+  if (
+    status.slots_until_next_registration < 1000 &&
+    status.registration_epoch_foresters.length === 0
+  ) {
+    warnings.push("Registration closing soon with no foresters registered");
+  }


🧹 Nitpick | 🔵 Trivial

Magic number 1000 for registration slot threshold.

This threshold controls when the "Registration closing soon" warning appears. Consider extracting it as a named constant to make the intent clearer and easier to tune:

+const REGISTRATION_WARNING_SLOT_THRESHOLD = 1000; + if ( - status.slots_until_next_registration < 1000 && + status.slots_until_next_registration < REGISTRATION_WARNING_SLOT_THRESHOLD && status.registration_epoch_foresters.length === 0 )

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if (

status.slots_until_next_registration < 1000 &&

status.registration_epoch_foresters.length === 0

) {

warnings.push("Registration closing soon with no foresters registered");

}

const REGISTRATION_WARNING_SLOT_THRESHOLD = 1000;

if (

status.slots_until_next_registration < REGISTRATION_WARNING_SLOT_THRESHOLD &&

status.registration_epoch_foresters.length === 0

) {

warnings.push("Registration closing soon with no foresters registered");

}

🤖 Prompt for AI Agents

In `@forester/dashboard/src/app/page.tsx` around lines 29 - 34, The literal 1000 should be replaced with a named constant to clarify intent and make tuning easier: add a top-level constant like REGISTRATION_WARNING_SLOT_THRESHOLD (or similar) in page.tsx and use it in the check inside the block that currently references status.slots_until_next_registration (the if that also checks status.registration_epoch_foresters.length). Replace the magic number with that constant, choose a descriptive name and add a brief comment describing what the threshold represents.

coderabbitai · 2026-02-10T11:03:14Z

forester/dashboard/src/app/page.tsx

+      {warnings.map((w, i) => (
+        <div
+          key={i}
+          className="bg-amber-50 border border-amber-200 rounded-lg px-4 py-3 text-sm text-amber-800"
+        >
+          {w}
+        </div>
+      ))}


🧹 Nitpick | 🔵 Trivial

Consider using the warning string as key instead of array index.

Since warning messages are unique strings derived deterministically from the status, they make better React keys than array indices. This is a minor nit — with a small, non-reorderable list it won't cause bugs, but it's slightly more idiomatic:

- {warnings.map((w, i) => ( - <div key={i} + {warnings.map((w) => ( + <div key={w}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

{warnings.map((w, i) => (

<div

key={i}

className="bg-amber-50 border border-amber-200 rounded-lg px-4 py-3 text-sm text-amber-800"

>

{w}

</div>

))}

{warnings.map((w) => (

<div

key={w}

className="bg-amber-50 border border-amber-200 rounded-lg px-4 py-3 text-sm text-amber-800"

>

{w}

</div>

))}

🤖 Prompt for AI Agents

In `@forester/dashboard/src/app/page.tsx` around lines 45 - 52, Replace the array index key in the warnings.rendering (the warnings.map callback) with the warning string itself: use the iteration value (the warning string variable passed as w) as the React key in the div (ensuring it’s unique/deterministic), or fall back to a stable derived id if you have any risk of duplicates; update the key prop in the map inside the component that renders warnings (the warnings.map callback) accordingly.

coderabbitai · 2026-02-10T11:03:15Z

forester/src/processor/v2/helpers.rs

+            let _ = self
+                .data_ready
+                .wait_timeout(guard, std::time::Duration::from_millis(50));


🛠️ Refactor suggestion | 🟠 Major

Handle poisoned condvar consistently with lock_recover pattern.

Throughout this file, mutex poisoning is handled gracefully via lock_recover. But wait_timeout on a poisoned mutex will return Err(PoisonError), and the current let _ = ... silently discards it. If the mutex was poisoned between the lock_recover at line 467 and the wait_timeout at line 471, the error is swallowed without logging.

For consistency with the rest of the file's poison-recovery strategy:

🛡️ Proposed fix

- let _ = self + let (_guard, _timeout_result) = self .data_ready - .wait_timeout(guard, std::time::Duration::from_millis(50)); + .wait_timeout(guard, std::time::Duration::from_millis(50)) + .unwrap_or_else(|poisoned| { + tracing::warn!("Poisoned condvar (recovering): streaming_address_queue.data_ready"); + poisoned.into_inner() + });

🤖 Prompt for AI Agents

In `@forester/src/processor/v2/helpers.rs` around lines 471 - 473, The call to self.data_ready.wait_timeout(guard, Duration::from_millis(50)) currently discards a potential PoisonError; update this to handle the Result consistently with the file's lock_recover pattern by matching the return value, calling lock_recover (or equivalent recovery) if Err(PoisonError) is returned, and logging or propagating the recovered guard; specifically modify the wait_timeout invocation tied to the guard produced by lock_recover so that Err(poison) is handled (recover the mutex and continue using the recovered guard) rather than silently ignored.

coderabbitai · 2026-02-10T11:03:15Z

forester/src/processor/v2/proof_worker.rs

@@ -1,5 +1,7 @@
 use std::{sync::Arc, time::Duration};

+const MAX_CONCURRENT_PROOFS: usize = 64;


🧹 Nitpick | 🔵 Trivial

Consider making the concurrency limit configurable via ProverConfig.

MAX_CONCURRENT_PROOFS = 64 is a reasonable default, but the optimal value depends on the prover server's capacity, which can vary across deployments. ProverConfig is already threaded into spawn_proof_workers — adding a field with a default of 64 would let operators tune this without a recompile.

🤖 Prompt for AI Agents

In `@forester/src/processor/v2/proof_worker.rs` at line 3, Replace the hard-coded MAX_CONCURRENT_PROOFS constant with a configurable field on ProverConfig (e.g., max_concurrent_proofs: usize) defaulting to 64, then use that field inside spawn_proof_workers and any functions in proof_worker.rs that currently reference MAX_CONCURRENT_PROOFS; update spawn_proof_workers' parameter usage to read config.max_concurrent_proofs (or fall back to 64 if None) and remove or deprecate the constant so deployments can tune concurrency via ProverConfig.

coderabbitai · 2026-02-10T11:03:15Z

forester/src/processor/v2/proof_worker.rs

+        let permit = semaphore.clone().acquire_owned().await;
        // Spawn immediately so we don't block receiving the next job
-        // while waiting for HTTP submission
+        // while waiting for HTTP submission. Semaphore bounds concurrency.
        tokio::spawn(async move {
+            let _permit = permit;


🧹 Nitpick | 🔵 Trivial

Unwrap the semaphore permit to avoid silently ignoring errors.

acquire_owned() returns Result<OwnedSemaphorePermit, AcquireError>. Right now, permit is a Result and on line 186 _permit holds that Result, not the actual permit. If acquire_owned ever errors (semaphore closed), the task would run without holding a permit—bypassing the concurrency bound silently. In practice AcquireError can't fire here because the Arc<Semaphore> is never dropped, but making the intent explicit is much cleaner and will surface bugs if the code evolves.

Proposed fix

- let permit = semaphore.clone().acquire_owned().await; + let permit = semaphore + .clone() + .acquire_owned() + .await + .expect("semaphore closed unexpectedly"); // Spawn immediately so we don't block receiving the next job // while waiting for HTTP submission. Semaphore bounds concurrency. tokio::spawn(async move { let _permit = permit;

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let permit = semaphore.clone().acquire_owned().await;

// Spawn immediately so we don't block receiving the next job

// while waiting for HTTP submission

// while waiting for HTTP submission. Semaphore bounds concurrency.

tokio::spawn(async move {

let _permit = permit;

let permit = semaphore

.clone()

.acquire_owned()

.await

.expect("semaphore closed unexpectedly");

// Spawn immediately so we don't block receiving the next job

// while waiting for HTTP submission. Semaphore bounds concurrency.

tokio::spawn(async move {

let _permit = permit;

🤖 Prompt for AI Agents

In `@forester/src/processor/v2/proof_worker.rs` around lines 182 - 186, The code stores the Result from semaphore.clone().acquire_owned().await into permit and then moves that Result into the spawned task as _permit, which means the actual OwnedSemaphorePermit isn’t held and errors are silently ignored; change the await to unwrap/expect (or match and early-return/log) so you obtain an OwnedSemaphorePermit before spawning. Concretely, call semaphore.clone().acquire_owned().await.expect("...") (or handle Err explicitly) so permit is an OwnedSemaphorePermit, then move that permit into the tokio::spawn closure and bind it to _permit to keep the permit alive for the task.

coderabbitai · 2026-02-10T11:03:16Z

forester/src/slot_tracker.rs

+const SLOT_DURATION_MS: u64 = 400;
+
 pub fn slot_duration() -> Duration {
-    Duration::from_nanos(solana_sdk::genesis_config::GenesisConfig::default().ns_per_slot() as u64)
+    Duration::from_millis(SLOT_DURATION_MS)


⚠️ Potential issue | 🟡 Minor

Hardcoded 400ms slot duration — reasonable, but note the frontend inconsistency.

400ms is Solana's target slot time, so this is a sensible default. However, there's a discrepancy with the dashboard frontend: in forester/dashboard/src/hooks/useForesterStatus.ts (line 9), the refresh interval calculation uses 0.46 * 1000 (460ms) per slot. While both are approximations and the frontend usage is just for polling cadence, having two different "slot duration" constants across the codebase is a maintenance hazard.

Consider extracting a shared constant or at least adding a comment here noting the canonical value, so future readers know which to reference.

🤖 Prompt for AI Agents

In `@forester/src/slot_tracker.rs` around lines 13 - 16, The slot duration is hardcoded to 400ms in SLOT_DURATION_MS and exposed via slot_duration(), but the frontend uses a different value (460ms); to avoid divergence either extract a shared canonical constant (e.g., move SLOT_DURATION_MS/slot_duration to a shared config crate or an exported API consumed by the dashboard) or, at minimum, add a clear comment above SLOT_DURATION_MS stating it is the canonical slot duration (400ms) and call out the frontend's 460ms polling approximation and where to change it; update references to SLOT_DURATION_MS/slot_duration() accordingly so all code points use the single source of truth.

coderabbitai · 2026-02-10T11:03:16Z

forester/src/telemetry.rs

+            let (non_blocking, guard) = tracing_appender::non_blocking(file_appender);
+            let _ = LOG_GUARD.set(guard);
+
+            let file_layer = fmt::Layer::new().with_writer(non_blocking);


⚠️ Potential issue | 🟡 Minor

File layer should disable ANSI escape codes.

The stdout_layer correctly sets .with_ansi(true) for terminal coloring, but the file_layer doesn't set .with_ansi(false). By default, fmt::Layer may include ANSI codes depending on the writer, and writing them to log files produces garbled output when viewed in editors or ingested by log aggregation tools.

Proposed fix

- let file_layer = fmt::Layer::new().with_writer(non_blocking); + let file_layer = fmt::Layer::new().with_writer(non_blocking).with_ansi(false);

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

let file_layer = fmt::Layer::new().with_writer(non_blocking);

let file_layer = fmt::Layer::new().with_writer(non_blocking).with_ansi(false);

🤖 Prompt for AI Agents

In `@forester/src/telemetry.rs` at line 42, The file_layer created with fmt::Layer::new().with_writer(non_blocking) should explicitly disable ANSI escape codes; update the file_layer construction (symbol: file_layer, function: fmt::Layer::new()) to call .with_ansi(false) so logs written via the non_blocking writer do not contain terminal ANSI sequences.

sergeytimoshin added 2 commits February 10, 2026 10:10

format

d16bc40

coderabbitai bot reviewed Feb 10, 2026

View reviewed changes

ananas-block approved these changes Feb 10, 2026

View reviewed changes

sergeytimoshin merged commit d538bed into main Feb 11, 2026
18 checks passed

sergeytimoshin deleted the sergey/forester-dashboard branch February 11, 2026 13:21

coderabbitai bot mentioned this pull request Feb 25, 2026

feat: forester dashboard + compression improvements #2310

Merged

		RUN addgroup --system --gid 1001 nodejs
		RUN adduser --system --uid 1001 nextjs

		@@ -1,5 +1,7 @@
		use std::{sync::Arc, time::Duration};

		const MAX_CONCURRENT_PROOFS: usize = 64;

	let file_layer = fmt::Layer::new().with_writer(non_blocking);
	let file_layer = fmt::Layer::new().with_writer(non_blocking).with_ansi(false);

Conversation

sergeytimoshin commented Feb 10, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Feb 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review failed

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sergeytimoshin commented Feb 10, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Feb 10, 2026 •

edited

Loading