Skip to content

[pipeline] Shared Arc<RootProvider> has no reconnect path — all pipeline stages fail permanently on WS drop #175

@obchain

Description

@obchain

Refs #42

PR: feat(cli): wire scanner → router → builder → simulator pipeline (feat/17-cli-e2e-pipeline)
Commit: latest on feat/17-cli-e2e-pipeline
File: crates/charon-cli/src/main.rs (adapter_ws and price_cache_ws construction)

Problem:

PR #42 consolidates four WS connections to fewer shared Arc<RootProvider> handles. The BlockListener has its own reconnect loop (PR #32). The shared provider does not.

If the WebSocket connection underlying the shared provider drops (network interruption, BSC node restart, Cloudflare WS timeout):

  • adapter.fetch_positions() returns RPC errors every block.
  • router.route() (Aave PoolDataProvider query) fails.
  • simulator.simulate() (eth_call) fails.
  • price_cache.refresh() fails.

The pipeline logs RPC errors every 3 seconds indefinitely. There is no backoff, no provider reconnect attempt, and no supervisor to restart the provider. BlockListener reconnects and resumes sending block events, so the drain loop stays active but all pipeline stages fail on every tick. With no Prometheus metrics yet (PR #50 pending), no alert fires.

Impact: Any network interruption leaves the bot in a permanent degraded state requiring manual restart. No observable error from the operator's perspective other than warn logs.

Fix:

  • Wrap provider creation in a reconnect-aware factory matching BlockListener's backoff pattern.
  • Or: on consecutive RPC failures (e.g. 3 in a row within one block interval), trigger controlled shutdown so Docker restart policy recovers the process cleanly.
  • Document the reconnect strategy in a follow-up issue if deferring.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinglayer:rustRust crates (core / scanner / protocols / executor / cli)pr-reviewFindings from PR review processpriority:p1-coreCore MVP scopestatus:readyScoped and ready to pick up

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions