Skip to content

Implementing replication #148

@cowlicks

Description

@cowlicks

This issue is for tracking implementing replication on hypercore. I've started doing this, but it requires some deeper changes.

Hypercore replication requires a Hypercore to handle an open-ended set of peer connections simultaneously, processing protocol messages from each peer while performing core operations (creating and verifying Merkle proofs, reading and writing blocks) in response. The target design implements Stream for Hypercore directly, so replication runs as structured concurrency: no spawned tasks, no shared ownership — the Hypercore drives its own peers when polled.

Problem

A Peer lives inside Hypercore (in self.peers). When a peer handles messages from a remote peer, it must call methods on Hypercore like core.create_proof(..).await or core.verify_and_apply_proof(..).await. But almost all the needed methods are async and return a future that has &mut self. A natural representation is to store the in-flight future in the Peer:

struct Peer {
    protocol: Protocol,
    pending_proof: Option<Pin<Box<dyn Future<Output = Result<Option<Proof>, _>>>>>,
    // ...
}

This is impossible with the current API. Hypercore::create_proof is async fn (&mut self), so its future has type impl Future + '_ — it borrows &mut Hypercore for its entire duration. Storing that future inside self.peers[i] would create a self-referential struct, which Rust's borrow checker correctly rejects. The consequence is that at most one async core operation can be in flight at any time, and it cannot be stored anywhere — it must be driven to completion before returning from poll_next. This makes it impossible to correctly implement the replication message loop in a poll-based design.

The primary cause is the &mut self lifetime propagates from the bottom of the storage stack upward:

  • RandomAccess (via #[async_trait]) desugars async fn read(&mut self) to a future with lifetime '_ mut self
  • Storage holds Box<dyn StorageTraits> and calls these methods, so its methods need &mut self
  • Oplog, MerkleTree, BlockStore, and Bitfield call through Storage, inheriting &mut self
  • Hypercore::create_proof and verify_and_apply_proof call all of the above —&mut self all the way up

The Fix

Change RandomAccess to take &self and return 'static futures. Implementations achieve this by managing their mutable state internally. Plan for changing the existing implementations:

  • RandomAccessMemory: all operations are already purely synchronous (in-memory Vec<u8> manipulation with no I/O). Wrapping state in Arc<std::sync::Mutex<>> and returning std::future::ready(result).
  • RandomAccessDisk: genuine async file I/O; file handle and metadata move into Arc<async_lock::Mutex<DiskInner>>.

With RandomAccess returning 'static futures, Storage fields change from Box<dyn StorageTraits> to Arc<dyn StorageTraits + Send + Sync>, its methods become &self, and the 'static property propagates up through Oplog, MerkleTree, BlockStore, and Bitfield. Hypercore's async methods become &self returning owned futures.

Peer can now legally hold a pending Pin<Box<dyn Future + 'static>>, Stream::poll_next can drive it each poll without conflicting with the borrow of self.peers, and the replication message loop becomes straightforward to implement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions