Skip to content

Add cumulative XOR field to CheckPoint for early-termination merge #2033

@evanlinjin

Description

@evanlinjin

Problem

When a chain source produces an update for LocalChain, the update is typically seeded from local's existing checkpoints and shares a long suffix with local — only the tip (and, in reorg cases, a short prefix near the tip) differs. The current merge walks every checkpoint in the update (unless if the underlying pointer matches) to find the point of agreement, even though the bulk of the update is usually redundant with what local already has.

We want an O(1) signal, usable at each step of the merge walk, that tells us "the suffix from this checkpoint back matches local" so we can terminate early.

Proposal

Add a cumulative_xor: [u8; 32] field to each CheckPoint, defined as the XOR of this checkpoint's block hash with all block hashes below it back to the chain's base. Maintained incrementally on insert() as parent.cumulative_xor ^ self.hash.

During merge, when walking update and local in lockstep at matching heights, if update_cp.cumulative_xor == local_cp.cumulative_xor, the suffixes agree and the walk can terminate.

Behavior

  • When update and local share a suffix with matching heights (the expected case for updates seeded from local), the walk terminates as soon as cumulative XORs match.
  • When update and local sample at different heights — e.g., the update originated from a differently-structured source — the XORs will not match even if the underlying blockchains agree, and the merge falls through to the full walk. This is a performance miss, not a correctness issue.
  • The field is an early-termination hint. The merge remains structurally correct by walk; XOR only determines when the walk can stop.

Testing

  • Cumulative XOR correctly computed on construction and maintained across insert().
  • Equal suffixes (same hashes at same heights) produce equal cumulative XOR at the top of the suffix.
  • No-reorg case: update's parent checkpoint matches local's tip — walk terminates immediately after the new tip.
  • Reorg case: divergence at tip, suffix below agrees — walk terminates at the first matching height below the reorg.
  • Differently-sampled chains: XORs differ, merge falls through to full walk, result is correct.
  • Edge cases: empty chain, single-block chain, update whose base differs from local's base.

Benchmarks

Numbers for the common case (frequent sync, long shared suffix) and the cold-catchup case (short shared suffix) to quantify the win and the per-insert overhead against baseline.

Open questions

  • Under CheckPoint<H> generalization: XOR block hashes in both the lightweight and verified paths so fingerprints are comparable across representations, or keep semantics path-specific?
  • Is the 32-byte-per-checkpoint memory cost acceptable across all bdk_core usage, or should this be opt-in via a wrapper type?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Discussion

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions