Design block-by-block API

_Originally posted by @evanlinjin in https://github.com/bitcoindevkit/bdk/issues/1172#issuecomment-1857636718_

### Requirements

1. Ability to **ONLY** insert *relevant* checkpoints. *Relevant* means checkpoints to blocks which
contain relevant transactions. This is important for doing block-by-block syncing. I.e. full node
without CBF, a CBF node (we still want to filter out false-positives), and silent payments (for the 
future).

2. Ability for the block-source to handle reorgs (mid-sync) without requesting data from 
`bdk::Wallet`.

### Why `apply_block_connected_to` and `apply_block_assume_connected` Does Not Satisfy

Both these methods applies all block checkpoints, no matter if they contain relevant transactions or
not. To solve this, we can have another method, `apply_block_relevant_assume_connected`, that does
not apply checkpoints of blocks containing no relevant transctions. However, this cannot handle
reorgs mid-sync in an elegant way.

Let's assume we have `apply_block_relevant_assume_connected` which filters checkpoints, and the
folowing scenario plays out:

```
block_height        | 1 | 2 | 3 |
emitted_initially     A   B   C
relevant              A       C
emitted_post_reorg    A   B'
```

`emitted_initially` is the checkpoints that the chain-source has emitted. `relevant` is the
checkpoints that end up being stored in `LocalChain`. `emitted_post_reorg` is the checkpoints that
the chain-source re-emits due to reorg. As can be seen, `A, B'` (the update) cannot connect with 
`A, C`. We need `A, B', C'` as the update.

### My proposal

```rust
/// Introduces transactions of the given block to the wallet.
///
/// Only relevant transactions are inserted. Transactions are inserted alongside their anchors.
fn introduce_block_txs(&mut self, block: &Block, height: u32) { todo!() }

/// Introduces a chain `tip` to the wallet as a `CheckPoint`.
///
/// This updates the `last_synced_to_height: u32` parameter to the height of `tip` if `tip` can
/// connect with the internal `LocalChain`.
///
/// This method attempts to insert the `tip`, but only if it contains relevant transactions.
fn introduce_tip(&mut self, tip: CheckPoint) -> Result<(), CannotConnectError> { todo!() }
```

The chain-source emitter is responsible for emitting full blocks and checkpoints (that connects the
current block to previously emitted blocks).

The block is first processed by `introduce_block_txs`. This inserts relevant transactions and
associated anchors into the wallet.

Then we call `introduce_tip`. If the block contains relevent transactions, the `LocalChain` is
updated with this new tip (and only the tip, since we want to skip irrelevant checkpoints). If the
tip is irrelevant, we only update the `last_synced_to_height: u32` value.

#### How does `introduce_tip` work?

Imagine a situation where the emitter has emitted block height 1 (with hash `A`) and height 2 (with
hash `B`). `1:A` is considered relevant and `2:B` is not. The state of the wallet's `LocalChain`
would be a single checkpoint `[1:A]`.

If the next emission is `3:C` and it contains relevant transactions, the `tip` input of
`introduce_tip` may contain the update chain `[1:A, 2:B, 3:C]` (but of course, we only want to
insert `3:C` and not `2:B`). The logic of `introduce_tip` will iterate the update chain backwards
to determine whether `3:C` can connect with the wallet chain (in this case, it can via `1:A`). With
this knowledge, `try_apply_tip` will create a trimmed update chain `[1:A, 3:C]` that is then applied
to just apply the new tip.

`introduce_tip` also needs to keep checkpoints that are needed to invalidate original checkpoints.
I.e. with a original chain `[1:A, 3:C, 4:D]` and an update chain `[1:A, 3:C, 4:D', 5:E]`.
`apply_tip` still needs to keep height 4 in the update even though it is not the tip. The
stripped-update will be `[3:C, 4:D', 5:E]`.

#### Some tips cannot connect, but we don't return error

Given a scenario with an original chain `[1:A, 3:C, 5:E]` (where the original tip is at height 5).
If there is a 2-block reorg, the chain-source emitter will be tempted to emit block at height 4 (as
that is earliest block reorged). The update chain may be `[1:A, 2:B, 3:C, 4:D']`. This cannot be
connect because we cannot know if `4:D'` and `5:E` belongs in the same chain. However, it is not the
end of the world as the next block emitted will be of height 5. So we just ignore this and wait for
the next emission instead of returning an error.

#### Initiating syncs

Because we are only include checkpoints which contain relevant transactions, we need somewhere
else to track the last-synced-to-height. This value is used when creating a new instance of a
chain-source-emitter. We need to track the last-synced-to-height in the wallet's changeset.

```rust
impl Emitter {
    /// We use `last_cp` for reorg detection. Otherwise, we start emitting from
    /// `last_height - assume_final_depth` where the first emission will connect to `last_cp`.
    fn new(last_height: u32, last_cp: CheckPoint) -> Self { todo!() }
}
```

#### Optimizing `introduce_tip`

Because we are only inserting checkpoints with relevant transactions, inserted checkpoints will be
few and far apart.

I.e. If the original chain is `[1:A, 4:D]` and the next relevant checkpoint is at height 4000,
this means we need to do 3996 (4000-4) iterations just to find out if checkpoint at height 4000 can
connect to `4:D`.

A solution is to cache the most recent irrelevant checkpoints. For example, when we introduce
checkpoint at height 5 (which is irrelevant), we cache it and associate it with our highest relevant
checkpoint `4:D`. We keep doing this so when we get to height 4000, we can iterate from the
introduced `tip` and find out that the previous node (at height 3999) is the same as the height 3999
that is cached. We also know that the cached checkpoint of height 3999 is connected to `4:D`. We can
safely create a trimmed-update of `[4:D, 4000:X]`.

### Changes to Bitcoind RPC chain source

We need to emit `CheckPoint`s alongside blocks.


            

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design block-by-block API #1245

Requirements

Why `apply_block_connected_to` and `apply_block_assume_connected` Does Not Satisfy

My proposal

How does `introduce_tip` work?

Some tips cannot connect, but we don't return error

Initiating syncs

Optimizing `introduce_tip`

Changes to Bitcoind RPC chain source

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Design block-by-block API #1245

Description

Requirements

Why apply_block_connected_to and apply_block_assume_connected Does Not Satisfy

My proposal

How does introduce_tip work?

Some tips cannot connect, but we don't return error

Initiating syncs

Optimizing introduce_tip

Changes to Bitcoind RPC chain source

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Why `apply_block_connected_to` and `apply_block_assume_connected` Does Not Satisfy

How does `introduce_tip` work?

Optimizing `introduce_tip`