feat(electrum): optimize merkle proof validation with batching by LagginTimes · Pull Request #1957 · bitcoindevkit/bdk

LagginTimes · 2025-05-15T19:05:40Z

Replaces #1908, originally authored by @Keerthi421.
Fixes #1891.

Description

This PR optimizes sync/full_scan performance by batching and caching key RPC calls to slash network round-trips and eliminate redundant work.

Key improvements:

Gather all blockchain.transaction.get_merkle calls into a single batch_call request.
Use batch_script_get_history instead of many individual script_get_history calls.
Use batch_block_header to fetch all needed block headers in one call rather than repeatedly calling block_header.
Introduce a cache of transaction anchors to skip re-validating already confirmed transactions.

Anchor Caching Performance Improvements

Results suggest a significant speed up with a warmed up cache. Tested on local Electrum server with:

$ cargo bench -p bdk_electrum --bench test_sync

Results before this PR (https://github.com/LagginTimes/bdk/tree/1957-master-branch):

sync_with_electrum      time:   [1.3702 s 1.3732 s 1.3852 s]

Results after this PR:

sync_with_electrum      time:   [851.31 ms 853.26 ms 856.23 ms]

Batch Call Performance Improvements

No persisted data was carried over between runs, so each test started with cold caches and measured only raw batching performance. Tested withexample_electrum out of https://github.com/LagginTimes/bdk/tree/example_electrum_timing with the following parameters:

$ example_electrum init "tr([62f3f3af/86'/1'/0']tpubDD4Kse29e47rSP5paSuNPhWnGMcdEDAuiG42LEd5yaRDN2CFApWiLTAzxQSLS7MpvxrpxvRJBVcjhVPRk7gec4iWfwvLrEhns1LA4h7i3c2/0/*)#cn4sudyq"
$ example_electrum scan tcp://signet-electrumx.wakiyamap.dev:50001

Results before this PR:

FULL_SCAN TIME: 8.145874476s

Results after this PR (using this PR's bdk_electrum_client.rs):

FULL_SCAN TIME: 2.594050112s

Changelog notice

Add transaction anchor cache to prevent redundant network calls.
Batch Merkle proof, script history, and header requests.

Checklists

All Submissions:

I've signed all my commits
I followed the contribution guidelines
I ran cargo fmt and cargo clippy before committing

New Features:

I've added tests for the new feature
I've added docs for the new feature

Bugfixes:

This pull request breaks the existing API
I've added tests to reproduce the issue which are now passing
I'm linking the issue being fixed by this PR

evanlinjin

Thanks for moving this forward.

This is not a full review, but I think it's enough to push this PR in a good direction.

evanlinjin · 2025-05-16T11:22:06Z

+        // Batch validate all collected transactions.
+        if !txs_to_validate.is_empty() {
+            let proofs = self.batch_fetch_merkle_proofs(&txs_to_validate)?;
+            self.batch_validate_merkle_proofs(tx_update, proofs)?;
        }


Instead of having every populate_with_{} method call this internally, it will be more efficient and make more logical sense if we extract this so that we only call it at the end of full_scan and sync.

In other words, populate_with_{} should no longer fetch anchors. Instead, they should either mutate, or return a list of (Txid, BlockId) for which we try to fetch anchors for in a separate step.

It will be even better if full txs are fetched in a separate step too.

Partially resolved. This is the next TODO:

It will be even better if full txs are fetched in a separate step too.

~~This will likely be included in a separate PR.~~
Fetching all full txs in a batch call at the beginning of sync actually ended up doubling sync time.

@LagginTimes did you figure out why though?

evanlinjin · 2025-05-26T01:39:19Z

@LagginTimes could you provide the benchmark results in the PR description and compare it to results before the changes in this PR?

notmandatory · 2025-05-26T16:24:01Z

Based on above benchmark results it looks like this change is 1s faster on sync, is that due to a small test size? Do we expect it to make more of a difference with wallets with many addresses?

evanlinjin · 2025-06-04T01:26:39Z

@LagginTimes ~~can you provide the code you used to test with a remote electrum server (instead of the testenv)?~~

Edit: how about we just test with the example-cli with a pre-populated signet wallet?

I suggested writing benchmarks with the assumption that local io (against testenv) would be slower than allocating memory (collecting requests into vec before batch requesting), however that assumption seems incorrect.

ValuedMammal · 2025-06-25T13:21:54Z

These are my criterion results after benching this PR. 👍

sync_with_electrum      time:   [37.325 ms 38.220 ms 38.805 ms]                              
                        change: [-85.756% -85.556% -85.308%] (p = 0.00 < 0.05)
                        Performance has improved.

ValuedMammal · 2025-06-25T13:23:15Z

In 838c247:

error: this file contains an unclosed delimiter
   --> crates/electrum/src/bdk_electrum_client.rs:724:3
    |
32  | impl<E: ElectrumApi> BdkElectrumClient<E> {
    |                                           - unclosed delimiter
...
504 |             for &(txid, height, hash) in chunk {
    |                                                - this delimiter might not be properly closed...
...
547 |         }
    |         - ...as it matches this but it has different indentation
...
724 | }

ValuedMammal

ACK 8086e1e

evanlinjin

Here are some suggestions to improve the benchmark.

I've implemented them in this commit: evanlinjin@1ba324c

I also added a benchmark for testing sync without cache. Feel free to cherry-pick this commit if you agree with the changes.

* Actually use different spks * Do not benchmark applying updates (only fetching/contructing) * Have two benches: One with cache, one without. * Remove `black_box`.

oleonardolima

tACK 156cbab

It looks good to me, I did run the example with the same descriptor, but a different server (and no TLS), here are the results:

master @ `63923c63dc5dbd7850ae8fa4f4d1b832170fe957`
FULL_SCAN TIME: 26.977265542s
SYNC TIME: 23.786322458s

merkle_batching @ `156cbab67f4ff91276f9f03749944f4c46210f7f`
FULL_SCAN TIME: 3.85442175s
SYNC TIME: 4.124651292s

oleonardolima · 2025-07-02T21:17:33Z

+        let histories = self
+            .inner
+            .batch_script_get_history(unique_spks.iter().map(|spk| spk.as_script()))?;


question to self: does this batch call guarantees order ?

If it doesn't, then it is a major bug in bdk_client since the API implies that the response order correlates with the requests.

oleonardolima · 2025-07-02T21:55:13Z

                    // Returned heights 0 & -1 are reserved for unconfirmed txs.
                    Ok(height) if height > 0 => {
-                        self.validate_merkle_for_anchor(tx_update, txid, height)?;
+                        pending_anchors.push((tx.0, height));


nit: use res.tx_hash for consistency with the branch below.

evanlinjin

ACK 156cbab

Styling tips for readability:

I find that it is more readable to initialize a variable with it's type rather than relying on it being implied later on. I.e. prefer let mut result = Vec::<(Txid, ConfirmationBlockTime)>::new() over let mut result = Vec::new(). The latter requires the reader to use an LSP or scan head first to figure out the type.
Avoid overly-nested logic. Sometimes it is better to use continue instead of nesting the logic in an if clause.
debug_assert!s help us to catch bugs. Don't be afraid to add them where panic/expect would not be appropriate.

LagginTimes requested a review from evanlinjin May 15, 2025 19:06

LagginTimes self-assigned this May 15, 2025

evanlinjin requested changes May 16, 2025

View reviewed changes

LagginTimes marked this pull request as draft May 20, 2025 18:06

LagginTimes force-pushed the merkle_batching branch 2 times, most recently from d69907b to 149807c Compare May 21, 2025 18:42

evanlinjin reviewed May 23, 2025

View reviewed changes

Comment thread crates/electrum/tests/test_electrum.rs Outdated

Comment thread crates/electrum/src/bdk_electrum_client.rs Outdated

Comment thread crates/electrum/src/bdk_electrum_client.rs Outdated

notmandatory added the module-blockchain label May 23, 2025

notmandatory added this to BDK Chain May 23, 2025

notmandatory moved this to In Progress in BDK Chain May 23, 2025

LagginTimes force-pushed the merkle_batching branch 2 times, most recently from dc08959 to bf38a8e Compare May 25, 2025 18:30

LagginTimes marked this pull request as ready for review May 25, 2025 18:48

LagginTimes requested review from ValuedMammal and evanlinjin May 25, 2025 18:48

evanlinjin added this to the Wallet 2.0.0 milestone May 26, 2025

notmandatory modified the milestones: Wallet 2.0.0, Wallet 2.1.0 May 26, 2025

LagginTimes marked this pull request as draft May 26, 2025 23:18

LagginTimes force-pushed the merkle_batching branch from de14241 to bb26525 Compare May 27, 2025 18:16

LagginTimes force-pushed the merkle_batching branch 2 times, most recently from 90a0018 to 591b51a Compare June 8, 2025 17:32

jp1ac4 mentioned this pull request Jun 9, 2025

Bump bdk dependencies to latest wizardsardine/liana#1742

Open

LagginTimes marked this pull request as ready for review June 10, 2025 09:30

LagginTimes force-pushed the merkle_batching branch from 591b51a to 70495e2 Compare June 10, 2025 10:21

ValuedMammal reviewed Jun 10, 2025

View reviewed changes

Comment thread crates/electrum/src/bdk_electrum_client.rs Outdated

Comment thread crates/electrum/src/bdk_electrum_client.rs

ValuedMammal reviewed Jun 10, 2025

View reviewed changes