Use the new script_sync logic for RPC backend#466
Use the new script_sync logic for RPC backend#466rajarshimaitra wants to merge 19 commits intobitcoindevkit:masterfrom
script_sync logic for RPC backend#466Conversation
Blockchain calls sync logic rather than the other way around. Sync logic is captured in script_sync.rs.
Use BTrees to store ordered sets rather than HashSets -> VecDequeue
This updates the rpc backend sync logic to use the script_sync pattern added in bitcoindevkit#461. A new stop_gap parameter is added to the rpc blockchain config, which determines termination of sync loop.
BitcoinD is used with txindex flag for blockchain tests to support get_raw_transaction rpc calls.
62d08b0 to
45e1e25
Compare
I think this is an issue, at the moment rpc client doesn't need the bitcoin node to have the txindex enabled, this means it may happen receiving tx have unknown fee because not all the prevouts are known, but I think it's an acceptable tradeoff. @LLFourn Is it possible to "partially satisfy" |
| .iter() | ||
| .filter(|tx_result| { | ||
| // Filter out txs related to the script_pubkey | ||
| if let Some(address) = &tx_result.detail.address { |
There was a problem hiding this comment.
I don't think we can rely on this to filter txs, what happens for example if we have a tx with 2 owned outputs ?
I am not sure what's the best way to handle this since at this stage we haven't the raw tx, but I think we should operate on raw tx to check the presence of script_pubkey
There was a problem hiding this comment.
I think in that case it will just match for both the script_pubkeys. listtransactions() will give an entry corresponding to each owned addresses. So the satisfier will have the same txid for both the scripts. In fact this is happening already in some of the tests, and I guess the underlying db update logic already handles that.
Saying that I do think the current filter logic is inadequate, and because of that the tests are failing. see #466 (comment)
There was a problem hiding this comment.
listtransactions() will give an entry corresponding to each owned addresses.
I didn't know about this, should be ok then
| related_txs | ||
| }) | ||
| .collect::<Vec<_>>(); | ||
|
|
There was a problem hiding this comment.
I launched cargo test --features test-rpc -- test_sync_bump_fee_remove_change -- --nocapture
with a dbg!(&satisfier) here and noticed the bumped tx is not filtered out (causing the test to fail)
There was a problem hiding this comment.
Yes. The reason its happening is, in case of a tx without change address, the listtransactions() will only have an entry with the recipient's address. Which is an out of wallet address for those two test failure cases, and thus not fetched by the current simple filter.
I think this has to be fixed first anyway. My current plan is to get the difference between listtransactions() and related_txs sets, and apply prev_out fetch on the ones in the set difference. Match with the address in their prev_outs, and add them to appropriate list in the satisfier.
Hopefully that can be done with get_transaction(), because the sought after inputs should always be a "owned" utxo, so its tx should be found. Also this way it will reduce too many get_transaction() call, we only fetch for tx with "owned inputs + not owned outputs". "owned output" txs will have already been filtered in the first address match try.
I hope this will work and at least will fix the test failures. Also looking for more thoughts or suggestions..
LLFourn
left a comment
There was a problem hiding this comment.
@RCasatta the TxOuts are used to calculate the fee and also how much the wallet was sent by checking if we own the script_pubkey. I'm not sure what the impact of this will be or how accurate fee and sent were before on the rpc backend.
@rajarshimaitra I can't concept ACK this since I'm not really experienced with the existing rpc sync logic. Can you elaborate what was the disadvantages of the previous approach and how this is improving it? In my mind bitcoind is keeping track of what transactions are associated with the scripts pubkeys owned by the wallet. When you filter through all the transactions again and again looking for things associated with the script it is duplicating the effort that bitcoind has already done.
It's possible you could do something similar to script_sync but without the ScriptReq part. Instead you start wtih DescriptorReq which is satisfied by getting all the txids associated with that descriptor and then it moves onto the TxReq stage. That seems like it would be more natural and would remove the strange stop_gap where there shouldn't be one (if I understand correctly).
| .map(|txin| { | ||
| // check if the prev_out is in db | ||
| if let Some(txout) = | ||
| db.get_previous_output(&txin.previous_output)? |
There was a problem hiding this comment.
I think blockchain logic should not do this. Rather when you pass a None to Option<Vec<TxOut>> when you satisfy the request the TxReq can look it up in the database to see if it can salvage it.
There was a problem hiding this comment.
Ah that's pretty neat. But what if the prev_out is not in the db? Will that cause any problem? This filter is added here because to satisfy the TxReq, I need to know weather I have to fetch for the prev_out, which is costly, so I don't wanna do that for all the inputs. Thus the filter.
As @RCasatta mentioned here #466 (comment), if it's possible to partially satisfy the TxReq without all the prev_outs, then the txindex requirement problem will also get solved.
There was a problem hiding this comment.
You're right my thinking was just wrong.
|
Thanks @LLFourn for the look.
Currently the sync logic is fetching the tx list ( I did this experiment just to see weather a generic db update logic like
But its entirely possible that trying to do the syncs in a self similar structure is actually less optimized. And I tried this out just to find that. 😄
That's interesting. Though I don't want to build another custom sync logic just for RPC because that will break all those "unifying" points above.
I probably didn't get this one. As far as I understand the |
As you can see from the optionality of the field fee in (Now that I read the comment on the fee field, I think the "offline" part is wrong. An always online bitcoin full node without txindex doesn't know the fee of a received wallet transaction because it doesn't know its previous outputs) |
|
Thanks for the detailed explanation @rajarshimaitra.
I don't think it'd have to be that bad. Keeping the decoupled structure while re-using most of the logic would be an improvement and achieve the properties you were interested in.
Right this is my point. The fact that unnecessary parameters are showing up indicates to me that this solution doesn't quite fit. I think you can come up with a more precise solution by re-using most of the logic but removing the |
the stop_gap parameter doesn't have a any logical existence in the RPC sync logic and it's just a config variable for `start_sync()`. Removed from RPC blockchain config.
Script request handling have been refactored from primarily looping over the requested scripts, to primarily loop over the listtransactions() result. By doing so we can better handle when to make the costly input fetching, if the transaction doesn't have an owned output. Additional cache is added to store the fetched inputs, which are then used in the Tx request handling, reducing rpc get_transaction() calls. This fixes the current test failure of RBF replacement without change.
|
Made few refactoring changes. Instead of trying to create something novel, I tried one last time to work with existing script requests.
I think this has at least reached to the point I had in my goal at the start. We can use the existing So I think I will drag it out of draft now. There might be more potential improvements, but this is now ready for reviews. |
script_sync logic for RPC backendscript_sync logic for RPC backend
|
One good news. With bitcoin/bitcoin#23319, input fetching can be done a lot faster.. |
|
Now that #501 is in you'll need to rebase on |
|
Documenting few Bitcoin core wallet PRs that we will benefit from in RPC sync:
|
|
Damn I wish I saw this PR before working on the |
|
Hey, we are in the process of releasing BDK 1.0, which will under the hood work quite differently from the current BDK. For this reason, I'm closing all the PRs that don't really apply anymore. If you think this is a mistake, feel free to rebase on master and re-open! |
Description
Recently in #461 a new syncing framework was introduced, for
esploraandelectrumbackends. Among many other optimization improvements,script_synccan also provide easy extension into other kind of backend. All the database managements are handled byscript_sync, and the blockchain backend just have to to satisfysync requestswith appropriatesatisfiers.In this PR I attempted to extend the same framework for RPC backend. Which I hope will allow us to optimize the sync logic in RPC even further.
Notes to the reviewers
This is a WIP PR as there are 2 blockchain test failures which I am still investigating. Opening this up for early reviews and comments. I am sure there are still many room for improvements. Also any eyes on debugging the test failures would be very helpful.
Test error log
This PR build on #461, so only the last 2 commits needs to be reviewed.
A new optional
stop_gapfield, is added inRpcBlockchain, which determines the sync loop termination. The default value is 20.in
blockchain_tests::TestClient, thebitcoindinstance is created withtxindex=1, which is required to callget_raw_transactions()in core RPC. I would like to do it without txindex as before. But not all previous outputs are found byget_transaction()and some tests were failing due to this.Checklists
All Submissions:
cargo fmtandcargo clippybefore committing