[wallet] Add branch and bound coin selection#148
[wallet] Add branch and bound coin selection#148afilini merged 6 commits intobitcoindevkit:masterfrom
Conversation
murchandamus
left a comment
There was a problem hiding this comment.
This is coming along nicely! I think the tests with two UTXO are a bit too easy to actually uncover some of the issues around Branch and Bound. I would like to suggest that you increase the available UTXO at least into the range of 10 to see some interesting behaviors. Also, since the target window is feerate dependent, you might want to add a couple more tests with hier feerates.
Personally, I would prefer if a PR this large was composed of multiple commits. A few spots could use a bit more explanation, I think it may be hard to follow along in a few spots if one isn't as familiar with the intended outcome as myself.
|
|
||
| // Calculated as: | ||
| // prev_txid (32 bytes) + prev_vout (4 bytes) + script_len (1 bytes) + script len (0 bytes) + sequence (4 bytes) | ||
| const TXIN_DEFAULT_WEIGHT: usize = 32 + 4 + 1 + 4; |
There was a problem hiding this comment.
From what I read this only supports native segwit inputs. Is that the intention?
There was a problem hiding this comment.
Not exactly. This should be the weight needed to build the txin without the satisfaction weight, i.e., should be standard for every input. One thing this comment made me notice is that I shouldn't add script_len here, as that's part of the satisfaction_weight (and it's currently counted in every other CS implemented, I'll fix this in a separate commit) :)
| while let Some(false) = current_selection.last() { | ||
| current_selection.pop(); | ||
| curr_available_value += | ||
| may_use_utxos[current_selection.len()].effective_value as u64; |
There was a problem hiding this comment.
Is the current_selection.len() right here? If you are exploring a branch that has omitted multiple UTXO, I don't think the selection's length needs to match the position in the available UTXO anymore.
There was a problem hiding this comment.
current_selection is a vector of booleans, and current_selection[i] contains true if we're using may_use_utxos[i], false otherwise. (This is not very idiomatic Rust, I'm currently looking at a way to refactor it).
So current_selection.len() is always <= than may_use_utxos.len(), but current_selection[i] matches may_use_utxos[i] for i in 0..current_selection.len()
For example:
current_selection: [true, false, false]
may_use_utxos (only the effective_values): [1, 2, 3, 4, 5]
if you current_selection.pop() you are removing current_selection[2],
and current_selection.len() becomes 2;
at this point you have to add may_use_utxos[2]
| assert_eq!(result.txin.len(), 2); | ||
| assert_eq!(result.selected_amount, 300_000); | ||
| assert_eq!(result.fee_amount, 186.0); |
There was a problem hiding this comment.
The BnB algorithm never allows change to be created. In the case that no changeless solution is found, it is supposed to fallback to another selection method.
Since the two available inputs are 100,000 and 200,000 satoshis, this should result in a failure for BnB. Since 50,000 sats is much larger than the fee for two inputs at 1 sat/vB and the permitted excess.
There was a problem hiding this comment.
Yeah at the moment what's happening is that BnB is giving back TotalTriesExceeded (even if this is not exactly true: BnB tried everything and didn't find a suitable solution, it's not that there weren't enough tries!), single random draw is correctly giving back 300_000.
I have updated the tests so they test the functions bnb and single_random_draw (instead of testing only coin_select)
| assert_eq!(result.txin.len(), 2); | ||
| assert_eq!(result.selected_amount, 300_000); | ||
| assert_eq!(result.fee_amount, 186.0); |
| fn test_bnb_exact_match() { | ||
| let seed = [0; 32]; | ||
| let mut rng: StdRng = SeedableRng::from_seed(seed); | ||
| let database = MemoryDatabase::default(); | ||
|
|
||
| for _i in 0..200 { | ||
| let mut may_use_utxos = generate_random_utxos(&mut rng, 16); | ||
| let target_amount = sum_random_utxos(&mut rng, &mut may_use_utxos); | ||
| let result = BranchAndBoundCoinSelection::new(0) | ||
| .coin_select( | ||
| &database, | ||
| vec![], | ||
| may_use_utxos, | ||
| FeeRate::from_sat_per_vb(0.0), | ||
| target_amount, | ||
| 0.0, | ||
| ) | ||
| .unwrap(); | ||
| assert_eq!(result.selected_amount, target_amount); | ||
| } | ||
| } |
There was a problem hiding this comment.
This is a good sanity check. I would like to propose another. Generate 200 random UTXOs and make the target the sum of the effective value of two of them. You should only get a results with one or two imputs from that. I think that might uncover at least one bug in the above code (I think there is an issue in the backtrack). :)
Alternatively, I would like to suggest a test with a known set of available UTXOs where it e.g. uses the first and fifth largest UTXO of the set to construct the target.
There was a problem hiding this comment.
I wrote the first test you suggested and it's unsurprisingly failing (TotalTriesExceeded), but I don't get why it should pass. 200 UTXOs are really a lot (with values ranging from 0 sats to 2BTC), how could the bnb find the exact match with only 100_000 tries?
There was a problem hiding this comment.
What is the complexity of the BnB approach? O(n^2)? (If so, it should definitely pass :))
There was a problem hiding this comment.
The whole search space of BnB is O(2^n), but the bounding essentially makes it go to linear complexity for some parts. Finding an exact match with two inputs can be done in less than n^2 tries though. I think we will need a few more bounding mechanisms. :)
009349b to
6151657
Compare
|
I've updated the code, should be little bit better now. As I was saying in the comment, the test with 200 UTXOs is currently failing (throws TotalTriesExceeded). I'll dig into it :) |
5711f08 to
3702fdc
Compare
|
I updated the PR so it's not a draft anymore. |
|
Okay, I'll review it again. :) |
84d883d to
4d7257d
Compare
afilini
left a comment
There was a problem hiding this comment.
Even though there are a few things that could probably be expressed in a more "rust-like" style, I would go ahead and merge this so that we can prepare a tag for the next beta release, and then work on more improvements in separate PRs.
Do you have any more comments @xekyo before I go ahead and merge this?
|
I've made a smaller PR here that will conflict a bit with this. All in all it looks easier to merge that one first and then this one. I don't mind updating this PR if that works. Not a big issue either way. As a side note I think it would be good if the B&B coin selection algorithm go in its own file in a coin_selection directory. Presumably there could be more than one coin selection algorithm going forward. |
|
I'm cool with merging #159 before this one :) |
4d7257d to
84a8988
Compare
|
@danielabrozzoni, @afilini it looks like there are some complications to that PR that requires some further thinking so I'd say go for this one :) |
84a8988 to
5ebda55
Compare
Replace all the occurences of `serialize(&txin)` with TXIN_DEFAULT_WEIGHT.
5ebda55 to
9f31ad1
Compare
| struct OutputGroup { | ||
| utxo: UTXO, | ||
| // weight needed to satisfy the UTXO, as described in `Descriptor::max_satisfaction_weight` | ||
| satisfaction_weight: usize, | ||
| // Amount of fees for spending a certain utxo, calculated using a certain FeeRate | ||
| fee: f32, | ||
| // The effective value of the UTXO, i.e., the utxo value minus the fee for spending it | ||
| effective_value: i64, | ||
| } |
There was a problem hiding this comment.
Just a bit background info: OutputGroups in Bitcoin Core collect all unspents associated with one address. They are used to prevent spending only one UTXO if there are multiple associated with the same address, and to rather only spend all or none of them, so that there is no additional privacy leak.
| .collect(); | ||
|
|
||
| // Mapping every (UTXO, usize) to an output group. | ||
| // Filtering UTXOs with an effective_value < 0, as the fee paid for |
There was a problem hiding this comment.
Nit: Strictly speaking, I think you're filtering effective_value <= 0.
| let mut rng: StdRng = SeedableRng::from_seed(seed); | ||
| let fee_rate = FeeRate::from_sat_per_vb(0.0); | ||
|
|
||
| for _ in 0..200 { |
There was a problem hiding this comment.
Am I understanding correctly, that you're performing this test 200 times? Would that perhaps be the reason why it timed out with a larger number of UTXOs? If I understand that correctly, have you tried running it with ~500 UTXOs just once?
|
One optimization that I had added was to skip a UTXO's inclusion branch when it exactly matches the one that was just backtracked. E.g. A = 100 Would then have the exploration: { [A], [AB], [ABC], [ABCD], …, [ABC_…], [AB__…], A__D…]. I.e. once B gets omitted, C gets skipped, because the prefix [AB…] and [AC…] would be equivalent. The observation was that people often send round values and this was an easy way to skip a bunch of equivalent solutions. |
Fixes #48