AddressIndex improvements: LastUnused, FirstUnused, and get_batch_unused_addresses()#546
AddressIndex improvements: LastUnused, FirstUnused, and get_batch_unused_addresses()#546nickfarrow wants to merge 5 commits intobitcoindevkit:masterfrom
Conversation
7b36e47 to
5daf23a
Compare
|
approach ACK. Needs rebase. |
1f4da05 to
8bf61de
Compare
|
Code changes look great but you'll need to do another rebase and can you add a signing key to Github and sign your commits when you do the rebase also? |
f8875d8 to
05aeb23
Compare
|
force push updated CI |
rajarshimaitra
left a comment
There was a problem hiding this comment.
tACK 05aeb23
Below are some suggested modification..
| // use the first address | ||
| crate::populate_test_db!( | ||
| wallet.database.borrow_mut(), | ||
| testutils! (@tx ( (@external descriptors, 0) => 25_000 ) (@confirmations 1)), | ||
| Some(100), | ||
| ); | ||
|
|
||
| assert_eq!( | ||
| wallet.get_address(FirstUnused).unwrap().to_string(), | ||
| "tb1q4er7kxx6sssz3q7qp7zsqsdx4erceahhax77d7" | ||
| ); |
There was a problem hiding this comment.
I would like to see the test situation here where we extract multiple addresses, use some of them and get back a previous unused one when called again.. That would correctly test the intended behavior.. Right now its just testing the vanilla situation..
There was a problem hiding this comment.
Hmm I actually don't know a better test to write than this one? With the batch unused you can write a more complicated test but with FristUnused there's not much you can do.
There was a problem hiding this comment.
Something like derive a bunch of address.. Only use some of them selectively so the address gaps are simulated.. Then check if the first unused is returned correctly.. Am I missing some details why that can't work??
There was a problem hiding this comment.
It can work but I don't get why the gaps would effect the algorithm that finds the first unused. I mean I don't think that this will likely find a problem with the algorithm that this test wouldn't find.
There was a problem hiding this comment.
Its not that the gaps would affect the algorithm, but to confirm that the behavior we are intending here is actually happening.. And this can be checked in single test for both first and last unused.. Once the behavior is pinned, we can decide later which one to use when or to keep both..
LLFourn
left a comment
There was a problem hiding this comment.
Thanks @nickfarrow. Tests LGTM. See comments.
| // use the first address | ||
| crate::populate_test_db!( | ||
| wallet.database.borrow_mut(), | ||
| testutils! (@tx ( (@external descriptors, 0) => 25_000 ) (@confirmations 1)), | ||
| Some(100), | ||
| ); | ||
|
|
||
| assert_eq!( | ||
| wallet.get_address(FirstUnused).unwrap().to_string(), | ||
| "tb1q4er7kxx6sssz3q7qp7zsqsdx4erceahhax77d7" | ||
| ); |
There was a problem hiding this comment.
Hmm I actually don't know a better test to write than this one? With the batch unused you can write a more complicated test but with FristUnused there's not much you can do.
|
Hi, please rebase to pickup changes in #596. Thanks! |
Signed-off-by: nickfarrow <nick@nickfarrow.com>
Signed-off-by: nickfarrow <nick@nickfarrow.com>
Signed-off-by: nickfarrow <nick@nickfarrow.com>
* get_batch_unused_addresses loops through address indexes `from_front = true` (for firstUnused) or `false` (for lastUnused). * Relies on database having up to date script_pubkeys in such a manner that script_pks.len() == self.fetch_index(keychain) * 1 script pubkey per address index? * Must work with current_address_index = 0 Signed-off-by: nickfarrow <nick@nickfarrow.com>
| .list_transactions(true)? | ||
| .iter() | ||
| // Return whether this address has been used in a transaction | ||
| fn has_address_been_used(&self, script_pk: &Script) -> bool { |
There was a problem hiding this comment.
actually checking a Script..
| let check_indexes = if from_front { | ||
| (0..=current_address_index).collect::<Vec<_>>() | ||
| } else { | ||
| (0..=current_address_index).rev().collect::<Vec<_>>() |
There was a problem hiding this comment.
better way to do this?
There was a problem hiding this comment.
Return a impl DoubleEndedIterator from the method instead.
There was a problem hiding this comment.
I am still not comfortable with the current state and approach of the PR. Some behaviour are not maintained, like get_batch(n, false, keychain) will give the list of addresses in reverse order. Not in ascending order of indexes.
Also I am thinking isn't it better to mark used addresses directly in the database? Knowing which one is used and not and saving the data seems to me more useful than figuring it out by transaction matching with the entire tx list, everytime we ask for an unused..
This will also simplify the LastUnused and FirstUnused fetching logic..
|
|
||
| /// Return vector of n unused addresses from the [`KeychainKind`]. | ||
| /// If less than n unused addresses are returned, the rest will be populated by new addresses. | ||
| /// The unused addresses returned are in order of oldest in keychain first, with increasing index. |
There was a problem hiding this comment.
This is not as per impl right now.. if from_front is set false, the addresses are returned in reverse order..
| .list_transactions(true)? | ||
| .iter() | ||
| // Return whether this address has been used in a transaction | ||
| fn has_address_been_used(&self, script_pk: &Script) -> bool { |
There was a problem hiding this comment.
This guy is better named as is_scriptpubkey_used..
| pub fn get_batch_unused_addresses( | ||
| &self, | ||
| n: usize, | ||
| from_front: bool, | ||
| keychain: KeychainKind, | ||
| ) -> Result<Vec<AddressInfo>, Error> { |
There was a problem hiding this comment.
I am not feeling comfortable with the API. get_batched_unused should not be concerned with fornt or back. Thats an impl detail for LastUnused or FirstUnused. And should not be exposed in public API..
This is also breaking the doc above. The order is not maintained anymore..
Better to handle the handle the first or last logic in in their respective functions itself than to handle in the batch function which is more generic.
There was a problem hiding this comment.
The correct thing is to return a impl DoubleEndedIterator over unused addresses I think.
| for i in check_indexes { | ||
| // if we have made a pubkey at this index, check whether the address has been used. | ||
| if i < script_pubkeys.len() { | ||
| let script_pk = &script_pubkeys[i]; | ||
| if self.has_address_been_used(script_pk) { | ||
| continue; | ||
| } | ||
| } | ||
| if let Ok(unused_address) = self | ||
| .get_descriptor_for_keychain(keychain) | ||
| .as_derived(i as u32, &self.secp) | ||
| .address(self.network) | ||
| .map(|address| AddressInfo { | ||
| address, | ||
| index: i as u32, | ||
| keychain, | ||
| }) | ||
| .map_err(|_| Error::ScriptDoesntHaveAddressForm) | ||
| { | ||
| unused_addresses.push(unused_address); | ||
| } | ||
|
|
||
| if unused_addresses.len() >= n { | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Try using some rust list comprehensions with iters and maps. Much of this code can be simplified..
| assert_eq!( | ||
| wallet | ||
| .get_batch_unused_addresses(3, true, KeychainKind::External) | ||
| .unwrap(), | ||
| vec![ | ||
| AddressInfo { | ||
| index: 0, | ||
| address: Address::from_str("tb1q6yn66vajcctph75pvylgkksgpp6nq04ppwct9a") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| }, | ||
| AddressInfo { | ||
| index: 2, | ||
| address: Address::from_str("tb1qzntf2mqex4ehwkjlfdyy3ewdlk08qkvkvrz7x2") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| }, | ||
| AddressInfo { | ||
| index: 3, | ||
| address: Address::from_str("tb1q32a23q6u3yy89l8svrt80a54h06qvn7gnuvsen") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| } | ||
| ] | ||
| ); | ||
| } |
There was a problem hiding this comment.
Also need to assert that FirstUnused and LastUnused are working as intended..
- Get 5 new addresses
- Use only index 0 and 3
- get_batch(3) should return index 0, 2, 4. current index should still be at 4.
- get_first_unused() should return 0
- get_last_unused() should return 4
- get_batch(4) should return 0,2,4,5, and current index should be at 5.
There was a problem hiding this comment.
Adding a check of the derivation index here would be good.
| // if we have made a pubkey at this index, check whether the address has been used. | ||
| if i < script_pubkeys.len() { | ||
| let script_pk = &script_pubkeys[i]; | ||
| if self.has_address_been_used(script_pk) { |
There was a problem hiding this comment.
Here for each spk we are iterating over the entire transaction list. For wallets with large transaction this will can cause massive overhead.
Instead a better way would be to handle Vec<Script> in the has_address_been_used function. Call list_transactions only once, filter out all the spks that haven't been used and return then as a Vec.
There was a problem hiding this comment.
Yep good idea, note this would then also check more addresses than necessary (not breaking early when finding n).
If this was stored in the database it could also break early if it's just a fast read
|
@nickfarrow also try to rebase on top of master instead of fetching and merging specific commit next time.. :) |
|
I had to push this PR to the next release so the team can focus on #593, and after that one you'll probably need to rebase again. But then I promise we'll work on getting this one in. :-) |
|
Yep I think this one needs some bigger discussion first around whether it is worthwhile to mark used addresses directly in the database as @rajarshimaitra suggested. If this were the case, this could be simplified to a function This PR's Not sure what changed with old commits, possible I added signoff lines to the previous commits by mistake which may have updated them sorry. Will take care with next. |
|
Upon further thoughts on this I have this rough idea of how it can be done:
Pro: Cons: I am willing to work on fleshing an impl out if this has Approach Acks.. |
|
Keep in mind we also have the key/value db, we don't have tables and columns there. We can add a flag to mark a script as used (similarly to how I suggested adding a flag for scripts that we've already setup rather than relying on just the derivation index), but getting the list of unused addresses will still require scanning |
LLFourn
left a comment
There was a problem hiding this comment.
This PR is almost ready and provides something that is pretty useful. The main bits of work here are:
- To restore the previous (someone nonsensical) behaviour of
LastUnused - To add tests to check the wallets derivation index is correct after calling batch unused (can just try and get a new address after and check its index)
As @rajarshimaitra mentions the best way to implement is to index things properly which in bdk is currently done in the database backend. In bdk_core I've done indexing of unused addresses. Since I was the one who requested this feature and I'm focused on bdk_core we could simply close this PR and wait until it lands. Does anyone else want this feature presently? @nickfarrow what do you think?
| pub fn get_batch_unused_addresses( | ||
| &self, | ||
| n: usize, | ||
| from_front: bool, | ||
| keychain: KeychainKind, | ||
| ) -> Result<Vec<AddressInfo>, Error> { |
There was a problem hiding this comment.
The correct thing is to return a impl DoubleEndedIterator over unused addresses I think.
| let check_indexes = if from_front { | ||
| (0..=current_address_index).collect::<Vec<_>>() | ||
| } else { | ||
| (0..=current_address_index).rev().collect::<Vec<_>>() |
There was a problem hiding this comment.
Return a impl DoubleEndedIterator from the method instead.
| assert_eq!( | ||
| wallet | ||
| .get_batch_unused_addresses(3, true, KeychainKind::External) | ||
| .unwrap(), | ||
| vec![ | ||
| AddressInfo { | ||
| index: 0, | ||
| address: Address::from_str("tb1q6yn66vajcctph75pvylgkksgpp6nq04ppwct9a") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| }, | ||
| AddressInfo { | ||
| index: 2, | ||
| address: Address::from_str("tb1qzntf2mqex4ehwkjlfdyy3ewdlk08qkvkvrz7x2") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| }, | ||
| AddressInfo { | ||
| index: 3, | ||
| address: Address::from_str("tb1q32a23q6u3yy89l8svrt80a54h06qvn7gnuvsen") | ||
| .unwrap(), | ||
| keychain: KeychainKind::External, | ||
| } | ||
| ] | ||
| ); | ||
| } |
There was a problem hiding this comment.
Adding a check of the derivation index here would be good.
|
ACK on @LLFourn that this can go in as it is without much further changes.. I will check the behavior once again.. If this lands through |
* remove batch getting `n` unused addresses, just get them all (much simpler) * use next_back() and next() for last and first unused * test more cases for get_unused_address_indexes * inline functions and simplified next addr * create HashSet of txn scripts before checking unused * add firstunused testcase for repeated unused
|
Is this one OK to add this to the |
|
IMO this PR is suboptimal because of #701. I think it should be fixed first to make the code in this PR make sense. @nickfarrow? |
|
Yep agree it may as well wait for improvements to |
|
Hey, we are in the process of releasing BDK 1.0, which will under the hood work quite differently from the current BDK. For this reason, I'm closing all the PRs that don't really apply anymore. If you think this is a mistake, feel free to rebase on master and re-open! |
Description
AddressIndex::LastUnusedto look back further than current_indexAddressIndex::FirstUnusedget_batch_unused_addressesNotes to the reviewers
Builds upon #522
Currently BDK supports address indexing via
LastUnused, which will return the address withcurrent_indexif it is unused, otherwise it will return a New address.With this current logic, if you get two new addresses A1 and A2 and use A2, then
LastUnusedwill give you aNewaddress rather than the unused A1.In order to more consistently utilize unused addresses i've added a new function
get_unused_key_indexes(keychain)which returns a vector of indexes for the unused addresses in that keychain. Making use of this function,LastUnusednow returns the most recent address that has not yet been used, and New if all addresses have been used.In some cases it may be desirable to utilize unused addresses that reside earlier in the keychain. i.e.
AddressIndex::FirstUnusedin this PR.FirstUnusedhas the same caveat as LastUnused: that if the wallet has not yet detected an address has been used, it could return a used address.Additionally a new public function
get_batch_unused_addressesallows for retrieval ofNunused addresses at once. Prioritizing unused addresses first, then populating the remaining with New addresses (likeFirstUnused).For example: if a wallet builds a transaction involving many
Newinternal addresses but that transaction is never broadcast, then all of these addresses can now easily be used in a later transaction viaget_batch_unused_addresses.Checklists
All Submissions:
cargo fmtandcargo clippybefore committingNew Features:
CHANGELOG.md