[reliable payments] switch result store clean-up#3131
Conversation
|
Is it worth rebasing this and see how much code is left? |
|
Should maybe wait for MPP changes to determine whether this is still needed. |
|
Do we still care about this? |
|
As stated in #3703 the bucket takes up a good chunk of space unnecessarily, so I think it would make sense to finish this one. |
|
@halseth put it back on the milestone, looks like it's gonna take quite a rebase... |
9c0c647 to
5466458
Compare
|
Rebased. Ended up doing it in a much simpler way, that also gets rid of the gap from the original PR... |
| } | ||
|
|
||
| var toClean [][]byte | ||
| if err := networkResults.ForEach(func(k, _ []byte) error { |
There was a problem hiding this comment.
This could be further simplified by using a kvdb.RwCursor. Using the cursor there's no need to collect and delete but items can be deleted on the fly.
There was a problem hiding this comment.
Nice, didn't know about that! Done 👍
| t.Fatalf("unable to get result: %v", err) | ||
| } | ||
| if i >= 2 && err != ErrPaymentIDNotFound { | ||
| t.Fatalf("expected ErrPaymentIDNotFound, got %v", err) |
There was a problem hiding this comment.
Similarly, you could use require.Error.
There was a problem hiding this comment.
I don't know if that can be used to match on the exact error?
There was a problem hiding this comment.
this should do it: require.Error(t, ErrPaymentIDNotFound, err)
There was a problem hiding this comment.
Couldn't get it to work. Looks like require.Error only checks that it is non-nil.
| return err | ||
| } | ||
|
|
||
| for _, a := range payment.HTLCs { |
There was a problem hiding this comment.
It seems to me that p which is an InFlightPayment has an Attempts member which holds the same info? (Could be I'm wrong here lacking deeper knowledge of the underlying code.)
There was a problem hiding this comment.
I deleted that field in the first commit, since it was not used for anything else :)
There was a problem hiding this comment.
Need to upgrade my memory to be able the handle three consecutive commits :)
5466458 to
02bd06a
Compare
bhandras
left a comment
There was a problem hiding this comment.
A few nits, otherwise LGTM 💡
There was a problem hiding this comment.
nit: could be
if numDeleted > 0 {
log.Infof("Removed %d...)
}
| t.Fatalf("unable to get result: %v", err) | ||
| } | ||
| if i >= 2 && err != ErrPaymentIDNotFound { | ||
| t.Fatalf("expected ErrPaymentIDNotFound, got %v", err) |
There was a problem hiding this comment.
this should do it: require.Error(t, ErrPaymentIDNotFound, err)
| return err | ||
| } | ||
|
|
||
| for _, a := range payment.HTLCs { |
There was a problem hiding this comment.
Need to upgrade my memory to be able the handle three consecutive commits :)
02bd06a to
44b71b4
Compare
cfromknecht
left a comment
There was a problem hiding this comment.
LGTM, only one comment about deleting under a cursor. i think this is okay, just wanted to bring it up just in case
There was a problem hiding this comment.
didn't we have an issue in the past where deleting while iterating caused issues? maybe i'm misremembering the issue tho...
There was a problem hiding this comment.
You're right, this is still a bug: etcd-io/bbolt#146
When you mention it I remember we had to work around it in btcwallet
44b71b4 to
d52eabc
Compare
That let us clean up handed off payment results, as they will never be queried again.
1ff47c6 to
90a59fe
Compare
This commit makes GetPaymentResult return a channel that should be closed by the caller when the result from the result channel is handled. This instructs the Switch that it is safe to delete the result from itsdatabase.
We make the router close this channel after it has successfully recorded the result, or decided it can re-attemptThis is a follow-up from #2762 and is not critical for the operation of
lnd, but the network results would never be deleted from the Switch's store, causing them to take up space. This fixes that by deleting them when we know for sure that the router has handled the result, and won't need it anymore.There is still a gap that can happen if we crash after the router has acked the result, but before the switch has deleted the payment, but in the worst case it will just leave a result lingering in the store.2020 update: we do the clean up simply by syncing the control tower with the network result store on startup, garbage collecting the never-to-be-read-again results.