Skip to content

routing+channeldb: mpp bucket structure#4000

Merged
Roasbeef merged 16 commits into
lightningnetwork:masterfrom
joostjager:payment-raw-failure
Mar 10, 2020
Merged

routing+channeldb: mpp bucket structure#4000
Roasbeef merged 16 commits into
lightningnetwork:masterfrom
joostjager:payment-raw-failure

Conversation

@joostjager
Copy link
Copy Markdown
Contributor

@joostjager joostjager commented Feb 13, 2020

Update database structure to accommodate for multiple htlcs per payment. This is a preparation for mpp multi-shard sending (#3970 and #3967).

Additionally this pr adds a raw failure field to the payment htlc bucket. This is required to make SendToRoute reliable across restarts. That work will be done in a separate pr, but the db change itself is in this pr to prevent an additional migration later on. Also having the per-htlc failure reasons available helps with debugging issues in general.

Both changes are visible in the output of ListPayments. That will now show all the attempted htlcs including the failed ones and their failure reasons.

@joostjager joostjager force-pushed the payment-raw-failure branch 3 times, most recently from 2376822 to e375741 Compare February 19, 2020 16:40
@joostjager joostjager force-pushed the payment-raw-failure branch 3 times, most recently from 7a195eb to df57625 Compare February 20, 2020 09:14
@joostjager joostjager changed the title routing+channeldb: store htlc failure routing+channeldb: mpp bucket structure Feb 20, 2020
@joostjager joostjager force-pushed the payment-raw-failure branch 16 times, most recently from 21b708a to ff91bc0 Compare February 21, 2020 10:03
@joostjager joostjager marked this pull request as ready for review February 21, 2020 12:48
@joostjager joostjager removed the request for review from Roasbeef February 21, 2020 12:49
@joostjager
Copy link
Copy Markdown
Contributor Author

PR is ready for first pass. Migration commit needs to be squashed with db changes commit

Copy link
Copy Markdown
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joostjager pr is looking really close! my main comment rn is that creation time is still left in unix time, while settle and fail times are in nanoseconds. imo we should convert them in this migration

Comment thread routing/router_test.go
Comment thread channeldb/mp_payment.go Outdated
Comment thread routing/control_tower.go Outdated
Comment thread channeldb/payment_control.go Outdated
Comment thread channeldb/migration13/migration.go
Comment thread channeldb/payments.go Outdated
Comment thread channeldb/payment_control.go Outdated
Comment thread channeldb/mp_payment.go
@joostjager joostjager force-pushed the payment-raw-failure branch 6 times, most recently from c98c58d to a95bf0d Compare March 9, 2020 10:36
@joostjager
Copy link
Copy Markdown
Contributor Author

Removed the "0x"+ part from the migration. Replaced by migtest.Hex helper function

joostjager and others added 11 commits March 9, 2020 11:43
MPPaymentCreationInfo and PaymentCreationInfo are identical except for
the naming of CreationTime/CreationDate.
To better distinguish payments from HTLCs, we rename the attempt info
struct to HTLCAttemptInfo. We also embed it into the HTLCAttempt struct,
to avoid having to duplicate this information.

The paymentID term is renamed to attemptID.
Previously this was tested as a white box. Database access methods were
duplicated as test code and compared to the return value of the code
under test. This approaches leads to brittle test because it relies
heavily on implementation details. This commit changes this and prepares
for additional test coverage being added in later commits.
Duplicate payments is legacy that we keep alive for accounting purposes.
This commit isolates the deserialization logic for duplicate payments in
its own file, so that regular payment logic and db structure can evolve
without needing to handle/migrate the legacy data.
In a later commit, htlc raw failure messages will be exposed through the
main rpc. This is a preparation for that.
These functions will (indirectly) be called by the main rpc server and
can no longer stay conditionally compiled.
Preparation for marshalling wire errors as part of rpc payment lookups.
Preparation for when we need to return errors in a next commit.
Removes this unnecessary dependency allowing migration code to use
utility functions from channeldb/migtest.
@joostjager joostjager force-pushed the payment-raw-failure branch 2 times, most recently from 4428cac to 2dfb7dc Compare March 9, 2020 10:47
@joostjager joostjager requested a review from cfromknecht March 9, 2020 10:52
Copy link
Copy Markdown
Contributor

@halseth halseth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No major comments, PR looks solid 👍

Comment thread channeldb/payment_control.go Outdated
Comment thread channeldb/payment_control.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If htlc.Settle != nil we will never allow failing the payment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that correct? We have the preimage, so how can the payment be failed?

Comment thread channeldb/payments.go
if err != nil {
return err
}
attemptInfo.AttemptID = aid
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfromknecht Thoughts?

Comment thread channeldb/mp_payment.go
// Write failure. If there is no failure message, write an empty
// byte slice.
var messageBytes bytes.Buffer
if f.Message != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the message have to come before the reason in the serialized code?

Comment thread channeldb/migration13/migration.go Outdated
Comment thread channeldb/migration13/migration.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably easier to create a time struct from the bytes, then write out as UnixNano.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that easier? Conversion from secs to nanosecs isn't more than this multiplication.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you commented on this because of the stray timeBytes[0]=0 statement?

joostjager and others added 5 commits March 9, 2020 18:31
This commit converts the database structure of a payment so that it can
not just store the last htlc attempt, but all attempts that have been
made. This is a preparation for mpp sending.

In addition to that, we now also persist the fail time of an htlc. In a
later commit, the full failure reason will be added as well.

A key change is made to the control tower interface. Previously the
control tower wasn't aware of individual htlc outcomes. The payment
remained in-flight with the latest attempt recorded, but an outcome was
only set when the payment finished. With this commit, the outcome of
every htlc is expected by the control tower and recorded in the
database.

Co-authored-by: Johan T. Halseth <johanth@gmail.com>
This commit extends the htlc fail info with the full failure reason that
was received over the wire. In a later commit, this info will also be
exposed on the rpc interface. Furthermore it serves as a building block
to make SendToRoute reliable across restarts.
This commit adds test helper code to dump, restore and verify the
low-level bbolt database structure.
This commit migrates the payments in the database to a new structure
that allows for multiple htlcs per payments. The migration introduces a
new sub-bucket that contains a list of htlcs and moves the old single
htlc into that.
Adds a new rpc field to the payment htlc proto message and populates it
with data that is now stored in the db.
Copy link
Copy Markdown
Contributor

@cfromknecht cfromknecht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM this looks ready to land ⚡️

Copy link
Copy Markdown
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🔰

Ran the migration on a few nodes (including those known to have duplicate payments), and the resulting diff in the ListPayments output is as expected (addition of the new fields).

Left some additional comments, but these are non-blocking and can be resolved in follow up PRs.

Comment thread lnrpc/rpc.proto
}


message Failure {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking, but I'd prefer to just copy over the protos compared to breaking users of the existing routerrpc package (some wallets now use this exclusively).

copy(newCreationInfo, creationInfo)

// Convert to nano seconds.
timeBytes := newCreationInfo[32+8 : 32+8+8]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking but this section could be made more readable by getting ride of the magic numbers and using types from the time package.

return nil, errors.New("unknown htlc failure reason")
}

return rpcFailure, nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There're a few fields that we appear to not properly be setting (might as well give all the information we have):

                    "failure": {
                        "code": "TEMPORARY_CHANNEL_FAILURE",
                        "channel_update": {
                            "signature": "2bc91eb05e6fe9e286e0e09372efdaafec4a62d224bdb7db7d23ea431722ac801ca39a9dd6827fe6d8913e806a1e6ed43ec7cae13a4e3bb8ffb40efbec6fdc11",
                            "chain_hash": "43497fd7f826957108f4a30fd9cec3aeba79972084e90ead01ea330900000000",
                            "chan_id": "1832711061194407937",
                            "timestamp": 1583766633,
                            "message_flags": 1,
                            "channel_flags": 1,
                            "time_lock_delta": 15,
                            "htlc_minimum_msat": "1000",
                            "base_fee": 0,
                            "fee_rate": 0,
                            "htlc_maximum_msat": "16609154000",
                            "extra_opaque_data": null
                        },
                        "htlc_msat": "0",
                        "onion_sha_256": null,
                        "cltv_expiry": 0,
                        "flags": 0,
                        "failure_source_index": 1,
                        "height": 0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants