Skip to content

lnrpc/routerrrpc+routing: fix send to route error propagation#5479

Merged
Roasbeef merged 3 commits into
lightningnetwork:masterfrom
Roasbeef:fix-send-to-route-error-propagation
Jul 7, 2021
Merged

lnrpc/routerrrpc+routing: fix send to route error propagation#5479
Roasbeef merged 3 commits into
lightningnetwork:masterfrom
Roasbeef:fix-send-to-route-error-propagation

Conversation

@Roasbeef
Copy link
Copy Markdown
Member

@Roasbeef Roasbeef commented Jul 7, 2021

In this commit, we fix a regression introduced by a recent bug fix in
this area. Before this change, we'd inspect the error returned by
processSendError, and then fail the payment from the PoV of mission
control using the returned error.

A recent refactoring removed processSendError and combined the logic
with tryApplyChannelUpdate in order to introduce a new
handleSendError method that consolidates the logic within the
shardHandler. Along the way, the behavior of the prior check was
replicated in the form of a new internal failPayment closure. However,
the new function closure ends up returning a channeldb.FailureReason
instance, which is actually an error.

In the wild, when SendToRoute fails due to an error at the
destination, then this new logic caused the handleSendErorr method to
fail with an error, returning an unstructured error back to the caller,
instead of the usual payment failure details.

We fix this by no longer checking the handleSendErorr for an error as
normal. The handleSendErorr function as is will always return an error
of type *channeldb.FailureReason, therefore we don't need to treat it
as a normal error. Instead, we check for the type of error returned, and
update the control tower state accordingly.

With this commit, the test added in the prior commit now passes.

Fixes #5477.

@Roasbeef Roasbeef added this to the 0.13.1 milestone Jul 7, 2021
@Roasbeef Roasbeef requested a review from yyforyongyu July 7, 2021 02:59
Comment thread routing/payment_lifecycle.go Outdated
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to ReportPaymentFail below, we just count this as an internal failure reason. This lets the method always return *channeldb.FailureReason.

The downside is that we lose this error context from the PoV, of the caller.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the internalErrorReason. To me errors come from other packages/services do seem to be "internal" when we are viewing it as the user of the routing. It's probably better (haven't thought through it) though we could wrap the error inside the internal error and returns it back to the caller.

Comment thread routing/router.go Outdated
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the method should maybe just return *channeldb.FailureReason directly so it's clear that it can't return a "real" generic error.

Comment thread routing/router_test.go Outdated
Comment thread routing/router_test.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there's a typo in the commit message,

As is, this test will fail as a recent refactoring causes us to return a
channeldb.FailureReason error, since the newly added handleSendError
code path in the SendToRoute method will return the raw error, ratehr rather
than the shardError, which is of the expected type.

Comment thread routing/router_test.go Outdated
Comment thread routing/router.go Outdated
Comment thread channeldb/payment_control.go Outdated
Copy link
Copy Markdown
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix! LGTM, just a few nits.

I haven't dug into the details about how errors are handled here, just a thought. It's probably better if we could make package-specific error types so that the raw error is never returned. Maybe a errors.go under every package, and when returning an error inside the package, it must use the defined types and no more if err != nil { return err}. This would have the effect of forcing us to look into the errors returned and assign the appropriate type, so error-type-aware...

Comment thread routing/router.go Outdated
Comment thread routing/payment_lifecycle.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer the internalErrorReason. To me errors come from other packages/services do seem to be "internal" when we are viewing it as the user of the routing. It's probably better (haven't thought through it) though we could wrap the error inside the internal error and returns it back to the caller.

@Roasbeef
Copy link
Copy Markdown
Member Author

Roasbeef commented Jul 7, 2021

It's probably better if we could make package-specific error types so that the raw error is never returned.

Yeah generally we have this, like the set of errors in routing/errors.go. The issue here however, was the ambiguity w.r.t if an error was actual a fatal error, or an "internal error" to eventually be passed up and/or ignored.

@Roasbeef Roasbeef force-pushed the fix-send-to-route-error-propagation branch 2 times, most recently from 2c040ad to 4be44ec Compare July 7, 2021 22:01
Roasbeef added 3 commits July 7, 2021 15:31
…hance error

In this commit, we modify the existing `TestSendToRouteStructuredError`
test to return an error that doesn't trigger the second chance logic.
Otherwise, we'll get a nil failure result from the mission control
interpretation, meaning we won't exercise the full code path. Instead,
we use a terminal error to ensure that the expected code path is hit.

As is, this test will fail as a recent refactoring causes us to return a
`channeldb.FailureReason` error, since the newly added `handleSendError`
code path in the `SendToRoute` method will return the raw error, rather
than the `shardError`, which is of the expected type.
In this commit, we fix a regression introduced by a recent bug fix in
this area. Before this change, we'd inspect the error returned by
`processSendError`, and then fail the payment from the PoV of mission
control using the returned error.

A recent refactoring removed `processSendError` and combined the logic
with `tryApplyChannelUpdate` in order to introduce a new
`handleSendError` method that consolidates the logic within the
`shardHandler`. Along the way, the behavior of the prior check was
replicated in the form of a new internal `failPayment` closure. However,
the new function closure ends up returning a `channeldb.FailureReason`
instance, which is actually an `error`.

In the wild, when `SendToRoute` fails due to an error at the
destination, then this new logic caused the `handleSendErorr` method to
fail with an error, returning an unstructured error back to the caller,
instead of the usual payment failure details.

We fix this by no longer checking the `handleSendErorr` for an error as
normal. The `handleSendErorr` function as is will always return an error
of type `*channeldb.FailureReason`, therefore we don't need to treat it
as a normal error. Instead, we check for the type of error returned, and
update the control tower state accordingly.

With this commit, the test added in the prior commit now passes.

Fixes lightningnetwork#5477.
@Roasbeef Roasbeef force-pushed the fix-send-to-route-error-propagation branch from 4be44ec to 079ab1c Compare July 7, 2021 22:31
@Roasbeef Roasbeef merged commit 583dee9 into lightningnetwork:master Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Payment failures are no longer populated as structured errors, payment failure details not returned

2 participants