Skip to content

routing: retry upon max hops exceeded error.#2380

Closed
Bluetegu wants to merge 1 commit into
lightningnetwork:masterfrom
Bluetegu:hopexceeded-retry
Closed

routing: retry upon max hops exceeded error.#2380
Bluetegu wants to merge 1 commit into
lightningnetwork:masterfrom
Bluetegu:hopexceeded-retry

Conversation

@Bluetegu
Copy link
Copy Markdown
Contributor

@Bluetegu Bluetegu commented Dec 27, 2018

The shortest path (fee wise) may exceed the max number of hops. Currently, the find path algorithm bails out in this case, and as a result payment attempts between source and destination fail.

Instead attempt to find an alternate path with higher fee that can be used to route payments to destination.

@Roasbeef Roasbeef added enhancement Improvements to existing features / behaviour routing P3 might get fixed, nice to have bug fix labels Jan 2, 2019
@Bluetegu Bluetegu force-pushed the hopexceeded-retry branch 4 times, most recently from bf54072 to 372ffad Compare January 9, 2019 17:37
Comment thread routing/pathfind.go Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From #2444:

I think this point is too late to check max hops, because the search has already finished. It would be better to already check HopLimit when extending the bestNode in the main loop above. So don't extend any paths that are already HopLimit hops. That way, the algorithm will continue the search with more costly paths that may stay below the limit. Similar to how the fee limit is checked in processEdge.

I'm not sure I follow. Are you saying that PR#2380 does not fix the bug? It does fix the simple scenario I added in TestNewRoutePathTooLong to check that the bug was fixed. I am not aware of a graph in which the fix will not find a valid path, if such exists, but I haven't proved mathematically that such a graph doesn't exists either. Let me know if you have such a graph in mind.

Suppose the following graph:

Route 1: A -> x1 -> x2 -> x... -> x20 -> B (cost 1)
Route 2: A -> y1 -> y2 -> y... -> y20 -> B (cost 2)
Route 3: A -> B (cost 3)

FindPath will come up with Route 1, because it is the lowest cost one. It exceeds the hoplimit, so it invokes k-shortest paths. That comes up with Route 2 (second best). But it also exceeds the hop limit. The final result is that no route is found, while Route 3 is a perfectly fine alternative.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed that by including all shortest paths in the K search and filtered the long ones only at end. I believe this addresses any number of > HopLimit paths in the graph. It's becoming more elegant too, which is always a good sign :-). Thanks!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that worst case you are iterating over all possible paths. If you iterate over all possible paths, you are indeed guaranteed to find the optimum, but with a very different time complexity than Dijkstra's.

Currently the algorithms used in lnd are Dijkstra and Yen's which have proofs. In my opinion, if we invent a new algorithm, it should have a proof too. Proof that it is optimal indeed and what the time/memory complexity of it is. Next step would be to evaluate the complexity to see whether it is acceptable (many problems are easily solvable given infinite time, but very hard in practice).

You can go down that path, but is it worth it? Too many hops isn't a practical problem at the moment. If the only way to pay is through a twenty hop route, you should look at opening a channel to a better routing node.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can go down that path, but is it worth it? Too many hops isn't a practical problem at the moment. If the only way to pay is through a twenty hop route, you should look at opening a channel to a better routing node.

The problem is that a malicious attacker may add low fee hops in the middle of the network and black-hole payments from a set of sources to a set of destinations.

Currently the algorithms used in lnd are Dijkstra and Yen's which have proofs. In my opinion, if we invent a new algorithm, it should have a proof too. Proof that it is optimal indeed and what the time/memory complexity of it is.

Current code has:
findPath: Dijkstra + fail upon lowest fee > hopLimit
findPaths: Dijkstra + Yen

With the fix we have
findPath: Dijkstra + Yen upon lowest fee > hop Limit
findPaths: Dijkstra + Yen

So I'm not sure what further analysis is required.

This means that worst case you are iterating over all possible paths. If you iterate over all possible paths, you are indeed guaranteed to find the optimum, but with a very different time complexity than Dijkstra's.

I didn't understand that. It iterates on all shortest paths, ala Yen's algorithm, until if finds a shortest path that is below hop limit.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that a malicious attacker may add low fee hops in the middle of the network and black-hole payments from a set of sources to a set of destinations.

What exactly has this to do with the hop limit? Or do you mean not black hole a payment, but prevent a route from being found?

With the fix we have findPath: Dijkstra + Yen upon lowest fee > hop Limit

It isn't Yen anymore, because it now contains recursion back to findKPaths.

I didn't understand that. It iterates on all shortest paths, ala Yen's algorithm, until if finds a shortest path that is below hop limit.

Yes, but it isn't bounded anymore. Previously, we gave it a max number of routes to return. Suppose this graph:

Nodes a1, .., a20, b1, ..., b20, ..., z1, ... z20
Node a1 is connected to b1, b2, ... , b20
Node a2 is connected to b1, b2, ... , b20
...
Node a20 is connected to b1, b2, ... , b20
Node b1 is connected to c1, c2, ..., c20
etc
Start node S is connected to a1, ... a20
End node E is connected from z1, ... z20

The only route within hop limit is S -> REAL -> E, but it is the most expensive route possible.

Your algorithm will now explore all possible routes through the network of nodes created. There are 20^26 possibilities for this.

This is what I mean with time complexity.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exactly has this to do with the hop limit? Or do you mean not black hole a payment, but prevent a route from being found?

Yes, its a DoS attack in which the malicious attacker adds say 18 low fee hops between two major hubs, and as a result multiple nodes connected to these hub can no longer find paths to multiple destinations.

It isn't Yen anymore, because it now contains recursion back to findKPaths.

Please note that stopAtMaxHopsExceeded was added to avoid recursion. findKPaths is called only upon first time findPath finds lowest fee route is > hop limit.

Your algorithm will now explore all possible routes through the network of nodes created. There are 20^26 possibilities for this.

This implies that the attacker adds an enormous number of nodes and edges, which is impractical.
If this is a concern, although I personally don't think it is, we can easily add a limit on the number of too-long shortest-fee paths found, and bail out in findKPaths if it goes beyond that (i.e. we know how many too-long paths are in the shortestPaths slice at each iteration).

Are you suggesting that it is better not to fix this problem at all?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, its a DoS attack in which the malicious attacker adds say 18 low fee hops between two major hubs, and as a result multiple nodes connected to these hub can no longer find paths to multiple destinations.

Ok, I see.

Please note that stopAtMaxHopsExceeded was added to avoid recursion. findKPaths is called only upon first time findPath finds lowest fee route is > hop limit.

My point is that it is a new algorithm without proof.

This implies that the attacker adds an enormous number of nodes and edges, which is impractical.

It is not an enormous amount. Even if it would create 40 nodes (2 * 20), the number of combinations is already 2^20. So the attack vector you describe above isn't removed. Sender's path finding will go in a practically endless k-shortest loop.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an enormous amount. Even if it would create 40 nodes (2 * 20), the number of combinations is already 2^20. So the attack vector you describe above isn't removed. Sender's path finding will go in a practically endless k-shortest loop.

I added a unit test to test this scenario, please see TestPathTooLongDoSAttack. It computes the correct path in time (under 15 seconds). The algorithm doesn't run on all combinations before it reaches the correct higher-fee path.

The reason in short is the search is breadth first. It doesn't look that way, as it runs on all possible spurs of the shortest path found, but it only pops one of them as the next shortest path, allowing the algorithm to find the correct path before popping additional candidate paths from the first found too-long shortest fee path(s).

I did not provide a proof here. I can try and provide one if needed. I believe the test does show that the fix is much more robust than it seems.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The running time really depends on the choice of fees. There is an optimum where the fees forces a maximum number of rounds. Suppose you do 2*20 nodes. On the first layer you use fee 2 and 3, on the 2nd layer fee 4 and 5, on the 3rd layer 8 and 9, on the nth layer 2^n and 2^n+1. It will still make the algo go through all 2^20 cases.

Checking the hop limit inside the Dijkstra algorithm will solve this case much more efficient.

@Bluetegu Bluetegu force-pushed the hopexceeded-retry branch 5 times, most recently from 0628324 to fc5dd54 Compare January 14, 2019 17:52
The shortest path (fee wise) may exceed the max number of hops. Attempt to
find an alternate path with higher fee instead.
@Roasbeef Roasbeef closed this Oct 8, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug fix enhancement Improvements to existing features / behaviour P3 might get fixed, nice to have routing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants