Strict channel enabling#2091
Conversation
fcee865 to
a6b5089
Compare
There was a problem hiding this comment.
should maybe filter by EligibleToForward()?
There was a problem hiding this comment.
Ah, good idea! Should the same filtering be done for HasActiveLink below?
There was a problem hiding this comment.
yes it seems so. in fact, i think we may be sending out channel announcements prematurely in some cases then
There was a problem hiding this comment.
I think it'd be useful to have these be distinctly configurable. The naming is somewhat confusing since the config option is overloaded
There was a problem hiding this comment.
would you say it is sufficient to rename it? Making them distinct will complicate the logic a bit, and also seems they should be symmetric.
There was a problem hiding this comment.
hmm my thinking was that they would be asymmetric, or at least that the ability to configure them as such would be useful even if the defaults were the same
There was a problem hiding this comment.
still seems like overloading the parameter is confusing and that they should be separate
There was a problem hiding this comment.
this is also somewhat of a breaking change from the user's perspective. prior to this change, their channels will be reenabled immediately. after applying this change, their channels won't be routable for 20 mins. this behavior happens w/o making any changes to the config because we're overloading the param
There was a problem hiding this comment.
It should help with the health of the network, and in the future nodes might get punished (by getting used less for routing) for not disabling unreliable channels.
There was a problem hiding this comment.
It should help with the health of the network
I agree, which to me indicates that the defaults should be symmetric. though i still don't see a reason for not giving users the option to choose asymmetric timeouts if they want, esp if the code paths for enable/disable will be distinct anyway.
misnaming/overloading parameters has led to a great deal of confusion previously, imo we should try keep argument names as self-descriptive as possible, since most users will never read lnd -h
There was a problem hiding this comment.
Agree on changing the parameter name to something more descriptive.
But I'm reluctant to making it possible to set them asymmetrically, as we want to discourage that (I think?), and also clutters up the config.
There was a problem hiding this comment.
I still think we want to have this logic w/in the peer. Depending on the timing of the channelStatusWatcher, it could result in the reenable happening at 2x the specified timeout. I think this is okay for disables, but not something we want for enabling a channel
There was a problem hiding this comment.
#2080 still seems like it should go in, and then thread the reenable timeout to override the default
There was a problem hiding this comment.
I think the logic within the channelStatusWatcher should ensure that an active channel will stay disabled for at most InactiveChanInterval * 5 / 4 ?
The reason we want this for enabling as well is to attempt making the network converge to a state where nodes going offline often will have their channels disabled. Eventually we might even consider enabling them with an exponential backoff.
There was a problem hiding this comment.
I think the logic within the channelStatusWatcher should ensure that an active channel will stay disabled for at most InactiveChanInterval * 5 / 4 ?
Indeed, it is 1.25x the specified timeout
From my understanding, the channelStatusWatcher is effectively long-polling, and is susceptible to false positives where a peer is online for say 5 mins at a time, but disconnects and reconnects quickly so each time we think it's enabled.
IMO we should only reenable if the connection is actually stable for 20 (or w/e) minutes, which can't really be deduced via long polling. #2080 fixes this by ensuring we always do so only after the prescribed duration since the connection was initialized. We avoid this currently by immediately sending a reenable, so the watcher is never actually detecting the disable -> enable transition.
FWIW, I think long polling is sufficient for detecting disables, as being down four consecutive times should be an indicator that the channel isn't reliable. For enabling however, passing four times isn't really sufficient for determining if the connection is actually stable.
There was a problem hiding this comment.
Eventually we might even consider enabling them with an exponential backoff.
Yeah was pondering this as well, I think it'd be a good idea
There was a problem hiding this comment.
IMO we should only reenable if the connection is actually stable for 20 (or w/e) minutes, which can't really be deduced via long polling.
Totally agree, I like this idea! I would say if we do this we should also make this true for disables; if you cannot maintain a stable connection for 20 minutes, we'll disable you. I think the code would be easier to maintain and extend if they share the logic here.
To allow nodes to do infrequent restarts, we could say something like they must keep a rolling 18.5 out of 20 min uptime, with each disconnect counting as minimum 1 min downtime say ( I haven't given this too much thought yet). I imagine @valentinewallace's channel event service could come in handy here :)
cdcf292 to
8b2d113
Compare
|
The disable bit decoupling is now split into a separate PR, to reduce the size of this one, and since they are really two different discussions: #2115 |
The local path finding is now decoupled from the ChannelUpdate disable flags in the channel graph, so we don't have to immediately enable the channels when a peer connects. Instead we let the server's watchChannelStatus goroutine enable the channel after it is been active for a duration of inactivechantimeout.
8b2d113 to
05d2e92
Compare
|
Replaced by #2080. |
This PR enables strict channel enabling in
lnd. This means that similarly to how we disable channels, we require a disabled link to stay active for 20 minutes (default) before re-enabling it.This is done in two steps:
The path finding algorithm is modified to take into account an extra set of active local channels, that it will use when considering whether to ignore local channels. This improves the current path finding logic when usingSplit into path finding: Disable bit decoupling #2115mission control, as we will no longer attempt to route over local channels that are inactive.This essentially decouples the graph's disabled status from the local path finding, as the local channel status will be more up-to-date than what is found in the graph.
Builds on #2115
Replaces #2080