peer: send reenable updates after 20m by cfromknecht · Pull Request #2080 · lightningnetwork/lnd

cfromknecht · 2018-10-22T20:49:40Z

This commit adds a short delay before sending
out channel reenable messages. The aim is to
prevent flooding the network if we have
flapping peers. The approach ensures that
reenabling will only occur within a peer's
lifecycle, which can prevent lingering
goroutines from interfering with
enable/disable attempts across different
connections with the same peer.

Roasbeef · 2018-10-24T01:19:05Z

announceChanStatus will already delay this, and only enable if the state hasn't changed in the last 20 minutes. In addition, this is an incomplete solution as we need to decouple enable/disable in our local graph, from the announcement we send out to the network. We should immediately enable within our graph, but delay the announcement until we reach a stable point. cc @halseth

halseth · 2018-10-24T04:24:18Z

Will be taking this over to finish the decoupling.

cfromknecht · 2018-10-24T23:13:15Z

@Roasbeef correct, this does not address the decoupling of local graph from network gossip, but neither is that the intention of this PR.

The purpose of this PR is

not enable immediately to avoid flooding in case the peer is unstable.
to remove the stray goroutine by ensuring reenabling happens within the peer lifecycle, as this can properly "cancel" the reenable if the peer disconnects before stability is reached.

announceChanStatus will already delay this, and only enable if the state hasn't changed in the last 20 minutes

where does this happen? to me, it looks like announceChanStatus signs and sends a new update to the gossiper immediately, which is then only delayed by 30s?

this is an incomplete solution as we need to decouple enable/disable in our local graph, from the announcement we send out to the network

again, that is mostly orthogonal to this PR. decoupling them allows us greater flexibility in picking timeouts for each. After decoupling, we are free to increase this timeout before sending out a new announcement.

Additionally, just letting channelStatusWatcher reenable has the downside that it may take up to almost twice the intended timeout before the channel is actually reenabled, where as this will reliably reenable after the chosen timeout.

cfromknecht · 2018-10-25T00:35:17Z

downside that it may take up to almost twice the intended timeout

reread through the code, and it's actually 1.25x

cfromknecht · 2018-10-31T04:27:55Z

After discussion w/ @halseth irl, seems like this should go in before #2091 as it is more accurate than allowing reenabling through long polling

halseth · 2018-10-31T13:36:29Z

Yep, #2115 needs to go in first though.

halseth · 2018-11-09T09:22:01Z

With #2115 merged I think this PR can replace #2091 if we set the channelReannounceDelay to InactiveChanTimeout (or create a new config option).

cfromknecht · 2018-11-11T00:45:22Z

Created a new config option called ChanStatusInterval, PTAL

halseth · 2018-11-13T16:45:00Z

is nilling necessary?

shaves some CPU cycles from select, figured why not?

adds some CPU cycles in my head at least

Why btw will it save CPU cycles? I think maybe that is only for tickers.

because the runtime doesn't have to acquire a lock to check if a value is being sent on a nil channel, it just skips it. it saves CPU cycles on every subsequent select

func main() { const n = 1000000 const numChans = 10 chans := make([]chan int, numChans) for i := range chans { chans[i] = make(chan int) } start := time.Now() for i := 0; i < n; i++ { select { case <-chans[0]: case <-chans[1]: case <-chans[2]: case <-chans[3]: case <-chans[4]: case <-chans[5]: case <-chans[6]: case <-chans[6]: case <-chans[8]: case <-chans[9]: default: } } fmt.Printf("non-nil: %v\n", time.Since(start)) for i := range chans { chans[i] = nil } start = time.Now() for i := 0; i < n; i++ { select { case <-chans[0]: case <-chans[1]: case <-chans[2]: case <-chans[3]: case <-chans[4]: case <-chans[5]: case <-chans[6]: case <-chans[6]: case <-chans[8]: case <-chans[9]: default: } } fmt.Printf("nil: %v\n", time.Since(start)) }

running this gives, OMM:

non-nil: 521.206479ms nil: 155.758026ms

In case anyone was interested, I modified the demo to show how the latency is linearly dependent on the number of nil channels:

func main() { const n = 1000000 const numChans = 10 chans := make([]chan int, numChans) for i := range chans { chans[i] = make(chan int) } testNNil := func(ii int) { start := time.Now() for i := range chans[:ii] { chans[i] = nil } start = time.Now() for i := 0; i < n; i++ { select { case <-chans[0]: case <-chans[1]: case <-chans[2]: case <-chans[3]: case <-chans[4]: case <-chans[5]: case <-chans[6]: case <-chans[7]: case <-chans[8]: case <-chans[9]: default: } } fmt.Printf("%d-nil: %v\n", ii, time.Since(start)) } for i := 0; i < 11; i++ { testNNil(i) } }

outputs:

0-nil: 529.313487ms 1-nil: 508.630059ms 2-nil: 492.725965ms 3-nil: 462.589964ms 4-nil: 437.568966ms 5-nil: 408.646654ms 6-nil: 358.446526ms 7-nil: 301.32741ms 8-nil: 244.178853ms 9-nil: 195.43525ms 10-nil: 160.556439ms

Wow, nice! TIL :)

halseth · 2018-11-13T16:45:39Z

needs a rebase

halseth · 2018-12-05T12:49:27Z

Needs a rebase now that #2115 is in!

Adds a configuration option for specifying delay between detecting active/inactive channels and sending an update to the network. The default interval is 20m.

wpaulino

Travis failed with:

    lnd_test.go:108: lnd finished with error (stderr):
        exit status 1
        unknown flag `inactivechantimeout'
        unknown flag `inactivechantimeout'
        
    --- FAIL: TestLightningNetworkDaemon/send_update_disable_channel (36.50s)

wpaulino · 2018-12-06T08:02:23Z

s/dispatchStatuses/reenableTimeout

This commit adds a configurable delay before sending out channel reenable messages. The aim is to prevent flooding the network if we have flapping peers. The approach ensures that reenabling will only occur within a peer's lifecycle, which can prevent lingering goroutines from interfering with enable/disable attempts across different connections with the same peer.

halseth · 2018-12-07T08:08:53Z

Another rebase needed 🤣

Roasbeef · 2018-12-15T01:13:05Z

+	// reenableTimeout will fire once after the configured channel status
+	// interval  has elapsed. This will trigger us to sign new channel
+	// updates and broadcast them with the "disabled" flag unset.
+	reenableTimeout := time.After(cfg.ChanStatusInterval)


Shouldn't it be possible for the existing watchChannelStatus method to also take on this responsibility? As is right now, it's possible for both of these goroutines to conflict, and cause a double-ish re-enable for channels.

Generally, I find the existing logic in watchChannelStatus hard to follow w.r.t the precise delay between enable and disable. I wouldn't be against abstracting it further to isolate to its own package so we can properly unit test it (with necessary interfaces added).

we decided on not using long-polling for reenable as it can lead to false positives. agree with the comments on the existing logic, i will extend this PR to refactor that to only be responsible for 1) dampening reenables and 2) disable still-enabled channels after the configured interval via long polling

I think it would be cleanest to make the long polling only responsible for disabling channels.

Since we poll to disable, and require 20 minutes (configurable) steady connection to enable, I don't think they should conflict?

cfromknecht · 2019-02-19T21:59:58Z

Closing since this was replaced by the #2411

halseth mentioned this pull request Oct 24, 2018

Strict channel enabling #2091

Closed

cfromknecht added p2p Code related to the peer-to-peer behaviour P3 might get fixed, nice to have gossip labels Oct 31, 2018

cfromknecht force-pushed the delay-channel-enable branch from 6dbce84 to 1a1873e Compare November 11, 2018 00:44

cfromknecht force-pushed the delay-channel-enable branch 2 times, most recently from 113541f to c8d0367 Compare November 11, 2018 00:47

halseth suggested changes Nov 13, 2018

View reviewed changes

cfromknecht force-pushed the delay-channel-enable branch 2 times, most recently from 7a6e779 to 8df4db5 Compare November 15, 2018 01:00

cfromknecht changed the title ~~peer: send reenable updates after 5s~~ peer: send reenable updates after 20m Nov 15, 2018

cfromknecht force-pushed the delay-channel-enable branch 2 times, most recently from 45b6d0d to a04fffd Compare December 6, 2018 03:18

config+server: rename InactiveChanTimeout to ChanStatusInterval

0d8dcf8

Adds a configuration option for specifying delay between detecting active/inactive channels and sending an update to the network. The default interval is 20m.

cfromknecht force-pushed the delay-channel-enable branch from a04fffd to 7ed95a7 Compare December 6, 2018 03:20

wpaulino reviewed Dec 6, 2018

View reviewed changes

Comment thread peer.go Outdated

Copy link
Copy Markdown

Contributor

wpaulino Dec 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/dispatchStatuses/reenableTimeout

cfromknecht reacted with thumbs up emoji

cfromknecht added 2 commits December 6, 2018 16:03

lnd_test: pass new --chanstatusinterval flag

97f4ff0

cfromknecht force-pushed the delay-channel-enable branch from 7ed95a7 to 97f4ff0 Compare December 7, 2018 00:26

Roasbeef reviewed Dec 15, 2018

View reviewed changes

halseth mentioned this pull request Dec 21, 2018

peer: only send send enable update for disabled chans #2017

Closed

cfromknecht mentioned this pull request Jan 4, 2019

netann: channel status manager #2411

Merged

5 tasks

cfromknecht closed this Feb 19, 2019

Conversation

cfromknecht commented Oct 22, 2018

Uh oh!

Roasbeef commented Oct 24, 2018

Uh oh!

halseth commented Oct 24, 2018

Uh oh!

cfromknecht commented Oct 24, 2018

Uh oh!

cfromknecht commented Oct 25, 2018

Uh oh!

cfromknecht commented Oct 31, 2018

Uh oh!

halseth commented Oct 31, 2018

Uh oh!

halseth commented Nov 9, 2018

Uh oh!

cfromknecht commented Nov 11, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cfromknecht Dec 7, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

halseth commented Dec 5, 2018

Uh oh!

wpaulino left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

halseth commented Dec 7, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cfromknecht commented Feb 19, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

cfromknecht Dec 7, 2018 •

edited

Loading