peer: send reenable updates after 20m#2080
Conversation
|
|
|
Will be taking this over to finish the decoupling. |
|
@Roasbeef correct, this does not address the decoupling of local graph from network gossip, but neither is that the intention of this PR. The purpose of this PR is
where does this happen? to me, it looks like
again, that is mostly orthogonal to this PR. decoupling them allows us greater flexibility in picking timeouts for each. After decoupling, we are free to increase this timeout before sending out a new announcement. Additionally, just letting |
reread through the code, and it's actually 1.25x |
|
Yep, #2115 needs to go in first though. |
6dbce84 to
1a1873e
Compare
|
Created a new config option called |
113541f to
c8d0367
Compare
There was a problem hiding this comment.
shaves some CPU cycles from select, figured why not?
There was a problem hiding this comment.
adds some CPU cycles in my head at least
There was a problem hiding this comment.
Why btw will it save CPU cycles? I think maybe that is only for tickers.
There was a problem hiding this comment.
because the runtime doesn't have to acquire a lock to check if a value is being sent on a nil channel, it just skips it. it saves CPU cycles on every subsequent select
There was a problem hiding this comment.
func main() {
const n = 1000000
const numChans = 10
chans := make([]chan int, numChans)
for i := range chans {
chans[i] = make(chan int)
}
start := time.Now()
for i := 0; i < n; i++ {
select {
case <-chans[0]:
case <-chans[1]:
case <-chans[2]:
case <-chans[3]:
case <-chans[4]:
case <-chans[5]:
case <-chans[6]:
case <-chans[6]:
case <-chans[8]:
case <-chans[9]:
default:
}
}
fmt.Printf("non-nil: %v\n", time.Since(start))
for i := range chans {
chans[i] = nil
}
start = time.Now()
for i := 0; i < n; i++ {
select {
case <-chans[0]:
case <-chans[1]:
case <-chans[2]:
case <-chans[3]:
case <-chans[4]:
case <-chans[5]:
case <-chans[6]:
case <-chans[6]:
case <-chans[8]:
case <-chans[9]:
default:
}
}
fmt.Printf("nil: %v\n", time.Since(start))
}
running this gives, OMM:
non-nil: 521.206479ms
nil: 155.758026ms
There was a problem hiding this comment.
In case anyone was interested, I modified the demo to show how the latency is linearly dependent on the number of nil channels:
func main() {
const n = 1000000
const numChans = 10
chans := make([]chan int, numChans)
for i := range chans {
chans[i] = make(chan int)
}
testNNil := func(ii int) {
start := time.Now()
for i := range chans[:ii] {
chans[i] = nil
}
start = time.Now()
for i := 0; i < n; i++ {
select {
case <-chans[0]:
case <-chans[1]:
case <-chans[2]:
case <-chans[3]:
case <-chans[4]:
case <-chans[5]:
case <-chans[6]:
case <-chans[7]:
case <-chans[8]:
case <-chans[9]:
default:
}
}
fmt.Printf("%d-nil: %v\n", ii, time.Since(start))
}
for i := 0; i < 11; i++ {
testNNil(i)
}
}
outputs:
0-nil: 529.313487ms
1-nil: 508.630059ms
2-nil: 492.725965ms
3-nil: 462.589964ms
4-nil: 437.568966ms
5-nil: 408.646654ms
6-nil: 358.446526ms
7-nil: 301.32741ms
8-nil: 244.178853ms
9-nil: 195.43525ms
10-nil: 160.556439ms
7a6e779 to
8df4db5
Compare
|
Needs a rebase now that #2115 is in! |
45b6d0d to
a04fffd
Compare
Adds a configuration option for specifying delay between detecting active/inactive channels and sending an update to the network. The default interval is 20m.
a04fffd to
7ed95a7
Compare
wpaulino
left a comment
There was a problem hiding this comment.
Travis failed with:
lnd_test.go:108: lnd finished with error (stderr):
exit status 1
unknown flag `inactivechantimeout'
unknown flag `inactivechantimeout'
--- FAIL: TestLightningNetworkDaemon/send_update_disable_channel (36.50s)
There was a problem hiding this comment.
s/dispatchStatuses/reenableTimeout
This commit adds a configurable delay before sending out channel reenable messages. The aim is to prevent flooding the network if we have flapping peers. The approach ensures that reenabling will only occur within a peer's lifecycle, which can prevent lingering goroutines from interfering with enable/disable attempts across different connections with the same peer.
7ed95a7 to
97f4ff0
Compare
|
Another rebase needed 🤣 |
| // reenableTimeout will fire once after the configured channel status | ||
| // interval has elapsed. This will trigger us to sign new channel | ||
| // updates and broadcast them with the "disabled" flag unset. | ||
| reenableTimeout := time.After(cfg.ChanStatusInterval) |
There was a problem hiding this comment.
Shouldn't it be possible for the existing watchChannelStatus method to also take on this responsibility? As is right now, it's possible for both of these goroutines to conflict, and cause a double-ish re-enable for channels.
Generally, I find the existing logic in watchChannelStatus hard to follow w.r.t the precise delay between enable and disable. I wouldn't be against abstracting it further to isolate to its own package so we can properly unit test it (with necessary interfaces added).
There was a problem hiding this comment.
we decided on not using long-polling for reenable as it can lead to false positives. agree with the comments on the existing logic, i will extend this PR to refactor that to only be responsible for 1) dampening reenables and 2) disable still-enabled channels after the configured interval via long polling
There was a problem hiding this comment.
I think it would be cleanest to make the long polling only responsible for disabling channels.
Since we poll to disable, and require 20 minutes (configurable) steady connection to enable, I don't think they should conflict?
|
Closing since this was replaced by the #2411 |
This commit adds a short delay before sending
out channel reenable messages. The aim is to
prevent flooding the network if we have
flapping peers. The approach ensures that
reenabling will only occur within a peer's
lifecycle, which can prevent lingering
goroutines from interfering with
enable/disable attempts across different
connections with the same peer.