Fix case where Opening Channels get stuck forever.#8406
Conversation
|
Important Auto Review SkippedAuto reviews are limited to the following labels: llm-review. Please add one of these labels to enable auto reviews. Please check the settings in the CodeRabbit UI or the To trigger a single review, invoke the WalkthroughThe recent update to the Lightning Network Daemon (lnd) introduces new configuration parameters to improve channel management and error handling. Notably, Changes
Assessment against linked issues
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
e62f5f1 to
c61b346
Compare
There was a problem hiding this comment.
I wouldn't put test-only (or rather -mostly?) information in the public sample configuration. Assuming developers are clever enough to find those settings, I'd rather avoid confusing "plebs" with this. The defaults are fine, I guess?
There was a problem hiding this comment.
I agree. Maybe we can make them a development only config by adding the flags to DevConfig in lncfg/dev_integration.go?
There was a problem hiding this comment.
lemmie know what you think with the current approach.
ellemouton
left a comment
There was a problem hiding this comment.
Really great catch and great test additions 🔥 LGTM!
|
@coderabbitai review |
There was a problem hiding this comment.
Review Status
Actionable comments generated: 5
Configuration used: CodeRabbit UI
Commits
Files that changed from the base of the PR and between 0a29b37 and c61b346a630bf015a5b7e7977742f79fdaea6e3f.Files selected for processing (10)
- config.go (2 hunks)
- docs/release-notes/release-notes-0.18.0.md (1 hunks)
- itest/list_on_test.go (1 hunks)
- itest/lnd_psbt_test.go (2 hunks)
- lncfg/config.go (2 hunks)
- lntest/harness_assertion.go (1 hunks)
- peer/brontide.go (1 hunks)
- peer/brontide_test.go (3 hunks)
- sample-lnd.conf (1 hunks)
- server.go (1 hunks)
Additional comments: 16
lncfg/config.go (1)
- 20-26: The addition of
DefaultReservationTimeoutandDefaultZombieSweeperIntervalconstants introduces new default durations for channel management. Ensure that these values are appropriate and consistent with the intended behavior described in the PR.docs/release-notes/release-notes-0.18.0.md (1)
- 68-70: The release notes have been updated to include a fix for the removal of failed channels. This entry should be verified to ensure it accurately reflects the changes made and the issues addressed in the PR.
Verification successful
Verifying the Release Notes Entry
The release notes entry for the fix related to the removal of failed channels in the Lightning Network Daemon (lnd) is as follows:
The comment requests verification to ensure that this entry accurately reflects the changes made and the issues addressed in the PR. To verify this, I retrieved the title and body of the PR linked in the release notes using the provided shell script.
The output of the script shows that the PR title is "Fix case where Opening Channels get stuck forever." The body of the PR explains the issue and the changes made to address it. It also mentions the addition of an integration test to ensure the successful removal of a failed channel opening.
The PR title and body align with the release notes entry, confirming that the release notes accurately reflect the changes made and the issues addressed in the PR.
Therefore, the release notes entry for the fix related to the removal of failed channels in the Lightning Network Daemon (lnd) is accurate.
Conclusion
The release notes entry for the fix related to the removal of failed channels in the Lightning Network Daemon (lnd) accurately reflects the changes made and the issues addressed in the PR.
Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Verify that the linked PR in the release notes correctly addresses the issue. curl -s https://api.github.com/repos/lightningnetwork/lnd/pulls/8406 | jq '.title,.body'Length of output: 1472
itest/list_on_test.go (1)
- 565-568: A new test case named "fail funding flow psbt" has been added. Verify that this test case is included in the test suite and that it effectively tests the failure of the PSBT funding flow as intended.
itest/lnd_psbt_test.go (2)
- 1352-1357: The arguments
--reservationtimeout=1sand--zombiesweeperinterval=1sare set to ensure the remote peer fails the funding process quickly. This is a logical approach for testing failure scenarios.- 1386-1391: The error handling in
ht.ReceiveOpenChannelErrorusesfmt.Errorfto create an expected error, but thefmtpackage is not used elsewhere in the test. This is the correct usage offmt.Errorffor simulating an expected error.peer/brontide_test.go (1)
- 1457-1547: The new test function
TestRemovePendingChannelhas been added to ensure that pending channels can be removed successfully and that theremovePendingChannelchannel is initialized properly. This test function appears to be well-structured and covers the necessary assertions to validate the behavior of adding and then removing a pending channel. The use ofwait.NoErrorfrom thelntest/waitpackage is appropriate for waiting on conditions within a timeout period.lntest/harness_assertion.go (1)
- 292-300: The function
ReceiveOpenChannelErrorcorrectly waits for an expected error during the open channel flow and uses an internal functionreceiveOpenChannelUpdateto receive updates. Ensure that the error message includes sufficient context for debugging.config.go (7)
- 349-354: The
Configstruct has been updated with new fieldsReservationTimeoutandZombieSweeperInterval. Ensure that these new fields are properly documented, initialized, and used throughout the codebase.- 349-354: The
BackupFilePathfield has been moved within theConfigstruct. Ensure that this change does not affect the initialization or usage of this field, potentially causing file path issues.- 349-354: The descriptions for
UnsafeDisconnect,UnsafeReplay, andMaxPendingChannelshave been updated. Confirm that these changes are accurate and reflect the intended functionality.- 580-588: Default values for
ReservationTimeoutandZombieSweeperIntervalare set in theDefaultConfigfunction. Verify that these defaults are sensible and align with the system's expectations.Verification successful
It's clear that the default values for
ReservationTimeoutandZombieSweeperIntervalare set in thelncfg/config.gofile. TheDefaultReservationTimeoutis set to 10 minutes, and theDefaultZombieSweeperIntervalis set to 1 minute. These values seem sensible for the system's expectations, as they provide reasonable timeouts for channel reservations and zombie sweeper intervals.Given this information, the defaults appear to be aligned with the system's expectations. Therefore, the verification is successful.
Conclusion
The default values for
ReservationTimeoutandZombieSweeperIntervalare set to 10 minutes and 1 minute, respectively, in thelncfg/config.gofile, and they align with the system's expectations. The verification is successful.* 577-591: > 📝 **NOTE** > This review was outside the diff hunks and was mapped to the diff hunk with the greatest overlap. Original lines [349-588]Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Check for the default values of ReservationTimeout and ZombieSweeperInterval. rg --type go "DefaultReservationTimeout|DefaultZombieSweeperInterval"Length of output: 513
The
ValidateConfigfunction has been updated. Ensure that the new configuration fields are validated correctly and that any new constraints or checks are properly implemented.
- 580-588: The
LoadConfigfunction has been updated. Ensure that the new configuration fields are loaded correctly and that any new logic does not introduce errors or misconfigurations.- 580-588: The
ValidateConfigfunction has been updated. Ensure that the new configuration fields are validated correctly and that any new constraints or checks are properly implemented.peer/brontide.go (1)
- 509-511: Initialization of the
removePendingChannelchannel is correct and aligns with the PR objectives to fix hanging channels.server.go (1)
- 1439-1440: The
newServerfunction now uses theZombieSweeperIntervalandReservationTimeoutvalues from thecfgparameter instead of hardcoding them. This change allows for more flexibility and customization of these parameters.
Add a unit test for the removal of a pending channel.
dc7f070 to
b67cb6f
Compare
|
@coderabbitai review |
There was a problem hiding this comment.
Review Status
Actionable comments generated: 4
Configuration used: CodeRabbit UI
Commits
Files that changed from the base of the PR and between e1259cd and b67cb6fe5efb64669def2f846550f6177d32e3ba.Files selected for processing (11)
- docs/release-notes/release-notes-0.17.4.md (1 hunks)
- funding/manager.go (4 hunks)
- itest/list_on_test.go (1 hunks)
- itest/lnd_psbt_test.go (2 hunks)
- lncfg/config.go (2 hunks)
- lncfg/dev.go (2 hunks)
- lncfg/dev_integration.go (2 hunks)
- lntest/harness_assertion.go (1 hunks)
- peer/brontide.go (1 hunks)
- peer/brontide_test.go (3 hunks)
- server.go (2 hunks)
Files skipped from review as they are similar to previous changes (7)
- itest/list_on_test.go
- itest/lnd_psbt_test.go
- lncfg/config.go
- lntest/harness_assertion.go
- peer/brontide.go
- peer/brontide_test.go
- server.go
Additional comments: 8
lncfg/dev.go (2)
- 5-7: The import statement is correct and follows Go conventions.
- 27-30: The method
GetReservationTimeoutis correctly retrieving the default reservation timeout.docs/release-notes/release-notes-0.17.4.md (1)
- 22-26: The bug fix entry in the release notes is clear and informative, correctly summarizing the changes made in the PR.
lncfg/dev_integration.go (4)
- 5-7: The import statement is correct and follows Go conventions.
- 23-24: The new fields
ReservationTimeoutandZombieSweeperIntervalin theDevConfigstruct are well-defined and include descriptive comments.- 32-38: The method
GetReservationTimeoutcorrectly provides a default value if the field is not set. However, ensure that the default value aligns with the expected behavior in the integration environment.- 41-47: There is a typo in the method name
GeTZombieSweeperInterval; it should beGetZombieSweeperInterval. Additionally, the method correctly provides a default value if the field is not set. Verify that the default value is appropriate for the integration environment.- func (d *DevConfig) GeTZombieSweeperInterval() time.Duration { + func (d *DevConfig) GetZombieSweeperInterval() time.Duration {funding/manager.go (1)
- 4851-4851: The check for PSBT funding reservations should be documented to explain why these are not pruned.
ef06433 to
2cb21f0
Compare
guggero
left a comment
There was a problem hiding this comment.
Great catch! Just one small detail, otherwise looks good to me 💯
There was a problem hiding this comment.
Also make sure there is an actual value in the dev config?
E.g. if f.cfg.Dev != nil && f.cfg.Dev.ReservationTimeout != 0 {?
Same with the sweeper interval below.
There was a problem hiding this comment.
Or use the f.cfg.Dev.GetReservationTimeout() method that always returns a non-zero value?
There was a problem hiding this comment.
Good idea, I will restructure to use yy's approach.
There was a problem hiding this comment.
nice itest! I think there's no more cleanup needed here?
There was a problem hiding this comment.
yes I think so, the nodes are shut down when the context of the testcase cancels and if the test fails the CleanUp of the harness shuts them down.
2cb21f0 to
4065171
Compare
This adds an itest for a failed funding flow by our peer.
4065171 to
13e557d
Compare
|
@coderabbitai resolve |
…opening-issue Fix case where Opening Channels get stuck forever.
Fixes #8362 and I also #8251
So basically the
removePendingChannelchannel was not initialized causing ourreservationCoordinatorto get stuck forever. So after one call of thefailFundingFlowwe would not be able to accept or open any new channels.It did not panic because
nilchannels are acutally quite useful in golang to merge 2 channels for more information read:https://medium.com/justforfunc/why-are-there-nil-channels-in-go-9877cc0b2308
I will add an itest to test the successful removal of a failed channel opening.
Summary by CodeRabbit
New Features
Bug Fixes
Tests
Documentation