splice: Reestablish when commit or sig sends fail#6840
Conversation
95105a3 to
da8aa04
Compare
vincenzopalazzo
left a comment
There was a problem hiding this comment.
Looks like that test are failing
test_splice passed 1 out of the required 1 times. Success!
===End Flaky Test Report===
=========================== short test summary info ============================
FAILED tests/test_opening.py::test_rbf_reconnect_tx_construct - TimeoutError: Unable to find "[re.compile('dualopend daemon died before signed PSBT returned')]" in logs.
FAILED tests/test_splicing_disconnect.py::test_splice_disconnect_sig - TimeoutError: Unable to find "[re.compile('CHANNELD_AWAITING_SPLICE to CHANNELD_NORMAL')]" in logs.
===== 2 failed, 105 passed, 582 skipped, 33 warnings in 699.05s (0:11:39) ======
497d677 to
166c3ae
Compare
1e41866 to
50bcd89
Compare
rustyrussell
left a comment
There was a problem hiding this comment.
I'm going to accept this since we're close to release and it only effects experimental. However, it's a giant blob of unreviewable changes, which is not nice :(
At least one thing is wrong, though.
553db0f to
80d177b
Compare
8fdab93 to
c1ffe15
Compare
ksedgwic
left a comment
There was a problem hiding this comment.
Investigated the "commitment mismatch" VLS was seeing with tests/test_splicing_disconnect.py::test_splice_disconnect_commit, looks like this is not a bug and we can easily handle in VLS: https://gitlab.com/lightning-signer/vls-hsmd/-/merge_requests/132#note_1654982750
This PR looks good from VLS
Adds tests for when the connection fails during 1) splice tx_signature 2) splice commitment_signed Fleshed out the reestablish flow for these two cases and implemented the fixes to make these reestablish flows work. Part of this work required changing commit process for splices: Now we send a single commit_part for the splice where previously we sent all commits, and accordingly, we no longer revoke in response. Changelog-Fixed: Implemented splicing restart logic for tx_signature and commitment_signed. Splice commitments are reworked in a manner incompatible with the last version.
c1ffe15 to
b54af9f
Compare
|
Squashed fixup commit |
|
Ack b54af9f |
There was a problem hiding this comment.
Bunch of nits, but nothing major. Nice job getting this turned around so quickly.
More tests would be nice??
Also echo what @rustyrussell said about more commits please. I need more commit inflation!!
ACK b54af9f
| AMOUNT_MSAT(10), | ||
| AMOUNT_SAT(1111), | ||
| 0, | ||
| false, |
There was a problem hiding this comment.
nit: probably worth testing this field works in the db tests (eg set to true once)
|
|
||
| db_bind_s64(stmt, inflight->funding->splice_amnt); | ||
| db_bind_int(stmt, inflight->i_am_initiator); | ||
| db_bind_int(stmt, inflight->force_sign_first); |
There was a problem hiding this comment.
nit: we usually do a "type conversion" btw int + bool
db_bind_int(stmt, inflight->remote_tx_sigs ? 1 : 0);
| struct bitcoin_signature last_sig; | ||
| bool i_am_initiator; | ||
| bool force_sign_first; | ||
| bool is_locked; |
|
|
||
| inflight = last_inflight(peer); | ||
|
|
||
| if (inflight && (!inflight->last_tx || !inflight->remote_tx_sigs)) { |
There was a problem hiding this comment.
nit: add comment from spec here?
| bool skip_commitments, | ||
| enum tx_role our_role) | ||
| bool send_commitments, | ||
| bool recv_commitments, |
There was a problem hiding this comment.
nit: maybe call this 'expect_rcv_commitments'?
| del l1.daemon.opts['dev-disconnect'] | ||
|
|
||
| # Should reconnect, and reestablish the splice. | ||
| l1.start() |
There was a problem hiding this comment.
you could also just call connect again?
| inv = l2.rpc.invoice(10**2, '3', 'no_3') | ||
| l1.rpc.pay(inv['bolt11']) | ||
|
|
||
| # Check that the splice doesn't generate a unilateral close transaction |
There was a problem hiding this comment.
is there another signal we can check for?
| funds_result = l1.rpc.fundpsbt("109000sat", "slow", 166, excess_as_change=True) | ||
|
|
||
| result = l1.rpc.splice_init(chan_id, 100000, funds_result['psbt']) | ||
| print("l1 splice_update") |
There was a problem hiding this comment.
can we get rid of the prints? 😁
|
|
||
| l2.daemon.wait_for_log(r'dev_disconnect: \+WIRE_COMMITMENT_SIGNED') | ||
|
|
||
| print("Killing l2 without sending WIRE_COMMITMENT_SIGNED") |
There was a problem hiding this comment.
nit: assert we're in the state expected before killing + reconnecting
|
|
||
| # Check that the splice doesn't generate a unilateral close transaction | ||
| time.sleep(5) | ||
| assert l1.db_query("SELECT count(*) as c FROM channeltxs;")[0]['c'] == 0 |
There was a problem hiding this comment.
one other test i would add tests the tx-abort pathway?
Adds tests for when the connection fails during
Fleshed out the reestablish flow for these two cases and implemented the fixes to make these reestablish flows work.
Fixes #6703
Changelog-Fixed: Implemented splicing restart logic for tx_signature and commitment_signed.