Skip to content

Retransmission with HTLC meta-issue #172

@rustyrussell

Description

@rustyrussell

Background

When nodes reconnect/restart, they need some way to see what was received by the other end to re-sync state. For normal (or shutting down state) this naturally follows batching done by commit_signature/revoke_and_ack messages.

The c-lightning prototype used a scheme where init messages contained a single counter: the total number of commit-signature and revoke_and_ack messages it had received. On disconnect, it would also forget any updates which it had not received commit_signature for.

In Milan, @adiabat argued simply retransmitting and discarding duplicates, rather than an explicit ack number. More recently, @pm47 asked to avoid the compulsory discard, and require an exact retransmission of previous messages; @rustyrussell instead asked for a strict superset. But further consideration has raised issues with these approaches.

The Uncertain Signature Problem

  • A sends: update1 commitment_signed disconnect
  • B is in three possible states:
  • B1: received nothing. Still at previous commit.
  • B2: received update1. Has one update pending.
  • B3: received update1 and commitment_signed. Has sent revoke_and_ack.

Now, when A reconnects, it does an exact retransmission:

  • A sends: update1 commitment_signed
  • B1 is fine. B2 is fine if it ignores the duplicate. B3 either considers the COMMIT to have changed nothing (currently illegal), or if it ignores that, the signature is bad (it expects to be using the next_per_commitment_point it sent in revoke_and_ack.

There is also the case where A adds another change (eg. feechange, or another update).

Possible solutions:

  • Insist on an exact retransmission, and allow an empty commit, assert that as a special case, an "empty" commitment uses the previous per-commitment-point (and replies with the same revoke_and_ack as before).
  • Allow changed transmission (must be a superset!), and if the signature check fails, try creating the commitment signature using the previous per_commitment_point, and if that succeeds, reply with the same revoke_and_ack as before.
  • Send the explicit counter of updates + revocations so we don't encounter this situation.

The Persistence Problem

It's important that an optimal implementation only be required to remember state at the minimal number of points, as a robust implementation will need to synchronously write to disk(s). A node must remember when it receives revoke_and_ack (to create penalty transactions later), and when it sends commitment_signed (as it is committed to the HTLCs at that point, so it must remember them), so these are the minimal "sync" points possible.

Thus, requiring a node to persistently remember updates it has sent but not yet committed to is a poor idea. However, this can be reconstructed: we have to remember incoming HTLCs or fulfill/fails which were going to the reconnecting peer anyway, we can just re-send them. However we would not normally remember fee changes we have not committed to: requiring this to be recorded on sending update_fee adds a disk sync. Nor would we normally remember the order in which we sent the updates, which is imperative for the update_add_htlc id fields to match.

ECLAIR seems to require remembering the state and not rolling back. c-lightning (old, pre-Milan daemon) used reconstruction on reconnect/restart, but assumed the other side would roll back and used a total counter, and thus didn't have an issue if order or fees changed. lnd goes even further, and doesn't even remember id across reconnections: HTLCs are implicitly renumbered from 0 at that point. I don't find this 8-byte ondisk saving convincing: once HTLCs are no longer in the commitment transaction the ID can indeed be forgotten, but so can the amount and routing information: only the cltv and RIPEMD of the payment hash need to be remembered for creating the penalty transaction.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions