Use length-delimited, byte-aligned codecs by pm47 · Pull Request #1442 · ACINQ/eclair

pm47 · 2020-06-02T16:04:33Z

All LN protocol message must be stored as length-delimited, because they may have arbitrary trailing data.

Took the opportunity to restore byte alignment. It was mainly due to the use of single-bit bool codec.

The goal is to play safe and have simple codecs, so the existing codecs have been kept and moved to a Legacy inner object.

This is an alternative to #1239.

pm47 · 2020-06-02T16:07:52Z

+      ("remoteNextCommitInfo" | either(boolean, waitingForRevocationCodec, publicKey)) ::
      ("commitInput" | inputInfoCodec) ::
-      ("remotePerCommitmentSecrets" | ShaChain.shaChainCodec) ::
+      ("remotePerCommitmentSecrets" | byteAligned(ShaChain.shaChainCodec)) ::


The shaChainCodec isn't byte aligned, and there may be room for optimization there. cc @sstone

I think we should make it byte-aligned, but the question is whether to do it "outside" (what you just did), or inside the shachain codec (use bool8 instead of bool)?

Apparently we can be more optimal, the codec should naturally be byte-aligned. Since this is sensitive code I'd rather leave that optimisation to another PR.

pm47 · 2020-06-02T16:13:01Z

-    .typecase(0x03, localCodec) // backward compatible
-    .typecase(0x01, provide(Origin.Local(UNKNOWN_UUID, None)))
    .typecase(0x02, relayedCodec)
+    .typecase(0x03, localCodec)


I removed the compatibility codec since the new codec will never use it.

pm47 · 2020-06-02T16:32:01Z

My plan for tests would be to generate a bunch of channel data before this patch (maybe with scalacheck generators @t-bast?) and their java object checksum.

I would then apply this change, read-then-write-then-read-again the data, and make sure the three checksums are equal.

I think that would bring solid assurance that we don't have a regression.

pm47 · 2020-06-05T12:02:09Z

So I validated the change by following the procedure decribed in my previous comment. I'm using an sqlite table with two columns: scodec-serialized and java-serialized, generated on a pre-migration branch. Then on the post-migration branch, I make sure that read(scodec) == read(java) and read(write(read(scodec))) == read(java).

I ended up using three data sets:

randomly generated data, using scalacheck (code here). Caveats: a few fields were empty or hardcoded (transactions, pending commitment changes).
testnet and mainnet node data, using this code:

  test("nonreg length-delimited reference data generation") {
    val sqlite = DriverManager.getConnection(s"jdbc:sqlite:${new File("endurance_before.sqlite")}")
    sqlite.createStatement().executeUpdate("CREATE TABLE data (scodec BLOB NOT NULL, java BLOB NOT NULL)")

    val read_sqlite = DriverManager.getConnection("jdbc:sqlite::resource:eclair.sqlite.endurance.bak")
    val rs = read_sqlite.createStatement().executeQuery("SELECT data FROM local_channels")
    var count = 0
    while (rs.next()) {
      val scodec = BitVector(rs.getBytes("data"))
      val d = ChannelCodecs.stateDataCodec.decode(scodec).require.value
      val bos = new ByteArrayOutputStream()
      val oos = new ObjectOutputStream(bos)
      oos.writeObject(d)
      oos.close()
      val java = bos.toByteArray
      val statement = sqlite.prepareStatement("INSERT INTO data VALUES (?, ?)")
      statement.setBytes(1, scodec.toByteArray)
      statement.setBytes(2, java)
      statement.execute()
      count = count + 1
    }
    println(s"processed $count channels")
  }

This is the actual test:

test("nonreg length-delimited migration") {
    val sqlite = DriverManager.getConnection(s"jdbc:sqlite:${new File("endurance_before.sqlite")}")
    val rs = sqlite.createStatement().executeQuery("SELECT * FROM data")
    var count = 0
    while (rs.next()) {
      val scodec = ByteVector(rs.getBytes(1)).bits
      val java = rs.getBytes(2)
      val d_scodec = ChannelCodecs.stateDataCodec.decode(scodec).require.value match {
        case d: DATA_WAIT_FOR_FUNDING_CONFIRMED => d.copy(waitingSince = 42000)
        case d: DATA_CLOSING => d.copy(waitingSince = 123456789)
        case d => d
      }
      val bis = new ByteArrayInputStream(java)
      val ois = new ObjectInputStream(bis)
      val d_java = ois.readObject().asInstanceOf[HasCommitments] match {
        case d: DATA_WAIT_FOR_FUNDING_CONFIRMED => d.copy(waitingSince = 42000)
        case d: DATA_CLOSING => d.copy(waitingSince = 123456789)
        case d => d
      }
      assert(d_scodec === d_java)
      val scodec2 = ChannelCodecs.stateDataCodec.encode(d_scodec).require
      assert(scodec != scodec2)
      assert(scodec2.size % 8 == 0)
      val d_scodec2 = ChannelCodecs.stateDataCodec.decode(scodec2).require.value
      assert(d_scodec2 == d_java)
      count = count + 1
    }
    println(s"processed $count channels")
  }

It makes little sense committing the migration tests since they occur on two different versions of the code and they only apply at the time of the migration. I prefer documenting them here.

pm47 · 2020-06-05T12:03:05Z

-  // We used to ignore unknown trailing fields, and assume that channel_update size was known. This is not true anymore,
-  // so we need to tell the codec where to stop, otherwise all the remaining part of the data will be decoded as unknown
-  // fields. Fortunately, we can easily tell what size the channel_update will be.
-  val noUnknownFieldsChannelUpdateSizeCodec: Codec[Int] = peek( // we need to take a peek at a specific byte to know what size the message will be, and then rollback to read the full message


Got rid of this compatibility code.

pm47 · 2020-06-05T12:03:20Z

      ("amountOut" | millisatoshi)).as[Origin.Relayed]

-  // this is for backward compatibility to handle legacy payments that didn't have identifiers
-  val UNKNOWN_UUID = UUID.fromString("00000000-0000-0000-0000-000000000000")


Got rid of this compatibility code.

pm47 · 2020-06-05T12:15:18Z

I also had this dilemma. I pushed the right button, but I'm really not sure: since the legacy codecs are frozen, shouldn't we remove the non-reg tests?

t-bast · 2020-06-08T13:01:20Z

shouldn't we remove the non-reg tests?

At least it doesn't hurt to keep them. The way I see it the goal is to completely remove the code of these legacy codecs (for example, X releases after the one where we release the new codecs). We can remove these tests at that point?

If that's the goal, maybe we can put a comment in the code to indicate that strategy.

t-bast · 2020-06-08T13:03:31Z

I think the testing strategy is sound, and at a high-level the code looks good. Could you squash/rebase and then we can review thoroughly?

pm47 · 2020-06-08T13:50:09Z

At least it doesn't hurt to keep them.

There is a price though: we have several accessible codecs for the same object types, which is a bit error-prone. But we don't use those codecs that often and the risk of using a legacy one by mistake is quite low.

pm47 · 2020-06-08T14:14:24Z

Could you squash/rebase and then we can review thoroughly?

Done. I need to re-run the tests. It just occured to me that #1141 had touched the codecs, so a bit more testing is probably not a bad thing.

t-bast

It's surprisingly less painful to review than I had anticipated :)
I hope I didn't miss anything, the migration tests look solid.

pm47 · 2020-06-09T14:20:25Z

6e3414b was less straightforward than I expected. Being in an inner object of the normal codecs, the legacy codecs have transparent access to all codecs which leads to hidden dependencies. I realize now that copy-pasting the old file would have been the safest approach. I started like that but reverted because it prevented me from sharing some codecs... which I now see wasn't a good idea. WDYT?

I've regenerated the random data to support static remote key and successfully re-ran the migration tests.

pm47 · 2020-06-09T14:46:38Z

Tried a different isolation with d052666, and re-ran the migrations tests. I think that design is better.

t-bast

LGTM once you've re-done the E2E tests!

Isolated legacy codec in separate file, and restricted visibility to "package" in order to reduce the risk of using those codecs. Also restricted the codecs to `decodeOnly` for the same reason.

pm47 commented Jun 2, 2020

View reviewed changes

Comment thread eclair-core/src/main/scala/fr/acinq/eclair/wire/ChannelCodecs.scala Outdated

pm47 commented Jun 2, 2020

View reviewed changes

Comment thread eclair-core/src/main/scala/fr/acinq/eclair/wire/ChannelCodecs.scala

pm47 commented Jun 2, 2020

View reviewed changes

pm47 commented Jun 5, 2020

View reviewed changes

Comment thread eclair-core/src/main/scala/fr/acinq/eclair/wire/ChannelCodecs.scala

pm47 requested a review from t-bast June 5, 2020 12:04

pm47 marked this pull request as ready for review June 5, 2020 12:04

pm47 force-pushed the length-prefix-codec-2 branch from 6a31dcd to 06207d1 Compare June 8, 2020 14:12

t-bast reviewed Jun 8, 2020

View reviewed changes

t-bast previously approved these changes Jun 9, 2020

View reviewed changes

Comment thread eclair-core/src/main/scala/fr/acinq/eclair/wire/ChannelCodecs.scala Outdated

Comment thread eclair-core/src/main/scala/fr/acinq/eclair/wire/ChannelCodecs.scala

pm47 dismissed t-bast’s stale review via b90a9e0 June 22, 2020 14:37

pm47 force-pushed the length-prefix-codec-2 branch from d052666 to b90a9e0 Compare June 22, 2020 14:37

t-bast previously approved these changes Jun 22, 2020

View reviewed changes

use length-delimited byte-aligned codecs

7a67349

Isolated legacy codec in separate file, and restricted visibility to "package" in order to reduce the risk of using those codecs. Also restricted the codecs to `decodeOnly` for the same reason.

pm47 dismissed t-bast’s stale review via 7a67349 June 22, 2020 15:20

pm47 force-pushed the length-prefix-codec-2 branch from b90a9e0 to 7a67349 Compare June 22, 2020 15:20

t-bast approved these changes Jun 22, 2020

View reviewed changes

pm47 merged commit 6c81f95 into master Jun 22, 2020

pm47 deleted the length-prefix-codec-2 branch June 22, 2020 16:03

pm47 mentioned this pull request Jun 29, 2020

Prefix all stored LN messages by their length #1239

Closed

Conversation

pm47 commented Jun 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pm47 Jun 2, 2020

Choose a reason for hiding this comment

Uh oh!

t-bast Jun 8, 2020

Choose a reason for hiding this comment

Uh oh!

pm47 Jun 8, 2020

Choose a reason for hiding this comment

Uh oh!

pm47 Jun 2, 2020

Choose a reason for hiding this comment

Uh oh!

pm47 commented Jun 2, 2020

Uh oh!

pm47 commented Jun 5, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pm47 Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

pm47 Jun 5, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pm47 commented Jun 5, 2020

Uh oh!

t-bast commented Jun 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

t-bast commented Jun 8, 2020

Uh oh!

pm47 commented Jun 8, 2020

Uh oh!

pm47 commented Jun 8, 2020

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

pm47 commented Jun 9, 2020

Uh oh!

pm47 commented Jun 9, 2020

Uh oh!

t-bast left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pm47 commented Jun 2, 2020 •

edited

Loading

pm47 commented Jun 5, 2020 •

edited

Loading

t-bast commented Jun 8, 2020 •

edited

Loading