Skip to content

TLV parsing#2501

Merged
rustyrussell merged 21 commits into
ElementsProject:masterfrom
niftynei:dual-funded-tlvs
Apr 3, 2019
Merged

TLV parsing#2501
rustyrussell merged 21 commits into
ElementsProject:masterfrom
niftynei:dual-funded-tlvs

Conversation

@niftynei
Copy link
Copy Markdown
Collaborator

@niftynei niftynei commented Mar 26, 2019

Have the wire generators be TLV savvy. Adds parsers for nested structs etc.

Addition of a new TLV type requires adding a new file gen_{tlv_name}_csv to the wires. The parsers will identify TLV fields from the _csv file, and then find the appropriate additional file to parse.

Also generates the structs necessary to figure out whether or not these fields have been included. In general, you can check for a TLV type's inclusion via whether or not the returned tlv struct's message field is null.

A few notes:

  • the size of any TLV individual message is capped at u8 a varint.
  • we fail if an even numbered TLV type message is not understood by the parser

@niftynei niftynei changed the title TLV parsing WIP: TLV parsing Mar 28, 2019
@niftynei niftynei force-pushed the dual-funded-tlvs branch 5 times, most recently from c1a37c0 to 174e214 Compare March 29, 2019 00:20
@niftynei niftynei changed the title WIP: TLV parsing TLV parsing Mar 29, 2019
@niftynei niftynei added the v1.1 label Mar 29, 2019
Copy link
Copy Markdown
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor issues only. I'm happy if you want to keep the structs for now, since we can always change the API to open-code the fields if it proves a problem, and I don't want to create any roadblocks.

Comment thread wire/gen_peer_wire_csv Outdated
funding_locked,36
funding_locked,0,channel_id,32
funding_locked,32,next_per_commitment_point,33
open_channel2,56
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK for now, but I think we're going to have to move to a system where we have experimental CSV files (which get compiled in if EXPERIMENTAL defined).

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on whether or not to include these. Really they're only here because I needed something to test with 🌵

Comment thread tools/generate-wire.py Outdated
self.is_tlv = False

# field name appended with '+' means this field contains a tlv
if name.endswith('+'):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't going to work well in the spec; let's use name.endswith('tlv')?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!!

Comment thread tools/generate-wire.py
""" prints fromwire function definition for a TLV message.
these are significantly different in that they take in a struct
to populate, instead of fields, as well as a length to read in
"""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting choice, and I'm not sure it's the correct one (we changed from structs to individual parameters early on in our wire parsing).

We could parse TLVs just like we do a formatted field, except everything has to be a ptr-to-ptr so we can set it to NULL if it doesn't appear.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, i think either of these is a valid way to do this. i had the impression that most TLV messages would be rather involved field wise, but it's also possible they're mostly one-off fields...

Comment thread tools/generate-wire.py
tlv__type_impl_towire_fields = """\tif ({tlv_name}->{name}) {{
\t\ttlv_msg = tal_arr(ctx, u8, 0);
\t\ttowire_u16(p, {enum});
\t\ttowire_u8(p, {enum});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except I was wrong and it (and type) should be varint :)

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated!!

Comment thread tools/generate-wire.py
\t\t}}
\t}}
\tif (!*p) {{
\tif (!*p || start_len - *plen != *len) {{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not entirely clear to me that this is correct. For onion, at least, we should on first type=0 and ignore any remaining. Seems consistent to do that elsewhere, too.

Copy link
Copy Markdown
Collaborator Author

@niftynei niftynei Apr 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm you're right. there's also the 'ordering' aspect of the proposed TLV spec that this implementation is missing at the moment. will update.

niftynei added 19 commits April 2, 2019 19:28
Version 1.1 of the lightning-rfc spec introduces TLVs for optional
data fields. This starts the process of updating our auto-gen'd
wireformat parsers to be able to understand TLV fields.

The general way to declare a new TLV field is to add a '+' to the
end of the fieldname. All field type declarations for that TLV set
should be added to a file in the same directory by the name
`gen_<field_name>_csv`.

Note that the FIXME included in this commit is difficult to fix, as
we currently pass in the csv files via stdin (so there's no easy
way to ascertain the originating directory of file)
'.c' wire format files include case statements to print the names
of enums. Include such methods for the enums pertaining to
tlv's as well.
Add tlv-messages to the general messages set so that their parsing
messages get printed out.

FIXME: figure out how to account for partial message length processing?
Since messages in TLV's are optional, the ideal way to deal with
them is to have a 'master struct' object for every defined tlv, where
the presence or lack of a field can be determined via the presence
(or lack thereof) of a struct for each of the optional message
types.

In order to do this, appropriately, we need a struct for every
TLV message. The next commit will make use of these.

Note that right now TLV message structs aren't namespaced to the
TLV they belong to, so there's the potential for collision. This
should be fixed when/where it occurs (should fail to compile).
construct structs for the TLV's. these will be the 'return type'
for tlv fields in parent messages (so to speak)
clean up basetype parsing code a bit
much better than statically calculating the sizeof
let's let the fromwire__tlv methods allocate the tlv-objects and
return them. we also want to initialize all of their underlying
messages to NULL, and fail if we discover a duplicate mesage type.

if parsing fails, instead of returning a struct we return NULL.

Suggested-By: @rustyrussell
fail if a message type is even and it's not included. otherwise,
continue with the next message type.
passing back a null TLV was crashing here, because we tried to
dereference a null pointer. instead, we put it into a temporary
struct that we can check for NULL-ness, before assigning to the
passed in pointer.
need to pass in a pointer to the array so that when we advance
the array in the subcalls, it advances in the parent. oops
otherwise they'll get cleaned up when the message is free'd.
it's nbd either way, but this seems tighter.
include includes for TLV _csv files
Updated to match what the CSV generator in the RFC repo actually
outputs, see lightning/bolts#597
niftynei added 2 commits April 2, 2019 19:30
so we can put and pull bitcoin 'var_int' length types from the
wire.

for more info on variable integers, see http://learnmeabitcoin.com/glossary/varint
TLV's use var_int's for messages sizes, both internally and
in the top level (you should really stack a var_int inside a var_int!!)

this updates our automagick generator code to understand 'var_ints'
@niftynei
Copy link
Copy Markdown
Collaborator Author

niftynei commented Apr 3, 2019

@rustyrussell fixed up to exclude any and all references to 'dual funding' :)

if you're happy with this, i think we're good to merge for now. here's a short list of things that need to be improved in a follow up PR:

  • 0x0 type stops parsing
  • ordering of messages as per spec
  • onion types need to be zero padded (?)
  • the print_towire/fromwire's for the devtool/decodemsg utility need to be amended

@rustyrussell
Copy link
Copy Markdown
Contributor

@rustyrussell fixed up to exclude any and all references to 'dual funding' :)

if you're happy with this, i think we're good to merge for now. here's a short list of things that need to be improved in a follow up PR:

* 0x0 type stops parsing

I think at the moment it will break parsing, being considered even :)

* ordering of messages as per spec

I think this Just Works if the spec lists them in ascending order But probably good to check!

* onion types need to be zero padded (?)

Naah, I think what will happen is we'll memcpy it in, so remainder will be zero.

* the print_towire/fromwire's for the devtool/decodemsg utility need to be amended

Def. nice to have.

Copy link
Copy Markdown
Contributor

@rustyrussell rustyrussell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack b58df84

Comment thread wire/fromwire.c
if (*max < 1) {
fromwire_fail(cursor, max);
return 0;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this test is redundant!

@rustyrussell rustyrussell merged commit bad0ac6 into ElementsProject:master Apr 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants