Update decoding of element/data segments in spec interpreter#1440
Update decoding of element/data segments in spec interpreter#1440alexcrichton wants to merge 1 commit intoWebAssembly:mainfrom
Conversation
This commit updates the spec interpreter after the merge of the bulk memory proposal to align with the textual specification for the encoding of data and element segments. In original MVP wasm a data/element segmented started with a leb128 for the memory/table index, but this leb128 was repurposed as a flags byte in the bulk-memory/reference-types proposals since it was always zero in practice (and never an over-long zero such as `"\80\00"`). The spec interpreter, however, hasn't been updated and was still reading a u32 for the flags byte, so this commit updates it to instead read a single byte. The tests have been updated accordinatly. Tests for overlong or invalid index encodings were updated to use an encoding that explicitly specifies the index (the prefix `"\02"` byte for both data and element segments). New tests were added to ensure that an overlong encoding of 0, which was previously valid, is no longer valid.
|
I assume this is to address #1439? PR looks good, but perhaps let's wait for that discussion to conclude. |
|
While #1439 is somewhat related this is intended to fix the interpreter to match the current text spec. The text spec does not mention a leb for the leading byte discriminant for element/data segments but the spec interpreter uses a leb. This means that parsers which strictly implement the textual spec cannot pass the current spec tests and this PR is trying to fix that. I recall that the decision at the time when this came up was to intentionally break any modules using overlong encodings for index 0. I found WebAssembly/bulk-memory-operations#153 as some discussion of this but I don't recall where other discussion happened. I thought, though, that #1439 was "intended breakage" and action wasn't necessary. |
|
With #1461 landed, this PR should no longer be needed. Closing. |
This commit updates the spec interpreter after the merge of the bulk
memory proposal to align with the textual specification for the encoding
of data and element segments. In original MVP wasm a data/element
segmented started with a leb128 for the memory/table index, but this
leb128 was repurposed as a flags byte in the bulk-memory/reference-types
proposals since it was always zero in practice (and never an over-long
zero such as
"\80\00"). The spec interpreter, however, hasn't beenupdated and was still reading a u32 for the flags byte, so this commit
updates it to instead read a single byte.
The tests have been updated accordinatly. Tests for overlong or invalid
index encodings were updated to use an encoding that explicitly
specifies the index (the prefix
"\02"byte for both data and elementsegments). New tests were added to ensure that an overlong encoding of
0, which was previously valid, is no longer valid.