Update testsuite; various lexing/parsing fixes#482
Conversation
sbc100
left a comment
There was a problem hiding this comment.
lgtm. That changes the .y are mostly opaque to me.
|
|
||
| src/prebuilt/wast-lexer-gen.cc: src/wast-lexer.cc | ||
| re2c --no-generation-date -bc -o $@ $< | ||
| re2c --no-generation-date -bc8 -o $@ $< |
There was a problem hiding this comment.
Why are these flags set in the CMakeLists.txt as well as the Makefile here?
There was a problem hiding this comment.
The Makefile generates the prebuilt lexer/parser. The CMakeLists.txt one is used if re2c/bison are on your machine and new enough. It'd be nice to remove the prebuilt, but I don't know if I wanna require those tools for windows, since it's kind of a pain.
| (module | ||
| (func (result i32) | ||
| block i32 | ||
| block (result i32) |
There was a problem hiding this comment.
So is this new result syntax required? Or is the old format still ok too?
There was a problem hiding this comment.
Yes, the new syntax is required.
There was a problem hiding this comment.
The purpose is to allow for future language extensions where the block signature is more complex (potentially with params and results).
Lexer changes:
* Switch re2c parser to UTF-8 parser. This can almost be done "for
free" with a flag, but required a bit of work to allow us to catch
malformed UTF-8 as well.
* Change the re2c fill value to 0xff, since it's never a valid UTF-8 byte.
* Allow for more reserved tokens (basically any ascii aside from
parentheses, double-quote, and semi-colon)
* Remove "infinity" from lexer, only "inf" is allowed now.
* Change definition of EOF token, it was implemented incorrectly. The
correct way to handle it is to only return it from FILL when there is no
more data to fill.
* \r is a valid escape.
Parser changes:
* Changes to match the spec parser:
- block signatures use (result <type>) syntax
- func/global/table/memory can have multiple inline exports
- inline imports are handled in func definition instead of import
definition
- allow for inline modules (i.e. no "(module ...)" s-expr required)
* Remove FuncField. This was previously used for parsing
params/results/locals, but it's less code to just parse
right-recursive (i.e. backward) and insert everything at the front.
This requires reversing the indexes in the BindingHash too.
* Remove the nasty macros `APPEND_FIELD_TO_LIST`,
`APPEND_ITEM_TO_VECTOR`, `APPEND_INLINE_EXPORT`, and
`CHECK_IMPORT_ORDERING`. This behavior is all handled by
`append_module_fields` now.
* All inline imports/exports are handled by returning additional
ModuleFields in a list. This removes the need for `OptionalExport`,
`ExportedFunc`, `ExportedGlobal`, `ExportedTable`, and
`ExportedMemory`.
* Use "_opt" suffix instead of "non_empty_" prefix, e.g.:
- text_list => text_list_opt, non_empty_text_list => text_list
* The locations changed for some symbols, typically the use the name
following the LPAR now, e.g. (import
^^^^^^
Yeah, I tried to keep this minimal, but the changes to the spec parser kind of required a rewrite. The code is much cleaner this way, and the test coverage is pretty good so ¯_(ツ)_/¯ |
wast2wasm wes recently updated to only support the former: WebAssembly/wabt#482
Lexer changes:
free" with a flag, but required a bit of work to allow us to catch
malformed UTF-8 as well.
parentheses, double-quote, and semi-colon)
correct way to handle it is to only return it from FILL when there is no
more data to fill.
Parser changes:
(result <type>)syntaxdefinition
params/results/locals, but it's less code to just parse
right-recursive (i.e. backward) and insert everything at the front.
This requires reversing the indexes in the BindingHash too.
APPEND_FIELD_TO_LIST,APPEND_ITEM_TO_VECTOR,APPEND_INLINE_EXPORT, andCHECK_IMPORT_ORDERING. This behavior is all handled byappend_module_fieldsnow.ModuleFields in a list. This removes the need for
OptionalExport,ExportedFunc,ExportedGlobal,ExportedTable, andExportedMemory.following the LPAR now, e.g. (import
^^^^^^