Test that invalid UTF-8 byte sequences are rejected.#450
Test that invalid UTF-8 byte sequences are rejected.#450sunfishcode wants to merge 2 commits intomasterfrom
Conversation
a27b23f to
ee6a07f
Compare
|
Looks good to me as far as the test cases go. However, on reflection it is most natural that these will be considered decoding errors, and that the AST itself represents names as vectors of code points. As you suspect, that implies that in the text format, errors will have to manifest themselves as immediate syntax errors. So to make the textual tests (the import/export ones) work you'll either need to turn them into individual .fail. files, or (preferably) express them in binary format like the custom section test, whose decoding happens separately from parsing. But before the spec interpreter implements the UTF-8 restriction, these tests break CI on travis either way (and potentially downstream users running the test suite on waterfalls). For that reason alone I would suggest holding off landing until the spec has caught up. I'll try to get to it later this week. |
|
See #454 for the interpreter implementation. After having implemented it I noticed that a few cases aren't tested in this PR:
It would be great if you could include a few tests for those as well. |
|
I may have missed it but it looks like there are no tests ensuring UTF-8s with a BOM fail to parse. It seems like that would be a good test too. As a nit it might also be useful to put roughly why the UTF-8 should fail to validate on each of the tests. |
|
This is superseded by #468, which I believe addresses all the feedback here. |
WebAssembly/design#1016 in the design repository requires implementations to validate that import/export names are UTF-8. This PR contains test coverage for this feature only.
The tests may need to be modified depending on whether UTF-8 validation is implemented as a syntactic constraint or not, or other details, however they should offer a good starting point.