Skip to content

Add tests for invalid UTF-8 encoding in the text format#474

Merged
binji merged 2 commits intomasterfrom
utf8-invalid-text
Oct 13, 2017
Merged

Add tests for invalid UTF-8 encoding in the text format#474
binji merged 2 commits intomasterfrom
utf8-invalid-text

Conversation

@sunfishcode
Copy link
Member

This adds a ".fail.wast" test file for each of the 179 invalid UTF-8 byte sequences created for the other UTF-8 tests. This provides some test coverage for validation on the text parsing side (because it matters for some consumers), however it's an awkwardly large number of files, and I don't know if it's worth the hassle. In any case, here's the PR in case anyone's interested.

@rossberg
Copy link
Member

I'm torn. The more tests the better, but it is a large number of files. OTOH, once #471 lands, we probably want similar tests checking the UTF-8 encoding of the input itself (i.e., invalid UTF-8 occurring directly, not escaped, in either strings or comments). Perhaps a solution is to introduce a subdirectory for everything Unicode related?

@AndrewScheidecker
Copy link
Contributor

Couldn't the need for the .fail.wast files be eliminated by adding an (assert_unparseable "...") that embeds the unparseable modules in strings?

@rossberg
Copy link
Member

@AndrewScheidecker, ha, interesting idea. Let me think about that.

@rossberg
Copy link
Member

rossberg commented Jun 6, 2017

PR #475 landed, so these tests can now be written in just one file.

@rossberg
Copy link
Member

@sunfishcode, any chance you can adapt this PR?

@sunfishcode
Copy link
Member Author

I still do intend to do this, though it's not a high priority right now.

@binji binji force-pushed the utf8-invalid-text branch from 9998f29 to 92c1c91 Compare October 12, 2017 19:06
@binji
Copy link
Member

binji commented Oct 12, 2017

Pushed a change to refactor these. Hope you don't mind! :-)

Copy link
Member

@rossberg rossberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks. LGTM (without looking at all the tests individually).

@rossberg
Copy link
Member

When you merge, can you make sure to update the title/description?

@sunfishcode
Copy link
Member Author

@binji That looks like all that needs to be done here then. Thanks!

@binji binji changed the title Add ".fail.wast" versions of the malformed UTF-8 tests. Add tests for invalid UTF-8 encoding in the text format Oct 13, 2017
@binji binji merged commit b3c6413 into master Oct 13, 2017
@binji binji deleted the utf8-invalid-text branch October 13, 2017 17:54
dhil pushed a commit to dhil/webassembly-spec that referenced this pull request Nov 13, 2023
remove duplicated token definition in parser
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants