Roundtrip non-malformed test cases through wasmparser / wasmprinter#2148
Conversation
|
This is the last PR in my series. I think it is something of an extension of #2146 so no objections to landing that first. |
crates/wasmparser/src/lib.rs
Outdated
| // https://github.com/WebAssembly/reference-types | ||
| @reference_types { | ||
| TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1) | ||
| TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select) |
There was a problem hiding this comment.
To me this is another example which is sort of a step-too-far in following the spec exactly as-is. Primarily from a performance perspective this means that any time this instruction is parsed it ends up doing a heap allocation where previously it didn't, and the only purpose is to check the length of the vector and ensure that it's one.
This is I think another example of where the spec is written in a way that doesn't really consider the consequence whether something is malformed or invalid. To me this would ideally result in a change to the binary grammar of the spec to say that a non-1-length typed select is a binary syntax error rather than a validation error.
For now could the select.wast test be excluded from the round-trip of assert_invalid directives?
There was a problem hiding this comment.
Fair enough... what do you think of the just-pushed approach (where visit_typed_select_multi is another visitor function off to the side)?
I really would love to keep the invariant that every parseable module can be roundtripped (especially since we're so close, and my group is trying to write a Wasm text editor that enforces the syntax), but I agree it shouldn't affect performance.
There was a problem hiding this comment.
Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)
| #[cfg(feature = "component-model")] | ||
| component: ComponentState, | ||
| custom_section_place: Option<&'static str>, | ||
| custom_section_place: Option<(&'static str, usize)>, |
There was a problem hiding this comment.
This new usize variable is to me pretty subtle and is something I'd prefer to avoid. Locally I see that this test is what motivated this change. Would it make sense to exclude module binary from both binary roundtripping and text roundtripping? (for similar reasons effectively). That would result in there being no more need for this change?
There was a problem hiding this comment.
Yeah, the issue is that if a section is empty, you don't want to update the "after xxx" custom section place because nothing got printed (and so the module won't roundtrip if you do this twice).
Given that this gets us to 100% binary-to-binary roundtrippability of all modules that originated in text, and 100% text-to-text roundtrippability of all modules that originated in binary, I would love to find a way to keep that invariant -- it feels like a really nice property to have and ensure. E.g. it did successfully catch a bug in that the wasmprinter was printing after element instead of after elem.
Are you open to some alternative implementation strategy here? We basically just need to make sure that the custom section place doesn't get updated if the section was empty -- I don't have any strong feelings about how we do it.
There was a problem hiding this comment.
Ok that all makes sense, and I agree it's nice to catch the after element bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option are and how the line number is being used to detect empty sections?
58969d7 to
0d74fd0
Compare
Handle multi-value typed select, even though this is currently invalid (but not malformed). Not handled: invalid components that use forward-referring name references (which the component parser will reject)
0d74fd0 to
5c973b2
Compare
| if tys.len() != 1 { | ||
| bail!(self.offset, "invalid result arity"); | ||
| } | ||
| self.visit_typed_select(tys.pop().unwrap()) |
There was a problem hiding this comment.
Could this debug_assert! that tys.len() != 1 and then always return with bail!?
src/bin/wasm-tools/dump.rs
Outdated
| match stringify!($visit).strip_prefix("visit_").unwrap() { | ||
| "typed_select_multi" => "typed_select", | ||
| other => other, | ||
| }, |
There was a problem hiding this comment.
Since this is just a debugging utility I think it's reasonable to print typed_select_multi in this case
| #[cfg(feature = "component-model")] | ||
| component: ComponentState, | ||
| custom_section_place: Option<&'static str>, | ||
| custom_section_place: Option<(&'static str, usize)>, |
There was a problem hiding this comment.
Ok that all makes sense, and I agree it's nice to catch the after element bug as well. Given all that let's leave this as-is, but can you add a comment here indicating what the two fields of the Option are and how the line number is being used to detect empty sections?
crates/wasmparser/src/lib.rs
Outdated
| // https://github.com/WebAssembly/reference-types | ||
| @reference_types { | ||
| TypedSelect { ty: $crate::ValType } => visit_typed_select (arity 3 -> 1) | ||
| TypedSelect { tys: Vec<$crate::ValType> } => visit_typed_select (arity select -> select) |
There was a problem hiding this comment.
Clever solution! I'm down for keeping that. Mind adding a comment to TypedSelectMulti as to its purpose though? (e.g. it's always an invalid instruction but it's there to round-trip the binary format)
|
🙏 🎆 Thank you! |
|
And of course thank you for working on this (and working with me on the review)! |
This PR modifies wasmprinter and wasm-encoder to support roundtripping all non-malformed examples in the testsuite. The main changes were to
ensure that component printer doesn't use forward-referring name references in the text format (these are rejected by the component parser)It then modifies the
wastsubcommand to run eachassert_invalidtest roundtrip through the text format and back.