cranelift: Stop printing value aliases in CLIF functions by jameysharp · Pull Request #8223 · bytecodealliance/wasmtime

jameysharp · 2024-03-22T18:46:27Z

This is a follow-up to #8214. That PR made the CLIF printer resolve aliases before printing any value, which means now the aliases are never referenced.

I expected plenty of tests to change due to removing the aliases from the printed representation of CLIF functions, but there was one surprise: in tests/disas/icall-loop.wat, this didn't just remove the aliases, but also changed the value numbering.

As it turns out, in that one function, the last value in the function that Wasmtime generated was an alias (v28 -> v3). The disas test harness has Wasmtime print the CLIF it generated, then the harness re-parses that CLIF, and passes the re-parsed CLIF to Cranelift's optimization passes. Since after this change the aliases are not printed, re-parsing the printed form doesn't round-trip perfectly, and specifically the last allocated value number changes. As a result, when legalization and optimization introduce new instructions, they get different result value numbers.

My conclusion is that this is fine. The disas tests don't need to produce exactly the same value numbers that the full Wasmtime to Cranelift pipeline would produce. This process does produce the same instructions in the same order, and I think that's all that matters.

This is a follow-up to bytecodealliance#8214. That PR made the CLIF printer resolve aliases before printing any value, which means now the aliases are never referenced. I expected plenty of tests to change due to removing the aliases from the printed representation of CLIF functions, but there was one surprise: in tests/disas/icall-loop.wat, this didn't just remove the aliases, but also changed the value numbering. As it turns out, in that one function, the last value in the function that Wasmtime generated was an alias (v28 -> v3). The disas test harness has Wasmtime print the CLIF it generated, then the harness re-parses that CLIF, and passes the re-parsed CLIF to Cranelift's optimization passes. Since after this change the aliases are not printed, re-parsing the printed form doesn't round-trip perfectly, and specifically the last allocated value number changes. As a result, when legalization and optimization introduce new instructions, they get different result value numbers. My conclusion is that this is fine. The disas tests don't need to produce exactly the same value numbers that the full Wasmtime to Cranelift pipeline would produce. This process does produce the same instructions in the same order, and I think that's all that matters.

bjorn3 · 2024-03-22T18:55:17Z

If you do a debug print of some Value you have somewhere in your code which happens to be an alias and then try to cross-reference it with the printed clif ir, you wouldn't be able to find the value anymore with this PR, right? Would it instead be possible to have places that print clif ir explicitly mutate the clif ir to resolve all aliases? (likely with a helper function for this on func.dfg) That way you have control over if aliases are printed or not.

cfallin · 2024-03-22T19:12:51Z

Perhaps we can have a verbose mode to printing (a flag carried on the writer object, not a Cranelift setting, to be clear) that prints all aliases still, but in a dump at the bikeshed_choice({beginning, end}) of the function body?

I like the cleanup here, but at the same time I'm also sympathetic to the view that this is "changing the defined values" visible to the reader, differently than the earlier use-aliases-in-args change; I can also easily imagine how this would become confusing when debugging. I think the above would largely address that: if one is println-debugging and wants to dump the function body before some algorithm starts to traverse values, one can dump it "verbosely" to include the aliases and such.

jameysharp · 2024-03-22T19:17:19Z

That's true, although you could also call dfg.resolve_aliases on values you want to display in your debug-print statements, which I would think would be more useful. Do you feel strongly that it's important to be able to find all Values represented in the CLIF serialization? I could be persuaded but I don't yet quite understand why it would be worth keeping.

The egraph pass already has the side effect of resolving all aliases, so I also considered having the printer only print aliases which are actually used somewhere. But since #8214 makes the printer never show any alias uses, I thought that would be a waste of time.

cfallin · 2024-03-22T19:27:09Z

I guess it comes down to: are the aliases semantically part of the CLIF still, or not? I think we should decide that, and then make the text representation follow. I'm concerned with not just debug printing but also "machine-readable bugs", so to speak: some other logic might reference a value somewhere along with text-serialized CLIF, and if the value has disappeared from the CLIF (or been replaced by some other new operator reusing the value number) then that's a subtle and hard-to-debug issue.

If aliases are semantically part of CLIF, then any random Value one comes across anywhere should be reasonable to print, and look for in the CLIF; if they aren't, then resolve_aliases is a more or less mandatory step to take before exposing a value "externally".

I think given the above bug potential, I am actually leaning toward "aliases are still part of CLIF"; that means we should print them, maybe even unconditionally, to avoid confusion about "variants"; and if we want to make them not part of CLIF, we should find ways to "hide" unsanitized (unresolved-aliases) Values. Separately, for a goal of "print clean-looking CLIF", we could do what @bjorn3 mentions and run a mutation (resolve all aliases in the instruction args, and delete alias definitions) before printing CLIF.

bjorn3 · 2024-03-22T19:54:17Z

are the aliases semantically part of the CLIF still, or not?

IMO yes as you can change a value back from an alias to a regular instruction or change it to be an alias for a different value. If changing a value to be an alias for another value was a permanent change where modifying the value would actually modify the aliased value instead, only then would I consider aliases to not be semantically part of clif ir as the alias and it's target are indistinguishable modulo the Eq trait impl (it would likely need to be removed in that case as unlike PartialEq, Eq requires a value to be equal to itself).

cfallin · 2024-03-22T20:04:49Z

Right, so given that, the line that makes sense to me is: altering values we print in arguments is fine because chasing aliases to the root is semantically equivalent; and also that's a very useful thing when reading the code; but removing the definitions of the aliases is removing another "output", or meaningful element of the code.

(I could also see an argument for removing the implicit alias-chasing in printing if we instead have a "cleanup alias resolution pass" before printing output that also deletes defs; that's fine because we're sequestering all the mutation away into a thing that does exactly the mutation it says on the tin, so to speak. But the above middle ground seems like a good compromise too, to me.)

jameysharp · 2024-03-22T20:34:05Z

Alright, I'm convinced enough that aliases are part of the meaning of CLIF.

But in that case I think we should revert #8214 and fully go the direction that @bjorn3 suggested instead, where if someone wants aliases resolved they should do it before printing.

In addition, I want the aliases deleted once all their uses have been resolved, as Chris suggests. It looks like the way to do that is to update the alias' type to INVALID and its referent to Value::reserved_value(). I believe that will make the printer skip those values, without needing to introduce a "verbose" flag on printing.

And I think I would like this alias resolution pass to happen in all the filetest-style harnesses, but that could be a separate discussion. I could go either way on whether Wasmtime's --emit-clif option should do alias resolution or not, but as that frontend rarely introduces aliases I guess it makes sense to just go ahead and print them, so that the round-trip done by the disas test harness is lossless.

cfallin · 2024-03-22T20:50:41Z

That sounds like a good approach to me!

This reverts commit 6c51848, "Cranelift: resolve value aliases when printing CLIF functions (bytecodealliance#8214)" In discussion on bytecodealliance#8223, we decided to handle this a different way. So I've introduced a method on DataFlowGraph which eliminates all value aliases, and we can call it when necessary. In this PR, I've specifically called it in disas tests. In these tests, value aliases are just clutter that distracts from understanding what code will actually be generated. They may change due to changes in legalization or optimization, but that doesn't signal anything about whatever the test was intended to check. There are other places we would probably be better off doing a whole-function rewrite once rather than calling `resolve_aliases` a bunch of times, such as the egraph pass and lowering. I have not changed those in this PR.

jameysharp · 2024-03-26T00:19:30Z

The new helper for rewriting away aliases is in #8238 and the rest of the relevant changes are coming after that. I'm dropping this PR in favor of that.

This reverts commit 6c51848, "Cranelift: resolve value aliases when printing CLIF functions (bytecodealliance#8214)", then applies the same effect a different way. In discussion on bytecodealliance#8223, we decided to handle this a different way. So I've introduced a method on DataFlowGraph which eliminates all value aliases, and we can call it when necessary. If that function is not called then the CLIF printer should print value aliases the same way it did before bytecodealliance#8214. In this PR, I've specifically called it in disas tests. The changes to write.rs and frontend.rs are from the revert, while the changes to disas.rs are new. In these tests, value aliases are just clutter that distracts from understanding what code will actually be generated. They may change due to changes in legalization, but that doesn't signal anything about whatever the test was intended to check. Because the new helper deletes aliases after it's done resolving them, those aliases now disappear from the test expectations. Aside from that, the test expectations are unchanged compared to what bytecodealliance#8214 did.

This reverts commit 6c51848, "Cranelift: resolve value aliases when printing CLIF functions (#8214)", then applies the same effect a different way. In discussion on #8223, we decided to handle this a different way. So I've introduced a method on DataFlowGraph which eliminates all value aliases, and we can call it when necessary. If that function is not called then the CLIF printer should print value aliases the same way it did before #8214. In this PR, I've specifically called it in disas tests. The changes to write.rs and frontend.rs are from the revert, while the changes to disas.rs are new. In these tests, value aliases are just clutter that distracts from understanding what code will actually be generated. They may change due to changes in legalization, but that doesn't signal anything about whatever the test was intended to check. Because the new helper deletes aliases after it's done resolving them, those aliases now disappear from the test expectations. Aside from that, the test expectations are unchanged compared to what #8214 did.

jameysharp requested review from a team as code owners March 22, 2024 18:46

jameysharp requested review from alexcrichton and cfallin and removed request for a team March 22, 2024 18:46

alexcrichton removed their request for review March 22, 2024 19:29

github-actions bot added the cranelift Issues related to the Cranelift code generator label Mar 22, 2024

jameysharp closed this Mar 26, 2024

jameysharp deleted the no-print-aliases branch March 26, 2024 00:19

jameysharp mentioned this pull request Mar 26, 2024

Resolve value aliases in disas tests #8239

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cranelift: Stop printing value aliases in CLIF functions#8223

cranelift: Stop printing value aliases in CLIF functions#8223
jameysharp wants to merge 1 commit intobytecodealliance:mainfrom
jameysharp:no-print-aliases

jameysharp commented Mar 22, 2024

Uh oh!

bjorn3 commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

bjorn3 commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jameysharp commented Mar 22, 2024

Uh oh!

bjorn3 commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

bjorn3 commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 22, 2024

Uh oh!

cfallin commented Mar 22, 2024

Uh oh!

jameysharp commented Mar 26, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants