Skip to content

Remove the use of the union-find structure during elaboration#7922

Merged
elliottt merged 2 commits intobytecodealliance:mainfrom
elliottt:trevor/remove-union-find-from-elaborator
Feb 13, 2024
Merged

Remove the use of the union-find structure during elaboration#7922
elliottt merged 2 commits intobytecodealliance:mainfrom
elliottt:trevor/remove-union-find-from-elaborator

Conversation

@elliottt
Copy link
Member

@elliottt elliottt commented Feb 12, 2024

Remove the UnionFind argument to Elaborator::new, and from the Elaborator structure, relying instead on the value_to_best_value table when computing canonical values. Running sightglass on the spideromonkey benchmark showed no difference in performance between main and this branch (compile time or run time).

Additionally, we compared the assembly produced with and without this change, and found the difference in generated code to be mostly negligible (lea instead of add, or a move between registers added or removed). The difference in the length of the disassembled output was only +15 lines, which out of 2212651 is pretty good.

Remove the UnionFind argument to `Elaborator::new`, and from the
`Elaborator` structure, relying instead on the `value_to_best_value`
table when computing canonical values.

Co-authored-by: Jamey Sharp <jsharp@fastly.com>
Co-authored-by: L. Pereira <lpereira@fastly.com>
@elliottt elliottt marked this pull request as ready for review February 12, 2024 23:46
@elliottt elliottt requested a review from a team as a code owner February 12, 2024 23:46
@elliottt elliottt requested review from cfallin and removed request for a team February 12, 2024 23:46
Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking this on!

It's reassuring to see no changes to the compile-tests' outputs (EDIT: and so little in real code). I remembered one bit of the original thinking after our discussion earlier: IIRC, we had wanted to track elaborated values by canonical-value (union-find result) because otherwise, elaborating a use of an "early" node in the class will not necessarily see a later node that was union'd into the class -- it will result in redundant elaboration.

I suspect this mostly doesn't show up in practice because eager rewriting means that all subsequent uses are of the latest (highest-numbered) rewrite, if any rewrites occurred, so the best-value propagation can give the best for the whole class at that node. But it'd be good to verify that is what you all had been thinking as well, and add a comment describing this somewhere. What do you think?

@elliottt
Copy link
Member Author

The goal that we had in removing the use of the union find structure was to enforce the assumption that it was valid to refer to an inner node of the union tree as a subset of an eclass. Using the union find results from the rewriting pass over the whole function during elaboration may end up merging results from different branches in the dominator tree into the same eclass, which would mean that we might canonicalize to a node higher in the union tree than we originally intended. If that situation occurred, we would be at the mercy of the cost function for correctness. I think that with the changes you proposed to how we manage access to the gvn map as well, we'll be able to relax the new guideline we introduced for using subsume when the RHS forgets values from the LHS in egraph rules.

As to where we would add a comment, I'm not sure the best place to document this. Do we already document our use of sub-trees of the union tree to name subsets of the eclass? I think that's a super elegant implementation detail that's worth documenting, and we could call out careful management of eclasses in the same place.

@cfallin
Copy link
Member

cfallin commented Feb 13, 2024

OK, yeah, this is confirming to me why we didn't see the scoping issue originally: I think I had been imagining the canonical value to be the "first" and hence dominate all others in the eclass, but, indeed, you're correct that out-of-order canonical values can cause references that require the cost function (or the domtree-range scoping described in #7891) to handle. Thanks for "elaborating" on that!

Perhaps we could add to the comment on value_to_elaborated_value?

Incidentally, this gives me another thought: if we could somehow canonicalize toward dominating values (up the domtree), this problem would also go away. I'm not sure how to do that efficiently though, so I think this plus domtree-range scoping (to avoid delicate subsume rules) feels like the right way to make all this fully robust.

@github-actions github-actions bot added the cranelift Issues related to the Cranelift code generator label Feb 13, 2024
@elliottt elliottt added this pull request to the merge queue Feb 13, 2024
Merged via the queue into bytecodealliance:main with commit cf14a89 Feb 13, 2024
@elliottt elliottt deleted the trevor/remove-union-find-from-elaborator branch February 13, 2024 19:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants