Tweak MPMS materialization to handle some edge cases #624
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds a couple of tweaks to the MPMS materialization scheme to try to handle the two major cases of arrays not being materialized when they should be (i.e., they contribute a lot of excess computation), as found when experimenting with mirgecom's gas-in-box with the flop counting from #623. The two cases are:
len(materialized_predecessors) < 2) that are used by a large number of materialized nodes. In mirgecom simulations, this occurs most frequently on powers of temperature, where the subexpression only depends on one materialized array (temperature) and has a single computational node (**), but due to a large amount of reuse (up to 30x) it contributes a lot of flops. I attempted to fix this by modifying the MPMS materialization criterion slightly (nsuccessors > 1 and len(materialized_predecessors) > 1changed tonsuccessors > 1 and nsuccessors + len(materialized_predecessors) >= 4). This may have the side effect of materializing some operations in the DAG that don't strictly need to be (Reshapes). Not sure yet how much this matters.Results for
gas-in-box:Results for
wave(which wasn't that bad to begin with):I'd still like to try this on the full prediction case to see if it behaves any differently from gas-in-box, and also to see how these changes affect timestep time.
Note 1: I moved the materialization code to a new module, because I was originally planning to add this as a separate materializer. I ultimately decided to tweak the existing one instead, but I left the new module. I figure
transform/__init__.pyis too crowded anyway, it could use some splitting up.Note 2: Due to the module change, it's likely best to read (and merge) this PR commit by commit.