Skip to content

Enable sliding window in registers#5815

Merged
dsharletg merged 27 commits intomasterfrom
dsharletg/slide-registers
Mar 24, 2021
Merged

Enable sliding window in registers#5815
dsharletg merged 27 commits intomasterfrom
dsharletg/slide-registers

Conversation

@dsharletg
Copy link
Contributor

This PR enables an alternative strategy for sliding window (used when the producer of a sliding window is store_in(MemoryType::Register)). The alternative strategy works by shifting the producer such that the previously computed values are copied from the previous iteration, and the newly computed values are always computed in the same place. This avoids non-constant addresses, which are a prerequisite for storing things in registers (on most targets).

This mostly fixes #1820, though it currently requires bending over backwards in the schedule a bit for sliding vectorized dimensions.

@abadams contributed heavily to the ideas implemented here.

This PR has some other related changes:

  • Treat if_then_else intrinsics like select when partitioning loops.
  • After we're done with using boxes_touched, rebase some loops to have a min of 0, to enable stronger simplifications.
  • More simplifier rules.

@steven-johnson steven-johnson requested a review from abadams March 18, 2021 22:44
@dsharletg
Copy link
Contributor Author

Ping on a code review, please take a look at this when you get a chance @abadams. I've confirmed that this does address the use case we had in mind well.

switch (type) {
case ForType::Extern:
case ForType::GPUBlock:
case ForType::GPUThread:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpu loops absolutely need to be rebased to zero, and there's a pass inside FuseGPUThreadLoops.cpp that does it. Is that mutator now redundant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That mutator works a little differently. It substitutes loop_var + min, rather than makes a new let, and that seems like maybe something that might matter for another pass/logic (there's a lot of stuff happening in FuseGPUThreadLoops). I would also be careful about changing when loops get rebased to 0.

Maybe this PR should make it so rebase_loops_to_zero accepts a set of ForType that get rebased, and use that in FuseGPUThreadLoops? That would at least keep the rebasing happening at the same time, and the only change would be substitute vs. let.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the new mutator is just more correct than the old one. I guess the old one happens earlier though, so we can't just rely on the new one to do the mutation. Your proposal sounds good, but I have no strong feelings one way or the other.

Copy link
Contributor Author

@dsharletg dsharletg Mar 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the duplication here is minimal and there are non-minimal risks in messing with this, so I'll save it for a separate PR (will file an issue).

rewrite(select(y, z, x + c0) < x + c1, y && (z < x + c1), c0 >= c1) ||
rewrite(select(y, z, x + c0) < x + c1, !y || (z < x + c1), c0 < c1) ||

rewrite(c0 < select(x, c1, c2), select(x, fold(c0 < c1), fold(c0 < c2))) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These rules all formally verify ok

@abadams
Copy link
Member

abadams commented Mar 23, 2021

Generally lgtm. I think the substitute call is fishy though.

@steven-johnson
Copy link
Contributor

The OSX failure is unrelated (will be fixed by #5841), should be good to land

@dsharletg dsharletg merged commit 92dfc82 into master Mar 24, 2021
@dsharletg dsharletg deleted the dsharletg/slide-registers branch March 24, 2021 17:57
@alexreinking alexreinking added this to the v12.0.0 milestone May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Sliding window / storage folding can't express rotating through a set of registers

4 participants