Skip to content

Roll forward "AliasAnalysis handles slices. (#1281)". #1357

Merged
wujingyue merged 5 commits intomainfrom
wjy/slice
Nov 21, 2023
Merged

Roll forward "AliasAnalysis handles slices. (#1281)". #1357
wujingyue merged 5 commits intomainfrom
wjy/slice

Conversation

@wujingyue
Copy link
Collaborator

No description provided.

@wujingyue wujingyue changed the base branch from fix1353 to wjy/rollforward November 20, 2023 22:35
@wujingyue wujingyue requested a review from jjsjann123 November 20, 2023 22:35
@wujingyue
Copy link
Collaborator Author

!build

// But don't reset next_non_broadcast_is_non_contiguous. Say we slice a
// TV of shape [4,1,7],[t,t,t] to a TV of shape [4,1,5]. The output
// contiguity should be [f,n,t].
} else if (next_non_broadcast_is_non_contiguous) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change here does make sense to me. But I'm not following the comment here.

TV of shape [4,1,7],[t,t,t]
don't reset next_non_broadcast_is_non_contiguous.

The old code calls continue for broadcast dimension, so we are not skipping anything here are we?

But rather the broadcast dimension here is not viewed as such, since its contiguity flag is marked as True?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note this is over the [f,n,t] of the output tensor, Jie, not the [t,t,t] of the input tensor.
(but still a valid point, seems like the code is the same)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I see how the first example below when we are slicing a broadcast dimension, how this code could fix it.

But the second example where the comment is referring doesn't make sense to me at all.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this comment because I almost wrote something like

    if (out_layout.allocation_domain[i]->isBroadcast()) {
      out_layout.contiguity[i] = std::nullopt;
      next_non_broadcast_is_non_contiguous = false;
    } else ...

which is wrong. We shouldn't reset next_non_broadcast_is_non_contiguous here when deciding the contiguity of a broadcast dimension.

After a second thought, I feel it's pretty obvious because the variable name says it all, so I may just remove this comment.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But the second example where the comment is referring doesn't make sense to me at all.

How so? You think the contiguity ought to be something else?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ha, now I see where the comment it coming from.

I somehow was reading too much into the example and how those were translating to the code. NVM my poor reading.

Copy link
Contributor

@tfogal tfogal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Would love if you could take some time to educate me on how the alias analysis is used, though! see comment

FusionExecutorCache fec(std::move(fusion));
at::Tensor in_tensor = at::randn({4, 1, 7}).cuda();
at::Tensor out_tensor = fec.runFusionWithInputs({in_tensor})[0];
EXPECT_TRUE(out_tensor.is_alias_of(in_tensor));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You would know better than me, but this confused me. I'm assuming that 7 is the fastest dimension in {4,1,7}, and it might be that I just have this backwards :-)

But if I have it the right way, I think of two tensors as aliases when they use the same base address for the data, but wouldn't the out tensor need a stride of 7 (despite the dim being 5) in the fastest dim for that to be true? Does the alias analysis / usage perhaps guarantee that out indeed does get a stride 7? If not, where did my logic go awry?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think about this in the python indexing. Any basic indexing (slice included) can be done via a meta update as a view into the original storage and you don't need copy.

after the slice, the stride on each dimension remains the same and we are only updating the shape (and offset when it applies)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import numpy as np
arr = np.arange(28).reshape(4, 1, 7)
b = arr[:,:,0:5]
>>> arr.strides
(56, 56, 8)
>>> b.strides
(56, 56, 8)
>>> arr.ctypes.data
94247211410336
>>> b.ctypes.data
94247211410336

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming that 7 is the fastest dimension in {4,1,7}

Right.

I think of two tensors as aliases when they use the same base address for the data

No, they alias when they share the storage buffer but the base addresses don't have to be the same. I believe @jjsjann123 answered this question. See also AliasTest.SliceViewPermute where all the three outputs are alias but only the first one uses the same base address.

@wujingyue wujingyue requested a review from jjsjann123 November 21, 2023 00:13
@wujingyue wujingyue changed the base branch from wjy/rollforward to main November 21, 2023 01:38
@wujingyue
Copy link
Collaborator Author

!build

@wujingyue wujingyue merged commit 3b9f421 into main Nov 21, 2023
@wujingyue wujingyue deleted the wjy/slice branch November 21, 2023 05:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants