-
Notifications
You must be signed in to change notification settings - Fork 79
Description
The slice implementation inside nvFuser does not have a runtime check to determine if the slice range is beyond the dimension size it is slice. Pytorch returns a zero-element tensor while nvFuser returns an empty tensor of the size of the slice(s).
Example:
import torch
from nvfuser import FusionDefinition, DataType
acts = [
torch.randn(5, 5, device='cuda'),
]
def legal(fd: FusionDefinition) -> None :
T0 = fd.from_pytorch(acts[0])
T1 = fd.ops.slice(T0, start_indices=[6, 6], end_indices=[8, 8], strides=[1, 1])
fd.add_output(T1)
with FusionDefinition() as fd:
legal(fd)
out = fd.execute(acts)
print(out[0].size(), out[0].stride())
out_eager = acts[0][6:8, 6:8]
print(out_eager.size(), out_eager.stride())
Output:
$ python test.py
tensor([[0., 0.],
[0., 0.]], device='cuda:0')
torch.Size([2, 2]) (2, 1)
tensor([], device='cuda:0', size=(0, 0))
torch.Size([0, 0]) (5, 1)
Naoya's comments on the feasibility of a runtime check:
I think we would need some runtime check. Currently we are ignoring a patten like:
t0: [I0]
t1 = slice(t0, {{0, 1}})
t1: [I1]
Here, the size of I1 is just 1, so it should be marked as a broadcast domain
But at the codegen side, we are not doing this
In this particular case, this should be trivial as we can obviously find the extent is 1, but more generally, whether the extent is 1 or not depends the extent of the slided domain as well as the start and end offsets
So, we need to do some check at the run time using the runtime info.
And that also seems to be the above case