-
Notifications
You must be signed in to change notification settings - Fork 79
Closed
Description
There is an issue with computing the output shape in some cases with broadcast_in_dim:
from nvfuser import DataType, FusionDefinition
import torch
from torch.testing import make_tensor
a = make_tensor((1, 0), device='cuda', dtype=torch.float32)
with FusionDefinition() as fd:
T0 = fd.define_tensor(symbolic_sizes=[1, -1], contiguity=[None, True], dtype=DataType.Float, is_cpu=False)
T1 = fd.ops.broadcast_in_dim(T0, output_shape=[0, 0], broadcast_dims=[0, 1])
fd.add_output(T1)
nvfout, = fd.execute([a])
# This fails because nvfout.shape is (1, 0)
assert nvfout.shape == torch.Size((0, 0))The Fusion seems to be constructed properly:
Inputs:
T0_g[ bS0{1}, iS1{i1} ], float
Outputs:
T2_g[ bS4{0}, iS5{i1} ], float
%kernel_math {
T1_l[ bS2{1}, iS3{i1} ]
= Set( T0_g[ bS0{1}, iS1{i1} ] )
T2_g[ bS4{0}, iS5{i1} ] = expand( T1_l[ bS2{1}, iS3{i1} ], {0, i1} )
}
Note that changing the first dimension of a to 0 and setting symbolic_sizes=[-1,-1], contiguity=[True,True] gives us the correct behavior. The issue seems to be in expanding to zero size. This might be an issue with the remove empty pass checking extent instead of getMaybeExpandedExtent.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels