-
Notifications
You must be signed in to change notification settings - Fork 79
Description
Currently, scheduling a reduction with a tensor having zero size results in an unhandled SIGFPE due to 0/0 at https://github.com/NVIDIA/Fuser/blob/main/csrc/scheduler/reduction.cpp#L558 (n_elems=target_blocks=0).
Here is a minimal reproduction:
// Test that 0-dimensional tensors do not break reduction scheduler
TEST_F(NVFuserTest, FusionReduceZeroElementTensor_CUDA) {
auto fusion = std::make_unique<Fusion>();
FusionGuard fg(fusion.get());
std::vector<int64_t> input_shape{3, 4, 0, 5};
auto tv0 = makeSymbolicTensor(4);
fusion->addInput(tv0);
auto tv1 = sum(tv0, {1});
fusion->addOutput(tv1);
auto options = at::TensorOptions().dtype(at::kFloat).device(at::kCUDA, 0);
at::Tensor at_x = at::randn(input_shape, options);
FusionExecutorCache executor_cache(std::move(fusion));
auto outputs = executor_cache.runFusionWithInputs({at_x});
auto t2 = at_x.sum({2});
auto reduction_params = getReductionHeuristics(fusion.get(), {at_x});
TORCH_CHECK(reduction_params, "Reduction schedule was not generated!");
scheduleReduction(fusion.get(), *reduction_params);
testValidate(
executor_cache.fusion(), outputs, {at_x}, {t2}, __LINE__, __FILE__);
}Note that the reduction does not need to be along the 0 dimension to trigger the crash. We have a test for zero-dim tensors in the presence of reductions, but it is not actually reducing the zero-dim tensor:
Line 5984 in 13a90b6
| TEST_F(NVFuserTest, FusionZeroSizeTensorReduction_CUDA) { |
I tried guarding a few of the ceilDivs in outerReductionHeuristic which can avoid the SIGFPE but lead to a kernel segfault, so I'm posting this issue for a little discussion before I try a fix. We may just need to check numel at the beginning of that function and short-circuit there if it's zero.