Fix getPointwiseHeuristics with empty tensors#369
Conversation
Note that this is not really safe as it will set zero_ = false even if it's still zero because of zero input numerator.
|
After removing the failing assertions, I'm hitting
This is a known issue being addressed in #358 and comes from the broadcast coming from the resize. If instead, we use a size-2 slice instead of size-1, the test passes. I don't think it's necessarily safe to just uncomment the failing check in |
| if (ext->isConstInt() && ext->evaluateInt() == 0) { | ||
| // A size-zero extent ID will be predicated out always, so it should | ||
| // not affect the projected extent calculation | ||
| continue; |
There was a problem hiding this comment.
@zasdfgbnm I know you've been looking at this code lately also, in relation to #393 . I am trying to avoid bad behavior when an empty TensorView with size zero extent is encountered. I currently am just pretending it doesn't exist here for the purposes of computing the projected extent. But I'm a little uncertain of whether that is enough, or if it's even safe. I think I have an idea of how this class supposed to work but I'd appreciate you having a look if you have some time.
There was a problem hiding this comment.
I think we should clean up this entire thing with expr simplifier, and I am thinking about doing so after #393.
There was a problem hiding this comment.
I think for the case of zero extent, as long as this does not throw an error, whatever value it returns does not matter, because we will not do any memory read/write anyway. Am I correct?
There was a problem hiding this comment.
If a tensor has zero elements, should we just stop propagation and assume that tensor is OK with arbitrary vectorization size?
There was a problem hiding this comment.
I am refactoring this part of code for allocation domain support. So I don't want you to invest too much time digging into the codebase on how to do it best. The workaround here makes sense if it can let the test pass.
|
Closing in favor of #449 |
A number of issues have come up when trying to process empty tensors (i.e. ones with at least one non-reduction axis with extent of zero) during scheduling and lowering. See: #264 #369 #269. Additionally, we now assume extents are positive (#440). Along with #543, this PR makes that a reality by removing all intermediate empty tensors. This PR: - Marks a `Fusion` as dynamic if dynamic reshapes/resizes exist or if _any_ alive `TensorView` has a static size-zero extent or a dynamic extent, since it might be empty. **This is means only Fusions with nothing but concrete non-zero sizes are static now.** That is, even if all static shapes are provided, it will be marked as a dynamic Fusion and those `TensorView`s will be modified during concretization. - Adds a pass done during `getConcretizationInfo()` that collects a vector of empty tensors which are not fusion inputs. It does not traverse their definitions, since there is nothing to compute for an empty tensor. - During concretization, sets the size-0 extents of identified empty tensors to constant 0. When encountering a new set of input sizes/scalars, we evaluate a minimal set of `Val`s (those that appear in dynamic extents), and only proceed with removing branches if any of these are zero. So there is a rather quick path to re-using concretizations in the common case where none of the extents are zero. Even after #543, this PR does not guarantee that all tensors present in the Fusion during scheduling have non-zero extent. It does guarantee that any remaining empty tensors are either fusion inputs or outputs, and that empty tensors will have constant 0 extents in any empty dimensions. Stripping empty inputs and outputs from the Fusion could potentially be done at segmentation, but should only be done if it does not result in additional kernels being launched; that is left for another PR (see #448). Fixes #365 and fixes #264. This replaces PRs #369 and #269. --------- Co-authored-by: Naoya Maruyama <naoyam@users.noreply.github.com>
Fixes #365.