Use GetMetaData for stride computation#649
Conversation
GetMetaData for stride computation
| // Forward traverse from rFactor domain to allocation domain, compute frontier | ||
| // sizes and strides, validate that splits are divisible and merges are | ||
| // contiguous, and update active_ids_ correspondingly. | ||
| class ForwardTraverseFromRFactorToAlloc { |
There was a problem hiding this comment.
Moved to tensor_metadata.cpp unchanged.
| }; | ||
|
|
||
| // Similar to ForwardTraverseFromRFactorToAlloc, but in the opposite direction. | ||
| class BackwardTraverseFromRFactorToAlloc { |
There was a problem hiding this comment.
Moved to tensor_metadata.cpp unchanged.
| // is [I1, I2], and the tensor's size is [15] and stride is [7], and the extent | ||
| // of I2 is 5, then the resulting size will be [3, 5] and stride will be [35, 7] | ||
| std::vector<std::pair<int64_t, int64_t>> | ||
| inferAndValidateAllocationSizesAndStrides( |
There was a problem hiding this comment.
Moved to tensor_metadata.cpp slightly changed.
|
!build |
|
!build |
jacobhinkle
left a comment
There was a problem hiding this comment.
LGTM. The changes are mostly mechanical due to name change and moving code. I made one note of no-longer-used method and member of TensorArg, but since you'll be removing all ArgAbstract subclassess soon anyway cleaning up the interface is unimportant.
| TORCH_INTERNAL_ASSERT( | ||
| (size_t)instance_.nAllocationDims() == sizes_strides.size()); | ||
| for (auto i : c10::irange((int64_t)sizes_strides.size())) { | ||
| alloc_sizes.at(i) = sizes_strides.at(i).first; |
There was a problem hiding this comment.
Since this is removed, where does alloc_sizes get set now?
There was a problem hiding this comment.
OK I see; it's not used anymore since it is all done in metadata. So this means we can also get rid of alloc_sizes along with getAllocSize().
There was a problem hiding this comment.
Yes, this change breaks TensorArg, but it is not used so who cares if it is broken.
|
@zasdfgbnm Could you remind me again what the logical stride means? Does that just strides computed from the logical sizes of a logical contiguous tensor? |
|
logical strides is just the stride in terms of rFactor domain, that is the "raw" strides from PyTorch using |
The definition of
Tensorin runtime has been changed asThat is, we are more explicit about whether we are referring to the size/stride of allocation domain or rFactor domain. On the host, the
PolymorphicValueof tensor metadata now stores all the fourlogical_size,logical_stride,alloc_size,alloc_stride.The utility that converts sizes and strides wrt rFactor to allocation domain was
inferAndValidateAllocationSizesAndStrides, this function has been moved totensor_metadata.cppand is now private to that file. The new way to compute sizes and strides of allocation domain is:I need to change
SchedulerRuntimeInfo::SchedulerRuntimeInfo,getKernelArgument, andvalidateAlignedVectorizedFusionInputOutputto use this new approach.With this change,
getTensorArgandKernelArgumentHolder::getBufferare no longer used. We should be ready to remove all the subclasses ofArgAbstract, but I am not doing this clean up. These cleanups will be left as next PR.