-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[VM] Memory alignment check for set_input in Virtual Machine
#11391
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
bba5b16 to
e906203
Compare
e906203 to
c7e352d
Compare
include/tvm/runtime/ndarray.h
Outdated
| * If AbilityOfZeroCopyForDLTensor is true a NDArray is created | ||
| * using the memory allocated by an external source. | ||
| * Responsibility for memory retaining lies with the external source. | ||
| * Otherwise new NDArray is created, the data is copied from the DLTensor. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This says data will be copied but the \brief says data will not be copied. Which is it? My reading of the code below is that data will be copied if AbilityOfZeroCopyForDLTensor is false.
I don't think we should change the semantics of this function. How about we just raise an error if the AbilityOfZeroCopyForDLTensor is false?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we do not raise the error due to the following reason. If you see in detail it is development of 'set_input' method of VirtualMachine. This method can be separated on two parts: one considers input as NDArray, another one does input as DLTensor. In both cases it trys to do zero copy if can, otherwise real copy is used. It has been implemented for NDArray (it automatically does not have problem with alignment) and after we have developed it for DLTensor but not checked alignment in previous PR. If we cannot use zero copy we still should use usual copy to avoid method failure. There are no 'set_input' and 'set_input_zero_copy' methods for VM. I discussed it on previous PR, it is design of VM. It means that 'set_input' should work stably in both cases with and without copying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you are proposing here is a change to the semantics of FromExternalDLTensor. Because this is a public and important API I suggest you do not change it and instead put the logic into set_input.
If NDArray needs its underlying DLTensor to be aligned, then we should also add a check to FromExternalDLTensor to make sure that the underlying data is aligned.
@tqchen maybe you can provide some feedback here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that FromExternalDLTensor in this case should check alignment,
set_input_zero_copyset_input in Virtual Machine
|
Hello @tkonolige and @tqchen! I've updated public API, now its functionality should be more correct |
|
@vvchernov I see that you are still changing the semantics of |
|
Hello @tkonolige! I've skipped |
tkonolige
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @KJlaccHoeUM9l!
|
Hello @tkonolige! CI tests were passed successfully. Could you approve it? |
|
@vvchernov I have approved it :). Unfortunately I am not a committer. You'll need someone who is to approve it. |
tmoreau89
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the review @tkonolige
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught.
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught.
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught.
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught.
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught. This check is also restricted to targets whose `void*` opaque pointer can be interpreted as a pointer to the data array. (e.g. No such check applies on Vulkan, as the `void*` is a pointer to a struct that contains additional bookkeeping.)
Prior to this commit, any use of `tvm.nd.from_dlpack` to create a strided `NDArray`, or a `NDArray` whose alignment was less than `tvm::runtime::kAllocAlignment` would raise an error. As a result, views into larger arrays, which are unlikely to be aligned and compact, could only be shared when copied into an aligned and compact buffer. This commit moves the compact/aligned check from the `NDArray` class into the generated TIR code as part of DLTensor unpacking. These checks were initially introduced in apache#11391, to avoid segfaults caused by use of non-aligned buffers in code intended for aligned buffers. The new checks will provide the same safeguard as the alignment is checked prior to use, but allows the alignment requirement to be relaxed on a per-buffer basis. This approach also removes a potential bug resulting from compile-time configuration of `tvm::runtime::kAllocAlignment`, first introduced in apache#13307. Since TVM supports cross-compiling, the installation of TVM used to compile a kernel may assume a larger value of `kAllocAlignment` than is provided by the runtime installation of TVM. By validating the alignment within the generated kernel, rather than as part of the runtime, this potential inconsistency would be caught. This check is also restricted to targets whose `void*` opaque pointer can be interpreted as a pointer to the data array. (e.g. No such check applies on Vulkan, as the `void*` is a pointer to a struct that contains additional bookkeeping.)
PR has added the ability to skip copying data when creating input
NDArraytensors if the source input is inDLTensorformat.However, when adding this functionality, memory alignment was not checked, as it was done for GraphExecutor.
In view of this, runtime errors (
Segmentation fault (core dumped)) are possible, because TVM uses aligned memory.This PR adds this check.