[VirtualMachine] Zero copy in set_input when input is DLTensor #11003

vvchernov · 2022-04-13T19:04:05Z

I observed that VirtualMachine::SetInputTensorWithIndex(...) method has discrepancy between description (also see description for VirtualMachine::SetInput(...) which assumes zero copy if possible and uses the method) and implementation. It always create new NDArray and copies data to it if source input is DLTensor even if devices are the same. It reduces performance of multiple input models due to excess copying. The PR fixes this issue.

Note: I have a remark about current design. VirtualMachine has only set_input python method, the same method is used inside run and invoke methods with input args. But there is no set_input_zero_copy. In description I obsrved that set_input tries to not use copying if possible. Theoretically we can have problem if set_input is used, input tensors are released after that and when run or invoke are launched. As I know GraphExecutor does not have such problem.

jwfromm

I like this change a lot and your comments are excellent.

AndrewZhaoLuo

Hmm, I'm not sure about this change, it seems like a major change in invariants. I would rather you make a new method like "set_input_zero_copy" and expose that to the user to use.

AndrewZhaoLuo

Gonna block this until get another set of opinions on this. @mbs-octoml @altanh ?

vvchernov · 2022-04-15T17:31:11Z

Hello @AndrewZhaoLuo! As you can see my note to the PR I said about the same, but currently set_input method of VirtualMachine already does zero copy for NDArray input. It means that if I add set_input_zero_copy I should change set_input to method which always copies external input to internal one (and performance for people who used it in their code is reduced). No problem, I can do it for me it is more reasonable approach. What do you think @jwfromm?

mbs-octoml

LGTM. Just a comment nit. Thanks, copy overhead has been troubling me lately so I'm glad you're ahead of me.

mbs-octoml · 2022-04-15T18:11:26Z

src/runtime/vm/vm.cc

-    std::vector<int64_t> shape;
-    for (int64_t i = 0; i < tensor->ndim; i++) {
-      shape.push_back(tensor->shape[i]);
+    if (dev.device_type == tensor->device.device_type &&


Thank you, this is a great change.

Could you update vm.py's set_input doc string to clearly state the by-copy vs by-ref semantics? Thanks.

Hello @mbs-octoml! I've added description and check device id for NDArray. But it looks like internal mechanism of copying does not take into account device id. Please see my changes

Thanks for that. Yeah, the codebase is not at all 'device_id clean' and will require an audit to find all the places it is either ignored or defaulted to '0'. One step at a time.

AndrewZhaoLuo · 2022-04-15T18:15:49Z

Hello @AndrewZhaoLuo! As you can see my note to the PR I said about the same, but currently set_input method of VirtualMachine already does zero copy for NDArray input. It means that if I add set_input_zero_copy I should change set_input to method which always copies external input to internal one (and performance for people who used it in their code is reduced). No problem, I can do it for me it is more reasonable approach. What do you think @jwfromm?

After talking to MBS, your change does in fact match the intended semantics better

AndrewZhaoLuo · 2022-04-15T18:42:44Z

Just please cover the nit

…e#11003) * method of creating of NDArray from external DLTensor was implemented * set input without copying for DLTensor source * code clean up * update description and comments after review Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>

vvchernov force-pushed the vc/vm_set_input_zero_copy branch 2 times, most recently from a7f7a67 to c21e22b Compare April 13, 2022 19:22

vvchernov changed the title ~~WIP: [VirtualMachine] Zero copy in set_input when input is DLTensor~~ [VirtualMachine] Zero copy in set_input when input is DLTensor Apr 14, 2022

Valery Chernov added 3 commits April 14, 2022 16:58

method of creating of NDArray from external DLTensor was implemented

d782877

set input without copying for DLTensor source

8d58230

code clean up

ea05472

vvchernov force-pushed the vc/vm_set_input_zero_copy branch from 2086bba to ea05472 Compare April 14, 2022 13:59

jwfromm approved these changes Apr 14, 2022

View reviewed changes

AndrewZhaoLuo reviewed Apr 15, 2022

View reviewed changes

AndrewZhaoLuo requested changes Apr 15, 2022

View reviewed changes

mbs-octoml approved these changes Apr 15, 2022

View reviewed changes

AndrewZhaoLuo approved these changes Apr 15, 2022

View reviewed changes

update description and comments after review

c5606d6

vvchernov force-pushed the vc/vm_set_input_zero_copy branch from 0ad16f6 to c5606d6 Compare April 15, 2022 19:45

AndrewZhaoLuo merged commit fafabc9 into apache:main Apr 15, 2022

KJlaccHoeUM9l mentioned this pull request May 20, 2022

[VM] Memory alignment check for set_input in Virtual Machine #11391

Merged

vvchernov deleted the vc/vm_set_input_zero_copy branch May 28, 2022 11:09

vvchernov mentioned this pull request May 28, 2022

[VM] check DLManagedTensor for conditions to construct NDArray #11504

Merged

tqchen mentioned this pull request Jan 1, 2023

[RUNTIME] Memory Safety Issue on NDArrary::FromExternalDLTensor #13678

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[VirtualMachine] Zero copy in set_input when input is DLTensor #11003

[VirtualMachine] Zero copy in set_input when input is DLTensor #11003

Uh oh!

vvchernov commented Apr 13, 2022

Uh oh!

jwfromm left a comment

Uh oh!

AndrewZhaoLuo left a comment

Uh oh!

AndrewZhaoLuo left a comment

Uh oh!

vvchernov commented Apr 15, 2022

Uh oh!

mbs-octoml left a comment

Uh oh!

mbs-octoml Apr 15, 2022

Uh oh!

vvchernov Apr 15, 2022

Uh oh!

mbs-octoml Apr 18, 2022

Uh oh!

AndrewZhaoLuo commented Apr 15, 2022

Uh oh!

AndrewZhaoLuo commented Apr 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[VirtualMachine] Zero copy in set_input when input is DLTensor #11003

[VirtualMachine] Zero copy in set_input when input is DLTensor #11003

Uh oh!

Conversation

vvchernov commented Apr 13, 2022

Uh oh!

jwfromm left a comment

Choose a reason for hiding this comment

Uh oh!

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

Uh oh!

AndrewZhaoLuo left a comment

Choose a reason for hiding this comment

Uh oh!

vvchernov commented Apr 15, 2022

Uh oh!

mbs-octoml left a comment

Choose a reason for hiding this comment

Uh oh!

mbs-octoml Apr 15, 2022

Choose a reason for hiding this comment

Uh oh!

vvchernov Apr 15, 2022

Choose a reason for hiding this comment

Uh oh!

mbs-octoml Apr 18, 2022

Choose a reason for hiding this comment

Uh oh!

AndrewZhaoLuo commented Apr 15, 2022

Uh oh!

AndrewZhaoLuo commented Apr 15, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants