Support verisilicon's NPU with BYOC framework #9046

sunshinemyson · 2021-09-20T07:07:17Z

workflow:
1. Define special pattern can be fused by NPU in python.
2. Generate compiled model with BYOC implementation.
3. Execute compiled model with simple runtime impl.

Signed-off-by: xiang.zhang xiang.zhang@verisilicon.com

Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.

workflow: 1. Define special pattern can be fused by NPU in python. 2. Generate compiled model with BYOC implementation. 3. Execute compiled model with simple runtime impl. Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com>

sunshinemyson · 2021-09-20T07:12:29Z

Hi @mbaret ,

Could you help review our PR for new NPU support? Thanks

areusch · 2021-09-21T01:00:18Z

@comaniac would you like to help review this too?

areusch

@sunshinemyson thanks for the PR. did an initial review. could you post up an RFC (eg. open a PR to github.com/apache/tvm-rfcs) or pre-RFC to discuss.tvm.ai giving the general approach and some background on your accelerator so we can better understand this PR? it's quite large and some context would be helpful. you might also consider committing just a couple ops initially so we can see/validate the approach, then accept the rest as follow-on PRs.

areusch · 2021-09-21T17:27:36Z

cmake/modules/contrib/VsiNpu.cmake

@@ -0,0 +1,30 @@
+if(NOT USE_VSI_NPU STREQUAL "OFF")


can you update cmake/config.cmake to contain the default and an example docstring? also, can you make this work for the case that USE_VSI_NPU is not set? i thiiink STREQUAL "OFF" would fail in that case (but i am not a cmake expert)

I'll fix it.

areusch · 2021-09-21T17:28:08Z

cmake/modules/contrib/VsiNpu.cmake

+endif()
+
+set(OVXLIB_API_ATTR "__attribute__\(\(visibility\(\"default\"\)\)\)")
+add_definitions(-DOVXLIB_API=${OVXLIB_API_ATTR})


can you add this to a specific target? same for the below definition

this is not required. Will remove it.

areusch · 2021-09-21T17:28:50Z

python/tvm/relay/op/contrib/vsi_npu_ffi_api.py

+
+import tvm._ffi
+
+tvm._ffi._init_api("relay.vsi_npu.support", __name__)


nit: add a newline at the end of the file. here and elsewhere.

will fix it.

areusch · 2021-09-21T17:29:24Z

src/relay/backend/contrib/vsi_npu/README.md

@@ -0,0 +1,97 @@
+# Versilicon NPU solution on TVM


would you like to contribute some of this content as a tutorial in tutorials/ somewhere?

I'll try to use GitHub wiki if possible.

areusch · 2021-09-21T17:29:49Z

src/relay/backend/contrib/vsi_npu/README.md

+   cd target_runtime_build
+   cp ../cmake/config.cmake ./
+    # add set(USE_VSI_NPU ON) to config.cmake, you can do it with cmake command option too
+   cmake -DCMAKE_BUILD_TYPE=Debug -DTIM_VX_INSTALL_DIR=<full_path_to_tim_vx_target_build_install_dir> \


just wondering how come you do a Debug build to deploy?

We don't use debug in deployment. I'll refine it.

areusch · 2021-09-21T18:31:31Z

src/relay/backend/contrib/vsi_npu/codegen_vsi_npu.h

+  // TODO
+};
+
+inline int32_t ConvertAxis(int32_t axisIn, uint32_t dimNum) {


can you add comments about what this is converting from and to?

areusch · 2021-09-21T18:33:32Z

src/relay/backend/contrib/vsi_npu/codegen.cc

+                 });
+};
+
+std::shared_ptr<tvx::Tensor> createVxOPerand(TensorInfoTable tensor_info,


can you follow the C++ style guide here https://google.github.io/styleguide/cppguide.html#Function_Names? e.g.CreateVxOperand

areusch · 2021-09-21T18:34:07Z

src/relay/backend/contrib/vsi_npu/codegen.cc

+  return specs;
+}
+
+using namespace backend;


prefer to avoid this https://google.github.io/styleguide/cppguide.html#Namespaces

areusch · 2021-09-21T18:45:53Z

tests/python/contrib/test_vsi_npu/test_operations.py

+
+        return relay.testing.init.create_workload(net)
+
+    print("Testing {0: <50}".format("ADD"), end="")


suggest to use logging package for compat with pytest

Acutally, we didn't use pytest in the test scripts. I think we need refactor all the tests with pytest?

areusch · 2021-09-21T18:46:23Z

tests/python/contrib/test_vsi_npu/test_operations.py

+    verify_vsi_result(inputs, quantize, [], data_shape, data_shape, output_dtype)
+
+if __name__ == "__main__":
+    #test_qnn_add()


replace all of this with:
sys.exit(pytest.main([__file__] + sys.argv[1:]))

comaniac · 2021-09-21T19:48:21Z

Echo @areusch. It's impossible to review a PR with 5k code changes.

masahi · 2021-09-22T03:23:36Z

@areusch @comaniac I'm happy to help review this PR and the RFC (I'm familiar with the BYOC flow)

comaniac · 2021-09-22T06:28:33Z

Thanks @masahi. It's no doubt that you are very familiar with BYOC and I would be comfortable to merge the PR if you approve.

Meanwhile, following the process of filing an RFC with an upstream plan that gradually adds features is always a better practice. Small size PRs could easily get the second or even third reviews, and can make the final framework more robust and bug-free. I would rather, and be happy to spend one hour to review and merge one small/medium size PR every day, instead of spending days or even months on a PR with many iterations.

sunshinemyson · 2021-09-22T15:03:42Z

@sunshinemyson thanks for the PR. did an initial review. could you post up an RFC (eg. open a PR to github.com/apache/tvm-rfcs) or pre-RFC to discuss.tvm.ai giving the general approach and some background on your accelerator so we can better understand this PR? it's quite large and some context would be helpful. you might also consider committing just a couple ops initially so we can see/validate the approach, then accept the rest as follow-on PRs.

https://discuss.tvm.apache.org/t/byoc-add-verisilicons-npu-support/11097?u=sven

* Add readme for VSI-NPU Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com> * Update README.VSI.md Fix wrong env variable name * pytorch model support * add dropout op * add type constraint for RemoveClipAfterRequantize * fix Conv2D op for depthwise conv2d Co-authored-by: xiang.zhang <xiang.zhang@verisilicon.com>

Co-authored-by: zhouheng.zheng <zhouheng.zheng@ouotlook.com>

1. fix per-axis quantized conv2d/dense export issue on vsi_npu model: pytorch quantized googlenet

areusch · 2022-03-24T20:37:06Z

hey @sunshinemyson what's the status of this PR? is it something you want to contribute still? we have a weekly TVM Community Meeting now which could be a great forum to present your design in a high-bandwidth setting.

areusch · 2022-04-08T23:05:50Z

closing due to inactivity, but feel free to reopen!

ChingHan0921 · 2024-11-14T05:36:00Z

Hello, we are using the ai-benchmark model (mobilenet_v3_quant.tflite) for inference testing, but we are unable to run the inference successfully. May I ask if there are any plans to support inference with MobileNetV3?

Support verisilicon's NPU with BYOC framework

5f44f6d

workflow: 1. Define special pattern can be fused by NPU in python. 2. Generate compiled model with BYOC implementation. 3. Execute compiled model with simple runtime impl. Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com>

areusch self-assigned this Sep 21, 2021

areusch added the status: need review label Sep 21, 2021

areusch requested changes Sep 21, 2021

View reviewed changes

masahi self-assigned this Sep 22, 2021

refine the code

2354bf1

sunshinemyson requested a review from icemelon as a code owner November 23, 2021 10:06

zhouheng.zheng and others added 4 commits November 26, 2021 22:45

fix the bug of multi outputs graph

2b3ed37

yolo3 graph result mistake (#12)

0881ad7

Co-authored-by: zhouheng.zheng <zhouheng.zheng@ouotlook.com>

add tvmc support (#13)

b70d13c

sunshinemyson requested a review from leandron as a code owner December 10, 2021 08:26

fix per-axis conv2d/dense export fail issue (#14)

6f6a438

1. fix per-axis quantized conv2d/dense export issue on vsi_npu model: pytorch quantized googlenet

areusch closed this Apr 8, 2022


		import tvm._ffi

		tvm._ffi._init_api("relay.vsi_npu.support", __name__) No newline at end of file


		return relay.testing.init.create_workload(net)

		print("Testing {0: <50}".format("ADD"), end="")

Support verisilicon's NPU with BYOC framework #9046

Support verisilicon's NPU with BYOC framework #9046

Uh oh!

Conversation

sunshinemyson commented Sep 20, 2021

Uh oh!

sunshinemyson commented Sep 20, 2021

Uh oh!

areusch commented Sep 21, 2021

Uh oh!

areusch left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

comaniac commented Sep 21, 2021

Uh oh!

masahi commented Sep 22, 2021

Uh oh!

comaniac commented Sep 22, 2021

Uh oh!

sunshinemyson commented Sep 22, 2021

Uh oh!

areusch commented Mar 24, 2022

Uh oh!

areusch commented Apr 8, 2022

Uh oh!

ChingHan0921 commented Nov 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants