-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Support verisilicon's NPU with BYOC framework #9046
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
workflow:
1. Define special pattern can be fused by NPU in python.
2. Generate compiled model with BYOC implementation.
3. Execute compiled model with simple runtime impl.
Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com>
|
Hi @mbaret , Could you help review our PR for new NPU support? Thanks |
|
@comaniac would you like to help review this too? |
areusch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sunshinemyson thanks for the PR. did an initial review. could you post up an RFC (eg. open a PR to github.com/apache/tvm-rfcs) or pre-RFC to discuss.tvm.ai giving the general approach and some background on your accelerator so we can better understand this PR? it's quite large and some context would be helpful. you might also consider committing just a couple ops initially so we can see/validate the approach, then accept the rest as follow-on PRs.
| @@ -0,0 +1,30 @@ | |||
| if(NOT USE_VSI_NPU STREQUAL "OFF") | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update cmake/config.cmake to contain the default and an example docstring? also, can you make this work for the case that USE_VSI_NPU is not set? i thiiink STREQUAL "OFF" would fail in that case (but i am not a cmake expert)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll fix it.
cmake/modules/contrib/VsiNpu.cmake
Outdated
| endif() | ||
|
|
||
| set(OVXLIB_API_ATTR "__attribute__\(\(visibility\(\"default\"\)\)\)") | ||
| add_definitions(-DOVXLIB_API=${OVXLIB_API_ATTR}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add this to a specific target? same for the below definition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not required. Will remove it.
|
|
||
| import tvm._ffi | ||
|
|
||
| tvm._ffi._init_api("relay.vsi_npu.support", __name__) No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add a newline at the end of the file. here and elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will fix it.
| @@ -0,0 +1,97 @@ | |||
| # Versilicon NPU solution on TVM | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would you like to contribute some of this content as a tutorial in tutorials/ somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to use GitHub wiki if possible.
| cd target_runtime_build | ||
| cp ../cmake/config.cmake ./ | ||
| # add set(USE_VSI_NPU ON) to config.cmake, you can do it with cmake command option too | ||
| cmake -DCMAKE_BUILD_TYPE=Debug -DTIM_VX_INSTALL_DIR=<full_path_to_tim_vx_target_build_install_dir> \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just wondering how come you do a Debug build to deploy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't use debug in deployment. I'll refine it.
| // TODO | ||
| }; | ||
|
|
||
| inline int32_t ConvertAxis(int32_t axisIn, uint32_t dimNum) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add comments about what this is converting from and to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
| }); | ||
| }; | ||
|
|
||
| std::shared_ptr<tvx::Tensor> createVxOPerand(TensorInfoTable tensor_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you follow the C++ style guide here https://google.github.io/styleguide/cppguide.html#Function_Names? e.g.CreateVxOperand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
| return specs; | ||
| } | ||
|
|
||
| using namespace backend; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer to avoid this https://google.github.io/styleguide/cppguide.html#Namespaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will fix.
|
|
||
| return relay.testing.init.create_workload(net) | ||
|
|
||
| print("Testing {0: <50}".format("ADD"), end="") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggest to use logging package for compat with pytest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Acutally, we didn't use pytest in the test scripts. I think we need refactor all the tests with pytest?
| verify_vsi_result(inputs, quantize, [], data_shape, data_shape, output_dtype) | ||
|
|
||
| if __name__ == "__main__": | ||
| #test_qnn_add() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
replace all of this with:
sys.exit(pytest.main([__file__] + sys.argv[1:]))
|
Echo @areusch. It's impossible to review a PR with 5k code changes. |
|
Thanks @masahi. It's no doubt that you are very familiar with BYOC and I would be comfortable to merge the PR if you approve. Meanwhile, following the process of filing an RFC with an upstream plan that gradually adds features is always a better practice. Small size PRs could easily get the second or even third reviews, and can make the final framework more robust and bug-free. I would rather, and be happy to spend one hour to review and merge one small/medium size PR every day, instead of spending days or even months on a PR with many iterations. |
https://discuss.tvm.apache.org/t/byoc-add-verisilicons-npu-support/11097?u=sven |
* Add readme for VSI-NPU Signed-off-by: xiang.zhang <xiang.zhang@verisilicon.com> * Update README.VSI.md Fix wrong env variable name * pytorch model support * add dropout op * add type constraint for RemoveClipAfterRequantize * fix Conv2D op for depthwise conv2d Co-authored-by: xiang.zhang <xiang.zhang@verisilicon.com>
Co-authored-by: zhouheng.zheng <zhouheng.zheng@ouotlook.com>
1. fix per-axis quantized conv2d/dense export issue on vsi_npu model: pytorch quantized googlenet
|
hey @sunshinemyson what's the status of this PR? is it something you want to contribute still? we have a weekly TVM Community Meeting now which could be a great forum to present your design in a high-bandwidth setting. |
|
closing due to inactivity, but feel free to reopen! |
|
Hello, we are using the ai-benchmark model (mobilenet_v3_quant.tflite) for inference testing, but we are unable to run the inference successfully. May I ask if there are any plans to support inference with MobileNetV3? |
workflow:
1. Define special pattern can be fused by NPU in python.
2. Generate compiled model with BYOC implementation.
3. Execute compiled model with simple runtime impl.
Signed-off-by: xiang.zhang xiang.zhang@verisilicon.com
Thanks for contributing to TVM! Please refer to guideline https://tvm.apache.org/docs/contribute/ for useful information and tips. After the pull request is submitted, please request code reviews from Reviewers by @ them in the pull request thread.