-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[microTVM]Gemmini code generation using microTVM #13770
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for contributing to TVM! Please refer to the contributing guidelines https://tvm.apache.org/docs/contribute/ for useful information and tips. Please request code reviews from Reviewers by @-ing them in a comment. Generated by tvm-bot |
|
@fzi-peccia fzi-peccia FYI, I have plan to review this PR this week. |
mehrdadh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a first pass.
apps/microtvm/gemmini/README.md
Outdated
| @@ -0,0 +1,3 @@ | |||
| This directory contains code to create code for the Gemmini accelerator using microTVM. These tests are then executed on the Spike RISC-V ISA simulator. | |||
|
|
|||
| In order to use this correctly, the Spike simulator has to be installed. This can be done by following the steps found on the Chipyard repository. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Link to instruction is missing
| @@ -0,0 +1,68 @@ | |||
| include $(abs_top_srcdir)/Makefrag | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you consolidate all the Makefiles to a Makefile.template and modify it based on the project type in generate_project step?
| with open(source_dir / "model.h", "w") as f: | ||
| f.write(model_h_template.substitute(template_values)) | ||
|
|
||
| # Arduino ONLY recognizes .ino, .ccp, .c, .h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove?
| """Changes all #include statements in project_dir to be relevant to their | ||
| containing file's location. | ||
|
|
||
| Arduino only supports includes relative to a file's location, so this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix the function description
cmake/modules/contrib/Gemmini.cmake
Outdated
| @@ -0,0 +1,117 @@ | |||
| if(USE_MICRO) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be a separate flag which is disabled by default, maybe use USE_GEMMINI
| @@ -0,0 +1,395 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tutorial files should move to somewhere under gallery/how_to/. Also you need to change the format to .py file and write it in sphinx format. Now we support notebook generation and google colab, so you can even add cells to install all the dependencies and run it in google colab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback, I am on it
mehrdadh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did another pass. Thanks @fzi-peccia!
| @@ -0,0 +1,74 @@ | |||
| # Licensed to the Apache Software Foundation (ASF) under one | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is highly recommended to use CMakeFile instead of Makefile to make it cross-platform compatible.
| ENV = Environment.instance() | ||
|
|
||
|
|
||
| def create_header_file( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not reuse the existing create_header_file function in TVM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed to use the standard create_header_file from tvm.micro.testing.utils, but I changed a line in it to generate a define, instead of a const. I think this should be a define, but if that is not the case, I will need to continue using my own create_header_file
… `requantize` (apache#13578) * wip * hack to convert size-1 scale and zp tensors to scalar * fix to binary op fast path * check output zp * add assert * add comment * lint * clean up beta handling * use regular binary op only for 32 bit add (bias addition) * do float(beta) when we know that beta is not None * restore original beta handling code to avoid mul by 1 * add comment on overflow
This fixes `ecr_pull` so that `docker-images.ini` can be updated with Docker images from a previous CI run for testing purposes Example run: https://ci.tlcpack.ai/blue/organizations/jenkins/tvm-cortexm/detail/PR-13590/4/pipeline/#step-80-log-9
…e_write (apache#13510) Add optional consumer blocks to cache_write.
Fix get tm allow_missing check pos
* add baddbmm conversion * fix * suppress lint
* [OpenCL][CI] Enable OpenCL cpp tests in CI * Add building gtest for OpenCL in GPU build * Fix CI build * Change OpenCL cpp tests build approach * Fix lint * Try to enable test in CI * Update version of gpu docker image * Change script mod
…che#12684) [Relay] Bug fix in relay.squeeze function. Also added functionality for parameter axis of type int
The current implementation of `CombineParallelDense` is hardcoded to slice along the last axis after the combined dense. I hit an error using this pass on the stable diffusion UNet, since it has a combined group where the dense is followed by `expand_dims` which changes the slicing axis (see https://github.com/masahi/torchscript-to-tvm/blob/master/stable-diffusion/compile.py for repro) ``` %76 = concatenate(%74) /* ty=Tensor[(20160, 1280), float32] */; %79 = concatenate(%77) /* ty=Tensor[(20160), float32] */; %78 = nn.dense(%75, %76, units=20160) /* ty=Tensor[(2, 20160), float32] */; %80 = nn.bias_add(%78, %79, axis=-1) /* ty=Tensor[(2, 20160), float32] */; %81 = expand_dims(%80, axis=2) /* ty=Tensor[(2, 20160, 1), float32] */; %82 = expand_dims(%81, axis=3) /* ty=Tensor[(2, 20160, 1, 1), float32] */; ``` The correct way to generate `strided_slice`: ``` %84 = strided_slice(%82, begin=[0, 0, 0, 0], end=[-1, 320, -1, -1], strides=[1, 1, 1, 1], slice_mode="size", axes=None) /* ty=Tensor[(2, 320, 1, 1), float32] */; ``` As I documented in the code, this fix is probably not 100% fail-proof. I think this is a difficult problem, since it requires tracking how the original output-channel axis of the combined dense moves across shape-changing operations like `reshape /transpose / split`. But this is at least "more correct" than the current implementation, so I'm submitting this fix as is for now. With this fix, `CombineParallelDense` works successfully on the stable diffusion UNet, and it reduces the number of `nn.dense` from 184 to 100.
… buffer (apache#13605) * Fix PlanAndUpdateBufferAllocationLocation not visiting constant buffer * add comment
…ache#13414) Enable depthwise conv2d NHWC with HWIO kernel layout. The default kernel layout is HWOI, matched to previous behavior.
…che#13602) * Add support for SequenceAt and SplitToSequence to onnx importer * Formatting * Change keepdims comparison * Only unify non-tuples in If
…#13606) * introduce LowerToPrimFunc to lower Relay func to TIR prim func * add doc * expose to python * adding test * another minor doc update * Verify that the input is a primitive function
…CopyConstants scheduler (apache#13588) In Ethos-U, CopyConstants scheduler currently copies weights for all operators. But in Vela, there are a number of scenarios where the weights are not buffered in SRAM, and FullyConnected case is one of them.
* fixed test * fix flag for arduino
Pass `std::nullopt` to initialization of `PassBuilder` for `PGOOptions`. LLVM is moving away from its own `Optional` type to `std::optional`.
…13616) default_rng was introduced in numpy 1.19, which is not present even in Ubuntu 20.04 (it comes with 1.17.4).
…abase (apache#13611) [Metaschedule] Align get_top_k logic in MemoryDatabase and JSONDatabase
…ase (apache#13618) * fixed tensor core batch_matmul legalize for transpose_b = False case * add test * clean up
…che#13615) In the Relay Matmul shape relation, we are a little over enthusiastic about unifying dynamic shapes. If one of the shapes is static, it does not need to be unified. This change only rewrites dynamic shapes to required static constraints. * Remove overwriting of matmul shapes when they are static * Simplify nesting * Add shape check to dense tests.
[Frontend] [ONNX] Support sequence_lens of GRU. Support convert sequence_lens input of GRU.
* [ETHOSN] Add support for experimental compiler option The support library currently supports enabling the experimental cascading compiler option via an environment variable `FORCE_EXPERIMENTAL_COMPILER`. This commit exposes the ability to enable this option through TVMC.
…#13622) * Fix print round-tripable multi thread env binding * add unittest
|
Thanks for the feedback @mehrdadh. I will work on this changes this week and let you know when everything is applied |
|
Hi @mehrdadh, all tests have passed except these two:
|
Added integration to generate C code able to execute neural networks on the Gemmini accelerator. Information about this can be found on this post