-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Issue8717 x86 dws conv2d schedule #9092
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue8717 x86 dws conv2d schedule #9092
Conversation
- Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com>
- Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com>
…-grovety/tvm into update-arm-simd-intrinsic
…-grovety/tvm into update-arm-simd-intrinsic
| return s | ||
|
|
||
|
|
||
| def schedule_depthwise_conv2d_nhwc(outs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrushao1994 @mbrookhart is this location ok? i believe this schedule is generic. could you take a look at it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like x86 always packs depthwise conv to NCHWc, so this will never get hit on an x86 machine. That makes it feel a little misplaced. Any reason not to put it here? https://github.com/apache/tvm/blob/main/python/tvm/topi/generic/conv2d.py
I think that's where a lot of the multi-CPU specialization ends up, i.e, all of the int8 kernels for ARM and X86 go through there.
|
@sergey-grovety could you add a test to exercise this schedule? |
…-grovety/tvm into issue8717-x86-DwsConv2d-schedule
areusch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sergey-grovety i'd like to get @mbrookhart or @junrushao1994 's feedback on schedule placement, then we can merge this.
| if file["path"] in {"CMSIS/DSP/Include", "CMSIS/DSP/Include/dsp", "CMSIS/NN/Include"}: | ||
| include_trees.update({file["path"]: file["sha"]}) | ||
|
|
||
| for path, sha in include_trees.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you mind factoring this into a helper function in python/tvm/contrib/download.py? it's fairly complex so would prefer if we can reuse it if needed.
…m/sergey-grovety/tvm into issue8717-x86-DwsConv2d-schedule" This reverts commit e927567, reversing changes made to 0ccb5a0.
This reverts commit 32ede71. fix format move schedule_depthwise_conv2d_nhwc to generic conv2d, add test for schedule_depthwise_conv2d_nhwc fix test_export_model_library_format_workspace use x86 depthwise_conv2d_nhwc schedule for arm_cpu Add x86 schedule for depthwise_conv2d_nhwc # Conflicts: # python/tvm/relay/op/strategy/arm_cpu.py
47dbcb8 to
e64aea9
Compare
…chedule_depthwise_conv2d_nhwc fix format Revert "fix test_export_model_library_format_workspace" added a missing comma
…7-x86-DwsConv2d-schedule
areusch
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sergey-grovety!
|
Merging due to separate reports of problems with WinMacBuild. |
* [microTVM] Update support for ARMv7m intrinsic - Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com> * [microTVM] Update support for ARMv7m intrinsic - Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com> * Issue 8717 Add schedule for depthwise_conv2d_nhwc * Implemented discussed changes. * Removed unnecessary test files. * Formatting fixed. * Formatting fixed2. * Formatting fixed3. * Formatting fixed4. * Formatting fixed5. * Fixed test time result checking. * Check rebuild. * Formatting fixed. * Formatting fixed. * Add default DepthwiseConv2D schedule in NHWC layout for arm cpu * Fixed micro model library test. Checking size reduced to 16 bytes from 2466816. * Revert "Merge branch 'update-arm-simd-intrinsic' of https://github.com/sergey-grovety/tvm into issue8717-x86-DwsConv2d-schedule" This reverts commit e927567, reversing changes made to 0ccb5a0. * Revert "fix test_export_model_library_format_workspace" This reverts commit 32ede71. fix format move schedule_depthwise_conv2d_nhwc to generic conv2d, add test for schedule_depthwise_conv2d_nhwc fix test_export_model_library_format_workspace use x86 depthwise_conv2d_nhwc schedule for arm_cpu Add x86 schedule for depthwise_conv2d_nhwc # Conflicts: # python/tvm/relay/op/strategy/arm_cpu.py * move schedule_depthwise_conv2d_nhwc to generic conv2d, add test for schedule_depthwise_conv2d_nhwc fix format Revert "fix test_export_model_library_format_workspace" added a missing comma * Revert wrong merge changes * empty commit to force pipeline restart * Add condition to use compute_at for generic schedule_depthwise_conv2d_nhwc Co-authored-by: Sergey Smirnov <Sergey.Smirnov@mir.dev> Co-authored-by: Alex-grovety <Alexey.Yazev@mir.dev>
* [microTVM] Update support for ARMv7m intrinsic - Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com> * [microTVM] Update support for ARMv7m intrinsic - Improved implementaion of gemm function for conv2d - Removed %4 restriction for channels - Added test case to verify SMLAD intrinsic speed acceleration Signed-off-by: Sergey Smirnov <Sergey@grovety.com> * Issue 8717 Add schedule for depthwise_conv2d_nhwc * Implemented discussed changes. * Removed unnecessary test files. * Formatting fixed. * Formatting fixed2. * Formatting fixed3. * Formatting fixed4. * Formatting fixed5. * Fixed test time result checking. * Check rebuild. * Formatting fixed. * Formatting fixed. * Add default DepthwiseConv2D schedule in NHWC layout for arm cpu * Fixed micro model library test. Checking size reduced to 16 bytes from 2466816. * Revert "Merge branch 'update-arm-simd-intrinsic' of https://github.com/sergey-grovety/tvm into issue8717-x86-DwsConv2d-schedule" This reverts commit e927567, reversing changes made to 0ccb5a0. * Revert "fix test_export_model_library_format_workspace" This reverts commit 32ede71. fix format move schedule_depthwise_conv2d_nhwc to generic conv2d, add test for schedule_depthwise_conv2d_nhwc fix test_export_model_library_format_workspace use x86 depthwise_conv2d_nhwc schedule for arm_cpu Add x86 schedule for depthwise_conv2d_nhwc # Conflicts: # python/tvm/relay/op/strategy/arm_cpu.py * move schedule_depthwise_conv2d_nhwc to generic conv2d, add test for schedule_depthwise_conv2d_nhwc fix format Revert "fix test_export_model_library_format_workspace" added a missing comma * Revert wrong merge changes * empty commit to force pipeline restart * Add condition to use compute_at for generic schedule_depthwise_conv2d_nhwc Co-authored-by: Sergey Smirnov <Sergey.Smirnov@mir.dev> Co-authored-by: Alex-grovety <Alexey.Yazev@mir.dev>
Issue #8717
Add x86 schedule for depthwise_conv2d_nhwc
use x86 depthwise_conv2d_nhwc schedule for arm_cpu