Schedule Transferability between Intel and ARM CPU targets

Relevant discuss post - https://discuss.tvm.ai/t/topi-using-x86-schedules-for-arm-conv2d/6365

Currently, TVM has different schedules for ARM and Intel for conv2d operators. The discuss post listed above shows that Intel conv2d NCHWc schedule on ARM gives better end-to-end latency compared to ARM NCHW conv2d spatial pack schedule for many TFLite networks. 

However, this is just one opportunity and there are also some more ideas that we should pursue. This issue lists those potential issues and anybody interested can pick them up. This list is a result of discussions from the above post.

- [ ] Try ARM winograd NCHW schedule on Intel servers.
- [ ] Write NCHWc winograd for Intel servers to comply with existing NCHWc conv2d implementation - work started by @ajtulloch -https://github.com/apache/incubator-tvm/pull/2111
- [ ] Work on NHWC ARM schedule - tuning and optimization - work started by @jackwish - https://github.com/apache/incubator-tvm/pull/3859
- [ ] Investigate NHWC vs NCHWc schedule - NCHWc can bring data layouts transform. Check if NHWC can achieve same performance as NCHWc conv2d, and eliminate data layout conversion overhead.

@FrozenGene @masahi @tqchen 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Schedule Transferability between Intel and ARM CPU targets #5340

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Schedule Transferability between Intel and ARM CPU targets #5340

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions