Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

[v1.x] [MKLDNN] 2 Conv Tests Failed with MKLDNN + ACL #20265

@Zha0q1

Description

@Zha0q1

I was trying to build MXNet 1.x with MKLDNN with ACL (Arm Compute Library) integration on an Arm instance. I used this cmake config file to integrate MKLDNN with ACL. The build was very performant and would surely benefit MXNet users hugely. I got a 3-4X boost with (16/64, 3, 512, 512) on Resnet compared to MKLDNN with no integration. However two operator unit test failed on this build:

test_operator.test_convolution_grouping 
test_operator.test_convolution_independent_gradients

I tried multiple mkldnn versions (v1.x now points to mkldnn release 2.0 beta 10, I also tried release 2.1 and 2.2) and ACl versions (20.08 and 21.2, 20.08 is Aug 2020 and 21.2 is the latest) and the failures persisted.

This got me suspect that there is some integration issue with MXNet-MKLDNN (more likely) or MKLDNN-ACL. Would someone from the Intel team help share some insights on this?

Would you help triage?@szha @leezu

CC @josephevans @mseth10 @waytrue17 @sandeep-krishnamurthy

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions