Skip to content

[VTA][Relay] Reshape mismatch and Compile fail when running yolo-v3-tiny model in tutorial deploy_detection.py #7301

@codgeek

Description

@codgeek

Reproduce code:
config set(USE_VTA_FSIM ON) in build/config.cmake , build tvm with vta on host

cd ${TVM_HOME} && python vta/tutorials/frontend/legacy/deploy_detection.py

description:
I am using the newest version code on 'Fri Jan 15 07:59:28 2021' with pynq Z1 board. Error occurs inside graph_pack() when running yolo-v3-tiny in deploy_detection.py.

File "vta/tutorials/frontend/legacy/deploy_detection.py", line 243, in <module>
    stop_name_idx=pack_dictMODEL_NAME,
  File "tvm/vta/python/vta/top/graphpack.py", line 599, in graph_pack
    expr = run_opt_pass(expr, transform.InferType())
  File "tvm/vta/python/vta/top/graphpack.py", line 30, in run_opt_pass
    mod = opt_pass(mod)
  File "tvm/python/tvm/ir/transform.py", line 127, in call
    return _ffi_transform_api.RunPass(self, mod)
  File "tvm/python/tvm/ffi/ctypes/packed_func.py", line 237, in call
    raise get_last_ffi_error()
File "tvm/src/relay/analysis/type_solver.cc", line 622
TVMError: 
  Check failed: false == false: [00:55:37]  tvm/src/relay/op/tensor/transform.cc:703: 
  Check failed: oshape_sum == data_shape_sum (172380 vs. 173056) : Input tensor shape and reshaped shape are not compatible, 
reshape data_shape:[1, 1, 16, 16, 26, 26], oshape:[1, 255, 26, 26]

I've added more shape info in src/relay/op/tensor/transform.cc as follows:

diff --git a/src/relay/op/tensor/transform.cc b/src/relay/op/tensor/transform.cc
index ecfde359d..7e150e2c9 100644
--- a/src/relay/op/tensor/transform.cc
+++ b/src/relay/op/tensor/transform.cc
@@ -386,7 +386,7 @@ bool TransposeRel(const Array<Type>& types, int num_inputs, const Attrs& attrs,
   // check dimension match
   ICHECK(!axes.defined() || static_cast<int>(axes.size()) == ndim)
       << "Dimension mismatch: axes has " << axes.size() << " elements"
-      << ", but data.ndim = " << ndim;
+      << ", but data.ndim = " << ndim << ", transpose data_shape:" << data->shape << ", axes:" << axes;
   // construct int_axes
   std::vector<int> int_axes;
   int_axes.reserve(ndim);
@@ -701,7 +701,7 @@ bool ReshapeRel(const Array<Type>& types, int num_inputs, const Attrs& attrs,
   }
   if (!found_dynamic) {
     ICHECK_EQ(oshape_sum, data_shape_sum)
-        << "Input tensor shape and reshaped shape are not compatible";
+        << "Input tensor shape and reshaped shape are not compatible" << ", reshape data_shape:" << data_shape << ", oshape:" << oshape;;
   }
 
   reporter->Assign(types[1], TensorType(oshape, data->dtype));

problem:
A relay IR is trying to reshape source tensor with shape [1, 1, 16, 16, 26, 26] in to shape [1, 255, 26, 26] while the high demision is 16*16=256.
I've tried to find the error op, it seems the %162-th op in compiled module has output shape of (1, 255, 26, 26), The error problem is why it's shape is transformed into 256 at previous stages?
Hoping programers knowing vta or relay could offer some help, thanks a lot!

            print("quant modules:", mod)
            # Perform graph packing and constant folding for VTA target
            mod = graph_pack(
                mod["main"],
                env.BATCH,
                env.BLOCK_OUT,
                env.WGT_WIDTH,
                start_name=pack_dict[MODEL_NAME][0],
                stop_name=pack_dict[MODEL_NAME][1],
                start_name_idx=pack_dict[MODEL_NAME][2],
                stop_name_idx=pack_dict[MODEL_NAME][3],
            )

%160 = (%156, %157, %159);
%161 = concatenate(%160, axis=2) /* ty=Tensor[(1, 3, 85, 26, 26), float32] /;
%162 = reshape(%161, newshape=[-1, 255, 26, 26]) /
ty=Tensor[(1, 255, 26, 26), float32] /;
%163 = cast(%111, dtype="int8") /
ty=Tensor[(1, 256, 13, 13), int8] */;

related changes:
after some searching, this is some related issues i’ve found.
@fprotopapa had fixed another issue #7059 in pack_dict configure, change the cast inedx from 185 to 186. 👍 Did your deploy_detection.py run successfully after this change? looking forward to your practice, many thanks!

@huajsj had add the yolo-v3-tiny support in this commit

some additional configs are:
I‘ve verified deploy_detection.py my macbook proDarwin Kernel Version 16.4.0: and ubuntu linux server Ubuntu 18.04.2 LTS (GNU/Linux 5.4.0-58-generic x86_64), the same problem occurs.

target: xilinx pynq Z1.

host: macbook pro
sys-platform: x86_64, macOS-10.12.3-x86_64-i386-64bit
os-version: Darwin Kernel Version 16.4.0:

python: 3.6
llvm: 10.0.0
cmake: 3.15.3
tvm:  build from newest version code  on 'Fri Jan 15 07:59:28 2021': (https://github.com/apache/tvm/tree/c9474639dd3761b78a457ab274603d87a3dcf9b8)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions