Skip to content

[Bug] TVM generates wrong results for ssd300_vgg16 #13296

@saurabh-shandilya

Description

@saurabh-shandilya

Expected behavior

I am trying to run pytorch's ssd300_vgg16 model via TVM's VirtualMachine. I followed the instructions in #10050 to overcome the initial problems and now the network runs fine. However the results are all wrong. They don't match the pytorch reference output and really way off. The code is based on mask-rcnn example provided here - https://tvm.apache.org/docs/how_to/deploy_models/deploy_object_detection_pytorch.html. I've tried targets llvm and cuda both and results are bad for both.

Pytorch reference output *************************
(tensor([[2.4417e+02, 2.2165e+02, 3.6217e+02, 2.9035e+02],
[2.9954e+02, 2.6913e+02, 4.6809e+02, 3.8972e+02],
[1.9778e+02, 1.9647e+02, 2.5620e+02, 3.2252e+02],
[2.7028e+01, 2.0530e+02, 1.0533e+02, 3.5209e+02],
[3.4073e+02, 2.0001e+02, 4.3230e+02, 3.7326e+02],
[1.4415e+02, 2.0797e+02, 1.6326e+02, 2.6495e+02],
........],
grad_fn=),

tensor([0.9727, 0.9540, 0.8788, 0.8229, 0.8031, 0.7723, 0.5995, 0.5492, 0.5456,
0.5138, 0.3374, 0.2810, 0.1931, 0.1798, 0.1736, 0.1671, 0.1607, 0.1527,
0.1512, 0.1445, 0.1230, 0.1188, 0.1182, 0.1161, 0.1160, 0.1159, 0.1152,
0.1151, 0.1129, 0.1125, 0.1120, 0.1094, 0.1079, 0.1076, 0.1070, 0.1059,
0.1059, 0.1057, 0.1047, 0.1025, 0.1004, 0.0993, 0.0989, 0.0981, 0.0975,
0.0968, 0.0963, 0.0961, 0.0941, 0.0940, 0.0936, 0.0934, 0.0930, 0.0925,
0.0918, 0.0908, 0.0902, 0.0902, 0.0901, 0.0899, 0.0893, 0.0889, 0.0874,
...], grad_fn=),
tensor([ 3, 2, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 3, 1, 1, 10, 1, 1,
3, 1, 1, 1, 3, 1, 1, 3, 1, 1, 1, 85, 10, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 10, 1, 1, 3, 3, 1, 1, 1, 1, 1, 1, 1,
1, 1, 2, 1, 1, 1, 1, 10, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1,
10, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1,
1, 10, 1, 3, 1, 1, 1, 1, 1, 1, 2, 1, 2, 2, 3, 3, 10, 1,
1, 2, 3, 3, 3, 3, 1, 10, 1, 10, 1, 3, 3, 1, 3, 3, 1, 1,
1, 1, 10, 1, 10, 3, 10, 1, 1, 3, 3, 1, 1, 1, 1, 1, 3, 1,
1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 10, 10, 1, 1, 1, 2, 10, 10,
3, 1, 1, 1, 1, 3, 1, 10, 1, 1, 1, 1, 1, 1, 1, 1, 3, 3,
1, 1, 3, 10, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 10, 1, 1, 1,
1, 3]))

Actual behavior

boxes = array([[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0. , 0. , 28.269907 , 35.20406 ],
[ 0. , 0. , 28.269907 , 35.20406 ],
[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0. , 0. , 28.269907 , 35.20406 ],
[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0.8428591, 0. , 19.342596 , 20.993477 ],
[ 0. , 0. , 28.269907 , 35.20406 ],
....], dtype=float32), <tvm.nd.NDArray shape=(120,), cuda(0)>

score:
array([0.00351303, 0.00333744, 0.00265404, 0.0024982 , 0.00247583,
0.00239549, 0.00220695, 0.00211583, 0.00201931, 0.00191237,
0.00188276, 0.00166292, 0.00166005, 0.00162623, 0.00150306,
0.00141596, 0.00134664, 0.00130558, 0.001287 , 0.00128138,
0.00121021, 0.00118284, 0.00116823, 0.00111245, 0.00106239,
0.00103616, 0.00102645, 0.00102028, 0.00101472, 0.00100795,
0.00100033, 0.00097783, 0.00094324, 0.00092061, 0.00091043,
....],
dtype=float32), <tvm.nd.NDArray shape=(120,), cuda(0)>

classes:
array([ 1, 84, 62, 1, 62, 84, 3, 85, 47, 3, 77, 72, 44, 10, 75, 85, 72,
51, 47, 67, 10, 44, 49, 46, 73, 31, 74, 46, 50, 64, 8, 86, 31, 32,
67, 81, 28, 15, 48, 33, 37, 16, 64, 27, 78, 6, 79, 9, 27, 86, 15,
28, 76, 8, 90, 57, 63, 60, 70, 55, 13, 2, 52, 38, 61, 16, 59, 43,
87, 41, 6, 43, 88, 54, 2, 53, 39, 13, 17, 58, 40, 18, 4, 14, 35,
41, 34, 35, 20, 42, 82, 7, 40, 42, 56, 5, 80, 37, 65, 36, 4, 36,
24, 21, 89, 22, 19, 25, 11, 23, 83, 30, 29, 68, 66, 71, 45, 12, 26,
69], dtype=int64)]

Environment

tvm 0.9.dev0
Windows
torch 1.10.2
torchvision 0.11.3

Any environment details, such as: Operating System, TVM version, etc

Steps to reproduce

Take the code from
https://tvm.apache.org/docs/how_to/deploy_models/deploy_object_detection_pytorch.html.
Change the torch_model to torchvision.models.detection.ssd300_vgg16

Any help to fix this is appreciated. Thanks!

Metadata

Metadata

Assignees

Labels

needs-triagePRs or issues that need to be investigated by maintainers to find the right assignees to address ittype: bug

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions