-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Closed
Description
My environment:
Linux ziran-pc 5.5.6-1-MANJARO #1 SMP Mon Feb 24 09:24:51 UTC 2020 x86_64 GNU/Linux
CUDA Version: 10.2
Python 3.8.1
gcc (Arch Linux 9.2.1+20200130-2) 9.2.1 20200130
Here is my code, which uses resnet18v1 onnx model.
resnetv1 = onnx.load('models/resnet18v1.onnx')
input_blob = resnetv1.graph.input[0]
input_shape = tuple(map(lambda x: getattr(x, 'dim_value'), input_blob.type.tensor_type.shape.dim))
shape_dict = {input_blob.name: input_shape}
mod_resnetv1, params_resnetv1 = relay.frontend.from_onnx(resnetv1, shape_dict)
mod_q_resnetv1 = quantize(mod_resnetv1, params_resnetv1)
graph, mod, params = relay.build_module.build(mod_q_resnetv1, target='cuda', params=params_resnetv1)
val_data = get_val_data()
for i, batch in enumerate(val_data):
if i > 0:
break
data, categories = batch['data'], batch['label']
m = debug_runtime.create(graph, mod, ctx, dump_root='tvmdbg')
m.set_input('data', tvm.nd.array(data.astype('float32')))
m.run()
tvm_out = m.get_output(0, tvm.nd.empty(tuple([1, 1000]), 'float32')).asnumpy()
Output when TVM is at ([Fix] Fix get_valid_count flaky test for cuda (#4901)):
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:92: Iteration: 0
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #0 fused_nn_conv2d_multiply_add_nn_relu: 1685.52 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #1 fused_nn_max_pool2d_1: 32.843 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #2 fused_multiply_round_clip_cast: 13.9443 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_: 320.88 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3: 321.255 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #5 fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_: 16.196 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #6 fused_cast_25: 12.0867 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1: 319.658 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #8 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4: 322.954 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #9 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2: 15.1093 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #10 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2: 63.3707 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #11 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2: 482.38 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #12 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5: 508.352 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #13 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2: 12.5682 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #14 fused_cast_24: 10.7158 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #15 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3: 506.871 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #16 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6: 510.042 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #17 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1: 12.363 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #18 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1: 77.2029 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #19 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4: 691.62 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #20 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7: 532.286 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #21 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1: 10.7689 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #22 fused_cast_23: 9.9673 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #23 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5: 538.167 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #24 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8: 540.056 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #25 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_: 11.4951 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #26 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_: 104.663 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #27 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6: 962.534 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #28 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9: 1023.26 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #29 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_: 9.9758 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #30 fused_cast_22: 9.3292 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #31 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7: 1025.56 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #32 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10: 1024.85 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #33 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast: 10.0607 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #34 fused_nn_global_avg_pool2d_cast_multiply: 12.0975 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #35 fused_nn_batch_flatten_nn_batch_flatten_multiply: 9.2545 us/iter
[22:55:22] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #36 fused_nn_dense_nn_bias_add: 21.2773 us/iter
Node Name Ops Time(us) Time(%) Shape Inputs Outputs
--------- --- -------- ------- ----- ------ -------
fused_nn_conv2d_multiply_add_nn_relu fused_nn_conv2d_multiply_add_nn_relu 1685.52 14.294 (1, 64, 112, 112) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7 1025.56 8.697 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10 1024.85 8.691 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9 1023.26 8.678 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6 962.534 8.163 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4 691.62 5.865 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8 540.056 4.58 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5 538.167 4.564 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7 532.286 4.514 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6 510.042 4.325 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5 508.352 4.311 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3 506.871 4.299 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2 482.38 4.091 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4 322.954 2.739 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3 321.255 2.724 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_ fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_ 320.88 2.721 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1 319.658 2.711 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_ fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_ 104.663 0.888 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1 77.203 0.655 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2 63.371 0.537 (1, 128, 28, 28) 4 1
fused_nn_max_pool2d_1 fused_nn_max_pool2d_1 32.843 0.279 (1, 64, 56, 56) 1 1
fused_nn_dense_nn_bias_add fused_nn_dense_nn_bias_add 21.277 0.18 (1, 1000) 3 1
fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_ fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_ 16.196 0.137 (1, 64, 56, 56) 2 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2 15.109 0.128 (1, 64, 56, 56) 2 1
fused_multiply_round_clip_cast fused_multiply_round_clip_cast 13.944 0.118 (1, 64, 56, 56) 1 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2 12.568 0.107 (1, 128, 28, 28) 2 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1 12.363 0.105 (1, 128, 28, 28) 2 1
fused_nn_global_avg_pool2d_cast_multiply fused_nn_global_avg_pool2d_cast_multiply 12.097 0.103 (1, 512, 1, 1) 1 1
fused_cast_25 fused_cast_25 12.087 0.103 (1, 64, 56, 56) 1 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_ fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_ 11.495 0.097 (1, 256, 14, 14) 2 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1 10.769 0.091 (1, 256, 14, 14) 2 1
fused_cast_24 fused_cast_24 10.716 0.091 (1, 128, 28, 28) 1 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast 10.061 0.085 (1, 512, 7, 7) 2 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_ fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_ 9.976 0.085 (1, 512, 7, 7) 2 1
fused_cast_23 fused_cast_23 9.967 0.085 (1, 256, 14, 14) 1 1
fused_cast_22 fused_cast_22 9.329 0.079 (1, 512, 7, 7) 1 1
fused_nn_batch_flatten_nn_batch_flatten_multiply fused_nn_batch_flatten_nn_batch_flatten_multiply 9.254 0.078 (1, 512) 1 1
Total_time - 11791.534 - - - -
Output when TVM is at ([Relay][AutoTVM] Relay op strategy (#4644)):
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:92: Iteration: 0
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #0 fused_nn_conv2d_multiply_add_nn_relu: 4584.26 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #1 fused_nn_max_pool2d_1: 30.2865 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #2 fused_multiply_round_clip_cast: 14.6314 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_: 5281.79 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3: 5251.26 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #5 fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_: 19.2247 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #6 fused_cast_25: 12.4631 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1: 5161.39 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #8 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4: 5320.71 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #9 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2: 107.187 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #10 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2: 59.8113 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #11 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2: 426.696 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #12 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5: 9036.95 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #13 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2: 18.7588 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #14 fused_cast_24: 13.5717 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #15 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3: 9323.67 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #16 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6: 9690.43 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #17 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1: 76.843 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #18 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1: 70.4272 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #19 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4: 596.825 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #20 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7: 9047.68 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #21 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1: 56.8034 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #22 fused_cast_23: 10.0938 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #23 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5: 8854.5 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #24 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8: 9212.74 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #25 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_: 14.1323 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #26 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_: 93.6364 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #27 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6: 843.468 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #28 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9: 11918 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #29 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_: 56.1085 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #30 fused_cast_22: 10.012 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #31 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7: 11729.8 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #32 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10: 12051.1 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #33 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast: 38.601 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #34 fused_nn_global_avg_pool2d_cast_multiply: 22.1764 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #35 fused_nn_batch_flatten_nn_batch_flatten_multiply: 9.9415 us/iter
[22:43:06] /home/ziran/repositories/incubator-tvm/src/runtime/graph/debug/graph_runtime_debug.cc:97: Op #36 fused_nn_dense_nn_bias_add: 22.5578 us/iter
Node Name Ops Time(us) Time(%) Shape Inputs Outputs
--------- --- -------- ------- ----- ------ -------
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__10 12051.1 10.119 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__9 11918.0 10.008 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__7 11729.8 9.85 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__6 9690.43 8.137 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__3 9323.67 7.829 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__8 9212.74 7.736 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__7 9047.68 7.597 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__5 9036.95 7.588 (1, 128, 28, 28) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__5 8854.5 7.435 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__4 5320.71 4.468 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_ fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036_ 5281.79 4.435 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__3 5251.26 4.41 (1, 64, 56, 56) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__1 5161.39 4.334 (1, 64, 56, 56) 4 1
fused_nn_conv2d_multiply_add_nn_relu fused_nn_conv2d_multiply_add_nn_relu 4584.26 3.849 (1, 64, 112, 112) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__6 843.468 0.708 (1, 512, 7, 7) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__4 596.825 0.501 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_nn_relu_cas_14207774232819154036__2 426.696 0.358 (1, 128, 28, 28) 4 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__2 107.187 0.09 (1, 64, 56, 56) 2 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_ fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588_ 93.636 0.079 (1, 512, 7, 7) 4 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1 fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948__1 76.843 0.065 (1, 128, 28, 28) 2 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__1 70.427 0.059 (1, 256, 14, 14) 4 1
fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2 fused_nn_conv2d_cast_multiply_add_right_shift_clip_cast_multiply_add_cast_multip_12768018879016187588__2 59.811 0.05 (1, 128, 28, 28) 4 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__1 56.803 0.048 (1, 256, 14, 14) 2 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_ fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089_ 56.108 0.047 (1, 512, 7, 7) 2 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast 38.601 0.032 (1, 512, 7, 7) 2 1
fused_nn_max_pool2d_1 fused_nn_max_pool2d_1 30.287 0.025 (1, 64, 56, 56) 1 1
fused_nn_dense_nn_bias_add fused_nn_dense_nn_bias_add 22.558 0.019 (1, 1000) 3 1
fused_nn_global_avg_pool2d_cast_multiply fused_nn_global_avg_pool2d_cast_multiply 22.176 0.019 (1, 512, 1, 1) 1 1
fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_ fused_multiply_round_clip_cast_cast_left_shift_multiply_add_right_shift_cast_add_2320814265661055830_ 19.225 0.016 (1, 64, 56, 56) 2 1
fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2 fused_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multiply_ad_12564017943341662089__2 18.759 0.016 (1, 128, 28, 28) 2 1
fused_multiply_round_clip_cast fused_multiply_round_clip_cast 14.631 0.012 (1, 64, 56, 56) 1 1
fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_ fused_cast_cast_left_shift_multiply_add_right_shift_cast_add_nn_relu_cast_multip_3103932645001264948_ 14.132 0.012 (1, 256, 14, 14) 2 1
fused_cast_24 fused_cast_24 13.572 0.011 (1, 128, 28, 28) 1 1
fused_cast_25 fused_cast_25 12.463 0.01 (1, 64, 56, 56) 1 1
fused_cast_23 fused_cast_23 10.094 0.008 (1, 256, 14, 14) 1 1
fused_cast_22 fused_cast_22 10.012 0.008 (1, 512, 7, 7) 1 1
fused_nn_batch_flatten_nn_batch_flatten_multiply fused_nn_batch_flatten_nn_batch_flatten_multiply 9.941 0.008 (1, 512) 1 1
Total_time - 119088.537 - - - -
Besides, the accuracy after the commit is close to zero on ILSVRC2012_img_val dataset.
Metadata
Metadata
Assignees
Labels
No labels