Skip to content

Conversation

@zxybazh
Copy link
Member

@zxybazh zxybazh commented Aug 30, 2022

This PR is a follow up for #12127 with updates on a critical local read cache (d) in data_pack block and scheduling for the kernel parts if available.

This change would bring MS's performance to be aligned with AutoTVM on NCHW Conv2d on CUDA. Benchmarking results to follow. And dispatch priority change will follow up in a separate PR.

CC @vinx13 @junrushao

cc @Hzfengsy @junrushao1994

@zxybazh zxybazh marked this pull request as ready for review August 30, 2022 12:14
@zxybazh
Copy link
Member Author

zxybazh commented Aug 30, 2022

Follow up with CUDA benchmarking results on Geforce RTX 3070, all data layouts are NCHW, padding (1, 1), kernel size (3, 3). Workload is a single conv2d function, dispatched to nn.contrib_conv2d_winograd_without_weight_transform for both AutoTVM and MetaSchedule.

Data Shape Kernel Layout Kernel Shape Winograd MS trials AutoTVM trials MS Perf(ms) AutoTVM Perf(ms) Perf Compare
(1, 512, 7, 7) OIHW (512, 512, 3, 3) Yes 2048 1024 0.05053524796891558 0.05030714765801846 -0.4513687378%
(2, 64, 56, 56) OIHW (64, 64, 3, 3) Yes 2048 1024 0.040951505223880594 0.04111647433704021 0.4028401698%
(2, 48, 56, 56) OIHW (48, 48, 3, 3) Yes 2048 1024 0.029897891745708238 0.030240458663465385 1.1457895449%
(1, 64, 28, 28) OIHW (64, 64, 3, 3) Yes 2048 1024 0.010250015296394941 0.010370220173567484 1.1727287589%
(1, 128, 28, 28) OIHW (128, 128, 3, 3) Yes 2048 1024 0.026061056047912482 0.02657971123398702 1.9901541408%
(1, 64, 56, 56) OIHW (64, 64, 3, 3) Yes 2048 1024 0.022871235249414843 0.023843754776970795 4.2521513025%
(1, 128, 14, 14) OIHW (128, 128, 3, 3) Yes 2048 1024 0.011837556120361336 0.012637039015519788 6.7537833572%
(1, 256, 14, 14) OIHW (256, 256, 3, 3) Yes 2048 1024 0.028101132421289355 0.03071125413188647 9.2883150453%
(1, 80, 73, 73) OIHW (192, 80, 3, 3) Yes 2048 1024 0.2016123825065274 0.22382275871015564 11.0163750497%

@github-actions github-actions bot requested a review from Hzfengsy August 30, 2022 22:45
@zxybazh
Copy link
Member Author

zxybazh commented Aug 30, 2022

@tvm-bot rerun

@zxybazh zxybazh force-pushed the bugfix/2022-08-30/fix-winograd branch from b3cdbdf to bbe5c05 Compare August 30, 2022 22:55
@Hzfengsy Hzfengsy merged commit f7cc992 into apache:main Aug 31, 2022
xinetzone pushed a commit to daobook/tvm that referenced this pull request Nov 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants