Add CMake option to enable saturation checker for ConvSymKernelAvx2 by yihonglyu · Pull Request #24220 · microsoft/onnxruntime

yihonglyu · 2025-03-27T21:24:54Z

Description

This PR adds a new CMake option: onnxruntime_ENABLE_CONVSYMKERNELAVX2_SAT_CHECKER. When enabled, this option activates a saturation checker for the VPMADDUBSW instruction used in the ConvSymKernelAvx2 path.

The checker works by calling a helper function before each VPMADDUBSW instruction. This function simulates the computation using C++ and intrinsics with higher-precision types (int32_t) to detect whether the result exceeds the bounds of int16_t (i.e., greater than INT16_MAX or less than INT16_MIN).

By default, the checker logs a warning only once per inference session. However, the logic can be easily extended to print more frequently if needed. Developers can also reuse this pattern to implement similar saturation checks for other instructions.

Motivation and Context

On some models running with AVX2 (instead of AVX-VNNI), we've observed accuracy degradation due to saturation in vectorized instructions. This saturation checker provides a way to debug and detect those cases by reporting potential overflow in intermediate computations.

- Support cmake option. - Thread safe warning. - Define CheckSaturation macro

Copilot

Copilot reviewed 5 out of 9 changed files in this pull request and generated no comments.

Files not reviewed (4)

cmake/CMakeLists.txt: Language not supported
cmake/onnxruntime_mlas.cmake: Language not supported
onnxruntime/core/mlas/lib/amd64/ConvSymKernelAvx2.asm: Language not supported
onnxruntime/core/mlas/lib/x86_64/ConvSymKernelAvx2.S: Language not supported

Revise comments as well

…icrosoft#24220) ### Description  This PR adds a new CMake option: onnxruntime_ENABLE_CONVSYMKERNELAVX2_SAT_CHECKER. When enabled, this option activates a saturation checker for the VPMADDUBSW instruction used in the ConvSymKernelAvx2 path. The checker works by calling a helper function before each VPMADDUBSW instruction. This function simulates the computation using C++ and intrinsics with higher-precision types (int32_t) to detect whether the result exceeds the bounds of int16_t (i.e., greater than INT16_MAX or less than INT16_MIN). By default, the checker logs a warning only once per inference session. However, the logic can be easily extended to print more frequently if needed. Developers can also reuse this pattern to implement similar saturation checks for other instructions. ### Motivation and Context  On some models running with AVX2 (instead of AVX-VNNI), we've observed accuracy degradation due to saturation in vectorized instructions. This saturation checker provides a way to debug and detect those cases by reporting potential overflow in intermediate computations.

yihonglyu force-pushed the yilyu/dbg-ConvSymKernelAvx2-saturate branch from f57282d to a22b284 Compare April 3, 2025 01:24

yihonglyu changed the title ~~Yilyu/dbg conv sym kernel avx2 saturate~~ Add ConvSymKernelAvx2 assembly saturation checker Apr 3, 2025

yihonglyu marked this pull request as ready for review April 3, 2025 23:02

yihonglyu requested a review from a team as a code owner April 3, 2025 23:02

yihonglyu added 9 commits April 12, 2025 05:09

Draft

8f76e3d

[Draft] Reset counter for every run

224edb8

Add support

25baa3e

- Support cmake option. - Thread safe warning. - Define CheckSaturation macro

Change to option ENABLE_CONVSYMKERNELAVX2_SAT_CHECKER

f0fe5b7

Revise saturation_check.cpp

d7df3b3

Format ConvSymKernelAvx2.S

2c48a47

Use func instead of use var saturation_count directly

d18bfc2

Remove unused comments of ConvSymKernelAvx2.S

6b87c61

Add support for Windows

40f9145

yihonglyu force-pushed the yilyu/dbg-ConvSymKernelAvx2-saturate branch from 7bdc0a2 to 40f9145 Compare April 12, 2025 05:29

Fix reset_saturation_count in multi-target build

0a84554

yihonglyu changed the title ~~Add ConvSymKernelAvx2 assembly saturation checker~~ Add CMake option to enable saturation checker for ConvSymKernelAvx2 Apr 15, 2025

Remove unused extern

b445620

yihonglyu requested review from Copilot and yuslepukhin April 15, 2025 23:15

Copilot AI reviewed Apr 15, 2025

View reviewed changes

yihonglyu requested review from fajin-corp, liqunfu and snnn April 15, 2025 23:16

yihonglyu added 4 commits April 16, 2025 01:31

Consider vpmaddubsw operand in CheckSaturation (Linux)

7cb171f

Consider vpmaddubsw operand in Windows CheckSaturation

416b9f2

Revise comments as well

Revise ConvSymKernelAvx2.S comments

e2816d1

Remove unused headers in saturation_check.cpp

4f9ca09

liqunfu approved these changes Apr 23, 2025

View reviewed changes

yihonglyu merged commit 7e0ee2b into main Apr 28, 2025
84 of 89 checks passed

yihonglyu deleted the yilyu/dbg-ConvSymKernelAvx2-saturate branch April 28, 2025 06:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add CMake option to enable saturation checker for ConvSymKernelAvx2#24220

Add CMake option to enable saturation checker for ConvSymKernelAvx2#24220
yihonglyu merged 15 commits intomainfrom
yilyu/dbg-ConvSymKernelAvx2-saturate

yihonglyu commented Mar 27, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

yihonglyu commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yihonglyu commented Mar 27, 2025 •

edited

Loading