Added SVE SIMD flag for CPU kernels#330
Added SVE SIMD flag for CPU kernels#330hrushitfujitsu wants to merge 3 commits intohuggingface:mainfrom
Conversation
| check_for_sve(HAVE_SVE) | ||
| if(HAVE_SVE) | ||
| set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv8.2-a+sve") | ||
| endif() |
There was a problem hiding this comment.
This will end up compiling all source files with armv8.2-a+sve if the compiler supports it, which means CPU kernels will fail on all ARM64 devices without SVE.
I think the best way to tackle this is to have some code with the default flags that does a CPU feature check and dispatches to SVE/non-SVE paths (even if the non-SVE path just raises an error message). The kernel with the SVE path can then be compiled with the -march=armv8.2-a+sve flag.
Here is a similar case where a kernel is compiled to work with AVX512 and non-AVX512:
https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L8
https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L19
The current CPU backend does not include support for Arm SVE. This PR adds SVE to the kernel-builder to enable and validate SVE support at cmake configuration phase in CPU kernels.
This is done to support SVE implementation of mamba sequential scan algorithm: huggingface/transformers#38185