Skip to content
This repository was archived by the owner on Jan 27, 2026. It is now read-only.

Added SVE SIMD flag for CPU kernels#330

Closed
hrushitfujitsu wants to merge 3 commits intohuggingface:mainfrom
MonakaResearch:sve_check
Closed

Added SVE SIMD flag for CPU kernels#330
hrushitfujitsu wants to merge 3 commits intohuggingface:mainfrom
MonakaResearch:sve_check

Conversation

@hrushitfujitsu
Copy link

@hrushitfujitsu hrushitfujitsu commented Dec 17, 2025

The current CPU backend does not include support for Arm SVE. This PR adds SVE to the kernel-builder to enable and validate SVE support at cmake configuration phase in CPU kernels.

This is done to support SVE implementation of mamba sequential scan algorithm: huggingface/transformers#38185

Comment on lines +46 to +49
check_for_sve(HAVE_SVE)
if(HAVE_SVE)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -march=armv8.2-a+sve")
endif()
Copy link
Member

@danieldk danieldk Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will end up compiling all source files with armv8.2-a+sve if the compiler supports it, which means CPU kernels will fail on all ARM64 devices without SVE.

I think the best way to tackle this is to have some code with the default flags that does a CPU feature check and dispatches to SVE/non-SVE paths (even if the non-SVE path just raises an error message). The kernel with the SVE path can then be compiled with the -march=armv8.2-a+sve flag.

Here is a similar case where a kernel is compiled to work with AVX512 and non-AVX512:

https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L8
https://github.com/huggingface/kernels-community/blob/04a14c8356fa6020746ef47d1fd63ac4c7b5978d/rmsnorm/build.toml#L19

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants