-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[Unity] Bump fpA_intB_gemm #16244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Unity] Bump fpA_intB_gemm #16244
Conversation
5911cc0 to
e545429
Compare
e545429 to
47c0f9c
Compare
Updated preprocessing and submodule the support 3D weight for MoE. * update * update * update
The updates to the cutlass kernels made in TVM PR apache#16244 require symbols provided in cuda 7.5+. While the cuda architecture is specified by setting `NVCC_FLAGS` in the `CMakeLists.txt` for each kernel, cmake 3.18+ also sets it based on the `CMAKE_CUDA_ARCHITECTURES` value. If not set, cmake will explicitly pass the compute capability as nvidia's default of 5.2, *EVEN IF* it has already been specified in `NVCC_FLAGS`. Because the kernels cannot compile with compute capability of 5.2, this causes compilation errors. By setting `CMAKE_CUDA_ARCHITECTURES` to `OFF`, cmake does not add 5.2 as a target architecture. See https://cmake.org/cmake/help/latest/policy/CMP0104.html for details on CMake's policy for CUDA architecture flags. See https://cmake.org/cmake/help/latest/policy/CMP0104.html for the default CUDA architecture for each version of CUDA.
The updates to the cutlass kernels made in TVM PR #16244 require symbols provided in cuda 7.5+. While the cuda architecture is specified by setting `NVCC_FLAGS` in the `CMakeLists.txt` for each kernel, cmake 3.18+ also sets it based on the `CMAKE_CUDA_ARCHITECTURES` value. If not set, cmake will explicitly pass the compute capability as nvidia's default of 5.2, *EVEN IF* it has already been specified in `NVCC_FLAGS`. Because the kernels cannot compile with compute capability of 5.2, this causes compilation errors. By setting `CMAKE_CUDA_ARCHITECTURES` to `OFF`, cmake does not add 5.2 as a target architecture. See https://cmake.org/cmake/help/latest/policy/CMP0104.html for details on CMake's policy for CUDA architecture flags. See https://cmake.org/cmake/help/latest/policy/CMP0104.html for the default CUDA architecture for each version of CUDA.
|
@vinx13 how can I build with fpA_intB_gemm in V100,is there a on-off build for fpA_intB_gemm, I come across build error when compile tvm in V100(SM 70) |
|
@JiaqingFu what error did you meet? sm70 should be supported. Build of fpA_intB_gemm is controlled by option |
|
@vinx13 3Q for reply, same config.cmake ,echo set(CMAKE_CUDA_ARCHITECTURES 86) >> config.cmake work for 3060; |
|
@vinx13 I solve it by update cutlass commit id and cutaass_fpA_intB_gemm commit id |

Updated preprocessing and submodule the support 3D weight for MoE.
cc @masahi