-
Notifications
You must be signed in to change notification settings - Fork 167
A8w8 asm codegen and tune #1161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds ASM A8W8 bpreshuffle int8 codegen and integrates it into the tuning system. The changes extend the existing A8W8 bpreshuffle GEMM implementation to support int8 quantization alongside fp8, introducing new ASM kernels and updating the tuning framework to handle multiple quantization data types.
- Adds ASM int8 kernel configuration and codegen for A8W8 bpreshuffle GEMM
- Refactors tuning framework to support both fp8 and int8 quantization methods via q_dtype_w parameter
- Updates kernel selection logic and API signatures to support new ASM kernels
Reviewed Changes
Copilot reviewed 14 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| hsa/gfx942/i8gemm/i8gemm_bf16_perTokenI8.csv | New kernel configuration for int8 ASM kernels |
| hsa/gfx942/i8gemm/codegen.py | Code generator for ASM i8gemm kernel configurations |
| csrc/py_itfs_cu/asm_gemm_a8w8.cu | Major refactoring of ASM GEMM interface with kernel selection logic |
| csrc/include/rocm_ops.hpp | Updated Python binding parameters for new ASM interface |
| csrc/include/asm_gemm_a8w8.h | Updated function signature for new parameters |
| csrc/ck_gemm_a8w8_bpreshuffle/gen_instances.py | Added filtering for int8 dtype in tuning |
| csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py | Major refactoring to support both fp8 and int8 tuning |
| csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.cu | Updated to support BFloat16 output |
| csrc/ck_gemm_a8w8_bpreshuffle/README.md | Documentation updates for new q_dtype_w parameter |
| aiter/utility/base_tuner.py | Base tuner improvements for result handling |
| aiter/ops/gemm_op_a8w8.py | Updated GEMM operations to use new configuration system |
| aiter/jit/optCompilerConfig.json | Added blob generation command for i8gemm |
| aiter/configs/asm_a8w8_gemm.csv | Updated ASM kernel configurations |
| aiter/configs/a8w8_bpreshuffle_untuned_gemm.csv | Added q_dtype_w column and int8 test cases |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
Motivation
update a8w8 bpreshuffle asm code and add it to tune
Technical Details
Test Plan
python op_tests/test_gemm_a8w8.py
aiter/csrc/ck_gemm_a8w8_bpreshuffle/gemm_a8w8_bpreshuffle_tune.py
Test Result
Submission Checklist