Skip to content

Conversation

@amd-ruitang3
Copy link
Contributor

No description provided.

Copilot AI review requested due to automatic review settings July 23, 2025 02:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces code generation for ASM FMOE (Fused Mixture of Experts) kernel configurations by adding CSV configuration files and Python codegen scripts. The changes support various data types (fp16, bf16), quantization schemes (Int8, Fp8, blockscale), and activation functions (gelu, silu) for the gfx942 GPU architecture.

  • Addition of 19 CSV configuration files containing kernel specifications for different FMOE variants
  • Python code generator script to convert CSV configurations into C++ header file with kernel metadata
  • Updates to function signatures to support activation parameter passing
  • Integration with the build system to generate ASM FMOE configurations

Reviewed Changes

Copilot reviewed 24 out of 417 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
hsa/gfx942/fmoe/*.csv Configuration files defining kernel parameters for various FMOE data type and activation combinations
hsa/gfx942/fmoe/codegen.py Python script that generates C++ header file from CSV configuration data
csrc/include/moe_op.h Updated function signatures to include activation parameter and improve formatting
aiter/ops/moe_op.py Added activation parameter to fmoe function signature
aiter/jit/optCompilerConfig.json Updated build configuration to include FMOE codegen in blob generation
aiter/fused_moe_bf16_asm.py Added activation parameter to fmoe function call

@valarLip valarLip merged commit ce8a2dc into main Jul 31, 2025
14 of 18 checks passed
@valarLip valarLip deleted the asm_fmoe_codegen branch July 31, 2025 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants