ahmed-bsod/gemm a8w8 gluon #1684

ahmed-bsod · 2025-12-18T16:06:27Z

Motivation

Added gluon kernels for GEMM_A8W8 and gluon preshuffled GEMM_A8W8

Test Plan

Ran the test_gemm_a8w8.py script to make sure all functional tests passing

Test Result

Tests passed 🔥

Copilot

Pull request overview

This PR adds gluon kernel implementations for GEMM_A8W8 operations, including both a standard version and a preshuffled weight version. The implementation supports int8 and FP8 (e4m3/e5m2) input types with various output types (bf16, fp16, fp32, int32).

Added two new gluon kernel variants: gemm_a8w8 and gemm_a8w8_preshuffle for AMD CDNA4 architecture
Extended test coverage to validate all three implementations (triton, gluon, gluon_shuffle) across multiple data types and configurations
Updated benchmark suite to support gluon implementations with command-line flags for easy performance comparison

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.

File	Description
`aiter/ops/triton/gluon/gemm_a8w8.py`	New file implementing gluon-based GEMM_A8W8 kernels with standard and preshuffled weight variants
`op_tests/triton_tests/gemm/basic/test_gemm_a8w8.py`	Extended test suite to parametrize over implementation types and added support for int32 output dtype
`op_tests/op_benchmarks/triton/bench_gemm_a8w8.py`	Added command-line arguments for gluon and shuffle flags to enable performance benchmarking
`aiter/ops/triton/configs/gemm/gluon/gfx950-GEMM-A8W8.json`	Configuration file with tuned kernel parameters for gfx950 architecture

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

aiter/ops/triton/gluon/gemm_a8w8.py

…n kernels

ahmed-bsod requested review from a team and Copilot December 18, 2025 16:06

ahmed-bsod force-pushed the ahmed-bsod/gemm_a8w8_gluon branch from fa00faf to ce1e6fa Compare December 18, 2025 16:06

Copilot started reviewing on behalf of ahmed-bsod December 18, 2025 16:07 View session

ahmed-bsod changed the title ~~Ahmed bsod/gemm a8w8 gluon~~ ahmed-bsod/gemm a8w8 gluon Dec 18, 2025

Copilot AI reviewed Dec 18, 2025

View reviewed changes

ahmed-bsod force-pushed the ahmed-bsod/gemm_a8w8_gluon branch 2 times, most recently from af5de67 to acc815a Compare December 18, 2025 16:43

ahmed-bsod requested review from cagrikymk and lburzawa December 18, 2025 16:46

ahmed-bsod force-pushed the ahmed-bsod/gemm_a8w8_gluon branch 6 times, most recently from d069611 to 52fcd9c Compare December 19, 2025 17:56

ahmed-bsod added 2 commits January 2, 2026 12:10

added gluon kernels for normal gemm_a8w8 and preshuffled gemm_a8w8

6a5ec91

updated the gemm_aw8w8 test and bench scripts to support running gluo…

ea4a722

…n kernels

ahmed-bsod force-pushed the ahmed-bsod/gemm_a8w8_gluon branch from 52fcd9c to ea4a722 Compare January 2, 2026 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ahmed-bsod/gemm a8w8 gluon #1684

ahmed-bsod/gemm a8w8 gluon #1684

ahmed-bsod commented Dec 18, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ahmed-bsod/gemm a8w8 gluon #1684

Are you sure you want to change the base?

ahmed-bsod/gemm a8w8 gluon #1684

Conversation

ahmed-bsod commented Dec 18, 2025

Motivation

Test Plan

Test Result

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants