Skip to content

Conversation

@zufayu
Copy link
Contributor

@zufayu zufayu commented Jul 24, 2025

No description provided.

Copilot AI review requested due to automatic review settings July 24, 2025 06:28
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request fixes a critical bug in the splitK selection logic for GEMM operations. The main issue was using XOR operator (^) instead of left bit shift (<<) when calculating splitK values from log2_k_split parameter.

  • Fixes incorrect bitwise XOR operation that was producing wrong splitK values
  • Simplifies conditional logic in kernel selection heuristics
  • Corrects log2 calculation implementation and dictionary storage format

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
op_tests/test_gemm_a4w4.py Uncomments log2_k_split parameter to enable testing of the fixed functionality
csrc/py_itfs_cu/asm_gemm_a4w4.cu Fixes splitK calculation bug, simplifies selection logic, and corrects log2 computation

@valarLip valarLip merged commit 715a237 into main Jul 24, 2025
13 checks passed
@valarLip valarLip deleted the a4w4_asm_pro_max branch July 24, 2025 10:33
cagrikymk pushed a commit that referenced this pull request Jul 30, 2025
Co-authored-by: zufayu <zufayu@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants