Skip to content

Conversation

@Zzz9990
Copy link
Contributor

@Zzz9990 Zzz9990 commented Dec 25, 2025

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

exp_bias: Optional[Tensor] = None,
activation: Optional[int] = 0,
block_m: Optional[int] = 32,
b_nt_type: Optional[int] = 0,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make args signature aligned, for example align with line 343

activation: Optional[int] = 0,
block_m: Optional[int] = 32,
b_nt_type: Optional[int] = 0,
split_k: Optional[int] = 1,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"b_nt_type, split_k" or "split_k,b_nt_type" ?



@functools.lru_cache(maxsize=2048)
def get_ksplit(token, topk, expert, inter_dim, model_dim):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

duplicate with line 514?


if split_k > 1:
if activation == ActivationType.Silu:
aiter.silu_and_mul(out, tmp_out.to(out.dtype))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove .to(out.dtype)?

if max_diff_exp_sums < tolerance:
print(
f" exp_sums TEST PASSED: Max difference ({max_diff_exp_sums:.6e}) < tolerance ({tolerance})"
f"? exp_sums TEST PASSED: Max difference ({max_diff_exp_sums:.6e}) < tolerance ({tolerance})"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...

head_dim=head_dim,
)

print(f" [{idx}/{total}] {kernel_type} test passed")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert these?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants