support sparse attn mtp by jiayyu · Pull Request #649 · ROCm/ATOM

jiayyu · 2026-04-25T13:47:55Z

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

This PR extends the ROCm AITER MLA sparse-attention path to support MTP (multi-token) decode/verify by introducing per-token sparse metadata handling and a new Triton kernel to gather sparse KV page indices for the per-token layout.

Changes:

Adjust sparse indexer/decode token accounting to handle max_seqlen_q > 1 in MTP verify.
Add a second set of persistent MLA metadata buffers for sparse MTP (per-token layout) and wire them into decode + cudagraph capture paths.
Add a Triton kernel to gather sparse KV indices for MTP per-token layout; update MLA decode path to select correct metadata/indices.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
`atom/models/deepseek_v2.py`	Fix decode token counting for sparse indexer under MTP verify (`batch_size * max_seqlen_q`).
`atom/model_ops/attentions/aiter_mla.py`	Introduce sparse-MTP persistent buffers + per-token metadata path; update decode/cudagraph capture plumbing for sparse MTP.
`atom/model_ops/attention_mla.py`	Route sparse MTP decode through per-token metadata and new Triton gather kernel for KV indices.
`atom/model_engine/model_runner.py`	Minor KV-cache sizing/cudagraph init tweaks related to total layers + sparse indptr reset.
`.github/benchmark/models_accuracy.json`	Formatting-only change (newline/EOF).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot

Pull request overview

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

jiayyu added 8 commits April 25, 2026 08:59

sparse attn mtp wip

572c533

maybe I need these

f128dcf

wip

545f87c

remove logs

0d114bd

remove logs

b95fb82

fix mem

d2e4780

run pass

ee7337a

rm logs

c130051

Copilot AI review requested due to automatic review settings April 25, 2026 13:47

Copilot started reviewing on behalf of jiayyu April 25, 2026 13:48 View session

Copilot AI reviewed Apr 25, 2026

View reviewed changes

Comment thread atom/model_ops/attentions/aiter_mla.py

fix format

6f91091

jiayyu force-pushed the sparse_mtp branch from 45a9caf to 6f91091 Compare April 25, 2026 13:57

Copilot AI review requested due to automatic review settings April 28, 2026 03:03

Copilot started reviewing on behalf of jiayyu April 28, 2026 03:04 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

Comment thread atom/model_ops/attentions/aiter_mla.py

Comment thread atom/model_ops/attentions/aiter_mla.py Outdated

jiayyu force-pushed the sparse_mtp branch from ff0bf97 to 4c6cdc5 Compare April 28, 2026 03:33

add test

229f1df

jiayyu force-pushed the sparse_mtp branch from 4c6cdc5 to 229f1df Compare April 28, 2026 05:32

Copilot AI review requested due to automatic review settings April 28, 2026 05:32

Copilot started reviewing on behalf of jiayyu April 28, 2026 05:35 View session

Copilot AI reviewed Apr 28, 2026

View reviewed changes

Comment thread atom/model_ops/attentions/aiter_mla.py

Comment thread atom/model_ops/attentions/aiter_mla.py Outdated

jiayyu added 2 commits April 28, 2026 06:29

fix draft model

1dcbf34

commit

bf2b08f

Copilot AI review requested due to automatic review settings April 28, 2026 12:25

commit

56ca030

valarLip requested a review from junhaha666 April 30, 2026 03:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support sparse attn mtp#649

support sparse attn mtp#649
jiayyu wants to merge 13 commits intomainfrom
sparse_mtp

jiayyu commented Apr 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jiayyu commented Apr 25, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants