Skip to content

Adjust MiniMax MI355X block size for TP8 EP8#1228

Open
jiacao-amd wants to merge 1 commit intoSemiAnalysisAI:mainfrom
jiacao-amd:minimax-block16-tp8ep8-block32
Open

Adjust MiniMax MI355X block size for TP8 EP8#1228
jiacao-amd wants to merge 1 commit intoSemiAnalysisAI:mainfrom
jiacao-amd:minimax-block16-tp8ep8-block32

Conversation

@jiacao-amd
Copy link
Copy Markdown
Collaborator

Summary

  • default MiniMax MI355X vLLM runs to block size 16 with shuffled KV cache layout enabled
  • special-case TP8/EP8 to disable shuffled KV cache layout and use block size 32

Testing

  • bash -n benchmarks/single_node/minimaxm2.5_fp8_mi355x.sh

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@jiacao-amd jiacao-amd force-pushed the minimax-block16-tp8ep8-block32 branch from c01e0b6 to d66409b Compare April 29, 2026 16:41
@jiacao-amd
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm

@github-actions
Copy link
Copy Markdown
Contributor

@jiacao-amd Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25121853688
Command: test-config --config-files .github/configs/amd-master.yaml --config-keys minimaxm2.5-fp8-mi355x-vllm
Pinned ref: d66409b
Approval: not required (trusted collaborator).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant