[AMD][MI30X]Update Qwen3.5 perf by zhentaocc · Pull Request #986 · SemiAnalysisAI/InferenceX

zhentaocc · 2026-04-01T03:14:59Z

Added new config keys for Qwen3.5 BF16 and FP8 benchmarks on MI300X and MI325X.
Updated Docker image to lmsysorg/sglang:v0.5.10-rocm720-mi30x for better compatibility.
Enhanced benchmark scripts with additional parameters for context length and prefill tokens.
Adjusted memory fraction settings and added new flags for server launch to optimize performance.

e2e Test: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24170176668

github-actions · 2026-04-01T03:15:08Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

github-actions · 2026-04-01T03:15:08Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

cquil11 · 2026-04-05T17:04:45Z

bump @zhentaocc

…ormance - Added new config keys for Qwen3.5 BF16 and FP8 benchmarks on MI300X and MI325X. - Updated Docker image to lmsysorg/sglang:v0.5.10rc0-rocm720-mi30x for better compatibility. - Enhanced benchmark scripts with additional parameters for context length and prefill tokens. - Adjusted memory fraction settings and added new flags for server launch to optimize performance.

…d FP8 configurations on MI300X and MI325X to streamline server launch commands.

…end instead of 'triton' for BF16 and FP8 configurations on MI300X and MI325X.

…rsion for MI300X and MI325X setups. Changed image tag from 'v0.5.10rc0-rocm720-mi30x' to 'v0.5.10-rocm720-mi30x' for consistency and reliability.

cquil11 · 2026-04-14T19:58:43Z

@chunfangamd @zhentaocc This looks good to me, what is the hold up?

functionstackx

Cookbook

functionstackx

lgtm

zhentaocc · 2026-04-15T06:48:49Z

sweep run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24170176668

chunfangamd

lgtm

This reverts commit 20073ba, except for changes to benchmarks/single_node/qwen3.5_{bf16,fp8}_mi355x.sh, which have been preserved to retain PR #1036's subsequent fixes.

[skip-sweep] This reverts commit 20073ba, except for changes to benchmarks/single_node/qwen3.5_{bf16,fp8}_mi355x.sh, which have been preserved to retain PR #1036's subsequent fixes.

zhentaocc requested a review from a team April 1, 2026 03:14

zhentaocc requested review from billishyahao and chunfangamd as code owners April 1, 2026 03:15

github-project-automation Bot added this to InferenceMAX Board Apr 1, 2026

zhentaocc self-assigned this Apr 1, 2026

zhentaocc added AMD sweep-enabled labels Apr 1, 2026

claude Bot reviewed Apr 1, 2026

View reviewed changes

Comment thread benchmarks/single_node/qwen3.5_bf16_mi300x.sh

Comment thread perf-changelog.yaml

functionstackx reviewed Apr 1, 2026

View reviewed changes

Comment thread benchmarks/single_node/qwen3.5_bf16_mi325x.sh

chunfangamd requested changes Apr 1, 2026

View reviewed changes

Chen, Todd added 4 commits April 8, 2026 21:38

Update Qwen3.5 benchmark PR link in perf-changelog.yaml

f05c916

Remove --ep-size parameter from Qwen3.5 benchmark scripts for BF16 an…

c647a86

…d FP8 configurations on MI300X and MI325X to streamline server launch commands.

Update Qwen3.5 benchmark scripts to use 'aiter' as the attention back…

2c17191

…end instead of 'triton' for BF16 and FP8 configurations on MI300X and MI325X.

zhentaocc force-pushed the todd/qwen35-mi30x branch from e307c2d to 2c17191 Compare April 9, 2026 02:40

Update Qwen3.5 benchmark configurations to use stable Docker image ve…

970742b

…rsion for MI300X and MI325X setups. Changed image tag from 'v0.5.10rc0-rocm720-mi30x' to 'v0.5.10-rocm720-mi30x' for consistency and reliability.

zhentaocc removed the sweep-enabled label Apr 10, 2026

functionstackx requested changes Apr 15, 2026

View reviewed changes

Comment thread benchmarks/single_node/qwen3.5_bf16_mi300x.sh

functionstackx approved these changes Apr 15, 2026

View reviewed changes

Merge branch 'main' into todd/qwen35-mi30x

a1c288e

chunfangamd requested review from 1am9trash, seungrokj and yctseng0211 as code owners April 15, 2026 06:35

chunfangamd self-requested a review April 15, 2026 06:53

chunfangamd approved these changes Apr 15, 2026

View reviewed changes

chunfangamd merged commit 20073ba into main Apr 15, 2026
21 checks passed

chunfangamd deleted the todd/qwen35-mi30x branch April 15, 2026 06:54

github-project-automation Bot moved this to Done in InferenceMAX Board Apr 15, 2026

claude Bot mentioned this pull request Apr 16, 2026

[AMD][MI35X]Update qwen3.5 perf #1036

Merged

cquil11 mentioned this pull request Apr 17, 2026

Revert "[AMD][MI30X]Update Qwen3.5 perf (#986)" [skip-sweep] #1062

Merged

cquil11 mentioned this pull request Apr 17, 2026

[AMD][MI30X]Update Qwen3.5 perf #1063

Merged

Conversation

zhentaocc commented Apr 1, 2026 • edited by chunfangamd Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

cquil11 commented Apr 5, 2026

Uh oh!

cquil11 commented Apr 14, 2026

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

functionstackx left a comment

Choose a reason for hiding this comment

Uh oh!

zhentaocc commented Apr 15, 2026

Uh oh!

chunfangamd left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zhentaocc commented Apr 1, 2026 •

edited by chunfangamd

Loading