[AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307) by rkarhila-amd · Pull Request #332 · SemiAnalysisAI/InferenceX

rkarhila-amd · 2025-12-15T15:39:27Z

Switching dsr1 fp8 to lmsys 0.5.5post3 images for MI355, MI325 and MI300

Note

Switch to upstream SGLang images for dsr1 FP8 on MI300/325/355

Update image tags in amd-master.yaml to lmsysorg/sglang:v0.5.5.post3-rocm700-* for dsr1-fp8-mi300x/mi325x/mi355x-sglang
Add benchmarks/dsr1_fp8_mi355x_{docker,slurm}.sh with SGLANG_USE_AITER=1, RCCL_MSCCL_ENABLE=0, ROCM_QUICK_REDUCE_QUANTIZATION=INT4, and server flags --attention-backend aiter, --enable-torch-compile, --cuda-graph-max-bs 128
Append corresponding entry to perf-changelog.yaml

^{Written by Cursor Bugbot for commit 1324e9e. This will update automatically on new commits. Configure here.}

* Switching dsr1 fp8 to lmsys images for MI355, MI325 and MI300 --------- Co-authored-by: Chun Fang <chun.fang@amd.com>

chatgpt-codex-connector · 2025-12-15T15:39:31Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…25_MI355

cquil11 · 2025-12-15T16:09:53Z

Reminder:

PR 267 has been merged. With this. sweeps will no longer run nightly, rather they will run only when necessary as indicated by the perf-changelog.yaml file at the root of the repo. Going forward, when developers make changes to configs that have performance impact, they must note that change in perf-changelog.yaml and give a brief description of the changes. Once their PR is ready for review, they can add the sweep-enabled label to trigger a test sweep on their local branch. Once everything looks good, they can merge to main and an official sweep will be run for the specified configs.

So for this PR, you will add something like the following entry to the bottom of perf-changelog.yaml:

- config-keys:
    - dsr1-fp8-mi300x-sglang
    - dsr1-fp8-mi325x-sglang
    - dsr1-fp8-mi355x-sglang
  description: |
    - Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8
    PR: https://github.com/InferenceMAX/InferenceMAX/pull/332

Then add the sweep-enabled tag to the PR after marking it ready for review to run a test sweep. After the test sweep is done, please link the run in your PR description.

…25_MI355

cquil11 · 2025-12-22T23:53:36Z

/gemini review

gemini-code-assist

Code Review

This pull request updates the SGLang Docker images for dsr1-fp8 benchmarks on MI300, MI325, and MI355 to a newer upstream version. It also adjusts the benchmark scripts for MI355x to include new environment variables and server arguments required by the updated image.

My review has identified a couple of issues:

The benchmark scripts for MI300x and MI325x have not been updated with the new required settings, which will likely cause those benchmarks to fail. This is a high-priority issue.
The changelog entry contains an incorrect pull request number.

Please see the detailed comments for suggestions on how to address these points.

gemini-code-assist · 2025-12-22T23:55:02Z

+    - dsr1-fp8-mi355x-sglang
+  description: |
+    - Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8
+    PR: https://github.com/InferenceMAX/InferenceMAX/pull/332


The PR number in the link appears to be incorrect. This pull request is #307, but the link points to #332. Please update it to ensure the changelog is accurate.

PR: https://github.com/InferenceMAX/InferenceMAX/pull/307

…25_MI355

cquil11 · 2025-12-30T23:11:47Z

@rkarhila-amd is this ready to ship? tests pass. please see code review comments
cc @chunfangamd @ppanchal-1

cquil11

lgtm

…25_MI355

Updated the pull request link for dsr1fp8 in the changelog.

functionstackx · 2026-01-05T15:40:38Z

@rkarhila-amd @cquil11 ready to merge? successfully validation here https://github.com/InferenceMAX/InferenceMAX/actions/runs/20609406656

cursor · 2026-01-05T15:43:23Z

    --mem-fraction-static 0.8 --disable-radix-cache \
    --num-continuous-decode-steps 4 \
    --max-prefill-tokens 196608 \
+    --enable-torch-compile \


MI300x and MI325x scripts missing flags for new image

The PR updates the SGLang image from v0.5.2 to v0.5.5.post3 for all three platforms (mi300x, mi325x, mi355x), but only the mi355x benchmark scripts were updated with the new flags (--attention-backend aiter, --enable-torch-compile, RCCL_MSCCL_ENABLE=0, ROCM_QUICK_REDUCE_QUANTIZATION=INT4). The existing dsr1_fp8_mi300x_*.sh and dsr1_fp8_mi325x_*.sh scripts lack these flags despite also receiving the new image version. This inconsistency may cause mi300x and mi325x benchmarks to fail or produce suboptimal results with the new upstream image.

Additional Locations (1)

benchmarks/dsr1_fp8_mi355x_slurm.sh#L14-L34

Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)

20ca1bc

* Switching dsr1 fp8 to lmsys images for MI355, MI325 and MI300 --------- Co-authored-by: Chun Fang <chun.fang@amd.com>

rkarhila-amd requested a review from a team as a code owner December 15, 2025 15:39

github-project-automation Bot added this to InferenceMAX Board Dec 15, 2025

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

464b397

…25_MI355

cquil11 added 2 commits December 15, 2025 10:16

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

41deedd

…25_MI355

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

106270c

…25_MI355

cquil11 moved this to In Progress in InferenceMAX Board Dec 15, 2025

add changes to perf changelog

e5a824f

cquil11 added AMD sweep-enabled labels Dec 15, 2025

cquil11 temporarily deployed to fork-pr-validation December 15, 2025 22:20 — with GitHub Actions Inactive

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

1729ddd

…25_MI355

gemini-code-assist Bot reviewed Dec 22, 2025

View reviewed changes

cquil11 added 4 commits December 30, 2025 09:03

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

1912115

…25_MI355

update perf changelog

aae7cde

update perf changelog pt 2

2d0e88f

update perf changelog pt 3

70f4b90

cquil11 approved these changes Dec 30, 2025

View reviewed changes

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

226b40c

…25_MI355

cursor Bot reviewed Dec 30, 2025

View reviewed changes

Comment thread .github/configs/amd-master.yaml

cquil11 and others added 4 commits December 30, 2025 19:00

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

55e53b2

…25_MI355

Merge branch 'main' into rkarhila/update_images_for_dsr1fp8_MI300_MI3…

6030525

…25_MI355

remove deleted links in .yaml

3b1299c

fix AMD's perf-changelog.yaml

1324e9e

Updated the pull request link for dsr1fp8 in the changelog.

functionstackx approved these changes Jan 5, 2026

View reviewed changes

cquil11 merged commit 70c1ece into main Jan 5, 2026
13 of 50 checks passed

cquil11 deleted the rkarhila/update_images_for_dsr1fp8_MI300_MI325_MI355 branch January 5, 2026 15:42

github-project-automation Bot moved this from In Progress to Done in InferenceMAX Board Jan 5, 2026

cursor Bot reviewed Jan 5, 2026

View reviewed changes

This was referenced Jan 5, 2026

AMD needs to use upstream SGLang image for MI300X #291

Closed

AMD needs to use upstream SGLang image for MI325X #292

Closed

cquil11 changed the title ~~Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)~~ [AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307) Apr 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)#332

[AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)#332
cquil11 merged 15 commits intomainfrom
rkarhila/update_images_for_dsr1fp8_MI300_MI325_MI355

rkarhila-amd commented Dec 15, 2025 •

edited by cursor Bot

Loading

Uh oh!

chatgpt-codex-connector Bot commented Dec 15, 2025

Uh oh!

cquil11 commented Dec 15, 2025

Uh oh!

cquil11 commented Dec 22, 2025

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Dec 22, 2025

Uh oh!

cquil11 commented Dec 30, 2025 •

edited

Loading

Uh oh!

cquil11 left a comment

Uh oh!

Uh oh!

functionstackx commented Jan 5, 2026

Uh oh!

Uh oh!

cursor Bot Jan 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

rkarhila-amd commented Dec 15, 2025 • edited by cursor Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chatgpt-codex-connector Bot commented Dec 15, 2025

Uh oh!

cquil11 commented Dec 15, 2025

Uh oh!

cquil11 commented Dec 22, 2025

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

gemini-code-assist Bot Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

cquil11 commented Dec 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cquil11 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

functionstackx commented Jan 5, 2026

Uh oh!

Uh oh!

cursor Bot Jan 5, 2026

Choose a reason for hiding this comment

MI300x and MI325x scripts missing flags for new image

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rkarhila-amd commented Dec 15, 2025 •

edited by cursor Bot

Loading

cquil11 commented Dec 30, 2025 •

edited

Loading