Skip to content

[AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)#332

Merged
cquil11 merged 15 commits intomainfrom
rkarhila/update_images_for_dsr1fp8_MI300_MI325_MI355
Jan 5, 2026
Merged

[AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307)#332
cquil11 merged 15 commits intomainfrom
rkarhila/update_images_for_dsr1fp8_MI300_MI325_MI355

Conversation

@rkarhila-amd
Copy link
Copy Markdown
Collaborator

@rkarhila-amd rkarhila-amd commented Dec 15, 2025

  • Switching dsr1 fp8 to lmsys 0.5.5post3 images for MI355, MI325 and MI300

Note

Switch to upstream SGLang images for dsr1 FP8 on MI300/325/355

  • Update image tags in amd-master.yaml to lmsysorg/sglang:v0.5.5.post3-rocm700-* for dsr1-fp8-mi300x/mi325x/mi355x-sglang
  • Add benchmarks/dsr1_fp8_mi355x_{docker,slurm}.sh with SGLANG_USE_AITER=1, RCCL_MSCCL_ENABLE=0, ROCM_QUICK_REDUCE_QUANTIZATION=INT4, and server flags --attention-backend aiter, --enable-torch-compile, --cuda-graph-max-bs 128
  • Append corresponding entry to perf-changelog.yaml

Written by Cursor Bugbot for commit 1324e9e. This will update automatically on new commits. Configure here.

* Switching dsr1 fp8 to lmsys images for MI355, MI325 and MI300

---------

Co-authored-by: Chun Fang <chun.fang@amd.com>
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Dec 15, 2025

Reminder:

PR 267 has been merged. With this. sweeps will no longer run nightly, rather they will run only when necessary as indicated by the perf-changelog.yaml file at the root of the repo. Going forward, when developers make changes to configs that have performance impact, they must note that change in perf-changelog.yaml and give a brief description of the changes. Once their PR is ready for review, they can add the sweep-enabled label to trigger a test sweep on their local branch. Once everything looks good, they can merge to main and an official sweep will be run for the specified configs.

So for this PR, you will add something like the following entry to the bottom of perf-changelog.yaml:

- config-keys:
    - dsr1-fp8-mi300x-sglang
    - dsr1-fp8-mi325x-sglang
    - dsr1-fp8-mi355x-sglang
  description: |
    - Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8
    PR: https://github.com/InferenceMAX/InferenceMAX/pull/332

Then add the sweep-enabled tag to the PR after marking it ready for review to run a test sweep. After the test sweep is done, please link the run in your PR description.

@cquil11 cquil11 moved this to In Progress in InferenceMAX Board Dec 15, 2025
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Dec 22, 2025

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the SGLang Docker images for dsr1-fp8 benchmarks on MI300, MI325, and MI355 to a newer upstream version. It also adjusts the benchmark scripts for MI355x to include new environment variables and server arguments required by the updated image.

My review has identified a couple of issues:

  • The benchmark scripts for MI300x and MI325x have not been updated with the new required settings, which will likely cause those benchmarks to fail. This is a high-priority issue.
  • The changelog entry contains an incorrect pull request number.

Please see the detailed comments for suggestions on how to address these points.

Comment thread .github/configs/amd-master.yaml
Comment thread .github/configs/amd-master.yaml
Comment thread perf-changelog.yaml Outdated
- dsr1-fp8-mi355x-sglang
description: |
- Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8
PR: https://github.com/InferenceMAX/InferenceMAX/pull/332
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The PR number in the link appears to be incorrect. This pull request is #307, but the link points to #332. Please update it to ensure the changelog is accurate.

    PR: https://github.com/InferenceMAX/InferenceMAX/pull/307

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Dec 30, 2025

@rkarhila-amd is this ready to ship? tests pass. please see code review comments
cc @chunfangamd @ppanchal-1

Copy link
Copy Markdown
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment thread .github/configs/amd-master.yaml
@functionstackx
Copy link
Copy Markdown
Contributor

@cquil11 cquil11 merged commit 70c1ece into main Jan 5, 2026
13 of 50 checks passed
@cquil11 cquil11 deleted the rkarhila/update_images_for_dsr1fp8_MI300_MI325_MI355 branch January 5, 2026 15:42
@github-project-automation github-project-automation Bot moved this from In Progress to Done in InferenceMAX Board Jan 5, 2026
--mem-fraction-static 0.8 --disable-radix-cache \
--num-continuous-decode-steps 4 \
--max-prefill-tokens 196608 \
--enable-torch-compile \
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MI300x and MI325x scripts missing flags for new image

The PR updates the SGLang image from v0.5.2 to v0.5.5.post3 for all three platforms (mi300x, mi325x, mi355x), but only the mi355x benchmark scripts were updated with the new flags (--attention-backend aiter, --enable-torch-compile, RCCL_MSCCL_ENABLE=0, ROCM_QUICK_REDUCE_QUANTIZATION=INT4). The existing dsr1_fp8_mi300x_*.sh and dsr1_fp8_mi325x_*.sh scripts lack these flags despite also receiving the new image version. This inconsistency may cause mi300x and mi325x benchmarks to fail or produce suboptimal results with the new upstream image.

Additional Locations (1)

Fix in Cursor Fix in Web

@cquil11 cquil11 changed the title Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307) [AMD] Use upstream SGLang images on mi300, mi325 and mi355 for dsr1fp8 (#307) Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

3 participants