sglang-update by yhyang201 · Pull Request #1225 · SemiAnalysisAI/InferenceX

yhyang201 · 2026-04-29T14:33:07Z

sglang-update

github-actions · 2026-04-29T14:33:17Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

github-actions · 2026-04-29T14:33:18Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude · 2026-04-29T14:40:37Z

    - { tp: 4, ep: 1, conc-start: 32, conc-end: 32 }
    - { tp: 4, ep: 4, dp-attn: true, conc-start: 512, conc-end: 512 }
    - { tp: 8, ep: 8, dp-attn: true, conc-start: 8192, conc-end: 8192 }
+    - { tp: 8, ep: 8, dp-attn: true, conc-start: 12288, conc-end: 12288 }


🔴 This PR adds a new conc=12288 recipe to dsv4-fp4-b300-sglang (new search-space row in nvidia-master.yaml plus matching CONC=12288 branch in dsv4_fp4_b300_sglang.sh) but does not add a corresponding entry to perf-changelog.yaml. Per AGENTS.md, perf-changelog.yaml is the documented trigger for benchmark runs on push-to-main, so without an entry the new recipe will not actually run. Please append a perf-changelog.yaml entry for dsv4-fp4-b300-sglang describing the new conc=12288 recipe and linking to this PR (mirroring the predecessor PR #1209 which added the conc=8192 entry).

Extended reasoning...

What the bug is and how it manifests

AGENTS.md is explicit that adding a new benchmark configuration requires updating perf-changelog.yaml. The section "Adding a New Benchmark Configuration" (AGENTS.md:192-196) lists "Add corresponding entry to perf-changelog.yaml to trigger benchmark" as step 2, the "Updating Docker Images" section (AGENTS.md:296) marks it as a MUST, and AGENTS.md:174 confirms "Changes to perf-changelog.yaml trigger benchmark runs." This PR adds a brand-new search-space row { tp: 8, ep: 8, dp-attn: true, conc-start: 12288, conc-end: 12288 } at .github/configs/nvidia-master.yaml:1886 plus the matching CONC=12288 branch in benchmarks/single_node/dsv4_fp4_b300_sglang.sh (with new tunings: CUDA_GRAPH_MAX_BS=1600, MAX_RUNNING_REQUESTS=12288, MEM_FRACTION_STATIC=0.72, SGLANG_OPT_USE_ONLINE_COMPRESS=1, etc.), but the changeset only touches those two files — perf-changelog.yaml is untouched.

The specific code path that triggers it

Per AGENTS.md, the sweep workflow on push-to-main keys off entries appended to perf-changelog.yaml to determine which configurations to (re)run. With no new changelog entry referencing dsv4-fp4-b300-sglang and the conc=12288 recipe, the workflow has no signal that there is anything new to benchmark, so the freshly added search-space row and script branch will sit dormant.

Why existing code doesn't prevent it

The repo does not have an automated check that cross-references new search-space rows in nvidia-master.yaml against perf-changelog.yaml entries. Compliance is enforced by convention/documentation (AGENTS.md) and prior practice. The directly analogous predecessor PR #1209 (commit a3330bb), which added the conc=8192 recipe to this exact same dsv4-fp4-b300-sglang config, did append a perf-changelog.yaml block (currently the last entry in the file) describing the new conc=8192 recipe and linking the PR — establishing the expected pattern that this PR omits.

Impact

The whole point of this PR is to start collecting performance numbers for the conc=12288 recipe. Without the perf-changelog.yaml entry, the recipe will not run on push-to-main and no benchmark numbers will be collected, defeating the purpose of the change. Users discovering the gap later will need a follow-up PR.

How to fix it

Append a new entry to perf-changelog.yaml (after the last entry, which is the conc=8192 entry from PR #1209) following the same shape: config-keys: [dsv4-fp4-b300-sglang], a short description such as "Add conc=12288 recipe for 1k1k on dsv4-fp4-b300-sglang", and a pr-link to PR #1225.

Step-by-step proof

Inspect this PR's changeset: git show 38d849d --stat shows only two files touched — .github/configs/nvidia-master.yaml (+1 line) and benchmarks/single_node/dsv4_fp4_b300_sglang.sh (+17/-3). perf-changelog.yaml is not in the changeset.

Inspect AGENTS.md:192-196 ("Adding a New Benchmark Configuration") — step 2 explicitly says "Add corresponding entry to perf-changelog.yaml to trigger benchmark."

Inspect AGENTS.md:296 — "MUST: Add an entry to perf-changelog.yaml."

Inspect AGENTS.md:174 — "Changes to perf-changelog.yaml trigger benchmark runs."

Inspect predecessor PR sglang-update #1209 (commit a3330bb), which added the conc=8192 row to the same config: it appended a corresponding perf-changelog.yaml entry (now the last block in the file) referencing dsv4-fp4-b300-sglang and the conc=8192 recipe.

Conclusion: the conc=12288 recipe added here will not be picked up on push-to-main because no new perf-changelog.yaml entry was added, and the analogous predecessor PR followed the documented protocol that this PR skips.

yhyang201 · 2026-04-29T14:41:47Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang

github-actions · 2026-04-29T14:41:58Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25115567759
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang
Pinned ref: 3002f78
Approval: not required (trusted collaborator).

yhyang201 · 2026-04-29T16:40:15Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang

github-actions · 2026-04-29T16:40:33Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25121580534
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang
Pinned ref: 3002f78
Approval: not required (trusted collaborator).

yhyang201 · 2026-04-29T18:14:56Z

hang, maybe crush

yhyang201 · 2026-04-29T18:15:04Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang

github-actions · 2026-04-29T18:15:21Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25126029879
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang
Pinned ref: 1489aae
Approval: not required (trusted collaborator).

yhyang201 · 2026-04-29T18:20:16Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang

github-actions · 2026-04-29T18:20:32Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25126267486
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang
Pinned ref: cec3e7e
Approval: not required (trusted collaborator).

sglang-update

38d849d

yhyang201 requested a review from a team April 29, 2026 14:33

yhyang201 requested review from jgangani and kedarpotdar-nv as code owners April 29, 2026 14:33

github-project-automation Bot added this to InferenceMAX Board Apr 29, 2026

sglang-update

3002f78

claude Bot reviewed Apr 29, 2026

View reviewed changes

yhyang201 added 3 commits April 30, 2026 01:58

sglang-update

5554de5

sglang-update

ef12f38

sglang-update

1489aae

sglang-update

cec3e7e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sglang-update#1225

sglang-update#1225
yhyang201 wants to merge 6 commits intomainfrom
sglang-update-conc12288

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

claude Bot Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

claude Bot Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

yhyang201 commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant