Skip to content

Add B300 config: glm5-fp4-sglang#1058

Merged
functionstackx merged 2 commits intomainfrom
claude/add-glm5-fp4-b300-sglang
Apr 17, 2026
Merged

Add B300 config: glm5-fp4-sglang#1058
functionstackx merged 2 commits intomainfrom
claude/add-glm5-fp4-b300-sglang

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

Summary

  • Add glm5-fp4-b300-sglang benchmark config and the corresponding benchmarks/single_node/glm5_fp4_b300.sh launch script
  • At the time of submission, the SGLang GLM-5 cookbook does not have a B300-specific recipe, so this reuses the existing GLM-5 FP4 (NVFP4) B200 SGLang recipe as-is until B300-specific tuning is available
  • Image: lmsysorg/sglang:v0.5.10.post1-cu130 (same as B200), runner: b300, same TP=4/8 and concurrency search-space as B200

Test plan

  • CI config validation passes
  • Run glm5-fp4-b300-sglang single-node benchmark on a B300 node and confirm server starts, benchmark completes, and result file is produced

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Comment thread perf-changelog.yaml
- "Add GLM-5 FP4 (NVFP4) B300 SGLang benchmark"
- "Image: lmsysorg/sglang:v0.5.10.post1-cu130"
- "At the time of submission, https://cookbook.sglang.io/autoregressive/GLM/GLM-5 does not have a B300-specific recipe, so this reuses the existing GLM-5 FP4 B200 SGLang recipe as-is"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1058
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The perf-changelog.yaml entry for glm5-fp4-b300-sglang has pr-link set to .../pull/XXXX instead of the actual PR number .../pull/1058. This should be corrected so the changelog permanently records the correct link.

Extended reasoning...

What the bug is and how it manifests

The perf-changelog.yaml entry for glm5-fp4-b300-sglang (the entry added by this PR) uses a placeholder value XXXX in the pr-link field instead of the actual PR number 1058. The field currently reads:

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX

The specific code path that triggers it

The affected entry is the last entry in perf-changelog.yaml (line 1414), introduced by this PR. The PR diff itself shows the intended value as /pull/1058, but the code committed to main at commit 82d44b0 still contains the placeholder /XXXX.

Why existing code does not prevent it

There is no automated validation that checks perf-changelog.yaml pr-link fields for unreplaced placeholders like XXXX. The CI config validation would pass since the YAML is syntactically valid — the placeholder is just a wrong URL string.

Impact

The changelog will permanently record a broken link for this entry. Anyone referencing the changelog to find the PR that introduced the glm5-fp4-b300-sglang config will be directed to a non-existent GitHub URL instead of PR #1058.

How to fix it

Replace XXXX with 1058 in the pr-link field:

pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1058

Step-by-step proof

  1. This PR is numbered Add B300 config: glm5-fp4-sglang #1058, as shown in the PR metadata.
  2. The PR diff (hunk for perf-changelog.yaml) shows the added line as + pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1058.
  3. However, the current file at HEAD (commit 82d44b0) shows the last entry's pr-link as https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX.
  4. The comparable dsr1-fp4-b300-sglang entry directly above (from PR Add B300 config: dsr1-fp4-sglang (non-MTP) #1049) correctly reads /pull/1049.
  5. Conclusion: the placeholder XXXX was not replaced before the commit landed on main.

functionstackx and others added 2 commits April 17, 2026 07:24
At the time of submission, the SGLang GLM-5 cookbook
(https://cookbook.sglang.io/autoregressive/GLM/GLM-5) does not have a
B300-specific recipe, so this config reuses the existing GLM-5 FP4
(NVFP4) B200 SGLang recipe as-is until B300-specific tuning is available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@functionstackx functionstackx force-pushed the claude/add-glm5-fp4-b300-sglang branch from 489db28 to be34819 Compare April 17, 2026 11:24
@functionstackx functionstackx merged commit 37120e6 into main Apr 17, 2026
16 checks passed
@functionstackx functionstackx deleted the claude/add-glm5-fp4-b300-sglang branch April 17, 2026 11:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant