Add B300 config: minimaxm2.5-fp4-vllm#1055
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow If additional help is needed, PR authors can reach out to core maintainers over Slack. |
There was a problem hiding this comment.
LGTM — straightforward B300 config addition following the same pattern as recent PRs #1048 and #1049.
Extended reasoning...
Overview
This PR adds a new minimaxm2.5-fp4-b300-vllm benchmark configuration by adding an entry to .github/configs/nvidia-master.yaml, a new single-node launch script benchmarks/single_node/minimaxm2.5_fp4_b300.sh, and a perf-changelog.yaml entry. The changes are mechanical and self-contained, directly mirroring the B200 config with only the runner changed to b300.
Security risks
None. These are benchmark configuration and shell script additions with no auth, crypto, or permission-sensitive code paths.
Level of scrutiny
Low. This is a config/script addition following an established, well-worn pattern. The immediately preceding PRs (#1048 for qwen3.5-fp8-b300-sglang, #1049 for dsr1-fp4-b300-sglang) have identical structure and were merged without issues.
Other factors
The only finding is a nit already flagged as an inline comment: the pr-link field in perf-changelog.yaml contains the placeholder XXXX instead of the actual PR number 1055. This is a cosmetic issue — the same pattern already exists in 7 other entries in the file — and does not affect benchmark functionality. CI config validation is still pending per the test plan checklist, but that is expected for a new submission.
| - "At the time of submission, https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 B200 vLLM recipe as-is" | ||
| pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1055 |
There was a problem hiding this comment.
🟡 The new minimaxm2.5-fp4-b300-vllm entry in perf-changelog.yaml has a placeholder pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX that was never replaced with the actual PR number. Since this PR is #1055, the link should be updated to pull/1055.
Extended reasoning...
Bug description: The perf-changelog.yaml entry added by this PR for the minimaxm2.5-fp4-b300-vllm config (lines 1413-1414) contains a template placeholder XXXX in the pr-link field rather than the actual PR number, resulting in a broken link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX.
Code path: The diff clearly shows the new entry at the bottom of perf-changelog.yaml includes pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. The PR number (1055) is known at submission time — it is literally the number of this pull request.
Why existing code doesn't prevent it: There is no automated validation step that checks perf-changelog.yaml entries for placeholder patterns like XXXX. The surrounding entries demonstrate the correct pattern: the immediately preceding entry for dsr1-fp4-b300-sglang (PR #1049) correctly uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1049, and similarly for qwen3.5-fp8-b300-sglang (PR #1048).
Impact: The broken link makes it impossible to trace the changelog entry back to the originating PR. Anyone reading the changelog to understand when or why the minimaxm2.5-fp4-b300-vllm config was added will be unable to follow the reference. There are already 7 pre-existing XXX placeholder entries in the file (lines 12, 19, 315, 790, 818, 855, 872), and this PR introduces another instance, continuing a pattern that reduces changelog utility.
Fix: Replace pull/XXXX with pull/1055 on the pr-link line of the new minimaxm2.5-fp4-b300-vllm entry.
Step-by-step proof: (1) This PR is numbered #1055 — visible in the PR metadata. (2) The diff adds a new perf-changelog.yaml entry with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. (3) At merge time, clicking that link leads to a 404 (or an unrelated PR if one happens to be numbered XXXX). (4) The correct value https://github.com/SemiAnalysisAI/InferenceX/pull/1055 would link directly to this PR. The fix is a one-word substitution.
Addressing the duplicate refutation: A separate bug_003 was reported describing the identical issue. Whether or not bug_003 is processed, this report independently identifies the same real defect; the fix remains the same regardless of which report is acted upon.
At the time of submission, the vLLM MiniMax-M2 recipes page (https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html) does not have a B300-specific recipe, so this config reuses the existing MiniMax-M2.5 FP4 (NVFP4) B200 vLLM recipe as-is until B300-specific tuning is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4dfd451 to
f1bdccf
Compare
Summary
minimaxm2.5-fp4-b300-vllmbenchmark config and the correspondingbenchmarks/single_node/minimaxm2.5_fp4_b300.shlaunch scriptvllm/vllm-openai:v0.19.0-cu130(same as B200), runner:b300, same TP/EP/dp-attn/concurrency search-space as B200Test plan
minimaxm2.5-fp4-b300-vllmsingle-node benchmark on a B300 node and confirm server starts, benchmark completes, and result file is produced🤖 Generated with Claude Code