Skip to content

Add B300 config: minimaxm2.5-fp4-vllm#1055

Merged
functionstackx merged 2 commits intomainfrom
claude/add-minimaxm2.5-fp4-b300-vllm
Apr 17, 2026
Merged

Add B300 config: minimaxm2.5-fp4-vllm#1055
functionstackx merged 2 commits intomainfrom
claude/add-minimaxm2.5-fp4-b300-vllm

Conversation

@functionstackx
Copy link
Copy Markdown
Contributor

Summary

  • Add minimaxm2.5-fp4-b300-vllm benchmark config and the corresponding benchmarks/single_node/minimaxm2.5_fp4_b300.sh launch script
  • At the time of submission, the vLLM MiniMax-M2 recipes page does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 (NVFP4) B200 vLLM recipe as-is until B300-specific tuning is available
  • Image: vllm/vllm-openai:v0.19.0-cu130 (same as B200), runner: b300, same TP/EP/dp-attn/concurrency search-space as B200

Test plan

  • CI config validation passes
  • Run minimaxm2.5-fp4-b300-vllm single-node benchmark on a B300 node and confirm server starts, benchmark completes, and result file is produced

🤖 Generated with Claude Code

@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — straightforward B300 config addition following the same pattern as recent PRs #1048 and #1049.

Extended reasoning...

Overview

This PR adds a new minimaxm2.5-fp4-b300-vllm benchmark configuration by adding an entry to .github/configs/nvidia-master.yaml, a new single-node launch script benchmarks/single_node/minimaxm2.5_fp4_b300.sh, and a perf-changelog.yaml entry. The changes are mechanical and self-contained, directly mirroring the B200 config with only the runner changed to b300.

Security risks

None. These are benchmark configuration and shell script additions with no auth, crypto, or permission-sensitive code paths.

Level of scrutiny

Low. This is a config/script addition following an established, well-worn pattern. The immediately preceding PRs (#1048 for qwen3.5-fp8-b300-sglang, #1049 for dsr1-fp4-b300-sglang) have identical structure and were merged without issues.

Other factors

The only finding is a nit already flagged as an inline comment: the pr-link field in perf-changelog.yaml contains the placeholder XXXX instead of the actual PR number 1055. This is a cosmetic issue — the same pattern already exists in 7 other entries in the file — and does not affect benchmark functionality. CI config validation is still pending per the test plan checklist, but that is expected for a new submission.

Comment thread perf-changelog.yaml
Comment on lines +1413 to +1414
- "At the time of submission, https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 B200 vLLM recipe as-is"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1055
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new minimaxm2.5-fp4-b300-vllm entry in perf-changelog.yaml has a placeholder pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX that was never replaced with the actual PR number. Since this PR is #1055, the link should be updated to pull/1055.

Extended reasoning...

Bug description: The perf-changelog.yaml entry added by this PR for the minimaxm2.5-fp4-b300-vllm config (lines 1413-1414) contains a template placeholder XXXX in the pr-link field rather than the actual PR number, resulting in a broken link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX.

Code path: The diff clearly shows the new entry at the bottom of perf-changelog.yaml includes pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. The PR number (1055) is known at submission time — it is literally the number of this pull request.

Why existing code doesn't prevent it: There is no automated validation step that checks perf-changelog.yaml entries for placeholder patterns like XXXX. The surrounding entries demonstrate the correct pattern: the immediately preceding entry for dsr1-fp4-b300-sglang (PR #1049) correctly uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1049, and similarly for qwen3.5-fp8-b300-sglang (PR #1048).

Impact: The broken link makes it impossible to trace the changelog entry back to the originating PR. Anyone reading the changelog to understand when or why the minimaxm2.5-fp4-b300-vllm config was added will be unable to follow the reference. There are already 7 pre-existing XXX placeholder entries in the file (lines 12, 19, 315, 790, 818, 855, 872), and this PR introduces another instance, continuing a pattern that reduces changelog utility.

Fix: Replace pull/XXXX with pull/1055 on the pr-link line of the new minimaxm2.5-fp4-b300-vllm entry.

Step-by-step proof: (1) This PR is numbered #1055 — visible in the PR metadata. (2) The diff adds a new perf-changelog.yaml entry with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. (3) At merge time, clicking that link leads to a 404 (or an unrelated PR if one happens to be numbered XXXX). (4) The correct value https://github.com/SemiAnalysisAI/InferenceX/pull/1055 would link directly to this PR. The fix is a one-word substitution.

Addressing the duplicate refutation: A separate bug_003 was reported describing the identical issue. Whether or not bug_003 is processed, this report independently identifies the same real defect; the fix remains the same regardless of which report is acted upon.

functionstackx and others added 2 commits April 17, 2026 08:43
At the time of submission, the vLLM MiniMax-M2 recipes page
(https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html)
does not have a B300-specific recipe, so this config reuses the existing
MiniMax-M2.5 FP4 (NVFP4) B200 vLLM recipe as-is until B300-specific
tuning is available.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@functionstackx functionstackx force-pushed the claude/add-minimaxm2.5-fp4-b300-vllm branch from 4dfd451 to f1bdccf Compare April 17, 2026 12:43
@functionstackx functionstackx merged commit 22d0f12 into main Apr 17, 2026
3 checks passed
@functionstackx functionstackx deleted the claude/add-minimaxm2.5-fp4-b300-vllm branch April 17, 2026 12:43
@claude claude Bot mentioned this pull request Apr 20, 2026
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

1 participant