Add B300 config: minimaxm2.5-fp4-vllm by functionstackx · Pull Request #1055 · SemiAnalysisAI/InferenceX

functionstackx · 2026-04-17T08:25:40Z

Summary

Add minimaxm2.5-fp4-b300-vllm benchmark config and the corresponding benchmarks/single_node/minimaxm2.5_fp4_b300.sh launch script
At the time of submission, the vLLM MiniMax-M2 recipes page does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 (NVFP4) B200 vLLM recipe as-is until B300-specific tuning is available
Image: vllm/vllm-openai:v0.19.0-cu130 (same as B200), runner: b300, same TP/EP/dp-attn/concurrency search-space as B200

Test plan

CI config validation passes
Run minimaxm2.5-fp4-b300-vllm single-node benchmark on a B300 node and confirm server starts, benchmark completes, and result file is produced

🤖 Generated with Claude Code

github-actions · 2026-04-17T08:25:48Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM — straightforward B300 config addition following the same pattern as recent PRs #1048 and #1049.

Extended reasoning...

Overview

This PR adds a new minimaxm2.5-fp4-b300-vllm benchmark configuration by adding an entry to .github/configs/nvidia-master.yaml, a new single-node launch script benchmarks/single_node/minimaxm2.5_fp4_b300.sh, and a perf-changelog.yaml entry. The changes are mechanical and self-contained, directly mirroring the B200 config with only the runner changed to b300.

Security risks

None. These are benchmark configuration and shell script additions with no auth, crypto, or permission-sensitive code paths.

Level of scrutiny

Low. This is a config/script addition following an established, well-worn pattern. The immediately preceding PRs (#1048 for qwen3.5-fp8-b300-sglang, #1049 for dsr1-fp4-b300-sglang) have identical structure and were merged without issues.

Other factors

The only finding is a nit already flagged as an inline comment: the pr-link field in perf-changelog.yaml contains the placeholder XXXX instead of the actual PR number 1055. This is a cosmetic issue — the same pattern already exists in 7 other entries in the file — and does not affect benchmark functionality. CI config validation is still pending per the test plan checklist, but that is expected for a new submission.

claude · 2026-04-17T08:37:39Z

+    - "At the time of submission, https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 B200 vLLM recipe as-is"
+  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1055


🟡 The new minimaxm2.5-fp4-b300-vllm entry in perf-changelog.yaml has a placeholder pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX that was never replaced with the actual PR number. Since this PR is #1055, the link should be updated to pull/1055.

Extended reasoning...

Bug description: The perf-changelog.yaml entry added by this PR for the minimaxm2.5-fp4-b300-vllm config (lines 1413-1414) contains a template placeholder XXXX in the pr-link field rather than the actual PR number, resulting in a broken link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX.

Code path: The diff clearly shows the new entry at the bottom of perf-changelog.yaml includes pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. The PR number (1055) is known at submission time — it is literally the number of this pull request.

Why existing code doesn't prevent it: There is no automated validation step that checks perf-changelog.yaml entries for placeholder patterns like XXXX. The surrounding entries demonstrate the correct pattern: the immediately preceding entry for dsr1-fp4-b300-sglang (PR #1049) correctly uses pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1049, and similarly for qwen3.5-fp8-b300-sglang (PR #1048).

Impact: The broken link makes it impossible to trace the changelog entry back to the originating PR. Anyone reading the changelog to understand when or why the minimaxm2.5-fp4-b300-vllm config was added will be unable to follow the reference. There are already 7 pre-existing XXX placeholder entries in the file (lines 12, 19, 315, 790, 818, 855, 872), and this PR introduces another instance, continuing a pattern that reduces changelog utility.

Fix: Replace pull/XXXX with pull/1055 on the pr-link line of the new minimaxm2.5-fp4-b300-vllm entry.

Step-by-step proof: (1) This PR is numbered #1055 — visible in the PR metadata. (2) The diff adds a new perf-changelog.yaml entry with pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXXX. (3) At merge time, clicking that link leads to a 404 (or an unrelated PR if one happens to be numbered XXXX). (4) The correct value https://github.com/SemiAnalysisAI/InferenceX/pull/1055 would link directly to this PR. The fix is a one-word substitution.

Addressing the duplicate refutation: A separate bug_003 was reported describing the identical issue. Whether or not bug_003 is processed, this report independently identifies the same real defect; the fix remains the same regardless of which report is acted upon.

At the time of submission, the vLLM MiniMax-M2 recipes page (https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html) does not have a B300-specific recipe, so this config reuses the existing MiniMax-M2.5 FP4 (NVFP4) B200 vLLM recipe as-is until B300-specific tuning is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

functionstackx requested a review from a team April 17, 2026 08:25

functionstackx requested review from jgangani and kedarpotdar-nv as code owners April 17, 2026 08:25

github-project-automation Bot added this to InferenceMAX Board Apr 17, 2026

claude Bot reviewed Apr 17, 2026

View reviewed changes

functionstackx added sweep-enabled and removed sweep-enabled labels Apr 17, 2026

functionstackx and others added 2 commits April 17, 2026 08:43

Fill in PR link for minimaxm2.5-fp4-b300-vllm changelog entry

f1bdccf

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

functionstackx force-pushed the claude/add-minimaxm2.5-fp4-b300-vllm branch from 4dfd451 to f1bdccf Compare April 17, 2026 12:43

functionstackx merged commit 22d0f12 into main Apr 17, 2026
3 checks passed

functionstackx deleted the claude/add-minimaxm2.5-fp4-b300-vllm branch April 17, 2026 12:43

github-project-automation Bot moved this to Done in InferenceMAX Board Apr 17, 2026

claude Bot mentioned this pull request Apr 20, 2026

Add B300 config: kimi-k2.5-fp4-vllm #1100

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add B300 config: minimaxm2.5-fp4-vllm#1055

Add B300 config: minimaxm2.5-fp4-vllm#1055
functionstackx merged 2 commits intomainfrom
claude/add-minimaxm2.5-fp4-b300-vllm

functionstackx commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

claude Bot left a comment

Uh oh!

claude Bot Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		- "At the time of submission, https://docs.vllm.ai/projects/recipes/en/latest/MiniMax/MiniMax-M2.html does not have a B300-specific recipe, so this reuses the existing MiniMax-M2.5 FP4 B200 vLLM recipe as-is"
		pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1055

Conversation

functionstackx commented Apr 17, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude Bot Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant