Revert "Add B300 config: kimi-k2.5-int4-vllm (#1057)" [skip-sweep] by cquil11 · Pull Request #1070 · SemiAnalysisAI/InferenceX

cquil11 · 2026-04-17T21:58:55Z

Reverts #1057

A re-open PR with the same contents will follow.

This reverts commit 6e89e55.

github-actions · 2026-04-17T21:59:03Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM — clean revert of the B300 Kimi-K2.5 INT4 config with a follow-up PR planned.

Extended reasoning...

Overview

This PR reverts #1057, removing three changes: the kimik2.5-int4-b300-vllm entry from .github/configs/nvidia-master.yaml, the corresponding benchmark script benchmarks/single_node/kimik2.5_int4_b300.sh, and the perf-changelog entry. The description confirms a re-open PR with the same contents will follow.

Security risks

None. This is a pure deletion of config/script files with no auth, crypto, or permission-sensitive code paths involved.

Level of scrutiny

Minimal. This is a mechanical revert of a recently merged PR — the diff is purely subtractive, removing exactly what #1057 added. No logic changes, no new code paths introduced.

Other factors

No bugs found by the automated system, no outstanding reviewer comments, and the intent is unambiguous. The recipe-reminder bot comment is a standard automated message and not a blocker.

…1267) * Add B300 config: kimi-k2.5-int4-vllm (vLLM 0.20.0 + TP=4/EP=1 sweep) - New `kimik2.5-int4-b300-vllm` config with the corresponding `benchmarks/single_node/kimik2.5_int4_b300.sh` launch script (mirrors the existing INT4 B200 vLLM recipe; the upstream vLLM Kimi-K2.5 recipes page does not yet ship B300-specific tuning). - Image: `vllm/vllm-openai:v0.20.0-cu130` — the original draft (#1057, reverted in #1070, reopened as #1071) carried `v0.19.0` while we waited on a working release; 0.20.0 has now shipped. - Search-space per (ISL, OSL): the existing TP=8 sweep plus a new TP=4 / EP=1 entry covering the lower-TP / expert-parallel variant on the same B300 nodes. Supersedes #1071 — opening fresh from main since the merge base had drifted (b200 schema migrated from `seq-len-configs` to `scenarios.fixed-seq-len`) and the user preferred a clean reopen over a rebase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * perf-changelog: move kimik2.5-int4-b300-vllm entry to bottom AGENTS.md requires new perf-changelog entries to be appended to the end of the file (oldest at top, newest at bottom). The original commit prepended the new entry above PR #95; move it after the current last entry (PR #1265) to satisfy the convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Revert "Add B300 config: kimi-k2.5-int4-vllm (#1057)" [skip-sweep]

6c20790

This reverts commit 6e89e55.

cquil11 requested a review from a team April 17, 2026 21:58

cquil11 requested review from jgangani and kedarpotdar-nv as code owners April 17, 2026 21:58

github-project-automation Bot added this to InferenceMAX Board Apr 17, 2026

cquil11 merged commit dd9c285 into main Apr 17, 2026
8 checks passed

cquil11 deleted the revert-1057-skip-sweep branch April 17, 2026 21:59

github-project-automation Bot moved this to Done in InferenceMAX Board Apr 17, 2026

cquil11 mentioned this pull request Apr 17, 2026

Add B300 config: kimi-k2.5-int4-vllm (vLLM 0.20.0 + TP=4/EP=1 sweep) #1071

Closed

2 tasks

claude Bot reviewed Apr 17, 2026

View reviewed changes

functionstackx mentioned this pull request May 3, 2026

Add B300 config: kimi-k2.5-int4-vllm (vLLM 0.20.0 + TP=4/EP=1 sweep) #1267

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Add B300 config: kimi-k2.5-int4-vllm (#1057)" [skip-sweep]#1070

Revert "Add B300 config: kimi-k2.5-int4-vllm (#1057)" [skip-sweep]#1070
cquil11 merged 1 commit intomainfrom
revert-1057-skip-sweep

cquil11 commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Uh oh!

claude Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cquil11 commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant