Skip to content

GB300 SGLang disagg #1226

Open
yhyang201 wants to merge 1 commit intoSemiAnalysisAI:mainfrom
yhyang201:job564-gb300-sglang-disagg
Open

GB300 SGLang disagg #1226
yhyang201 wants to merge 1 commit intoSemiAnalysisAI:mainfrom
yhyang201:job564-gb300-sglang-disagg

Conversation

@yhyang201
Copy link
Copy Markdown
Collaborator

@yhyang201 yhyang201 commented Apr 29, 2026

GB300 SGLang disagg
conc=4096

…4096)

Adds the max-throughput DSV4-Pro SGLang disaggregated recipe for GB300
(7 prefill + 2 decode nodes, DeepEP, mooncake transfer). Derived from
Job 564 on gb300-cw which achieved 358,890 total_token_throughput
(9,969 tok/s/gpu) on 9 GB300 nodes.

Files:
- Recipe: sglang/deepseek-v4/8k1k/disagg-gb300-7p1d-dep4-dep8.yaml
- CI config: nvidia-master.yaml (dsv4-fp4-gb300-dynamo-sglang, conc=4096)
- Runner: launch_gb300-cw.sh + runners.yaml gb300-cw entry
- Changelog: perf-changelog.yaml entry
- Eval skip: generate_sweep_configs.py (gb300-cw lacks eval path)
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@yhyang201 yhyang201 changed the title GB300 SGLang disagg 7P+2D max-throughput (Job 564, conc=4096) GB300 SGLang disagg 7P+2D Apr 29, 2026
@yhyang201 yhyang201 changed the title GB300 SGLang disagg 7P+2D GB300 SGLang disagg Apr 29, 2026
@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-gb300-dynamo-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25120673375
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-gb300-dynamo-sglang
Pinned ref: 5c43167
Approval: not required (trusted collaborator).

@yhyang201
Copy link
Copy Markdown
Collaborator Author

Metric Job 740
total_token_throughput 327,693 tok/s
tok/s/gpu 9,103
mean_ttft_ms 28,562
mean_tpot_ms 69.46
mean_itl_ms 2,052
mean_e2el_ms 92,487
max_concurrency 4096
num_prompts 40,960
duration 1,037s
completed 40,960

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant