Skip to content

GB300 SGLang disagg recipe#1227

Open
yhyang201 wants to merge 2 commits intoSemiAnalysisAI:mainfrom
yhyang201:job588-gb300-sglang-disagg
Open

GB300 SGLang disagg recipe#1227
yhyang201 wants to merge 2 commits intoSemiAnalysisAI:mainfrom
yhyang201:job588-gb300-sglang-disagg

Conversation

@yhyang201
Copy link
Copy Markdown
Collaborator

@yhyang201 yhyang201 commented Apr 29, 2026

conc=8192

7P+2D topology (TP4/DP4/EP4 prefill + TP8/DP8/EP8 wideep decode),
9 nodes, mooncake transfer, DeepEP. 359,226 total_token_throughput
(9,979 tok/s/gpu) on gb300-cw with 8k/1k random workload.
Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-gb300-dynamo-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25116925485
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-gb300-dynamo-sglang
Pinned ref: 52830a9
Approval: not required (trusted collaborator).

@yhyang201 yhyang201 changed the title Add DSv4-Pro FP4 GB300 Dynamo SGLang disagg recipe (conc=8192) GB300 SGLang disagg recipe Apr 29, 2026
@yhyang201
Copy link
Copy Markdown
Collaborator Author

Metric Value
total_token_throughput 320,366
tok/s/gpu 8,899
completed 40,960
mean_ttft 64,509 ms
mean_tpot 67.33 ms
duration 1060s

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

1 participant