sglang dpskv4 hopper#1212
Conversation
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25064444822 |
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25065340024 |
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25065714943 |
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25066286688 |
| model: deepseek-ai/DeepSeek-V4-Pro | ||
| model-prefix: dsv4 | ||
| runner: h200 | ||
| precision: fp4 |
There was a problem hiding this comment.
shouldnt this be fp8?
8bde8f2 to
1a9d38e
Compare
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25090754026 |
| --tp $TP \ | ||
| --moe-runner-backend marlin \ | ||
| --chunked-prefill-size 4096 \ | ||
| --disable-flashinfer-autotune \ |
There was a problem hiding this comment.
disable radix cache too plz since this is random datasets
There was a problem hiding this comment.
since this is a random dataset, could you kindly clarify why the radix cache needs to be disabled? I would expect the cache hit rate to be close to zero....
There was a problem hiding this comment.
more of just gaurnteeing consistency
|
/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp8-h200-sglang |
|
@yhyang201 Kicking off a sweep. Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25091984554 |
sglang dpskv4 hopper