Skip to content

sglang dpskv4 hopper#1212

Open
yhyang201 wants to merge 6 commits intoSemiAnalysisAI:mainfrom
yhyang201:dsv4-fp4-h200-sglang
Open

sglang dpskv4 hopper#1212
yhyang201 wants to merge 6 commits intoSemiAnalysisAI:mainfrom
yhyang201:dsv4-fp4-h200-sglang

Conversation

@yhyang201
Copy link
Copy Markdown
Collaborator

sglang dpskv4 hopper

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25064444822
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang
Pinned ref: d7960fe
Approval: not required (trusted collaborator).

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25065340024
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang
Pinned ref: eea0897
Approval: not required (trusted collaborator).

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25065714943
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang
Pinned ref: d6039e3
Approval: not required (trusted collaborator).

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25066286688
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang
Pinned ref: 1a9d38e
Approval: not required (trusted collaborator).

Comment thread .github/configs/nvidia-master.yaml Outdated
model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: h200
precision: fp4
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldnt this be fp8?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@yhyang201 yhyang201 force-pushed the dsv4-fp4-h200-sglang branch from 8bde8f2 to 1a9d38e Compare April 29, 2026 03:22
@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25090754026
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-h200-sglang
Pinned ref: d6fd273
Approval: not required (trusted collaborator).

--tp $TP \
--moe-runner-backend marlin \
--chunked-prefill-size 4096 \
--disable-flashinfer-autotune \
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

disable radix cache too plz since this is random datasets

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since this is a random dataset, could you kindly clarify why the radix cache needs to be disabled? I would expect the cache hit rate to be close to zero....

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more of just gaurnteeing consistency

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp8-h200-sglang

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/25091984554
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp8-h200-sglang
Pinned ref: d6fd273
Approval: not required (trusted collaborator).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants