Skip to content

[NVIDIA] Add DSR1 FP8 H200 Dynamo TRT-LLM configurations#570

Merged
cquil11 merged 35 commits intomainfrom
nv/dsr1-fp8-h200-dynamo-trtllm-260126
Jan 29, 2026
Merged

[NVIDIA] Add DSR1 FP8 H200 Dynamo TRT-LLM configurations#570
cquil11 merged 35 commits intomainfrom
nv/dsr1-fp8-h200-dynamo-trtllm-260126

Conversation

@nlevin-ui
Copy link
Copy Markdown
Collaborator

@nlevin-ui nlevin-ui commented Jan 26, 2026

DSR1 FP8 H200 Dynamo TRT-LLM Disagg

  • 1k/1k MTP on/off
  • 8k/1k MTP on/off

csahithi and others added 18 commits January 26, 2026 09:41
Expand dsr1-fp8-h200-dynamo-trt section with full configuration set:
- 1k1k MTP configs (c4-c512) with CONFIG_FILE references
- 1k1k STP configs (c4-c512) with CONFIG_FILE references
- 8k1k MTP configs (c4-c512) with CONFIG_FILE references
- 8k1k STP configs (c4-c512) with CONFIG_FILE references

All configs reference recipe YAMLs in srt-slurm-trtllm repo under
recipies/trtllm/h200/{1k1k,8k1k}/{mtp,stp}/
dep8 = enable_attention_dp: true = dp-attn: true
tep8 = enable_attention_dp: false = dp-attn: false
Update MODEL_PATH from /models/dsr1-fp8 (old DeepSeek-R1) to
/models/DeepSeek-R1-0528 (new version matching nvidia-master.yaml)
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Jan 26, 2026

You must update perf-changelog.yaml before the sweep will kick off.

Comment thread runners/launch_h200-dgxc-slurm.sh
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Jan 27, 2026

test sweeping again to smoke test @ishandhanani additions
https://github.com/InferenceMAX/InferenceMAX/actions/runs/21411593275

@ishandhanani
Copy link
Copy Markdown
Collaborator

ishandhanani commented Jan 27, 2026

This should not have been merged into 1 PR. Sweeps for SGLang won't pass yet.

In the future please allow me to indicate when things are ready to merge into 1 mega PR. It's easier on our end if we can keep things separate

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Jan 27, 2026

sure np. nevertheless it appears sweeps (without your changes) are still failing?

edit: is is bc of "recipies" LMFAO (my bad I didn't mean yall actually had to change this 😭 )

cquil11 and others added 6 commits January 27, 2026 13:56
- Update SQUASH_FILE to use /data/containers/ with + separators
- Strip nvcr.io/ prefix from path to match actual .sqsh filenames
- Add CONTAINER_KEY to convert IMAGE to srt-slurm format (nvcr.io#)
- Map container key to .sqsh path dynamically in srtslurm.yaml
Use the release branch for Q1 2026 submission instead of main.
@ishandhanani
Copy link
Copy Markdown
Collaborator

edit: is is bc of "recipies" LMFAO

LOL we had to fix it at some point. Remnant of me just moving way to fast and not having any spell check in the repo 😆

@functionstackx
Copy link
Copy Markdown
Contributor

This should not have been merged into 1 PR.

mb guys!

Comment thread .github/configs/nvidia-master.yaml
Comment thread runners/launch_h200-dgxc-slurm.sh
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Jan 28, 2026

@nlevin-ui left a couple more comments, mainly nits. In general, please just follow the conventions set in #585 if you can. Also see comments in #510 which may also apply to this PR.

Link each CONFIG_FILE to its source in srt-slurm sa-submission-q1-2026 branch.
Keep all entries:
- H200 dynamo-trt entry (this PR)
- Evals-only entry (PR #558)
- B300 dynamo-trt entry (PR #585)
Removed outdated DSR1 FP8 H200 Dynamo TRT configuration details and re-added them in a new section.
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Jan 29, 2026

ok. lgtm, thanks for all your hard work on this.

@cquil11 cquil11 merged commit d7a6d4e into main Jan 29, 2026
7 of 38 checks passed
@cquil11 cquil11 deleted the nv/dsr1-fp8-h200-dynamo-trtllm-260126 branch January 29, 2026 15:57
@cquil11 cquil11 changed the title Add DSR1 FP8 H200 Dynamo TRT-LLM configurations [NVIDIA] Add DSR1 FP8 H200 Dynamo TRT-LLM configurations Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

6 participants