Skip to content

[NVIDIA] update vllm b200 image. TODO: add logic for docker runner.#3

Merged
kimbochen merged 3 commits intomainfrom
kepotdar/vllm-b200-update
Sep 3, 2025
Merged

[NVIDIA] update vllm b200 image. TODO: add logic for docker runner.#3
kimbochen merged 3 commits intomainfrom
kepotdar/vllm-b200-update

Conversation

@kedarpotdar-nv
Copy link
Copy Markdown
Collaborator

Updated vllm B200 runner image with ToT . This image will work for Hopper as well, but want to try B200 updates first.

TODO: apply changes to Hopper and B200 docker config.

@kedarpotdar-nv kedarpotdar-nv added the enhancement New feature or request label Sep 2, 2025
Comment thread benchmarks/70b_b200_slurm.sh Outdated

FUSION_FLAG='{"pass_config":{"enable_fi_allreduce_fusion":true,"enable_attn_fusion":true,"enable_noop":true},"custom_ops":["+quant_fp8","+rms_norm"],"cudagraph_mode":"FULL_DECODE_ONLY","splitting_ops":[]}'

NO_PREFIX_CACHING_FLAG="--no-enable-prefix-caching"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems redundant

Copy link
Copy Markdown
Collaborator

@kimbochen kimbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. Everything looks good except the NO_PREFIX_CACHING_LFAG.

@kedarpotdar-nv
Copy link
Copy Markdown
Collaborator Author

Good catch, fixed!

@kimbochen kimbochen merged commit f41800f into main Sep 3, 2025
@kimbochen kimbochen deleted the kepotdar/vllm-b200-update branch September 3, 2025 01:35
@cquil11 cquil11 added the NVIDIA label Apr 8, 2026
@cquil11 cquil11 changed the title update vllm b200 image. TODO: add logic for docker runner. [NVIDIA] update vllm b200 image. TODO: add logic for docker runner. Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request NVIDIA

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants