Conversation
Collaborator
|
Looks great, thank you so much! |
Collaborator
Author
I think they are set by the runner setup automatically. Checking now in the test run. |
Oseltamivir
added a commit
that referenced
this pull request
Apr 29, 2026
Stable ai-dynamo 1.0.2 (the only release on pypi.nvidia.com) imports vllm.inputs.data, which vllm-project/vllm#35182 (2026-03-26, commit ba2f0acc) deleted; the deepseekv4-cu130 image used here is post-deletion, so dynamo.vllm workers crash on import in multimodal_handlers/__init__. The fix shipped in 1.2.0.dev wheels but srt-slurm PR #84's DynamoConfig schema only accepts version (PyPI) / hash / top_of_tree — no wheel: field. Set dynamo.install: false in all 5 gb300 recipes so srtctl emits no worker-side install line, and have launch_gb300-cw.sh: * stage aarch64 1.2.0.dev20260426 wheels under /mnt/vast/dynamo-wheels/ (mkdir-as-lock; flock is unreliable on this VAST mount per prior runs) * symlink the cache into srt-slurm/configs/dynamo-wheels/ so the container sees /configs/dynamo-wheels/<version>/ * append a `pip install --no-index --find-links` line to upstream's configs/vllm-container-deps.sh (which srtctl already runs in every worker container before launching dynamo.vllm) This matches the working gb200-nv path (wheel: "1.2.0.dev20260426" on the aflowers/vllm-gb200-v0.20.0 srt-slurm branch) without grafting that branch's 19k-line schema diff onto PR #84.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The logic in
utils/calculate_success_rate.pywas broken.Now, instead of hard-coding the total possible calculations for each GPU, simply count all of the attempted job runs and all of the successful job runs for each GPU.
Full Sweep Test Example:
https://github.com/InferenceMAX/InferenceMAX/actions/runs/18265141263/job/51998546795#logs
Successfully calculates success rates of a reduced H100 sweep.
Additional Examples: