fix: update the custom vllm instructions#1116
Conversation
WalkthroughThe change introduces an environment-driven custom vLLM workflow: a new/updated build script to fetch and wire a local vLLM, Docker build-time gating via BUILD_CUSTOM_VLLM, comprehensive documentation updates to the workflow, and minor .gitignore adjustments. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer
participant Dock as Docker build
participant DF as Dockerfile
participant Script as build-custom-vllm.sh
participant GH as vLLM Repo
participant Repo as NeMo RL Repo
Dev->>Dock: docker build --build-arg BUILD_CUSTOM_VLLM=1
Dock->>DF: Process Dockerfile
alt BUILD_CUSTOM_VLLM provided
DF->>Script: Execute build-custom-vllm.sh
Script->>GH: Clone vLLM (GIT_URL) and checkout GIT_REF
Script->>Script: Determine VLLM_WHEEL_COMMIT / wheel location
Script->>Repo: Install deps (Torch 2.7.1), update pyproject.toml (editable vllm), write nemo-rl.env
Script-->>DF: Exit on success/failure
else No arg
DF-->>Dock: Skip custom vLLM build
end
Dock-->>Dev: Image built
note over Dev,Repo: At runtime, use env vars to verify vLLM import/version and run apps
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (8)
docs/guides/use-custom-vllm.md (2)
53-53: Grammar nit: “Ensure's” → “Ensures”.-# (copied from build-custom-vllm.sh) Ensure's vllm doesn't try to recompile c++ source +# (copied from build-custom-vllm.sh) Ensures vllm doesn't try to recompile C++ source
50-57: Add a short note about wheel compatibility (Python/CUDA/arch).The example URL targets cp38-abi3/manylinux1/x86_64. A brief note to pick the correct wheel for the user’s Python/CUDA/arch will prevent install-time surprises.
tools/build-custom-vllm.sh (6)
54-61: Harden the xformers pin replacement.Allow leading whitespace and preserve any trailing platform markers more robustly.
-# Replace xformers==.* (but preserve any platform markers at the end) -# NOTE: that xformers is bumped from 0.0.30 to 0.0.31 to work with torch==2.7.1. This version may need to change to change when we upgrade torch. -find requirements/ -name "*.txt" -type f -exec sed -i -E 's/^(xformers)==[^;[:space:]]*/\1==0.0.31/' {} \; 2>/dev/null || true +# Replace xformers==.* (preserve trailing platform markers) +# NOTE: xformers bumped to 0.0.31 for torch==2.7.1; revisit on torch upgrades. +find requirements/ -name "*.txt" -type f -exec sed -i -E 's/^([[:space:]]*xformers)[[:space:]]*==([^;[:space:]]*)/\1==0.0.31/' {} \; 2>/dev/null || true
72-74: Remove stray commented duplicate.-#uv pip install --no-build-isolation -e . uv pip install --no-build-isolation -e .
80-84: Improve repo-root detection message and UX.Consider deriving REPO_ROOT via git to be resilient to layout changes; keep current fallback.
-REPO_ROOT="$(realpath "$SCRIPT_DIR/..")" +REPO_ROOT="$(git -C "$SCRIPT_DIR" rev-parse --show-toplevel 2>/dev/null || realpath "$SCRIPT_DIR/..")"
148-158: Echo copy-pastable exports.Small DX improvement and matches the docs’ env-driven flow.
-[INFO] Verify this new vllm version by running: - -VLLM_COMMIT=$VLLM_COMMIT \\ -VLLM_PRECOMPILED_WHEEL_LOCATION=$VLLM_PRECOMPILED_WHEEL_LOCATION \\ - uv run --extra vllm vllm serve Qwen/Qwen3-0.6B +[INFO] Verify this new vllm version by running: + +export VLLM_COMMIT="$VLLM_COMMIT" +export VLLM_PRECOMPILED_WHEEL_LOCATION="$VLLM_PRECOMPILED_WHEEL_LOCATION" +uv run --extra vllm vllm serve Qwen/Qwen3-0.6B
35-38: Optional: support reuse/force for existing 3rdparty/vllm.A “--force” or “NRL_FORCE_RECLONE” path would help iterative dev instead of hard-failing.
62-69: Optional: parameterize Torch backend/version via env.Let advanced users override without editing the script.
-uv pip install torch==2.7.1 --torch-backend=cu128 +uv pip install "torch==${TORCH_VERSION:-2.7.1}" --torch-backend="${TORCH_BACKEND:-cu128}"
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.gitignore(0 hunks)docs/guides/use-custom-vllm.md(1 hunks)tools/build-custom-vllm.sh(2 hunks)
💤 Files with no reviewable changes (1)
- .gitignore
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (3)
docs/guides/use-custom-vllm.md (3)
10-11: Fix usage placeholder typos; use a single, consistent third arg.Replace the misspelled/long placeholder with
<VLLM_COMMIT>to match the script.-# Usage: bash tools/build-custom-vllm.sh <GIT_URL> <GIT_BRANCH> <VLLM_PRECOMILED_WHEEL_COMMI_FROM_MAINT> +# Usage: bash tools/build-custom-vllm.sh <GIT_URL> <GIT_BRANCH> <VLLM_COMMIT>
16-18: Unify commit references or parameterize with env vars.Avoid hard-coding a different commit than the example above; reference
$VLLM_COMMITand$VLLM_PRECOMPILED_WHEEL_LOCATION.-# VLLM_COMMIT=a3319f4f04fbea7defe883e516df727711e516cd \ -# VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/a3319f4f04fbea7defe883e516df727711e516cd/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl \ +# VLLM_COMMIT=<commit> \ +# VLLM_PRECOMPILED_WHEEL_LOCATION=https://wheels.vllm.ai/${VLLM_COMMIT}/vllm-1.0.0.dev-cp38-abi3-manylinux1_x86_64.whl \
21-21: Remove VLLM_USE_PRECOMPILED in favor of the new envs.Guidance conflicts with the new flow. Prefer
VLLM_COMMIT+VLLM_PRECOMPILED_WHEEL_LOCATION.-# [IMPORTANT] Remember to set the shell variable 'export VLLM_USE_PRECOMPILED=1' when running NeMo RL apps with this custom vLLM to avoid re-compiling. +# [IMPORTANT] Export VLLM_COMMIT and VLLM_PRECOMPILED_WHEEL_LOCATION when running NeMo RL apps to avoid re-compiling.
🧹 Nitpick comments (6)
docker/Dockerfile.ngc_pytorch (1)
18-22: Expose vLLM args: good; add traceability labels in release stage.Baking these as ENV is fine. For build provenance, label the image with the chosen commit and wheel URL.
Apply in the release stage:
LABEL com.nvidia.build.ref="${NVIDIA_BUILD_REF}" +LABEL com.nvidia.vllm.commit="${VLLM_COMMIT}" +LABEL com.nvidia.vllm.wheel="${VLLM_PRECOMPILED_WHEEL_LOCATION}"docker/Dockerfile (1)
14-19: Same as ngc_pytorch: add labels for provenance.Surface
VLLM_COMMITandVLLM_PRECOMPILED_WHEEL_LOCATIONvia labels to aid debugging and SBOMs.Add in the release stage:
LABEL com.nvidia.build.ref="${NVIDIA_BUILD_REF}" +LABEL com.nvidia.vllm.commit="${VLLM_COMMIT}" +LABEL com.nvidia.vllm.wheel="${VLLM_PRECOMPILED_WHEEL_LOCATION}"docs/guides/use-custom-vllm.md (4)
38-44: Avoid brittle example output.Replace the hard-coded version string with a placeholder to prevent doc rot as vLLM changes.
-# Successfully imported vLLM version: 0.0.1.dev1+g69d5add74.d20250910 +# Successfully imported vLLM version: <version>
72-83: Mention both Dockerfiles if applicable.If users also target the PyTorch base image, mirror this snippet with
-f docker/Dockerfile.ngc_pytorch. Keeps the guidance complete.
24-34: Add a quick git hint to enforce reproducibility.Suggest
git add pyproject.toml uv.lock && git commit -m "chore: update vllm pins"so users don’t forget to commit the resolver state.
50-60: Note local wheel paths are supported.Clarify that
VLLM_PRECOMPILED_WHEEL_LOCATIONcan be a local file path (/wheels/vllm-*.whl) in airgapped environments.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(1 hunks)docker/Dockerfile(1 hunks)docker/Dockerfile.ngc_pytorch(1 hunks)docs/guides/use-custom-vllm.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- .gitignore
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
docker/Dockerfile (1)
14-19: Do not persist VLLM_PRECOMPILED_WHEEL_LOCATION in the final image — keep it as ARG and record the commit via a LABELRemove the ENV that bakes VLLM_PRECOMPILED_WHEEL_LOCATION into image layers; change docker/Dockerfile and docker/Dockerfile.ngc_pytorch to keep VLLM_PRECOMPILED_WHEEL_LOCATION as ARG only, set ARG VLLM_COMMIT=unknown (for reproducibility) and add a LABEL org.opencontainers.image.vllm.commit="${VLLM_COMMIT}" to capture the commit. Keep or expose VLLM_COMMIT as ENV only if you intentionally require it at runtime.
Verified: repository occurrences are limited to docker/*, docs/guides/use-custom-vllm.md and tools/build-custom-vllm.sh; no runtime code was found that requires the ENV names.
docker/Dockerfile.ngc_pytorch (1)
18-23: Mirror the security posture: don’t bake wheel URLs into the image; wire the commit via LABEL.Same rationale as the main Dockerfile. Keep URL as ARG-only; persist the commit via LABEL. Optionally default VLLM_COMMIT for deterministic caching.
Apply:
-ARG VLLM_COMMIT -ARG VLLM_PRECOMPILED_WHEEL_LOCATION -ENV VLLM_COMMIT=${VLLM_COMMIT} -ENV VLLM_PRECOMPILED_WHEEL_LOCATION=${VLLM_PRECOMPILED_WHEEL_LOCATION} +ARG VLLM_COMMIT=unknown +ARG VLLM_PRECOMPILED_WHEEL_LOCATION +ENV VLLM_COMMIT=${VLLM_COMMIT} +# Do not persist VLLM_PRECOMPILED_WHEEL_LOCATION +LABEL org.opencontainers.image.vllm.commit="${VLLM_COMMIT}"If the intent is to let users override the built version or supply a prebuilt wheel, consider (outside these lines) adding a guarded path in the build_vllm stage:
# After cloning vllm: # honor explicit commit if provided RUN if [ -n "${VLLM_COMMIT:-}" ]; then git checkout "${VLLM_COMMIT}"; else \ VLLM_VERSION=$(grep -A 1 'name = "vllm"' /tmp/uv.lock | sed -n 's/.*version = "\(.*\)".*/\1/p'); \ git checkout "v${VLLM_VERSION}"; \ fi # Or, if a precompiled wheel location is set, skip source build: # (Use BuildKit secrets if this URL contains tokens) RUN if [ -n "${VLLM_PRECOMPILED_WHEEL_LOCATION:-}" ]; then \ uv pip install --no-cache-dir --no-deps "${VLLM_PRECOMPILED_WHEEL_LOCATION}"; \ else \ pip wheel --no-deps --no-build-isolation -v .; \ fi
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(1 hunks)docker/Dockerfile(1 hunks)docker/Dockerfile.ngc_pytorch(1 hunks)docs/guides/use-custom-vllm.md(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- .gitignore
- docs/guides/use-custom-vllm.md
938805e to
732eb30
Compare
|
working state of 2.8 https://github.com/NVIDIA-NeMo/RL/pull/new/tk/custom-vllm-fixes-torch2.8 |
e08770a to
e533f54
Compare
f9d0344 to
7d194f7
Compare
Signed-off-by: Terry Kong <terryk@nvidia.com> fix Signed-off-by: Terry Kong <terryk@nvidia.com> add docker instructions Signed-off-by: Terry Kong <terryk@nvidia.com> update to pyproject.toml to create image (revert me) Signed-off-by: Terry Kong <terryk@nvidia.com> space Signed-off-by: Terry Kong <terryk@nvidia.com> only ignore top level .git Signed-off-by: Terry Kong <terryk@nvidia.com> Revert "update to pyproject.toml to create image (revert me)" This reverts commit dca1742. bug in precompiled wheel url Signed-off-by: Terry Kong <terryk@nvidia.com> cleanup Signed-off-by: Terry Kong <terryk@nvidia.com> example upgrading to 2.8 torch and recent vllm and recent ray Signed-off-by: Terry Kong <terryk@nvidia.com> more changes needed for one-off example Signed-off-by: Terry Kong <terryk@nvidia.com> fix the docker image for custom vllm builds Signed-off-by: Terry Kong <terryk@nvidia.com> Revert "more changes needed for one-off example" This reverts commit 3458000. Signed-off-by: Terry Kong <terryk@nvidia.com> Revert "example upgrading to 2.8 torch and recent vllm and recent ray" This reverts commit 732eb30. Signed-off-by: Terry Kong <terryk@nvidia.com>
7d194f7 to
2e63fac
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (2)
tools/build-custom-vllm.sh (2)
16-16: Fix malformedsetoptions
set -eou pipefailtrips over-o uand the script exits before doing anything. Switch to the standard ordering sopipefailis actually enabled.-set -eou pipefail +set -euo pipefail
103-145: Make the pyproject edits resilient to missing keysRunning on a pyproject without
dependencies/optional-dependenciesblows up withNoneTypeerrors. Usesetdefaultso the script can create any missing sections.-project = doc.get("project") -if project is None: - raise SystemExit("[ERROR] Missing [project] in pyproject.toml") - -deps = project.get("dependencies") - -if not any(x.startswith("setuptools_scm") for x in deps): - deps.append("setuptools_scm") +project = doc.setdefault("project", {}) +deps = project.setdefault("dependencies", []) +if not any(str(x).startswith("setuptools_scm") for x in deps): + deps.append("setuptools_scm") -opt = project.get("optional-dependencies") -vllm_list = opt["vllm"] +opt = project.setdefault("optional-dependencies", {}) +vllm_list = opt.setdefault("vllm", [])
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
.gitignore(1 hunks)docker/Dockerfile(2 hunks)docs/guides/use-custom-vllm.md(1 hunks)tools/build-custom-vllm.sh(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- docker/Dockerfile
🧰 Additional context used
📓 Path-based instructions (2)
**/*.sh
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.sh: Follow the Google Shell Style Guide for all shell scripts
Useuv runto execute Python scripts in shell/driver scripts instead of activating virtualenvs and callingpythondirectly
Add the NVIDIA copyright header (with current year) at the top of all shell scripts, excluding tests/ and test-only scripts
Files:
tools/build-custom-vllm.sh
docs/**/*.md
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
When a markdown doc under docs/**/*.md is added or renamed, update docs/index.md to include it in the appropriate section
Files:
docs/guides/use-custom-vllm.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: sphinx-build / Build docs
- GitHub Check: Lint check
- GitHub Check: Lint check
- GitHub Check: Post automodel integration comment / Comment on PR
- GitHub Check: Post submodule check comment / Comment on PR
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com>
Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Summary by CodeRabbit
New Features
Documentation
Upgrading to latest
The commit included in
build-custom-vllm.shin this PR point to a commit near 0.10.0 since that's what is onmain.Some major upgrades of vllm will probably require some manual steps that are impossible to completely enumerate in the instructions, but here is an example of working through a very recent vllm commit: vllm-project/vllm@cc99baf
Finding a vllm commit: