fix(trtllm): install pip into runtime venv for NVRTC JIT include discovery by nv-yna · Pull Request #8296 · ai-dynamo/dynamo

nv-yna · 2026-04-17T04:22:30Z

Summary

Install pip into the TRT-LLM runtime venv so TRT-LLM's NVRTC JIT path can discover its own install location at runtime.

Root cause

On Blackwell (sm_100a), TRT-LLM 1.3.0rc11 must JIT-compile fmhaSm100aKernel_* via NVRTC because FMHA cubins are only pre-compiled for sm_90 (Hopper). TRT-LLM's JIT wrapper (cpp/include/tensorrt_llm/deep_gemm/compiler.cuh:151, getJitIncludeDirs()) discovers its install path by shelling out to pip show tensorrt_llm. From there it adds <install>/tensorrt_llm/include as an NVRTC -I option — the path that ships cuda.h and the kernel sources.

Dynamo's runtime venv is built with uv pip install, which does not install pip inside the venv. $PATH still has $VIRTUAL_ENV/bin first, but there is no pip binary there, so the subprocess falls through to /usr/bin/pip (system Python's pip), which cannot see uv-managed site-packages. pip show tensorrt_llm returns "Package(s) not found", getJitIncludeDirs() returns empty, and TRT-LLM calls nvrtcCompileProgram with zero -I options:

[E] [CudaRunner.cpp:458]: Failed to preprocess kernel fmhaSm100aKernel_Qkv...PersistentSwapsAbForGen:
    Compilation failed: NVRTC_ERROR_COMPILATION
    fmhaSm100aKernel_...(4): catastrophic error: could not open source file "cuda.h"
    (no directories in search list)
    #include <cuda.h>

Hopper is unaffected because the pre-compiled sm_90 cubins skip the JIT path entirely.

Fix

Add pip to the uv pip install line in both the runtime and dev/local-dev branches of container/templates/trtllm_runtime.Dockerfile. pip show tensorrt_llm now resolves to /opt/dynamo/venv/bin/pip (installed in the venv), which can see the dist-info — TRT-LLM gets the correct include path and NVRTC JIT succeeds.

Alternatives considered and rejected

CPATH=/usr/local/cuda/include — NVRTC does not honor GCC's CPATH.
ln -s /usr/local/cuda/include/cuda.h /usr/include/cuda.h — NVRTC does not search /usr/include; it only uses -I options.
apt install cuda-cudart-dev-13-1 — redundant; headers are already present. The bug is discovery, not file absence.
Patching getJitIncludeDirs() upstream — correct long-term fix but requires a TRT-LLM release; the pip-in-venv workaround unblocks Dynamo 1.1.0 immediately.

Test plan

Reproduced and fixed on GB200 (gb200nvl4, NVIDIA GB200 sm_100a, ARM64) using the pre-built CI arm64 image gitlab-master.nvidia.com:5005/dl/ai-dynamo/dynamo-ci:652d692a...-48651348-trtllm-arm64:

Baseline: trtllm-serve serve Qwen/Qwen3-0.6B --backend pytorch fails with the NVRTC error above.
Fix validated inline: docker exec ... uv pip install pip → pip show tensorrt_llm resolves → trtllm-serve starts, /v1/models returns Qwen/Qwen3-0.6B, /v1/chat/completions returns a valid completion, nvidia-smi shows 171 GB used by the worker.
Template change renders without diff drift (verified via python3 container/render.py --framework=trtllm --target=runtime --cuda-version=13.1 --platform=arm64).
Fresh docker build from this branch on GB200 produces dyn-2715-main-fix:latest (28.5 GB arm64 image). trtllm-serve serve Qwen/Qwen3-0.6B --backend pytorch starts clean, /v1/models returns the model, /v1/chat/completions returns a valid completion, nvidia-smi shows 167 GB of GPU memory used by the worker process. Zero NVRTC errors in /tmp/serve.log.

Fixes DYN-2715.

Summary by CodeRabbit

Chores
- Improved runtime environment configuration to ensure proper package discovery and dependency resolution.

…nstall TRT-LLM's NVRTC JIT path (FMHA kernel compilation on Blackwell sm_100a) discovers its install location at runtime by shelling out to `pip show tensorrt_llm`. The runtime venv is built with `uv pip install`, which does not place `pip` inside the venv, so the subprocess resolves to the system `/usr/bin/pip` and cannot see uv-managed site-packages. `pip show` then returns "Package(s) not found" and TRT-LLM passes zero `-I` options to NVRTC, failing the FMHA JIT with: NVRTC_ERROR_COMPILATION ... could not open source file "cuda.h" (no directories in search list) The failure only surfaces on Blackwell because sm_90 (Hopper) ships pre-compiled cubins and never invokes NVRTC. Fixes DYN-2715. Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com>

github-actions · 2026-04-17T04:22:39Z

👋 Hi nv-yna! Thank you for contributing to ai-dynamo/dynamo.

Just a reminder: The NVIDIA Test Github Validation CI runs an essential subset of the testing framework to quickly catch errors.Your PR reviewers may elect to test the changes comprehensively before approving your changes.

🚀

coderabbitai · 2026-04-17T04:23:35Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7c0c209a-d6d5-4618-90b7-717e31c3cde8

📥 Commits

Reviewing files that changed from the base of the PR and between b0f7d8a and 4aede83.

📒 Files selected for processing (1)

container/templates/trtllm_runtime.Dockerfile

Walkthrough

Updates the Dockerfile to explicitly install the pip package into the virtual environment during uv pip install commands for both dev and non-dev targets. Adds inline comments explaining that pip is required at runtime for tensorrt_llm NVRTC JIT discovery via pip show command.

Changes

Cohort / File(s)	Summary
Infrastructure/Docker Configuration `container/templates/trtllm_runtime.Dockerfile`	Added explicit `pip` package installation to venv during wheel installation for both dev and non-dev targets. Added clarifying comments that pip is required at runtime for tensorrt_llm NVRTC JIT discovery affecting FMHA kernel JIT on sm_100a.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and specifically summarizes the main change: installing pip into the runtime venv to fix NVRTC JIT discovery for TRT-LLM.
Description check	✅ Passed	The description fully addresses the template requirements with comprehensive sections: clear summary, detailed root cause analysis, concrete fix explanation, alternatives considered, and thorough test plan with verification steps.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

…overy (#8296) Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com> Co-authored-by: Yuewei Na <nv-yna@users.noreply.github.com>

…overy (#8296) Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com> Co-authored-by: Yuewei Na <nv-yna@users.noreply.github.com> Signed-off-by: Indrajit Bhosale <iamindrajitb@gmail.com>

nv-yna requested review from a team as code owners April 17, 2026 04:22

pull-request-size Bot added the size/S label Apr 17, 2026

copy-pr-bot Bot temporarily deployed to GITLAB April 17, 2026 04:22 Inactive

github-actions Bot added the fix label Apr 17, 2026

github-actions Bot added external-contribution Pull request is from an external contributor container labels Apr 17, 2026

copy-pr-bot Bot temporarily deployed to GITLAB April 17, 2026 04:23 Inactive

nv-yna mentioned this pull request Apr 17, 2026

fix(trtllm): install pip into runtime venv for NVRTC JIT include discovery #8297

Closed

3 tasks

nv-yna requested review from dillon-cullinan, ranrubin and tanmayv25 and removed request for ranrubin April 17, 2026 06:56

tanmayv25 approved these changes Apr 17, 2026

View reviewed changes

nv-yna merged commit 94ee2aa into ai-dynamo:main Apr 17, 2026
70 checks passed

richardhuo-nv pushed a commit that referenced this pull request Apr 17, 2026

fix(trtllm): install pip into runtime venv for NVRTC JIT include disc…

820ce73

…overy (#8296) Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com> Co-authored-by: Yuewei Na <nv-yna@users.noreply.github.com>

nv-yna mentioned this pull request Apr 18, 2026

[test] fix(trtllm): DYN-2715 on release/1.1.0 with #8324 IRSA cherry-pick #8338

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(trtllm): install pip into runtime venv for NVRTC JIT include discovery#8296

fix(trtllm): install pip into runtime venv for NVRTC JIT include discovery#8296
nv-yna merged 1 commit into
ai-dynamo:mainfrom
nv-yna:yna/dyn-2715

nv-yna commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

nv-yna commented Apr 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root cause

Fix

Alternatives considered and rejected

Test plan

Summary by CodeRabbit

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

coderabbitai Bot commented Apr 17, 2026

Walkthrough

Changes

Estimated code review effort

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nv-yna commented Apr 17, 2026 •

edited by coderabbitai Bot

Loading