[NvTensorRTRTX EP] Implement GetHardwareDeviceIncompatibilityDetails with driver and compute capability checks#27577
Open
umangb-09 wants to merge 2 commits intomicrosoft:mainfrom
Open
Conversation
Contributor
Author
|
@chilo-ms @adrianlizarraga review this |
Contributor
There was a problem hiding this comment.
Pull request overview
Implements ORT’s GetHardwareDeviceEpIncompatibilityDetails hook for the NvTensorRTRTX EP factory so callers can receive structured diagnostics (compute capability + NVIDIA driver version checks) instead of opaque session-creation failures.
Changes:
- Wire
GetHardwareDeviceIncompatibilityDetailsintoNvTensorRtRtxEpFactoryand implement detailed incompatibility reporting. - Extend the existing hardware-device support check to optionally return compute capability major/minor for better diagnostics.
- Add NVML-based driver version querying and link the provider module against NVML.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc | Adds the incompatibility-details implementation, including compute capability extraction and NVML driver version validation. |
| cmake/onnxruntime_providers_nv.cmake | Links NvTensorRTRTX provider against CUDA::nvml to enable NVML driver version queries. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc
Outdated
Show resolved
Hide resolved
1d981d9 to
9285932
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Implements
GetHardwareDeviceIncompatibilityDetailsImplfor the NvTensorRTRTX EP factory, wiring it into ORT'sGetHardwareDeviceIncompatibilityDetailsEP API.Motivation and Context
The GetHardwareDeviceEpIncompatibilityDetails API (introduced in #26922) allows EPs to return structured diagnostic information when a device is not compatible, rather than silently failing.
This PR implements that API for the NvTensorRTRTX EP so that users get actionable error messages when their GPU architecture or NVIDIA driver version does not meet the EP's requirements, instead of an opaque failure at session creation time.