Skip to content

[NvTensorRTRTX EP] Implement GetHardwareDeviceIncompatibilityDetails with driver and compute capability checks#27577

Open
umangb-09 wants to merge 2 commits intomicrosoft:mainfrom
umangb-09:umangb/device_incomaptibility_MSFT
Open

[NvTensorRTRTX EP] Implement GetHardwareDeviceIncompatibilityDetails with driver and compute capability checks#27577
umangb-09 wants to merge 2 commits intomicrosoft:mainfrom
umangb-09:umangb/device_incomaptibility_MSFT

Conversation

@umangb-09
Copy link
Contributor

Description

Implements GetHardwareDeviceIncompatibilityDetailsImpl for the NvTensorRTRTX EP factory, wiring it into ORT's GetHardwareDeviceIncompatibilityDetails EP API.

Motivation and Context

The GetHardwareDeviceEpIncompatibilityDetails API (introduced in #26922) allows EPs to return structured diagnostic information when a device is not compatible, rather than silently failing.

This PR implements that API for the NvTensorRTRTX EP so that users get actionable error messages when their GPU architecture or NVIDIA driver version does not meet the EP's requirements, instead of an opaque failure at session creation time.

@umangb-09
Copy link
Contributor Author

@chilo-ms @adrianlizarraga review this

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements ORT’s GetHardwareDeviceEpIncompatibilityDetails hook for the NvTensorRTRTX EP factory so callers can receive structured diagnostics (compute capability + NVIDIA driver version checks) instead of opaque session-creation failures.

Changes:

  • Wire GetHardwareDeviceIncompatibilityDetails into NvTensorRtRtxEpFactory and implement detailed incompatibility reporting.
  • Extend the existing hardware-device support check to optionally return compute capability major/minor for better diagnostics.
  • Add NVML-based driver version querying and link the provider module against NVML.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_factory.cc Adds the incompatibility-details implementation, including compute capability extraction and NVML driver version validation.
cmake/onnxruntime_providers_nv.cmake Links NvTensorRTRTX provider against CUDA::nvml to enable NVML driver version queries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

@umangb-09 umangb-09 force-pushed the umangb/device_incomaptibility_MSFT branch from 1d981d9 to 9285932 Compare March 13, 2026 09:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants