[TRT RTX EP] EP context changes#25747
Conversation
|
@chilo-ms we will need a review of this :) |
|
@jywu-msft We are adding more unit tests that i believe will also help test the compile API etc in ORT. Can we resurface the topic of running NV EP in the official ORT CI ? |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc
Outdated
Show resolved
Hide resolved
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc
Outdated
Show resolved
Hide resolved
### Description <!-- Describe your changes. --> See the title ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make traditional EPs (non plug-in) access OrtValue initializers. Re: #25747
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.h
Outdated
Show resolved
Hide resolved
expand EP context tests
fix large model unit test
remove unused lines revert changes from main in grahp.cc
ef8540a to
d0926f8
Compare
### Description <!-- Describe your changes. --> See the title ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make traditional EPs (non plug-in) access OrtValue initializers. Re: #25747
|
@chilo-ms Since i resolve the cases where weights are unnecessarily copied based on Dimitri's comments this should be ready to merge. |
|
/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline |
|
Azure Pipelines successfully started running 5 pipeline(s). |
* Implements `GetEPContextNodes()` * Enables usage of `AddExternalInitializersFromFilesInMemory` for models that have to be communicated as byte stream but are larger than 2GB * Add EP context unit tests for file, bytestreams and both embed modes NOTE: For large models > 2GB, `embed_mode=0` must be used. `embed_mode=1` fails due to protobuf limitations --------- Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
### Description Cherry-pick the following PRs into the `rel-1.23.0` branch: - #25592 - #25622 - #25688 - #25729 - #25743 - #25769 - #25745 - #25761 - #25751 - #25716 - #25228 - #25768 - #25788 - #25747 - #25800 - #25818 - #25762 - #25749 - #25831 ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> --------- Co-authored-by: quic-tirupath <quic_tirupath@quicinc.com> Co-authored-by: quic-calvnguy <quic_calvnguy@quicinc.com> Co-authored-by: qti-kromero <kromero@qti.qualcomm.com> Co-authored-by: Jeff Kilpatrick <jkilpatrick@qti.qualcomm.com> Co-authored-by: Scott McKay <skottmckay@gmail.com> Co-authored-by: David Fan <30608893+jiafatom@users.noreply.github.com> Co-authored-by: kuanyul-qti <kuanyul@qti.qualcomm.com> Co-authored-by: Dmitri Smirnov <yuslepukhin@users.noreply.github.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Chunye Wang@AMD <chunywan@amd.com> Co-authored-by: minfhong-qti <minfhong@qti.qualcomm.com> Co-authored-by: Vishal Agarwal <vishala@nvidia.com> Co-authored-by: Maximilian Müller <maximilianm@nvidia.com> Co-authored-by: Maximilian Müller <44298237+gedoensmax@users.noreply.github.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: adrastogi <aditya.rastogi@microsoft.com> Co-authored-by: Aditya Rastogi <adityar@ntdev.microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
### Description <!-- Describe your changes. --> See the title ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Make traditional EPs (non plug-in) access OrtValue initializers. Re: microsoft#25747
* Implements `GetEPContextNodes()` * Enables usage of `AddExternalInitializersFromFilesInMemory` for models that have to be communicated as byte stream but are larger than 2GB * Add EP context unit tests for file, bytestreams and both embed modes NOTE: For large models > 2GB, `embed_mode=0` must be used. `embed_mode=1` fails due to protobuf limitations --------- Co-authored-by: Maximilian Müller <maximilianm@nvidia.com>
GetEPContextNodes()AddExternalInitializersFromFilesInMemoryfor models that have to be communicated as byte stream but are larger than 2GBNOTE: For large models > 2GB,
embed_mode=0must be used.embed_mode=1fails due to protobuf limitations