Psa fix tensor caching by saurabhkale17 · Pull Request #452 · intel/onnxruntime

saurabhkale17 · 2024-09-24T09:44:04Z

Description

changing the emplace to [] that does have a difference, emplace will only create a new entry if it doesn't already exist in the map
change the logic of the caching lookup to key off of input/output names instead of ort raw ptrs.
changes OV tensor creation for CPU allocated input/output ORT tensors. The CPU allocated input/output tensor path was re-allocating OV tensors based on the ORT input/output tensors. So we'd get 2 copies: ORT input/output tensor -> OV tensor (OVEP) -> NPU Tensor (NPU plugin).

* Implements blob compatibility check for NPU * OVEP catches the NPU driver exception and return failure status * NPU to CPU fallback is disabled when inferencing with blob * Update NPU device exception handling approach * Changes failure status code to exception (std::runtime_error) * Capture all NPU related errors * Throw minimal error message with error type and error code for Release builds * Fix lint issues * Address review comments * Address review comments --------- Co-authored-by: Srirammaswamy <srirammaswamy.s@intel.com>

…PU (#441) * Prototype shared memory allocator on Windows using OV-EP * Partially working allocator. Crashing on tensor destruction. Might have UMD exceptions. Needs further debug. Unknown if values are correct. * Hard code onnx perf to use RT NPU allocator for inputs * Fix allocation lookups coming from different level zero contexts * Page align OV allocation * Allocate input as WC * Only set tensors when they have changed. * Revert "Allocate input as WC" This reverts commit d43219f. * Hard code onnx perf to use RT NPU for outputs * Revert "Hard code onnx perf to use RT NPU for outputs" This reverts commit c1f3b3e. * Hard code onnx perf to use RT NPU for outputs fixed * Fix onnx_perf_test app crash on tensor destroy * refactor: remove redundant ort_shape_to_ovshape lambda function * alocate buffer in NPU visible region from perf test application * remove redundant code * add command line parameter in perf test for using remote tensors * remove redundant code * remove redundant statements * fix crash during inference * remove redundant code * enable backward compatibility of remote tensor feature * Revert "enable backward compatibility of remote tensor feature" This reverts commit 1791b90. * enable backward compatibility of remote tensor feature in OVEP --------- Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com> Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>

…ter then 2024.3

Disable driver caching for NPU when epctx enabled for ov version greater then 2024.3

* fix debug build issue and lint issues * change naming for OVEP NPU specific macro * fix unit tests and lint issues

Generic device memory

Modified Create Options to pass config options to execution Provider

fix psa psr accuracy issue

…e changed.

saurabhkale17 · 2024-09-25T13:25:28Z

Duplicate PR of #455

preetha-intel and others added 18 commits September 6, 2024 20:58

Disable driver caching for NPU when epctx enabled for ov version grea…

df7febe

…ter then 2024.3

Merge pull request #443 from intel/jatin/ov_2024_4_cache_fix

f24a235

Disable driver caching for NPU when epctx enabled for ov version greater then 2024.3

Merge branch 'microsoft:main' into ovep-develop-lnl-1.2

24968dc

Ovep release lnl 1.2.1 (#445)

8259b03

* fix debug build issue and lint issues * change naming for OVEP NPU specific macro * fix unit tests and lint issues

Merge branch 'microsoft:main' into ovep-develop-lnl-1.2

5a9c8af

Merge branch 'microsoft:main' into ovep-develop-lnl-1.2

64abc1a

Refactor device memory implementation to make it more generic

c898be1

fix lint issues

755e265

Merge pull request #447 from intel/generic_device_memory

3b361d6

Generic device memory

Modified Create Options to pass config options to execution Provider

61692ce

fix psa psr accuracy issue

60af873

Changes for adding config buffer

0619e22

Merge pull request #448 from intel/sahar/redesign_factory_create

63026cd

Modified Create Options to pass config options to execution Provider

Merge pull request #449 from intel/psa_psr_accuracy

1bb8ed3

fix psa psr accuracy issue

fix: caching lookup to behave correctly when inputs/output mapping ar…

e4a6912

…e changed.

fix tensor caching

0f8c376

saurabhkale17 requested review from preetha-intel and sfatimar September 24, 2024 09:45

fix lint issues

123efd6

saurabhkale17 force-pushed the ovep-develop-lnl-1.2 branch from 4d64dc0 to f9b995c Compare September 25, 2024 13:20

saurabhkale17 closed this Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Psa fix tensor caching#452

Psa fix tensor caching#452
saurabhkale17 wants to merge 19 commits intoovep-develop-lnl-1.2from
psa_fix_tensor_caching

saurabhkale17 commented Sep 24, 2024

Uh oh!

saurabhkale17 commented Sep 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

saurabhkale17 commented Sep 24, 2024

Description

Uh oh!

saurabhkale17 commented Sep 25, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants