Skip to content

Psa fix tensor caching#452

Closed
saurabhkale17 wants to merge 19 commits intoovep-develop-lnl-1.2from
psa_fix_tensor_caching
Closed

Psa fix tensor caching#452
saurabhkale17 wants to merge 19 commits intoovep-develop-lnl-1.2from
psa_fix_tensor_caching

Conversation

@saurabhkale17
Copy link

Description

  1. changing the emplace to [] that does have a difference, emplace will only create a new entry if it doesn't already exist in the map
  2. change the logic of the caching lookup to key off of input/output names instead of ort raw ptrs.
  3. changes OV tensor creation for CPU allocated input/output ORT tensors. The CPU allocated input/output tensor path was re-allocating OV tensors based on the ORT input/output tensors. So we'd get 2 copies: ORT input/output tensor -> OV tensor (OVEP) -> NPU Tensor (NPU plugin).

preetha-intel and others added 18 commits September 6, 2024 20:58
* Implements blob compatibility check for NPU

* OVEP catches the NPU driver exception and return failure status

* NPU to CPU fallback is disabled when inferencing with blob

* Update NPU device exception handling approach

* Changes failure status code to exception (std::runtime_error)

* Capture all NPU related errors

* Throw minimal error message with error type and error code for Release
  builds

* Fix lint issues

* Address review comments

* Address review comments

---------

Co-authored-by: Srirammaswamy <srirammaswamy.s@intel.com>
…PU (#441)

* Prototype shared memory allocator on Windows using OV-EP

* Partially working allocator.

Crashing on tensor destruction. Might have UMD exceptions. Needs further
debug. Unknown if values are correct.

* Hard code onnx perf to use RT NPU allocator for inputs

* Fix allocation lookups coming from different level zero contexts

* Page align OV allocation

* Allocate input as WC

* Only set tensors when they have changed.

* Revert "Allocate input as WC"

This reverts commit d43219f.

* Hard code onnx perf to use RT NPU for outputs

* Revert "Hard code onnx perf to use RT NPU for outputs"

This reverts commit c1f3b3e.

* Hard code onnx perf to use RT NPU for outputs fixed

* Fix onnx_perf_test app crash on tensor destroy

* refactor: remove redundant ort_shape_to_ovshape lambda function

* alocate buffer in NPU visible region from perf test application

* remove redundant code

* add command line parameter in perf test for using remote tensors

* remove redundant code

* remove redundant statements

* fix crash during inference

* remove redundant code

* enable backward compatibility of remote tensor feature

* Revert "enable backward compatibility of remote tensor feature"

This reverts commit 1791b90.

* enable backward compatibility of remote tensor feature in OVEP

---------

Co-authored-by: Javier E. Martinez <javier.e.martinez@intel.com>
Co-authored-by: Eric Crawford <eric.r.crawford@intel.com>
Disable driver caching for NPU when epctx enabled for ov version greater then 2024.3
* fix debug build issue and lint issues

* change naming for OVEP NPU specific macro

* fix unit tests and lint issues
Modified Create Options to pass config options to execution Provider
@saurabhkale17
Copy link
Author

Duplicate PR of #455

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants