Set shared memory type based on options during the compilation phase by quic-ashigarg · Pull Request #24196 · microsoft/onnxruntime

quic-ashigarg · 2025-03-26T20:45:29Z

Description

During inference, using the QNN EP option to set enable_htp_shared_memory_allocator gives a hint that we use RPC allocated buffers to avoid buffer copy between CPU and NPU.

With the current PR, we add hints in the compilation phase that if RPC memory is going to be used, any additional allocations done on the CPU can be avoided.

Motivation and Context

This should help reduce the peak CPU memory consumption while running AI work loads using shared memory.

Related PR: #23136

HectorSVC · 2025-03-26T21:06:01Z

/azp run Linux QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows ARM64 QNN CI Pipeline,Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2025-03-26T21:06:19Z

Azure Pipelines successfully started running 4 pipeline(s).

edgchen1

During inference, using the QNN EP option to set enable_htp_shared_memory_allocator ensures that we use RPC allocated buffers to avoid buffer copy between CPU and NPU.

Technically, it does not ensure shared buffer usage. enable_htp_shared_memory_allocator only makes the allocator available. It is up to the user to use the allocator or not for graph inputs and outputs.

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc

onnxruntime/core/providers/qnn/qnn_execution_provider.cc

yuslepukhin · 2025-04-15T17:40:17Z

Please, be advised of this PR: #24371

HectorSVC · 2025-04-16T16:37:49Z

/azp run Linux QNN CI Pipeline,Windows x64 QNN CI Pipeline,Windows ARM64 QNN CI Pipeline,Linux Android Emulator QNN CI Pipeline

azure-pipelines · 2025-04-16T16:38:08Z

Azure Pipelines successfully started running 4 pipeline(s).

onnxruntime/core/providers/qnn/builder/qnn_def.h

onnxruntime/core/providers/qnn/qnn_execution_provider.cc

HectorSVC · 2025-04-17T04:00:38Z

/azp run Win_TRT_Minimal_CUDA_Test_C, Windows GPU Doc Gen CI Pipeline

azure-pipelines · 2025-04-17T04:01:21Z

Azure Pipelines successfully started running 1 pipeline(s).

HectorSVC · 2025-04-17T04:12:16Z

/azp run Win_TRT_Minimal_CUDA_Test_CI

azure-pipelines · 2025-04-17T04:12:25Z

Azure Pipelines successfully started running 1 pipeline(s).

…icrosoft#24196) ### Description During inference, using the QNN EP option to set enable_htp_shared_memory_allocator gives a hint that we use RPC allocated buffers to avoid buffer copy between CPU and NPU. With the current PR, we add hints in the compilation phase that if RPC memory is going to be used, any additional allocations done on the CPU can be avoided. ### Motivation and Context This should help reduce the peak CPU memory consumption while running AI work loads using shared memory. Related PR: microsoft#23136 Co-authored-by: Ashish Garg (AISW) <ashigarg@qti.qualcomm.com>

edgchen1 reviewed Mar 27, 2025

View reviewed changes

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc Show resolved Hide resolved

onnxruntime/core/providers/qnn/qnn_execution_provider.cc Outdated Show resolved Hide resolved

HectorSVC added the ep:QNN issues related to QNN exeution provider label Mar 31, 2025

quic-ashigarg force-pushed the dev/ashigarg/qmem branch from 5d5db8a to 1faa0d4 Compare April 15, 2025 17:48

Enable qmemTensor handle support in CBG

d21cec8

quic-ashigarg force-pushed the dev/ashigarg/qmem branch from 1faa0d4 to d21cec8 Compare April 15, 2025 17:54

edgchen1 reviewed Apr 16, 2025

View reviewed changes

onnxruntime/core/providers/qnn/builder/qnn_def.h Show resolved Hide resolved

onnxruntime/core/providers/qnn/qnn_execution_provider.cc Show resolved Hide resolved

HectorSVC approved these changes Apr 26, 2025

View reviewed changes

HectorSVC merged commit 138a3a3 into microsoft:main Apr 26, 2025
68 of 71 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set shared memory type based on options during the compilation phase#24196

Set shared memory type based on options during the compilation phase#24196
HectorSVC merged 1 commit intomicrosoft:mainfrom
CodeLinaro:dev/ashigarg/qmem

quic-ashigarg commented Mar 26, 2025 •

edited

Loading

Uh oh!

HectorSVC commented Mar 26, 2025

Uh oh!

azure-pipelines bot commented Mar 26, 2025

Uh oh!

edgchen1 left a comment

Uh oh!

Uh oh!

Uh oh!

yuslepukhin commented Apr 15, 2025

Uh oh!

HectorSVC commented Apr 16, 2025

Uh oh!

azure-pipelines bot commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

HectorSVC commented Apr 17, 2025

Uh oh!

azure-pipelines bot commented Apr 17, 2025

Uh oh!

HectorSVC commented Apr 17, 2025

Uh oh!

azure-pipelines bot commented Apr 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

quic-ashigarg commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

HectorSVC commented Mar 26, 2025

Uh oh!

azure-pipelines bot commented Mar 26, 2025

Uh oh!

edgchen1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

yuslepukhin commented Apr 15, 2025

Uh oh!

HectorSVC commented Apr 16, 2025

Uh oh!

azure-pipelines bot commented Apr 16, 2025

Uh oh!

Uh oh!

Uh oh!

HectorSVC commented Apr 17, 2025

Uh oh!

azure-pipelines bot commented Apr 17, 2025

Uh oh!

HectorSVC commented Apr 17, 2025

Uh oh!

azure-pipelines bot commented Apr 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

quic-ashigarg commented Mar 26, 2025 •

edited

Loading