Qualcomm AI Engine Direct - Merge the two pybind libraries into a single library by shewu-quic · Pull Request #15999 · pytorch/executorch

shewu-quic · 2025-11-27T06:16:19Z

Summary:

Prevent dynamic_cast failures caused by separate typeinfo in each library with clang.

cc: @haowhsu-quic

pytorch-bot · 2025-11-27T06:16:22Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15999

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 1 Cancelled Job, 1 Unrelated Failure

As of commit fffb232 with merge base ee236cb ():

NEW FAILURE - The following job has failed:

trunk / test-llama-runner-qnn-linux (fp32, qnn_8a8w, qnn) / linux-job (gh)
RuntimeError: Command docker exec -t d5858bf339effc8a64630d482a83cea3ddc8d201a67433dabd35f04b749f32a2 /exec failed with exit code 1

CANCELLED JOB - The following job was cancelled. Please retry:

trunk / test-models-macos-cpu (mobilebert, xnnpack-quantization-delegation) / macos-job (gh)
##[error]The operation was canceled.

UNSTABLE - The following job is marked as unstable, possibly due to flakiness on trunk:

pull / android / run-emulator (gh) (#16137)
Timeout waiting for emulator to boot.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

shewu-quic · 2025-11-27T06:19:43Z

Hi @cccclai
This PR addresses the dynamic_cast failure encountered with clang in this issue #15734
I will split Qualcomm AI Engine Direct - Support PARSeq in floating point precision into a few PRs.

Could you take a look at this when you have a moment?

Thanks

shewu-quic · 2025-11-27T06:25:37Z

@pytorchbot label "release notes: qualcomm"

cccclai · 2025-12-01T18:32:24Z

It seems like a pretty big change and I just recently disable qnn pybind test #15949. Let's make sure it passes internal test and the qnn related CI..

…gle library Summary: - Prevent dynamic_cast failures caused by separate typeinfo in each library.

meta-codesync · 2025-12-11T17:56:31Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this in D88963737.

DamonFool · 2025-12-16T09:17:37Z

Hi @cccclai , did you test executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py?

I got the following error

Traceback (most recent call last):
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py", line 16, in <module>
    from executorch.backends.qualcomm.quantizer.quantizer import QuantDtype
  File "/home/jiefu/executorch/backends/qualcomm/quantizer/quantizer.py", line 12, in <module>
    from executorch.backends.qualcomm._passes.qnn_pass_manager import QnnPassManager
  File "/home/jiefu/executorch/backends/qualcomm/_passes/__init__.py", line 7, in <module>
    from .annotate_adaptive_avg_pool1d import AnnotateAdaptiveAvgPool1D
  File "/home/jiefu/executorch/backends/qualcomm/_passes/annotate_adaptive_avg_pool1d.py", line 7, in <module>
    from executorch.backends.qualcomm.builders.node_visitor import q_ops
  File "/home/jiefu/executorch/backends/qualcomm/builders/__init__.py", line 7, in <module>
    from . import (
  File "/home/jiefu/executorch/backends/qualcomm/builders/node_visitor.py", line 51, in <module>
    torch.int8: PyQnnManager.Qnn_DataType_t.QNN_DATATYPE_SFIXED_POINT_8,
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: module 'executorch.backends.qualcomm.python.PyQnnManagerAdaptor' has no attribute 'Qnn_DataType_t'

shewu-quic · 2025-12-16T09:21:19Z

executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py

Hi @DamonFool ,

Could you rebuild PyQnnManagerAdaptor.so?

./backends/qualcomm/scripts/build.sh

DamonFool · 2025-12-16T09:45:50Z

./backends/qualcomm/scripts/build.sh

It works for me.
Thanks @cccclai .

But with the following error

[QNN Partitioner Op Support]: aten.unsqueeze_copy.default | True
[QNN Partitioner Op Support]: aten.unsqueeze_copy.default | True
[QNN Partitioner Op Support]: aten.where.self | True
[ERROR] [Qnn ExecuTorch]: Number of input elements 1 does not match number of output elements 128.

[ERROR] [Qnn ExecuTorch]: Op specific validation failed.

[ERROR] [Qnn ExecuTorch]:  <E> validateNativeOps master op validator aten_copy_default:qti.aisw:Reshape failed 3110

[ERROR] [Qnn ExecuTorch]:  <E> QnnBackend_validateOpConfig failed 3110

[ERROR] [Qnn ExecuTorch]:  <E> Failed to validate op aten_copy_default with error 0xc26

[WARNING] [Qnn ExecuTorch]: Qnn Backend op validation failed with error: 3110
[QNN Partitioner Op Support]: aten.copy.default | False
Traceback (most recent call last):
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py", line 259, in <module>
    main(args)
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py", line 192, in main
    compile(args)
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py", line 80, in compile
    manager.to_edge_transform_and_lower_to_qnn(
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/llm_utils/qnn_decoder_model_manager.py", line 292, in to_edge_transform_and_lower_to_qnn
    self.edge_prog_mgr = to_edge_transform_and_lower_to_qnn(
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/backends/qualcomm/utils/utils.py", line 448, in to_edge_transform_and_lower_to_qnn
    return to_edge_transform_and_lower(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/program/_program.py", line 115, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/program/_program.py", line 1378, in to_edge_transform_and_lower
    edge_manager = edge_manager.to_backend(method_to_partitioner)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/program/_program.py", line 115, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/program/_program.py", line 1680, in to_backend
    new_edge_programs = to_backend(method_to_programs_and_partitioners)
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/functools.py", line 909, in wrapper
    return dispatch(args[0].__class__)(*args, **kw)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/backend/backend_api.py", line 721, in _
    partitioner_result = partitioner_instance(fake_edge_program)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/backend/partitioner.py", line 66, in __call__
    return self.partition(exported_program)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 199, in partition
    partitions = self.generate_partitions(edge_program)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 164, in generate_partitions
    return generate_partitions_from_list_of_nodes(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/exir/backend/canonical_partitioners/pattern_op_partitioner.py", line 54, in generate_partitions_from_list_of_nodes
    partition_list = capability_partitioner.propose_partitions()
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/.python/executorch/lib/python3.11/site-packages/torch/fx/passes/infra/partitioner.py", line 226, in propose_partitions
    if self._is_node_supported(node) and node not in assignment:
       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/.python/executorch/lib/python3.11/site-packages/torch/fx/passes/infra/partitioner.py", line 87, in _is_node_supported
    return self.operator_support.is_node_supported(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/jiefu/executorch/backends/qualcomm/partition/qnn_partitioner.py", line 100, in is_node_supported
    op_wrapper = self.node_visitors[node.target.__name__].define_node(
                 ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
KeyError: 'dim_order_ops._empty_dim_order.default'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/jiefu/executorch/examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py", line 265, in <module>
    raise Exception(e)
Exception: 'dim_order_ops._empty_dim_order.default'

I tested it with transformers==4.53.1.

So may I ask is the qwen2_5.py script also broken with older transformers?

It is broken with transformers==4.56.1, right?

DamonFool · 2025-12-16T09:47:22Z

My test command is

python3 examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py \
    -m SM8650  \
    -s xxx \
    --prompt "My favourite condiment is "  \
    -b build-android \
    --decoder_model qwen2.5_0.5B \
    --calibration_tasks wikitext \
    --calibration_limit 1 \
    --ptq 16a8w

shewu-quic · 2025-12-16T10:38:04Z

My test command is

python3 examples/qualcomm/oss_scripts/qwen2_5/qwen2_5.py \
    -m SM8650  \
    -s xxx \
    --prompt "My favourite condiment is "  \
    -b build-android \
    --decoder_model qwen2.5_0.5B \
    --calibration_tasks wikitext \
    --calibration_limit 1 \
    --ptq

It seems to be misaligned with transformers==4.56.1. Will look into it. Thanks for reporting.

BTW, you can also run this script to run qwen to get better performance.

https://github.com/pytorch/executorch/tree/main/examples/qualcomm/oss_scripts/llama

python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --temperature 0 --model_mode hybrid --max_seq_len 1024 --prefill_ar_len 128 --decoder_model qwen2_5-0_5b --prompt "I would like to learn python, could you teach me with a simple example?" --tasks wikitext --limit 1

DamonFool · 2025-12-16T10:45:43Z

python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s ${SERIAL_NUM} -m ${SOC_MODEL} --temperature 0 --model_mode hybrid --max_seq_len 1024 --prefill_ar_len 128 --decoder_model qwen2_5-0_5b --prompt "I would like to learn python, could you teach me with a simple example?" --tasks wikitext --limit 1

Thanks @shewu-quic .
Will try it later.

@haowhsu-quic

…gle library (pytorch#15999) Summary: - Prevent dynamic_cast failures caused by separate typeinfo in each library with clang. cc: @haowhsu-quic

@haowhsu-quic

…gle library (pytorch#15999) Summary: - Prevent dynamic_cast failures caused by separate typeinfo in each library with clang. cc: @haowhsu-quic

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 27, 2025

shewu-quic marked this pull request as ready for review November 27, 2025 06:16

shewu-quic requested review from cccclai, kirklandsign and larryliu0820 as code owners November 27, 2025 06:16

pytorch-bot bot added the release notes: qualcomm Changes to the Qualcomm backend delegate label Nov 27, 2025

shewu-quic force-pushed the dev1/hutton/combine_pybind_libraries branch 2 times, most recently from 9b4866c to 1ab4cd2 Compare December 3, 2025 01:34

shewu-quic added 2 commits December 11, 2025 09:36

Qualcomm AI Engine Direct - Merge the two pybind libraries into a sin…

6f35133

…gle library Summary: - Prevent dynamic_cast failures caused by separate typeinfo in each library.

fixed ci failure without PyQnnWrapperAdaptor

fffb232

shewu-quic force-pushed the dev1/hutton/combine_pybind_libraries branch from 1ab4cd2 to fffb232 Compare December 11, 2025 01:37

cccclai approved these changes Dec 15, 2025

View reviewed changes

cccclai merged commit 68ddd80 into pytorch:main Dec 15, 2025
276 of 279 checks passed

shewu-quic mentioned this pull request Jan 15, 2026

Failed to create InputDef for Op aten_permute_copy_default (qti.aisw::Transpose) param perm #16504

Closed

nil-is-all mentioned this pull request Jan 20, 2026

[ERROR] [Qnn ExecuTorch]: Filters in[1] dimension 3 at index 2 not equal to channel_in 200 / groups 1. #16619

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - Merge the two pybind libraries into a single library#15999

Qualcomm AI Engine Direct - Merge the two pybind libraries into a single library#15999
cccclai merged 2 commits intopytorch:mainfrom
CodeLinaro:dev1/hutton/combine_pybind_libraries

shewu-quic commented Nov 27, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 27, 2025 •

edited

Loading

Uh oh!

shewu-quic commented Nov 27, 2025 •

edited

Loading

Uh oh!

shewu-quic commented Nov 27, 2025

Uh oh!

cccclai commented Dec 1, 2025

Uh oh!

meta-codesync bot commented Dec 11, 2025

Uh oh!

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

shewu-quic commented Dec 16, 2025 •

edited

Loading

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

shewu-quic commented Dec 16, 2025

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

shewu-quic commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15999

❌ 1 New Failure, 1 Cancelled Job, 1 Unrelated Failure

Uh oh!

shewu-quic commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

shewu-quic commented Nov 27, 2025

Uh oh!

cccclai commented Dec 1, 2025

Uh oh!

meta-codesync bot commented Dec 11, 2025

Uh oh!

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

shewu-quic commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

shewu-quic commented Dec 16, 2025

Uh oh!

DamonFool commented Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

shewu-quic commented Nov 27, 2025 •

edited

Loading

pytorch-bot bot commented Nov 27, 2025 •

edited

Loading

shewu-quic commented Nov 27, 2025 •

edited

Loading

shewu-quic commented Dec 16, 2025 •

edited

Loading