Qualcomm AI Engine Direct - GLM1.5B by winskuo-quic · Pull Request #15691 · pytorch/executorch

winskuo-quic · 2025-11-10T02:11:15Z

Summary

GLM Enablement
python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s $DEVICE -m SM8750 --temperature 0 --model_mode kv --max_seq_len 128 --decoder_model glm-1_5b --prompt "Could you tell me about Facebook?"

Test plan

python backends/qualcomm/tests/test_qnn_delegate.py -k TestExampleLLMScript.test_static_glm1_5b --model SM8750 --build_folder build-android/ --executorch_root . -s $DEVICE --artifact ./glm1_5b

pytorch-bot · 2025-11-10T02:11:18Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15691

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-11-10T02:11:58Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

cccclai

Looks good, can you fix the lint?

cccclai · 2025-11-15T20:45:33Z

examples/models/llama/model_args.py

    attention_kwargs: Dict[str, Any] = dataclasses.field(default_factory=dict)
    # Hybrid models can have layer types different from attention
    layer_types: Optional[list] = None
+    model_architecture: Optional[str] = None


Can you add comments to explain this variable?

Added. Thanks

cccclai · 2025-11-15T20:46:07Z

examples/models/glm/config/1_5b_config.json

+  "use_hf_rope": true,
+  "attention_qkv_bias": false,
+  "use_qk_norm": false,
+  "model_architecture" : "GlmForCausalLM"


Do we have any existing variable that can be used for this?

Thanks for the suggestion.
I was actually thinking of reusing base_model_name_or_path. However, it seems like this variable is used in optimum for some other purpose, like referring to actual model path, so I created a new variable to prevent any conflict in future.
Another reason of creating this config is that as we are enabling more models, we noticed minor differences among models. For example, GLM FeedForward is different from other model's FeedForward. We need some variables to differentiate GLM and other LLM models.

meta-codesync · 2025-11-24T20:04:11Z

@cccclai has imported this pull request. If you are a Meta employee, you can view this in D87811592.

### Summary GLM Enablement `python examples/qualcomm/oss_scripts/llama/llama.py -b build-android -s $DEVICE -m SM8750 --temperature 0 --model_mode kv --max_seq_len 128 --decoder_model glm-1_5b --prompt "Could you tell me about Facebook?"` ### Test plan `python backends/qualcomm/tests/test_qnn_delegate.py -k TestExampleLLMScript.test_static_glm1_5b --model SM8750 --build_folder build-android/ --executorch_root . -s $DEVICE --artifact ./glm1_5b`

winskuo-quic requested review from cccclai, jackzhxng and lucylq as code owners November 10, 2025 02:11

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 10, 2025

cccclai approved these changes Nov 15, 2025

View reviewed changes

cccclai reviewed Nov 15, 2025

View reviewed changes

Qualcomm AI Engine Direct - GLM1.5B

555714b

winskuo-quic force-pushed the dev1/winskuo/glm_1.5b branch from 306d471 to 555714b Compare November 24, 2025 02:57

Code Review

0ba5eb0

cccclai merged commit f84d45b into pytorch:main Nov 25, 2025
139 of 141 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Qualcomm AI Engine Direct - GLM1.5B#15691

Qualcomm AI Engine Direct - GLM1.5B#15691
cccclai merged 2 commits intopytorch:mainfrom
CodeLinaro:dev1/winskuo/glm_1.5b

winskuo-quic commented Nov 10, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Nov 10, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Nov 10, 2025

Uh oh!

cccclai left a comment

Uh oh!

cccclai Nov 15, 2025

Uh oh!

winskuo-quic Nov 24, 2025

Uh oh!

cccclai Nov 15, 2025

Uh oh!

winskuo-quic Nov 24, 2025

Uh oh!

meta-codesync bot commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

winskuo-quic commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

pytorch-bot bot commented Nov 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15691

Uh oh!

github-actions bot commented Nov 10, 2025

This PR needs a release notes: label

Uh oh!

cccclai left a comment

Choose a reason for hiding this comment

Uh oh!

cccclai Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

cccclai Nov 15, 2025

Choose a reason for hiding this comment

Uh oh!

winskuo-quic Nov 24, 2025

Choose a reason for hiding this comment

Uh oh!

meta-codesync bot commented Nov 24, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

winskuo-quic commented Nov 10, 2025 •

edited

Loading

pytorch-bot bot commented Nov 10, 2025 •

edited

Loading

This PR needs a `release notes:` label