fix eval_llama_qnn custom annotation#15953
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/15953
Note: Links to docs will display an error until the docs builds have been completed. This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
Hi @cccclai, The previous PR #15807 moved all LLM quantization related configs into the quantization recipe. This fix updates Please have a look, thanks! |
|
@pytorchbot label "release notes: qualcomm" |
|
lint is failing, can you fix it? |
c6b6ec0 to
3907d55
Compare
Done, thanks! |
|
Thank you |
### Summary Fix eval_llama_qnn: retrieve custom annotation from quantization recipe ### Test plan ``` bash python -m executorch.examples.qualcomm.oss_scripts.llama.eval_llama_qnn --decoder_model qwen2_5-0_5b --quant_linear_only --max_seq_length 1024 --ptq 16a4w ```
Summary
Fix eval_llama_qnn: retrieve custom annotation from quantization recipe
Test plan