Add 16A8W quantization configuration utility for ARM backend#13728
Add 16A8W quantization configuration utility for ARM backend#13728
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13728
Note: Links to docs will display an error until the docs builds have been completed. ❗ 1 Active SEVsThere are 1 currently active SEVs. If your PR is affected, please view them below: ❌ 3 New FailuresAs of commit 8f4c714 with merge base 8278e7b ( NEW FAILURES - The following jobs have failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
bb0f06d to
773fc0b
Compare
Pull Request resolved: #13641 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. ghstack-source-id: 305991462 @exported-using-ghexport Differential Revision: [D79763381](https://our.internmc.facebook.com/intern/diff/D79763381/)
773fc0b to
8f4c714
Compare
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13641 by @Ninja91
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/main
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/orig
@diff-train-skip-merge