Arm backend: Add 16A8W linear ops support and test#13754
Conversation
Pull Request resolved: #13641 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. ghstack-source-id: 305891620 @exported-using-ghexport Differential Revision: [D79763381](https://our.internmc.facebook.com/intern/diff/D79763381/)
Pull Request resolved: #13658 - Adds linear ops test using the 16A8W config in INT16 profile. - Adds support in view ops validation for INT16 Dtype. - Validated with TOSA pipeline test. - Checked earlier marked flaky tests no longer flaky and remove markers. Note: Not verified with tosa reference model run. ghstack-source-id: 305897251 Differential Revision: [D80308822](https://our.internmc.facebook.com/intern/diff/D80308822/)
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13754
Note: Links to docs will display an error until the docs builds have been completed. ❌ 9 New Failures, 3 Unrelated FailuresAs of commit ee58c9b with merge base 9053089 ( NEW FAILURES - The following jobs have failed:
FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@Ninja91 Nice change! The changes in op_transpose.py and op_view.py results in some test failures as we partition a few ops incorrectly. @per and @agrima1304 have patches that fixes these failures but their patches are blocked by the vela pin update in #13282. If you move the changes in op_transpose.py and op_view.py to a separate PR I believe we should be able to merge this PR. |
|
@Ninja91 arm tests started failing after this PR, see this dashboard |
|
reverting here: #13895 |
…13895) This reverts commit f8156fb. ### Summary [PLEASE REMOVE] See [CONTRIBUTING.md's Pull Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests) for ExecuTorch PR guidelines. [PLEASE REMOVE] If this PR closes an issue, please add a `Fixes #<issue-id>` line. [PLEASE REMOVE] If this PR introduces a fix or feature that should be the upcoming release notes, please add a "Release notes: <area>" label. For a list of available release notes labels, check out [CONTRIBUTING.md's Pull Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests). ### Test plan [PLEASE REMOVE] How did you test this PR? Please write down any manual commands you used and note down tests that you have written if applicable.
|
@per @mergennachin @oscarandersson8218 the PR was reverted and I am pushing this now here: #13899. Validated that no arm tests are failing. |
Stack from [ghstack](https://github.com/ezyang/ghstack) (oldest at bottom): * __->__ #13899 - Adds linear ops test using the 16A8W config in INT16 profile. - Adds support in view ops validation for INT16 Dtype. - Validated with TOSA pipeline test. - Checked earlier marked flaky tests no longer flaky and remove markers. Note: Not verified with tosa reference model run. Differential Revision: [D81550511](https://our.internmc.facebook.com/intern/diff/D81550511/) Differential Revision: [D81550511](https://our.internmc.facebook.com/intern/diff/D81550511) Reattempt to land #13754
This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #13658 by @Ninja91
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/Ninja91/1/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/Ninja91/3/orig
@diff-train-skip-merge
cc @digantdesai @freddan80 @per @zingo @oscarandersson8218