Add 16A8W quantization configuration utility for ARM backend by Ninja91 · Pull Request #13175 · pytorch/executorch

Ninja91 · 2025-08-07T06:08:31Z

Summary:
This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479.

Key Changes

1. New Quantization Configuration Function

Add get_16a8w_quantization_config() in fbcode/executorch/backends/arm/quantizer/arm_quantizer.py
Provides 16-bit activations with HistogramObserver (better precision than 8A8W)
Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient)
Technically supported by TOSA through EXT-INT16 extension/profile

2. Test Implementation

Add test_linear_16a8w_tosa_INT() test in fbcode/executorch/backends/arm/test/ops/test_linear.py
Demonstrates usage of new 16A8W quantization configuration

Benefits

Better Precision: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets.
Configurable: Supports same parameters as existing quantization configurations

Testing

The implementation provides the utility function and test infrastructure. Note: The test reveals that TOSA backend has limited INT16 support for some operations (view operations only support INT8/INT32/FP32/BOOL), which is expected and shows the configuration correctly produces INT16 tensors.

Differential Revision: D79763381

cc @digantdesai @freddan80 @per @zingo @oscarandersson8218

pytorch-bot · 2025-08-07T06:08:34Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13175

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures, 3 Unrelated Failures

As of commit 247e13a with merge base b427bd7 ():

NEW FAILURES - The following jobs have failed:

Build documentation / build (buck2) / Build doc (gh)
At least one of the pre-conditions you specified did not hold
Lint / lintrunner / linux-job (gh)
>>> Lint for backends/arm/quantizer/arm_quantizer.py:
pull / android / run-emulator (gh)
The process '/usr/bin/sh' failed with exit code 255
pull / unittest-arm-backend-with-no-fvp (test_pytest_models) / linux-job (gh)
RuntimeError: Command docker exec -t 854ff1269423341c3b47682d923c43f65711a27c1e6c54695a37575f5b699e10 /exec failed with exit code 1
pull / unittest-arm-backend-with-no-fvp (test_pytest_ops) / linux-job (gh)
RuntimeError: Command docker exec -t 1222450dff29b5ae994a4b5c77c541b54c50cffcaee5074289e4acaa7f1f42b8 /exec failed with exit code 1
pull / unittest-editable / macos / macos-job (gh)
backends/xnnpack/test/ops/test_conv1d.py::TestConv1d::test_qs8_conv1d_batchnorm_seq

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

pull / test-selective-build-linux / linux-job (gh) (detected as infra flaky with no log or failing log classifier)
pull / unittest / linux / linux-job (gh) (similar failure)
exir/tests/test_remove_unused_parameters_pass.py::TestRemoveUnusedParametersPass::test_remove_unused_parameters_nested_e2e_to_edge

BROKEN TRUNK - The following job failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest-editable / linux / linux-job (gh) (trunk failure)
exir/tests/test_remove_unused_parameters_pass.py::TestRemoveUnusedParametersPass::test_remove_unused_parameters_nested_e2e_to_edge

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2025-08-07T06:08:54Z

This pull request was exported from Phabricator. Differential Revision: D79763381

github-actions · 2025-08-07T06:09:26Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

facebook-github-bot · 2025-08-07T06:13:03Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** **2. Test Implementation** - Add `test_linear_16a8w_tosa_INT()` test in `fbcode/executorch/backends/arm/test/ops/test_linear.py` - Demonstrates usage of new 16A8W quantization configuration ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. - **Configurable**: Supports same parameters as existing quantization configurations ## Testing The implementation provides the utility function and test infrastructure. Note: The test reveals that TOSA backend has limited INT16 support for some operations (view operations only support INT8/INT32/FP32/BOOL), which is expected and shows the configuration correctly produces INT16 tensors. Differential Revision: D79763381

backends/arm/quantizer/arm_quantizer.py

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

facebook-github-bot · 2025-08-15T03:12:37Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

facebook-github-bot · 2025-08-15T03:15:12Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

facebook-github-bot · 2025-08-15T06:06:29Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

facebook-github-bot · 2025-08-20T23:30:34Z

This pull request was exported from Phabricator. Differential Revision: D79763381

Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

digantdesai · 2025-08-21T12:25:26Z

seems like you have PRs which include same changes from the older PR. Do you want to clean that up or drop this PR in favor of #13448?

Ninja91 · 2025-08-22T21:02:21Z

@digantdesai , @mergennachin requested review from @robell , @gggekov . Can any of you please review this PR?

Ninja91 · 2025-08-23T20:41:04Z

Closing this PR as this commit is covered in #13448

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

facebook-github-bot · 2025-08-23T20:44:27Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

facebook-github-bot · 2025-08-25T06:34:20Z

This pull request was exported from Phabricator. Differential Revision: D79763381

…#13175) Summary: Pull Request resolved: pytorch#13175 This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_16a8w_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Differential Revision: D79763381

…#13175) Summary: This diff implements a 16A8W (16-bit activations, 8-bit weights) quantization configuration utility for the ExecutorTorch ARM backend, following the feedback from D79746479. ## Key Changes **1. New Quantization Configuration Function** - Add `get_symmetric_a16w8_quantization_config()` in `fbcode/executorch/backends/arm/quantizer/arm_quantizer.py` - Provides 16-bit activations with HistogramObserver (better precision than 8A8W) - Maintains 8-bit weights with MinMaxObserver/PerChannelMinMaxObserver (memory efficient) - **Technically supported by TOSA through [EXT-INT16 extension/profile](https://www.mlplatform.org/tosa/tosa_spec.html#_conv2d)** ## Benefits - **Better Precision**: 16-bit activations provide higher precision than 8-bit. Useful for carrying precision for recurring neural nets. Reviewed By: 3l1 Differential Revision: D79763381

Ninja91 requested a review from digantdesai as a code owner August 7, 2025 06:08

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 7, 2025

facebook-github-bot added the fb-exported label Aug 7, 2025

Ninja91 force-pushed the export-D79763381 branch from 34348f8 to 07879f1 Compare August 7, 2025 06:13

digantdesai added ciflow/trunk module: arm Issues related to arm backend labels Aug 7, 2025

mergennachin requested review from gggekov, per and robell August 13, 2025 15:50

mergennachin added the partner: arm For backend delegation, kernels, demo, etc. from the 3rd-party partner, Arm label Aug 13, 2025

digantdesai reviewed Aug 13, 2025

View reviewed changes

backends/arm/quantizer/arm_quantizer.py Outdated Show resolved Hide resolved

Ninja91 force-pushed the export-D79763381 branch from 07879f1 to 923179c Compare August 15, 2025 03:12

pytorch-bot bot removed the ciflow/trunk label Aug 15, 2025

Ninja91 force-pushed the export-D79763381 branch from 923179c to 9413e5d Compare August 15, 2025 03:15

Ninja91 force-pushed the export-D79763381 branch from 9413e5d to 098353c Compare August 15, 2025 06:02

Ninja91 force-pushed the export-D79763381 branch from c928502 to 24c9ce4 Compare August 20, 2025 22:12

Ninja91 force-pushed the export-D79763381 branch from 24c9ce4 to c80587c Compare August 20, 2025 23:26

Ninja91 force-pushed the export-D79763381 branch from c80587c to 247e13a Compare August 20, 2025 23:30

Ninja91 closed this Aug 23, 2025

Ninja91 mentioned this pull request Aug 25, 2025

Add 16A8W quantization configuration utility for ARM backend #13641

Merged

Conversation

Ninja91 commented Aug 7, 2025 • edited by pytorch-bot bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Key Changes

Benefits

Testing

Uh oh!

pytorch-bot bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/13175

❌ 6 New Failures, 3 Unrelated Failures

Uh oh!

facebook-github-bot commented Aug 7, 2025

Uh oh!

github-actions bot commented Aug 7, 2025

This PR needs a release notes: label

Uh oh!

facebook-github-bot commented Aug 7, 2025

Uh oh!

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 15, 2025

Uh oh!

facebook-github-bot commented Aug 20, 2025

Uh oh!

digantdesai commented Aug 21, 2025

Uh oh!

Ninja91 commented Aug 22, 2025

Uh oh!

Ninja91 commented Aug 23, 2025

Uh oh!

facebook-github-bot commented Aug 23, 2025

Uh oh!

facebook-github-bot commented Aug 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Ninja91 commented Aug 7, 2025 •

edited by pytorch-bot bot

Loading

pytorch-bot bot commented Aug 7, 2025 •

edited

Loading

This PR needs a `release notes:` label