Add torchao conversion by metascroy · Pull Request #14545 · pytorch/executorch

metascroy · 2025-09-24T17:44:37Z

This adds a new "torchao" backend for pre-quantized checkpoints.

Pre-quantized checkpoints can be lowered to a backend (e.g., XNNPACK) by specifying "-X" in etLLM.

With this PR, we can now lower pre-quantized checkpoints to torchao lowbit kernels by specifying "--torchao_kernels" in the export script instead of "-X". Note this will run both linear and tied_embedding kernels with torchao_kernels.

If you want to run linear with XNNPACK, but only run tied embedding with torchao, use "--torchao_kernels_tied_embedding" and "-X".

New CI tests are added for the flow.

pytorch-bot · 2025-09-24T17:44:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14545

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 2 Unrelated Failures

As of commit 96dc88e with merge base 9283b4e ():

NEW FAILURES - The following jobs have failed:

Build Windows Wheels / pytorch/executorch / build-wheel-py3_10-cpu (gh)
Build Windows Wheels / pytorch/executorch / upload / upload-wheel-py3_10-cpu (gh)
Unable to download artifact(s): Artifact not found for name: pytorch_executorch__3.10_cpu_x64
pull / test-vulkan-operators-linux / linux-job (gh)
RuntimeError: Command docker exec -t 89c275bef0c128fdaaeaa05a47928cb572605f349ed1dc226023564709ec5655 /exec failed with exit code 127

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / test-binary-size-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.
pull / test-setup-linux-gcc / linux-job (gh) (trunk failure)
##[error]The operation was canceled.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2025-09-24T17:45:20Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

jerryzh168 · 2025-09-29T18:20:33Z

extension/llm/export/config/llm_config.py

+    convert_linear: bool = False
+    convert_tied_embedding: bool = False


nit: these feels like convert functions, maybe just use use_torchao_kernels_linear and use_torchao_kernels_tied_embedding?

jerryzh168 · 2025-09-29T18:21:17Z

extension/llm/export/config/llm_config.py

+            else:
+                # Otherwise, only enable the conversions that are specified
+                llm_config.backend.torchao.convert_linear = getattr(
+                    args, "torchao_kernels_linear", False


nit: can match the name here as well use_torchao_kernels_linear

jerryzh168 · 2025-09-29T18:21:22Z

extension/llm/export/config/llm_config.py

+                    args, "torchao_kernels_linear", False
+                )
+                llm_config.backend.torchao.convert_tied_embedding = getattr(
+                    args, "torchao_kernels_tied_embedding", False


jackzhxng · 2025-10-06T15:36:54Z

examples/models/llama/export_llama_lib.py

+    parser.add_argument(
+        "--use-torchao-kernels",
+        action="store_true",
+        help="Delegate tied-embedding and quantized linear ops to torchao kernels",
+    )


why do we need this when it's combining the below two args?

jackzhxng · 2025-10-06T15:37:17Z

extension/llm/export/config/llm_config.py

+    """
+    Configures the torchao-kernels backend.
+    """
+


Can we follow the other backend config examples and use enabled?

This adds a new "torchao" backend for pre-quantized checkpoints. Pre-quantized checkpoints can be lowered to a backend (e.g., XNNPACK) by specifying "-X" in etLLM. With this PR, we can now lower pre-quantized checkpoints to torchao lowbit kernels by specifying "--torchao_kernels" in the export script instead of "-X". Note this will run both linear and tied_embedding kernels with torchao_kernels. If you want to run linear with XNNPACK, but only run tied embedding with torchao, use "--torchao_kernels_tied_embedding" and "-X". New CI tests are added for the flow.

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 24, 2025

metascroy added the ciflow/trunk label Sep 24, 2025

metascroy force-pushed the add-torchao-convert branch from 11fa2a5 to 680725a Compare September 24, 2025 20:23

metascroy added 10 commits September 25, 2025 09:15

Add torchao conversion

0bb87af

up

fc82a58

up

e484954

up

288b86b

up

96f98b5

up

4d9e718

up

61a05bc

up

1a5c3f3

up

ffd7c1c

up

f503d2c

metascroy force-pushed the add-torchao-convert branch from 26a0a37 to f503d2c Compare September 25, 2025 16:15

metascroy added 4 commits September 25, 2025 09:51

up

2847aae

up

2c69cce

up

f7a6e2e

up

a7fa6bd

metascroy marked this pull request as ready for review September 25, 2025 18:56

metascroy requested review from GregoryComer, jackzhxng, larryliu0820, lucylq, mergennachin and swolchok as code owners September 25, 2025 18:56

metascroy requested review from jerryzh168 and lisjin September 26, 2025 21:12

jerryzh168 reviewed Sep 29, 2025

View reviewed changes

metascroy added 3 commits September 29, 2025 15:11

up

51279a4

up

9db7b18

up

96dc88e

jerryzh168 approved these changes Sep 30, 2025

View reviewed changes

metascroy merged commit 5d29a7d into main Sep 30, 2025
417 of 423 checks passed

metascroy deleted the add-torchao-convert branch September 30, 2025 17:09

jackzhxng reviewed Oct 6, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add torchao conversion#14545

Add torchao conversion#14545
metascroy merged 17 commits intomainfrom
add-torchao-convert

metascroy commented Sep 24, 2025 •

edited

Loading

Uh oh!

pytorch-bot bot commented Sep 24, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Sep 24, 2025

Uh oh!

jerryzh168 Sep 29, 2025 •

edited

Loading

Uh oh!

metascroy Sep 29, 2025

Uh oh!

jerryzh168 Sep 29, 2025

Uh oh!

jerryzh168 Sep 29, 2025

Uh oh!

Uh oh!

jackzhxng Oct 6, 2025

Uh oh!

jackzhxng Oct 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		convert_linear: bool = False
		convert_tied_embedding: bool = False

Conversation

metascroy commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Sep 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/14545

❌ 3 New Failures, 2 Unrelated Failures

Uh oh!

github-actions bot commented Sep 24, 2025

This PR needs a release notes: label

Uh oh!

jerryzh168 Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

metascroy Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Sep 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jackzhxng Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

jackzhxng Oct 6, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

metascroy commented Sep 24, 2025 •

edited

Loading

pytorch-bot bot commented Sep 24, 2025 •

edited

Loading

This PR needs a `release notes:` label

jerryzh168 Sep 29, 2025 •

edited

Loading