[ET-VK] Prevent decomposition of activation ops with native shaders by abdelaziz-mahdy · Pull Request #17361 · pytorch/executorch

abdelaziz-mahdy · 2026-02-11T00:06:26Z

Summary

Add hardswish, hardsigmoid, and hardshrink to the Vulkan partitioner's ops_not_to_decompose list, and register hardswish and hardsigmoid in op_registry.py.

These activation ops have native GLSL shader implementations in the Vulkan backend (activations.h / UnaryOp.cpp) but were being decomposed by PyTorch's default decomposition table into primitive ops (mul/add/clamp/div with constant tensors) before the Vulkan partitioner could claim them.

On PowerVR GPUs (e.g. Pixel 10 Pro), the decomposed paths produce NaN/Inf because the constant scalar tensors (3 and 6 in hardswish(x) = x * clamp(x+3, 0, 6) / 6) are not loaded correctly through the dim_order_ops._to_dim_order_copy buffer-to-texture conversion path.

Root Cause

aten.hardswish.default and aten.hardsigmoid.default are in PyTorch's default decomposition table
vulkan_partitioner.py's ops_not_to_decompose only contained upsample_nearest2d.vec
When using to_edge_transform_and_lower(), the partitioner's ops_to_not_decompose() method is called — but since these ops weren't listed, they got decomposed before the partitioner could see them
The native GLSL shaders (DEFINE_ACTIVATION_FN(hardswish), VK_REGISTER_OP(aten.hardswish.default, hardswish)) were never used

Changes

backends/vulkan/op_registry.py: Register hardsigmoid and hardswish in the unary ops list (they had C++ implementations but were missing from the Python registry)
backends/vulkan/partitioner/vulkan_partitioner.py: Add 3 activation ops (hardswish, hardsigmoid, hardshrink) to ops_not_to_decompose so to_edge_transform_and_lower() preserves them

Note: silu was intentionally excluded — it has no native Vulkan shader or C++ registration. Its decomposed path (sigmoid + mul) works correctly since both ops have native implementations.

Test Plan

Tested on Pixel 10 Pro (PowerVR D-Series DXT-48-1536 MC1, Android 16):

Isolated hardswish-only model: perfect match with XNNPACK reference (maxDiff=0.000000)
Isolated hardsigmoid model: works without NaN
Full MobileNet V3 Small (FP32): NaN eliminated (was 1000/1000 NaN → now 0/1000)
Full MobileNet V3 Small (FP16): NaN eliminated (0/1000)

Note: MobileNetV3 uses hardswish extensively in feature blocks and hardsigmoid in Squeeze-and-Excite blocks, making both critical for this model family.

Fixes #17299

pytorch-bot · 2026-02-11T00:06:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17361

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 6 New Failures

As of commit 2159c3e with merge base 0d9799f ():

NEW FAILURES - The following jobs have failed:

pull / test-mediatek-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t 55dc98a8d7b5ef1d42dd0399858d25321202d9247ab22d75ae02160db17a50b6 /exec failed with exit code 2
pull / test-openvino-linux / linux-job (gh)
RuntimeError: Command docker exec -t e7fcba87659bbfc24b6e9062da43a2d030251025eefed83a9abead4f7be29597 /exec failed with exit code 1
pull / test-samsung-models-linux / linux-job (gh)
RuntimeError: Command docker exec -t bc735876035aa844a7954320655e51a69e174f2d8abd5f65ed402a79d081a5d9 /exec failed with exit code 1
pull / test-samsung-quantmodels-linux / linux-job (gh)
RuntimeError: Command docker exec -t 56adddf60db096200ed186c845afe08884a38ebb4e80de18eb32f807ee2e8dd3 /exec failed with exit code 1
pull / unittest-buck / linux / linux-job (gh)
RuntimeError: Command docker exec -t 37e858a82905d8cb4e87e87f2c036cccc29f666760a1471961ea8ca94a7fe567 /exec failed with exit code 3
pull / unittest-buck / macos / macos-job (gh)
RuntimeError: Command bash /Users/ec2-user/runner/_work/_temp/exec_script failed with exit code 3

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-02-11T00:07:10Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR updates the ExecuTorch Vulkan backend’s lowering path to preserve certain activation ops from PyTorch’s default decompositions so the Vulkan partitioner can claim them and use native unary implementations.

Changes:

Extend the Vulkan partitioner’s ops_not_to_decompose list to include several activation ops so they survive to_edge_transform_and_lower().
Register aten.hardsigmoid.default and aten.hardswish.default as supported unary ops in the Vulkan Python op registry.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
`backends/vulkan/partitioner/vulkan_partitioner.py`	Adds activation ops to the “do not decompose” list so Vulkan can see/claim them before decomposition happens.
`backends/vulkan/op_registry.py`	Adds `hardsigmoid` and `hardswish` to the unary-op registration list for Vulkan partitioning support.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-11T00:12:05Z

backends/vulkan/partitioner/vulkan_partitioner.py

+    torch.ops.aten.hardsigmoid.default,
+    torch.ops.aten.hardswish.default,
+    torch.ops.aten.hardshrink.default,
+    torch.ops.aten.silu.default,


torch.ops.aten.silu.default is being added to ops_not_to_decompose, but the Vulkan backend doesn’t appear to have a native implementation/registration for SiLU (no VK_REGISTER_OP(aten.silu.default, ...) in backends/vulkan/runtime/graph/ops/impl, no GLSL helper, and it’s not registered in backends/vulkan/op_registry.py). Preserving it from decomposition may therefore prevent the graph from lowering to Vulkan via the decomposed mul+sigmoid path and could leave an unsupported op in the edge graph.

Suggestion: either (a) add and register a Vulkan SiLU implementation end-to-end (C++ + GLSL + op_registry.py), or (b) remove SiLU from ops_not_to_decompose and keep this list limited to ops that Vulkan can actually consume natively.

Suggested change

torch.ops.aten.silu.default,

Add hardswish, hardsigmoid, hardshrink, and silu to the Vulkan partitioner's ops_not_to_decompose list, and register hardswish and hardsigmoid in the op_registry. These ops have native GLSL shader implementations in the Vulkan backend but were being decomposed by PyTorch's default decomposition table into primitive ops (mul/add/clamp/div with constant tensors) before the partitioner could claim them. The decomposed paths produce NaN/Inf on PowerVR GPUs due to constant tensor loading issues in the decomposed graph. With this fix, to_edge_transform_and_lower() automatically preserves these ops via the partitioner's ops_to_not_decompose() method, allowing the native Vulkan shaders to handle them directly. Tested on Pixel 10 Pro (PowerVR D-Series DXT-48-1536): - MobileNet V3 Small: NaN eliminated (was 1000/1000 NaN, now 0/1000) - Isolated hardswish test: perfect match with XNNPACK reference Fixes pytorch#17299

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

backends/vulkan/op_registry.py

SS-JIA

@abdelaziz-mahdy change LGTM, just need to fix some changes introduced by merge conflicts.

Restore register_pow_tensor_scalar which was accidentally replaced with a duplicate register_unary_op during merge conflict resolution.

Copilot

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

nil-is-all · 2026-02-23T20:10:47Z

Hi @abdelaziz-mahdy, could you address the merge conflicts?

…kan-preserve-activation-ops

…bdelaziz-mahdy/executorch into vulkan-preserve-activation-ops

abdelaziz-mahdy · 2026-02-23T21:10:05Z

Hi @abdelaziz-mahdy, could you address the merge conflicts?

pulled from main .

Copilot AI review requested due to automatic review settings February 11, 2026 00:06

abdelaziz-mahdy requested a review from SS-JIA as a code owner February 11, 2026 00:06

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 11, 2026

Copilot started reviewing on behalf of abdelaziz-mahdy February 11, 2026 00:06 View session

abdelaziz-mahdy mentioned this pull request Feb 11, 2026

Vulkan backend produces all-zero outputs on PowerVR GPU (Pixel 10 Pro) #17299

Open

Copilot AI reviewed Feb 11, 2026

View reviewed changes

abdelaziz-mahdy force-pushed the vulkan-preserve-activation-ops branch from 193765a to 0b85421 Compare February 11, 2026 00:38

SS-JIA approved these changes Feb 11, 2026

View reviewed changes

Merge branch 'main' into vulkan-preserve-activation-ops

f4ee65d

Copilot AI review requested due to automatic review settings February 11, 2026 21:06

Copilot AI reviewed Feb 11, 2026

View reviewed changes

SS-JIA reviewed Feb 11, 2026

View reviewed changes

backends/vulkan/op_registry.py Outdated Show resolved Hide resolved

SS-JIA requested changes Feb 11, 2026

View reviewed changes

Revert accidental merge conflict changes in op_registry.py

0b8e58c

Restore register_pow_tensor_scalar which was accidentally replaced with a duplicate register_unary_op during merge conflict resolution.

SS-JIA approved these changes Feb 14, 2026

View reviewed changes

Merge branch 'main' into vulkan-preserve-activation-ops

fe3e20a

Copilot AI review requested due to automatic review settings February 17, 2026 15:32

Copilot AI reviewed Feb 17, 2026

View reviewed changes

abdelaziz-mahdy added 2 commits February 23, 2026 17:09

Merge branch 'main' of https://github.com/pytorch/executorch into vul…

da38960

…kan-preserve-activation-ops

Merge branch 'vulkan-preserve-activation-ops' of https://github.com/a…

2159c3e

…bdelaziz-mahdy/executorch into vulkan-preserve-activation-ops

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ET-VK] Prevent decomposition of activation ops with native shaders#17361

[ET-VK] Prevent decomposition of activation ops with native shaders#17361
abdelaziz-mahdy wants to merge 6 commits intopytorch:mainfrom
abdelaziz-mahdy:vulkan-preserve-activation-ops

abdelaziz-mahdy commented Feb 11, 2026 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Feb 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 11, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

SS-JIA left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

nil-is-all commented Feb 23, 2026

Uh oh!

abdelaziz-mahdy commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

abdelaziz-mahdy commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Changes

Test Plan

Uh oh!

pytorch-bot bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17361

❌ 6 New Failures

Uh oh!

github-actions bot commented Feb 11, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 11, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

SS-JIA left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

nil-is-all commented Feb 23, 2026

Uh oh!

abdelaziz-mahdy commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

abdelaziz-mahdy commented Feb 11, 2026 •

edited

Loading

pytorch-bot bot commented Feb 11, 2026 •

edited

Loading

This PR needs a `release notes:` label