Skip to content

Fix Vulkan texture tensor UBO budget overflow#17294

Merged
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
abdelaziz-mahdy:fix/vulkan-texture-ubo-budget
Feb 9, 2026
Merged

Fix Vulkan texture tensor UBO budget overflow#17294
meta-codesync[bot] merged 1 commit intopytorch:mainfrom
abdelaziz-mahdy:fix/vulkan-texture-ubo-budget

Conversation

@abdelaziz-mahdy
Copy link
Contributor

@abdelaziz-mahdy abdelaziz-mahdy commented Feb 8, 2026

Summary

Fixes #17293

Texture-backed tensors had a UBO (Uniform Buffer Object) metadata budget of only 2 fields (sizes + logical_limits), while operators like Linear and MatMul unconditionally request strides, numel, and dim_order UBOs on all tensors regardless of storage type. This caused an assertion failure at runtime:

Vulkan uniform data allocation has exceeded tensor uniform buffer size
(Tensor.h:579 metadata_ubo_impl)

This affected all Vulkan-delegated models on Android (tested on Pixel 10 Pro, Android 16), including simple models like MobileNet V3 Small.

Changes

backends/vulkan/runtime/api/containers/Tensor.cpp:

  1. calculate_max_ubo_nbytes() — Increased texture tensor UBO budget from 2 fields (sizes + logical_limits) to 5 fields (sizes + strides + dim_order + numel + logical_limits), matching the buffer budget plus logical_limits.

  2. get_max_ubo_nbytes() — Updated max_metadata_field_count for texture tensors from 2u to 5u.

backends/vulkan/test/vulkan_compute_api_test.cpp:

Added texture_tensor_ubo_metadata_budget_test regression test that:

  • Creates a texture-backed vTensor and requests all 5 metadata UBOs (sizes, logical_limits, strides, numel, dim_order)
  • Creates a buffer-backed vTensor and verifies it still works
  • Fails without the fix, passes with it

Test Plan

  • Regression test texture_tensor_ubo_metadata_budget_test passes
  • Full vulkan_compute_api_test suite: 50/52 pass (same 2 pre-existing matmul precision failures on MoltenVK/Apple M2 Pro, unrelated to this change)
  • Zero regressions introduced

cc @SS-JIA @manuelcandales @digantdesai @cbilgin

Copilot AI review requested due to automatic review settings February 8, 2026 00:04
@pytorch-bot pytorch-bot bot added the module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ label Feb 8, 2026
@pytorch-bot
Copy link

pytorch-bot bot commented Feb 8, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17294

Note: Links to docs will display an error until the docs builds have been completed.

❌ 3 New Failures, 1 Cancelled Job

As of commit 1865306 with merge base 50c170c (image):

NEW FAILURES - The following jobs have failed:

CANCELLED JOB - The following job was cancelled. Please retry:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 8, 2026
@github-actions
Copy link

github-actions bot commented Feb 8, 2026

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Vulkan runtime crash where texture-backed tensors exceeded their pre-allocated metadata UBO budget when ops (e.g., Linear/MatMul) request strides/numel/dim_order UBOs regardless of storage type.

Changes:

  • Increase texture-backed tensor metadata UBO budget to cover sizes/strides/dim_order/numel/logical_limits.
  • Update the texture “max metadata field count” accordingly.
  • Add a regression test that exercises all metadata UBO accessors on a texture-backed tensor.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
backends/vulkan/runtime/api/containers/Tensor.cpp Expands texture tensor UBO budget and updates field-count sizing logic.
backends/vulkan/test/vulkan_compute_api_test.cpp Adds a regression test ensuring texture-backed tensors can allocate all requested metadata UBOs without throwing.
Comments suppressed due to low confidence (1)

backends/vulkan/runtime/api/containers/Tensor.cpp:1175

  • get_max_ubo_nbytes() currently assumes each metadata field consumes exactly nbytes_per_ubo, but metadata_ubo_impl actually uses align_up(sizeof(field), min_nbytes_per_ubo_), which can be larger than nbytes_per_ubo when the alignment is smaller than the field size (e.g., ivec4 needs 16 bytes). To avoid future under-allocation bugs if this helper gets used, consider computing the max size the same way as calculate_max_ubo_nbytes() (summing the aligned sizes of each field) or removing this helper if it’s dead code.
size_t vTensor::get_max_ubo_nbytes(const size_t nbytes_per_ubo) const {
  // Ops like Linear and MatMul unconditionally request strides/numel UBOs on
  // all tensors regardless of storage type, so texture tensors need the same
  // metadata budget as buffer tensors plus logical_limits (5 fields total).
  size_t max_metadata_field_count = 5u;
  if (storage_type() == utils::kBuffer) {
    // sizes, strides, dim order, numel
    max_metadata_field_count = 4u;
  }
  return max_metadata_field_count * nbytes_per_ubo;

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Fixes pytorch#17293

Texture-backed tensors only allocated UBO space for 2 metadata fields
(sizes + logical_limits), but operators like Linear and MatMul
unconditionally request strides_ubo() and numel_ubo() on all tensors
regardless of storage type, causing:

  "Uniform data allocation has exceeded Tensor uniform buffer size"

This increases the texture UBO budget to accommodate all 5 metadata
fields (sizes, strides, dim_order, numel, logical_limits) in both
calculate_max_ubo_nbytes() and get_max_ubo_nbytes().

Adds a regression test that verifies texture tensors can serve all
metadata UBO requests without overflow.
@abdelaziz-mahdy abdelaziz-mahdy force-pushed the fix/vulkan-texture-ubo-budget branch from f475ab4 to 1865306 Compare February 8, 2026 00:29
abdelaziz-mahdy added a commit to abdelaziz-mahdy/executorch_flutter_models that referenced this pull request Feb 8, 2026
Buffer storage produces incorrect (zero) results on Android Adreno GPUs.
Texture storage is the default, faster on mobile GPUs, and better tested.
The UBO overflow that originally motivated the buffer workaround is being
fixed upstream (pytorch/executorch#17294).
Copy link
Contributor

@SS-JIA SS-JIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abdelaziz-mahdy - thank you for the detailed investigation and fix! The change looks good to me.

I've imported the changes to Meta-internal codebase so we can validate the CI signals there. Will approve + merge once those pass.

@meta-codesync
Copy link
Contributor

meta-codesync bot commented Feb 9, 2026

@SS-JIA has imported this pull request. If you are a Meta employee, you can view this in D92712792.

@meta-codesync meta-codesync bot merged commit 2d2e422 into pytorch:main Feb 9, 2026
150 of 154 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vulkan: Texture tensor UBO overflow on Android

3 participants