Fix Vulkan texture tensor UBO budget overflow#17294
Fix Vulkan texture tensor UBO budget overflow#17294meta-codesync[bot] merged 1 commit intopytorch:mainfrom
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/17294
Note: Links to docs will display an error until the docs builds have been completed. ❌ 3 New Failures, 1 Cancelled JobAs of commit 1865306 with merge base 50c170c ( NEW FAILURES - The following jobs have failed:
CANCELLED JOB - The following job was cancelled. Please retry:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
Fixes a Vulkan runtime crash where texture-backed tensors exceeded their pre-allocated metadata UBO budget when ops (e.g., Linear/MatMul) request strides/numel/dim_order UBOs regardless of storage type.
Changes:
- Increase texture-backed tensor metadata UBO budget to cover sizes/strides/dim_order/numel/logical_limits.
- Update the texture “max metadata field count” accordingly.
- Add a regression test that exercises all metadata UBO accessors on a texture-backed tensor.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
backends/vulkan/runtime/api/containers/Tensor.cpp |
Expands texture tensor UBO budget and updates field-count sizing logic. |
backends/vulkan/test/vulkan_compute_api_test.cpp |
Adds a regression test ensuring texture-backed tensors can allocate all requested metadata UBOs without throwing. |
Comments suppressed due to low confidence (1)
backends/vulkan/runtime/api/containers/Tensor.cpp:1175
get_max_ubo_nbytes()currently assumes each metadata field consumes exactlynbytes_per_ubo, butmetadata_ubo_implactually usesalign_up(sizeof(field), min_nbytes_per_ubo_), which can be larger thannbytes_per_ubowhen the alignment is smaller than the field size (e.g., ivec4 needs 16 bytes). To avoid future under-allocation bugs if this helper gets used, consider computing the max size the same way ascalculate_max_ubo_nbytes()(summing the aligned sizes of each field) or removing this helper if it’s dead code.
size_t vTensor::get_max_ubo_nbytes(const size_t nbytes_per_ubo) const {
// Ops like Linear and MatMul unconditionally request strides/numel UBOs on
// all tensors regardless of storage type, so texture tensors need the same
// metadata budget as buffer tensors plus logical_limits (5 fields total).
size_t max_metadata_field_count = 5u;
if (storage_type() == utils::kBuffer) {
// sizes, strides, dim order, numel
max_metadata_field_count = 4u;
}
return max_metadata_field_count * nbytes_per_ubo;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Fixes pytorch#17293 Texture-backed tensors only allocated UBO space for 2 metadata fields (sizes + logical_limits), but operators like Linear and MatMul unconditionally request strides_ubo() and numel_ubo() on all tensors regardless of storage type, causing: "Uniform data allocation has exceeded Tensor uniform buffer size" This increases the texture UBO budget to accommodate all 5 metadata fields (sizes, strides, dim_order, numel, logical_limits) in both calculate_max_ubo_nbytes() and get_max_ubo_nbytes(). Adds a regression test that verifies texture tensors can serve all metadata UBO requests without overflow.
f475ab4 to
1865306
Compare
Buffer storage produces incorrect (zero) results on Android Adreno GPUs. Texture storage is the default, faster on mobile GPUs, and better tested. The UBO overflow that originally motivated the buffer workaround is being fixed upstream (pytorch/executorch#17294).
SS-JIA
left a comment
There was a problem hiding this comment.
@abdelaziz-mahdy - thank you for the detailed investigation and fix! The change looks good to me.
I've imported the changes to Meta-internal codebase so we can validate the CI signals there. Will approve + merge once those pass.
Summary
Fixes #17293
Texture-backed tensors had a UBO (Uniform Buffer Object) metadata budget of only 2 fields (sizes + logical_limits), while operators like Linear and MatMul unconditionally request strides, numel, and dim_order UBOs on all tensors regardless of storage type. This caused an assertion failure at runtime:
This affected all Vulkan-delegated models on Android (tested on Pixel 10 Pro, Android 16), including simple models like MobileNet V3 Small.
Changes
backends/vulkan/runtime/api/containers/Tensor.cpp:calculate_max_ubo_nbytes()— Increased texture tensor UBO budget from 2 fields (sizes + logical_limits) to 5 fields (sizes + strides + dim_order + numel + logical_limits), matching the buffer budget plus logical_limits.get_max_ubo_nbytes()— Updatedmax_metadata_field_countfor texture tensors from2uto5u.backends/vulkan/test/vulkan_compute_api_test.cpp:Added
texture_tensor_ubo_metadata_budget_testregression test that:Test Plan
texture_tensor_ubo_metadata_budget_testpassesvulkan_compute_api_testsuite: 50/52 pass (same 2 pre-existing matmul precision failures on MoltenVK/Apple M2 Pro, unrelated to this change)cc @SS-JIA @manuelcandales @digantdesai @cbilgin