vulkan: add get/set tensor 2d functions by 0cc4m · Pull Request #22514 · ggml-org/llama.cpp

0cc4m · 2026-04-29T11:06:59Z

Overview

Implement the 2d tensor copy functions that were added for TP support to the Vulkan backend. This shouldn't make a performance difference, but it was not much work since the 2d functions basically already existed.

I also noticed that the interface comments for the functions were universally wrong, so I corrected them, too. Sorry about the pings that causes.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES, Claude was used for assistance.

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

taronaeo

Ack for IBM zDNN :)

jeffbolznv · 2026-04-29T14:11:56Z

+    } else {
+        slices.resize(height);
+        for (size_t i = 0; i < height; i++) {
+            slices[i].srcOffset = i * width;


I want to make sure I understand - here the src data is tightly packed, and when we do the memcpy that uses the spitch?

Yes, the deferred_memcpy packs the source data into a contiguous shape in the staging buffer, then the vk::BufferCopy slices copy from the packed shape in the staging buffer into the final dpitch shape. If dpitch == width this is just one copy, otherwise it copies each slice. Does that help?

jeffbolznv · 2026-04-29T14:16:22Z

    ggml_vk_buffer_write(buf, vk_tensor_offset(tensor) + tensor->view_offs + offset, data, size);
 }

+static void ggml_backend_vk_buffer_set_tensor_2d(ggml_backend_buffer_t buffer, ggml_tensor * tensor, const void * data, size_t offset,


Are tensors passed into these commands always contiguous?

I don't think so, but it is up to the caller to make sure the data ends up where the tensor expects it. At least that's how I understood it. Is that correct @JohannesGaessler?

The ggml backend API for getting or setting tensor data in one dimension does not consider the tensor data layout, the user simply specifies an offset and a size in bytes and the data is copied into that memory region. The 2D functions I added follow this same pattern: they simply do a 2D copy using the tensor's memory and it is the responsibility of the user calling the function to correctly consider the tensor's memory layout. So no, these functions cannot expect to always receive contiguous tensors.

OK, I'm reading this as: the tensors may not be contiguous, but that's ok and we're supposed to ignore the layout for this op. Then the change looks fine to me.

jeffbolznv · 2026-04-29T14:17:27Z

-static void ggml_backend_vk_set_tensor_async(ggml_backend_t backend, ggml_tensor * tensor, const void * data, size_t offset, size_t size) {
-    VK_LOG_DEBUG("ggml_backend_vk_set_tensor_async(" << size << ")");
+static void ggml_backend_vk_set_tensor_2d_async(ggml_backend_t backend, ggml_tensor * tensor, const void * data, size_t offset,
+                                                size_t size, size_t n_copies, size_t stride_tensor, size_t stride_data) {


Should 'size' here be renamed width to be consistent with the others?

I just used the same nomenclature as the CUDA backend. Internally we use width and height, while the interface uses size and n_copies.

* 'master' of github.com:tekintian/llama.cpp: (659 commits) ggml-webgpu: Improve performance of mat-vec and mat-mat for MUL_MAT_ID (ggml-org#22464) Update llama-mmap to use ftello/fseeko (ggml-org#22497) common : check for null getpwuid in hf-cache (ggml-org#22550) vulkan: add get/set tensor 2d functions (ggml-org#22514) spec: fix argument typo (ggml-org#22552) ci : bump ty to 0.0.33 (ggml-org#22535) vendor : update cpp-httplib to 0.43.2 (ggml-org#22548) CUDA: fix tile FA kernel on Pascal (ggml-org#22541) scripts : add wc2wt.sh - create worktree from current HEAD (ggml-org#22513) add fast matmul iquants (ggml-org#22504) spec : fix draft model checkpoints (ggml-org#22521) spec : fix vocab compat checks in spec example (ggml-org#22426) common : do not pass prompt tokens to reasoning budget sampler (ggml-org#22488) hexagon: make vmem and buffer-size configurable (ggml-org#22487) CUDA: fuse SSM_CONV + ADD(bias) + SILU (ggml-org#22478) spec : disacard last drafted token with low prob (ggml-org#22506) sync : ggml ggml : bump version to 0.10.1 (ggml/1469) webui: fix slow mic stop and WAV encode (ggml-org#22480) ggml-cpu : disable tiled matmul on AIX to fix page boundary segfault (ggml-org#22293) ... # Conflicts: # .gitignore

0cc4m added 2 commits April 29, 2026 12:15

vulkan: add get/set_tensor_2d functions

5f413f5

fix backend interface comments

ccbb717

0cc4m requested review from a team, JohannesGaessler and ggerganov as code owners April 29, 2026 11:07

0cc4m removed request for a team and ggerganov April 29, 2026 11:07

CISC approved these changes Apr 29, 2026

View reviewed changes

Comment thread ggml/src/ggml-metal/ggml-metal.cpp Outdated

Update ggml/src/ggml-metal/ggml-metal.cpp

3551cd8

Co-authored-by: Sigbjørn Skjæret <sigbjorn.skjaeret@scala.com>

taronaeo approved these changes Apr 29, 2026

View reviewed changes

jeffbolznv reviewed Apr 29, 2026

View reviewed changes

github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs Vulkan Issues specific to the Vulkan backend labels Apr 29, 2026

jeffbolznv approved these changes Apr 30, 2026

View reviewed changes

0cc4m merged commit 660b1b4 into master Apr 30, 2026
67 of 68 checks passed

0cc4m deleted the 0cc4m/vulkan-tensor-get-set-2d branch April 30, 2026 15:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vulkan: add get/set tensor 2d functions#22514

vulkan: add get/set tensor 2d functions#22514
0cc4m merged 3 commits intomasterfrom
0cc4m/vulkan-tensor-get-set-2d

0cc4m commented Apr 29, 2026

Uh oh!

Uh oh!

taronaeo left a comment

Uh oh!

jeffbolznv Apr 29, 2026

Uh oh!

0cc4m Apr 30, 2026

Uh oh!

jeffbolznv Apr 29, 2026

Uh oh!

0cc4m Apr 30, 2026

Uh oh!

JohannesGaessler Apr 30, 2026

Uh oh!

jeffbolznv Apr 30, 2026

Uh oh!

jeffbolznv Apr 29, 2026

Uh oh!

0cc4m Apr 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

0cc4m commented Apr 29, 2026

Overview

Requirements

Uh oh!

Uh oh!

taronaeo left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants