cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 by ggerganov · Pull Request #15038 · ggml-org/llama.cpp

ggerganov · 2025-08-02T13:35:41Z

Fix strides for batched GEMM to take into account when the ne02 == 1
Fix src1 contiguous condition - it's always cont when we convert it

ggml-ci

JohannesGaessler · 2025-08-02T13:45:48Z

Good catch with src1 being potentially contiguous after a type conversion.

ggerganov · 2025-08-02T13:53:16Z

The SYCL tests still fail because I think it needs to update the GGML_SYCL_DNNL path of this function. @qnixsynapse Will leave this to your team and merge this for now.

Waiting for the CUDA CI to pass and will merge.

qnixsynapse · 2025-08-02T16:10:54Z

@Rbiessy @Alcpz Since you guys were maintaining MUL_MAT kernels, tagging you both for visibility.

dpct path in batched kernel also doesn't seem to properly support non_cont inputs in my testing. So not doing anything at this time

…l-org#15038)"

* cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 ggml-ci * cont : fix cont types ggml-ci * cont : adopt variable names and comment from the other branch

) * cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 ggml-ci * cont : fix cont types ggml-ci * cont : adopt variable names and comment from the other branch

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1

ce91179

ggml-ci

github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Aug 2, 2025

ggerganov mentioned this pull request Aug 2, 2025

CUDA: fix strided GEMM for [0,2,1,3] per && ne2==1 #15037

Closed

cont : fix cont types

18388c7

ggml-ci

JohannesGaessler approved these changes Aug 2, 2025

View reviewed changes

cont : adopt variable names and comment from the other branch

275a591

ggerganov merged commit 15e92fd into master Aug 2, 2025
45 of 47 checks passed

ggerganov deleted the gg/cuda-sycl-mm-batched-fix branch August 2, 2025 14:13

Rbiessy mentioned this pull request Aug 5, 2025

sycl: fix mul_mat selection #15092

Merged

Nexesenex added a commit to Nexesenex/croco.cpp that referenced this pull request Aug 7, 2025

Revert "cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1 (ggm…

32098df

…l-org#15038)"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1#15038

cuda, sycl : fix batched gemm when ne02 == 1 && ne03 > 1#15038
ggerganov merged 3 commits intomasterfrom
gg/cuda-sycl-mm-batched-fix

ggerganov commented Aug 2, 2025

Uh oh!

JohannesGaessler commented Aug 2, 2025

Uh oh!

ggerganov commented Aug 2, 2025

Uh oh!

Uh oh!

qnixsynapse commented Aug 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ggerganov commented Aug 2, 2025

Uh oh!

JohannesGaessler commented Aug 2, 2025

Uh oh!

ggerganov commented Aug 2, 2025

Uh oh!

Uh oh!

qnixsynapse commented Aug 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants