Skip to content

opencl: add q4_K gemm and gemv kernels for Adreno#20919

Merged
lhez merged 7 commits intoggml-org:masterfrom
qualcomm:sq/q4_k-trans
Mar 30, 2026
Merged

opencl: add q4_K gemm and gemv kernels for Adreno#20919
lhez merged 7 commits intoggml-org:masterfrom
qualcomm:sq/q4_k-trans

Conversation

@shaofeiqi
Copy link
Copy Markdown
Contributor

Overview

Add Q4_K GEMM and GEMV kernels to the Adreno backend to improve performance for Q4_K quantized models.

Requirements

@shaofeiqi shaofeiqi requested a review from a team as a code owner March 23, 2026 21:20
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Mar 23, 2026
@lhez lhez requested a review from max-krasnyansky March 30, 2026 05:00
Copy link
Copy Markdown
Member

@max-krasnyansky max-krasnyansky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Good to see Q4_K working.

@lhez lhez merged commit 08f2145 into ggml-org:master Mar 30, 2026
45 checks passed
slartibardfast pushed a commit to slartibardfast/llama.cpp that referenced this pull request Apr 12, 2026
* opencl: add q4_K gemm and gemv kernels for Adreno

* opencl: fix whitespace

* opencl: add workarounds for compiler bugs on older devices

* opencl: handle fp16 denorm on X Elite

* opencl: fix kernel build error

* opencl: fix whitespace

* opencl: make q4_K cvt kernels signature consistent

---------

Co-authored-by: Li He <lih@qti.qualcomm.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* opencl: add q4_K gemm and gemv kernels for Adreno

* opencl: fix whitespace

* opencl: add workarounds for compiler bugs on older devices

* opencl: handle fp16 denorm on X Elite

* opencl: fix kernel build error

* opencl: fix whitespace

* opencl: make q4_K cvt kernels signature consistent

---------

Co-authored-by: Li He <lih@qti.qualcomm.com>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* opencl: add q4_K gemm and gemv kernels for Adreno

* opencl: fix whitespace

* opencl: add workarounds for compiler bugs on older devices

* opencl: handle fp16 denorm on X Elite

* opencl: fix kernel build error

* opencl: fix whitespace

* opencl: make q4_K cvt kernels signature consistent

---------

Co-authored-by: Li He <lih@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants