opencl: add flattened Q4_K mv and general Q4_K mm by shaofeiqi · Pull Request #20773 · ggml-org/llama.cpp

shaofeiqi · 2026-03-19T20:14:40Z

This PR adds the flattened Q4_K mv and the general Q4_K mm, which provides some improvements and can help Q4_K_M.
A more specialized Q4_K GEMM and GEMV kernels using transposed layouts will be added in the following PR.

shaofeiqi requested a review from a team as a code owner March 19, 2026 20:14

shaofeiqi force-pushed the sq/q4_k-flat branch from 58d13e9 to caf4a51 Compare March 19, 2026 21:01

opencl: add flattened Q4_K mv and general Q4_K mm

7794488

shaofeiqi force-pushed the sq/q4_k-flat branch from caf4a51 to 7794488 Compare March 19, 2026 21:03

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Mar 19, 2026

lhez approved these changes Mar 23, 2026

View reviewed changes

lhez merged commit 84ffd0c into ggml-org:master Mar 23, 2026
48 checks passed

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

opencl: add flattened Q4_K mv and general Q4_K mm (ggml-org#20773)

38eb1df

rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026

opencl: add flattened Q4_K mv and general Q4_K mm (ggml-org#20773)

e197d47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencl: add flattened Q4_K mv and general Q4_K mm#20773

opencl: add flattened Q4_K mv and general Q4_K mm#20773
lhez merged 1 commit intoggml-org:masterfrom
qualcomm:sq/q4_k-flat

shaofeiqi commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaofeiqi commented Mar 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants