Skip to content

opencl: add flattened Q4_K mv and general Q4_K mm#20773

Merged
lhez merged 1 commit intoggml-org:masterfrom
qualcomm:sq/q4_k-flat
Mar 23, 2026
Merged

opencl: add flattened Q4_K mv and general Q4_K mm#20773
lhez merged 1 commit intoggml-org:masterfrom
qualcomm:sq/q4_k-flat

Conversation

@shaofeiqi
Copy link
Copy Markdown
Contributor

This PR adds the flattened Q4_K mv and the general Q4_K mm, which provides some improvements and can help Q4_K_M.
A more specialized Q4_K GEMM and GEMV kernels using transposed layouts will be added in the following PR.

@shaofeiqi shaofeiqi requested a review from a team as a code owner March 19, 2026 20:14
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Mar 19, 2026
@lhez lhez merged commit 84ffd0c into ggml-org:master Mar 23, 2026
48 checks passed
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants