opencl: fix rms_norm_mul by lhez · Pull Request #17250 · ggml-org/llama.cpp

lhez · 2025-11-13T19:28:35Z

The rms_norm_mul kernel produces incorrect result when ne00 = 768. This PR changes how the kernel does reduction to get the sum. This seems to fix the issue.

max-krasnyansky

@lhez would be good to remove the commented-out code in the next round of updates

* opencl: use subgrroup reduce for reduction in rms_norm_mul * opencl: add comment about workgroup size

lhez added 2 commits November 12, 2025 22:05

opencl: use subgrroup reduce for reduction in rms_norm_mul

5bca5cc

opencl: add comment about workgroup size

696343e

DajanaV mentioned this pull request Nov 13, 2025

UPSTREAM PR #17250: opencl: fix rms_norm_mul auroralabs-loci/llama.cpp#199

Open

github-actions Bot added ggml changes relating to the ggml tensor library for machine learning OpenCL Issues specific to the OpenCL backend labels Nov 13, 2025

lhez marked this pull request as ready for review November 14, 2025 19:32

lhez requested a review from max-krasnyansky as a code owner November 14, 2025 19:32

max-krasnyansky approved these changes Nov 16, 2025

View reviewed changes

max-krasnyansky merged commit 52e5d42 into ggml-org:master Nov 16, 2025
72 checks passed

Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026

opencl: fix rms_norm_mul (ggml-org#17250)

31d84cb

* opencl: use subgrroup reduce for reduction in rms_norm_mul * opencl: add comment about workgroup size

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026

opencl: fix rms_norm_mul (#17250)

73726d5

* opencl: use subgrroup reduce for reduction in rms_norm_mul * opencl: add comment about workgroup size

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

opencl: fix rms_norm_mul (ggml-org#17250)

b211658

* opencl: use subgrroup reduce for reduction in rms_norm_mul * opencl: add comment about workgroup size

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

opencl: fix rms_norm_mul#17250

opencl: fix rms_norm_mul#17250
max-krasnyansky merged 2 commits intoggml-org:masterfrom
qualcomm:lh/rms-norm-mul-fix

lhez commented Nov 13, 2025

Uh oh!

max-krasnyansky left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

lhez commented Nov 13, 2025

Uh oh!

max-krasnyansky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants