cuda : add FILL op support by JayZenith · Pull Request #17851 · ggml-org/llama.cpp

JayZenith · 2025-12-08T00:21:16Z

Add CUDA backend support for the GGML_OP_FILL, which was previously missing (CPU and Vulkan had it). This operation is used by the Qwen3-Next model (discussion in #16623).

Added fill.cu and fill.cuh with a simple CUDA kernel.
Added dispatch case in ggml_cuda_compute_forward()
Declared support in ggml_backend_cuda_device_supports_op()

Tested with test-backend-ops -o FILL on Tesla T4:
FILL(type=f32,ne=[10,10,4,3],c=0.000000): OK
FILL(type=f32,ne=[303,207,11,3],c=2.000000): OK
FILL(type=f32,ne=[800,600,4,4],c=-152.000000): OK
FILL(type=f32,ne=[2048,512,2,2],c=3.500000): OK
4/4 tests passed

am17an

I'm not sure why this should just not be a cudaMemsetAsync

JayZenith · 2025-12-08T03:05:39Z

@am17an cudaMemsetAsync writes only one byte value repeatedly and dosent interpret floats/doubles. It works for 0.0f but fails for numbers like 1.0f (0x3F800000) as it would write 0x3F to every byte. This kernel writes the full float/double per element, so works for any number. Essentially, byte-wise vs element-wise writing.

am17an · 2025-12-08T03:39:01Z

You need to also enable this kernel via ggml_backend_cuda_device_supports_op, right now I'm not sure how it's passing test-backend-ops for you

* cuda : add FILL op support * cuda : add missing FILL op files

cuda : add FILL op support

d91f4f9

github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Dec 8, 2025

loci-dev mentioned this pull request Dec 8, 2025

UPSTREAM PR #17851: cuda : add FILL op support auroralabs-loci/llama.cpp#481

Open

am17an reviewed Dec 8, 2025

View reviewed changes

Comment thread ggml/src/ggml-cuda/fill.cu Outdated

am17an reviewed Dec 8, 2025

View reviewed changes

Comment thread ggml/src/ggml-cuda/fill.cu Outdated

JayZenith force-pushed the cuda-fill-op branch 3 times, most recently from d22704c to 43f3b5f Compare December 8, 2025 04:09

am17an approved these changes Dec 8, 2025

View reviewed changes

JayZenith force-pushed the cuda-fill-op branch from 43f3b5f to ae71397 Compare December 8, 2025 04:57

cuda : add missing FILL op files

179ddb5

JayZenith force-pushed the cuda-fill-op branch from ae71397 to 179ddb5 Compare December 8, 2025 08:38

am17an merged commit 51e0c2d into ggml-org:master Dec 8, 2025
78 checks passed

gabe-l-hart mentioned this pull request Dec 10, 2025

feat: llama.cpp bump (17f7f4) for SSM performance improvements ollama/ollama#13408

Merged

0Marble pushed a commit to 0Marble/llama.cpp that referenced this pull request Dec 18, 2025

cuda : add FILL op support (ggml-org#17851)

4db951e

* cuda : add FILL op support * cuda : add missing FILL op files

Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026

cuda : add FILL op support (ggml-org#17851)

bcfbb39

* cuda : add FILL op support * cuda : add missing FILL op files

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026

cuda : add FILL op support (#17851)

5a654f6

* cuda : add FILL op support * cuda : add missing FILL op files

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

cuda : add FILL op support (ggml-org#17851)

c505c2f

* cuda : add FILL op support * cuda : add missing FILL op files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cuda : add FILL op support#17851

cuda : add FILL op support#17851
am17an merged 2 commits intoggml-org:masterfrom
JayZenith:cuda-fill-op

JayZenith commented Dec 8, 2025 •

edited

Loading

Uh oh!

am17an left a comment

Uh oh!

JayZenith commented Dec 8, 2025

Uh oh!

am17an commented Dec 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

JayZenith commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

am17an left a comment

Choose a reason for hiding this comment

Uh oh!

JayZenith commented Dec 8, 2025

Uh oh!

am17an commented Dec 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

JayZenith commented Dec 8, 2025 •

edited

Loading