Skip to content

CUDA: use fastdiv in set-rows#16834

Merged
am17an merged 2 commits intoggml-org:masterfrom
am17an:cuda-fast-div-setrows
Oct 29, 2025
Merged

CUDA: use fastdiv in set-rows#16834
am17an merged 2 commits intoggml-org:masterfrom
am17an:cuda-fast-div-setrows

Conversation

@am17an
Copy link
Copy Markdown
Contributor

@am17an am17an commented Oct 29, 2025

Helpful in TG

Device 0: NVIDIA GeForce RTX 4090, compute capability 8.9, VMM: yes

Model Test t/s master t/s set-row Speedup
gpt-oss 20B MXFP4 MoE tg128 186.10 187.25 1.01
gpt-oss 20B MXFP4 MoE tg256 184.63 187.19 1.01
gpt-oss 20B MXFP4 MoE tg512 183.95 184.45 1.00
qwen3moe 30B.A3B Q4_0 tg128 161.55 163.48 1.01
qwen3moe 30B.A3B Q4_0 tg256 162.39 163.33 1.01
qwen3moe 30B.A3B Q4_0 tg512 159.14 160.86 1.01

@github-actions github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Oct 29, 2025
Comment thread ggml/src/ggml-cuda/set-rows.cu
@am17an am17an requested a review from slaren as a code owner October 29, 2025 10:56
@am17an am17an merged commit e41bcce into ggml-org:master Oct 29, 2025
68 of 69 checks passed
@am17an am17an deleted the cuda-fast-div-setrows branch October 29, 2025 13:11
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
* CUDA: use fastdiv in set-rows

* add assert about value fitting in u32
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* CUDA: use fastdiv in set-rows

* add assert about value fitting in u32
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* CUDA: use fastdiv in set-rows

* add assert about value fitting in u32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants