Skip to content

Q1_0: port CUDA kernels#21584

Closed
pwilkin wants to merge 1 commit intoggml-org:masterfrom
pwilkin:q1-cuda-kernels
Closed

Q1_0: port CUDA kernels#21584
pwilkin wants to merge 1 commit intoggml-org:masterfrom
pwilkin:q1-cuda-kernels

Conversation

@pwilkin
Copy link
Copy Markdown
Member

@pwilkin pwilkin commented Apr 7, 2026

Overview

CUDA kernels for Q1_0 ported from the original fork.

Additional information

The CUDA kernels from the original fork's Q1_0_g128.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: Yes, used Kimi 2.5 to compare the branch and port the kernels

@pwilkin pwilkin requested a review from a team as a code owner April 7, 2026 20:37
@github-actions github-actions Bot added Nvidia GPU Issues specific to Nvidia GPUs ggml changes relating to the ggml tensor library for machine learning labels Apr 8, 2026
@am17an
Copy link
Copy Markdown
Contributor

am17an commented Apr 8, 2026

I think the original authors already had planned CUDA kernels. If so, we should let them add them

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Apr 8, 2026

@khosravipasha can you please take a look and comment?

@khosravipasha
Copy link
Copy Markdown
Contributor

Oh thanks, did not see this just submitted the CUDA PR: #21629
was waiting for Metal backend to get merged.
I might need some help with tuning the kernels not a GPU expert myself, but so far the speed ups were satisfactory.

@pwilkin
Copy link
Copy Markdown
Member Author

pwilkin commented Apr 8, 2026

Ah, no worries :)

Obsoleted by #21629

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants