Skip to content

ggml-cpu: add RVV implementation for q1_0 x q8_0 vec dot#22500

Open
velonica0 wants to merge 1 commit intoggml-org:masterfrom
velonica0:rvv_q1_0_dot
Open

ggml-cpu: add RVV implementation for q1_0 x q8_0 vec dot#22500
velonica0 wants to merge 1 commit intoggml-org:masterfrom
velonica0:rvv_q1_0_dot

Conversation

@velonica0
Copy link
Copy Markdown

@velonica0 velonica0 commented Apr 29, 2026

Overview

This PR adds an RVV-specific implementation of ggml_vec_dot_q1_0_q8_0 for RISC-V.

This PR is a follow-up to #21273 and #21636, and my changes here are based on the work from those PRs. Thanks to the authors for laying the groundwork.

Benchmark results for Bonsai 1.7B (Due to runtime constraints on the K1, the benchmark was limited to 64 token)

threads test generic tok/s RVV tok/s speedup
4 pp64 0.561362 +- 0.000273 2.462494 +- 0.007606 4.39x
4 tg64 0.442725 +- 0.000522 1.872843 +- 0.005062 4.23x
8 pp64 0.880189 +- 0.002553 3.418164 +- 0.015693 3.88x
8 tg64 0.367250 +- 0.005271 0.796739 +- 0.005334 2.17x

Perplexity summary style table:

Metric RVV (K1)
Same top p 99.294 +- 0.235 %
Mean KLD 0.000222 +- 0.000009
Maximum KLD 0.004316
99.9% KLD 0.004056
99.0% KLD 0.001395
Median KLD 0.000137
1.0% KLD -0.000012
Minimum KLD -0.000051
Mean Delta p -0.007 +- 0.010 %
Maximum Delta p 2.477 %
99.9% Delta p 2.177 %
99.0% Delta p 1.180 %
95.0% Delta p 0.460 %
Median Delta p -0.000 %
5.0% Delta p -0.546 %
1.0% Delta p -1.088 %
0.1% Delta p -1.604 %
Minimum Delta p -1.646 %
RMS Delta p 0.346 +- 0.017 %

Additional information

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure:
    Use AI to help me with testing and data collection.

@velonica0 velonica0 requested a review from ggerganov as a code owner April 29, 2026 05:38
@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant