llama: Add support for RWKV v7 architecture(v2) by MollySophia · Pull Request #12412 · ggml-org/llama.cpp

MollySophia · 2025-03-16T13:00:55Z

@BlinkDL 's explanation of RWKV v7:
RWKV-7 as a meta-in-context learner
Also there are plenty of tests on trained models posted on his x account.

Current available RWKV v7 model repos in HF format:

Base models:

https://huggingface.co/fla-hub/rwkv7-191M-world
https://huggingface.co/fla-hub/rwkv7-0.4B-world
https://huggingface.co/fla-hub/rwkv7-1.5B-world
https://huggingface.co/fla-hub/rwkv7-2.9B-world
https://huggingface.co/fla-hub/rwkv7-0.1B-g1 (Haven't add the option to enable it's capability yet.)

Distilled models:

https://huggingface.co/RWKV-Red-Team/ARWKV-R1-1B5
https://huggingface.co/RWKV-Red-Team/ARWKV-R1-7B
https://huggingface.co/RWKV-Red-Team/ARWKV_7B_R1_16K

This PR contains:

GGML_OP_L2_NORM that applies pytorch-style l2 normalization, along the rows. Tested with CPU, CUDA, SYCL, Vulkan, Metal backends.
GGML_OP_RWKV_WKV7 which is the core of the RWKV v7 architecture. Implemented the naive recurrent wkv7 kernel in CPU, CUDA, SYCL, Vulkan, Metal.
Support inference of RWKV7 and ARWKV7 models.
Simple Metal kernel for the old WKV6.
Skip unused tokens in last layer ffn computation for rwkv models.
Fix inference with RWKV6Qwen2.

TODO:

llama-parallel seems broken with all rwkv models. Will check what's wrong and try to fix them tomorrow. (Inference is fixed. But the output seems mixed between these parallel sequences. Haven't figured out what's wrong yet)
Why is Musa build failing? (Seems that there's some bugs in their vectorization code. Getting rid of a #pragma unroll in wkv.cu fix the build.

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

Rbiessy

No concern with the SYCL changes, thanks

* ggml: Add op l2_norm Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * ggml: Add op rwkv_wkv7 Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * llama: Add support for RWKV7 and ARWKV7 models Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * llama: fix inference with RWKV6Qwen2 Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * llama: add more (a)rwkv7 variants in size Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * Apply code-format changes Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * fix MUSA build Signed-off-by: Molly Sophia <mollysophia379@gmail.com> * llama: fix shape error with rwkv using llama-parallel Signed-off-by: Molly Sophia <mollysophia379@gmail.com> --------- Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

MollySophia added 5 commits March 16, 2025 16:03

ggml: Add op l2_norm

98eff12

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

ggml: Add op rwkv_wkv7

194ead5

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

llama: Add support for RWKV7 and ARWKV7 models

ba7bdc0

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

llama: fix inference with RWKV6Qwen2

f34ffbc

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

llama: add more (a)rwkv7 variants in size

94b62e7

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

Apply code-format changes

019120d

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

MollySophia mentioned this pull request Mar 16, 2025

llama: Add support for RWKV v7 architecture #11452

Closed

fix MUSA build

35a8f06

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

MollySophia force-pushed the rwkv-v7-new branch from 560f606 to 35a8f06 Compare March 17, 2025 02:38

llama: fix shape error with rwkv using llama-parallel

8b32b8a

Signed-off-by: Molly Sophia <mollysophia379@gmail.com>

MollySophia requested a review from ggerganov March 17, 2025 07:02

ggerganov approved these changes Mar 17, 2025

View reviewed changes

Rbiessy mentioned this pull request Mar 17, 2025

SYCL: Remove misleading ggml_sycl_op_flatten function #12387

Merged

Rbiessy approved these changes Mar 17, 2025

View reviewed changes

MollySophia merged commit 7dfad38 into ggml-org:master Mar 17, 2025

heredos mentioned this pull request Mar 26, 2025

RWKV-G1 ollama/ollama#9653

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llama: Add support for RWKV v7 architecture(v2)#12412

llama: Add support for RWKV v7 architecture(v2)#12412
MollySophia merged 8 commits intoggml-org:masterfrom
MollySophia:rwkv-v7-new

MollySophia commented Mar 16, 2025 •

edited

Loading

Uh oh!

Rbiessy left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MollySophia commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Base models:

Distilled models:

Uh oh!

Rbiessy left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MollySophia commented Mar 16, 2025 •

edited

Loading