Skip to content

Add LLaDA-7b-MoE diffusion model#16003

Merged
am17an merged 3 commits intoggml-org:masterfrom
am17an:llada_moe
Sep 16, 2025
Merged

Add LLaDA-7b-MoE diffusion model#16003
am17an merged 3 commits intoggml-org:masterfrom
am17an:llada_moe

Conversation

@am17an
Copy link
Copy Markdown
Contributor

@am17an am17an commented Sep 15, 2025

Add support for https://huggingface.co/inclusionAI/LLaDA-MoE-7B-A1B-Instruct, MoE diffusion models similar to OLMoE (except the QK norm). Added two ggufs - bf16 and q8_0

Example command: ./llama-diffusion-cli -m llada-moe-7B-instruct-BF16.gguf -p "Write code to train MNIST in pytroch" -ngl 99 --diffusion-block-length 32 --diffusion-steps 256 -ub 256 --diffusion-algorithm 4 -fa 0 --temp 0 -sys "You are a helpful AI assistant"

@am17an am17an requested a review from CISC September 15, 2025 08:50
@github-actions github-actions Bot added examples python python script changes labels Sep 15, 2025
Comment thread convert_hf_to_gguf.py Outdated
Comment thread convert_hf_to_gguf.py Outdated
Comment thread gguf-py/gguf/constants.py Outdated
Comment thread src/llama-model.cpp Outdated
Comment thread src/llama-model.cpp Outdated
Comment thread src/llama-model.cpp Outdated
Comment thread src/llama-vocab.cpp Outdated
@am17an am17an merged commit 6d75883 into ggml-org:master Sep 16, 2025
51 of 52 checks passed
@am17an am17an deleted the llada_moe branch September 16, 2025 02:39
@CISC
Copy link
Copy Markdown
Member

CISC commented Oct 18, 2025

@am17an There's a preview of LLaDa2Moe available, and it uses the same expert group selection as in BailingMoeV2, so I made it generally available in 6dd223b in case you want to take a stab at it later. :)

@am17an
Copy link
Copy Markdown
Contributor Author

am17an commented Oct 19, 2025

@CISC Thanks! I saw that, from what I see it looks their sampling doesn't change so it should be straightforward to add this model. Though I will wait for them to release the full version first.

blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants