model: add hunyuan dense by stevenkuang-tencent · Pull Request #14878 · ggml-org/llama.cpp

stevenkuang-tencent · 2025-07-25T13:23:05Z

Update:

Support hunyuan_dense
fix hunyuan_moe chat template

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

This reverts commit aa973ca.

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

CISC · 2025-07-29T09:24:31Z

@stevenkuang-tencent gentle ping

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

stevenkuang-tencent · 2025-08-01T13:12:30Z

Politely asking, can this pull request be merged now? @CISC

CISC · 2025-08-01T13:26:39Z

@stevenkuang-tencent Yes, but the chat template gives me pause, please follow up once model is released if there are any problems.

jacekpoplawski · 2025-08-01T15:37:45Z

Is this for upcoming models or old ones?
Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

stevenkuang-tencent · 2025-08-01T15:52:06Z

Is this for upcoming models or old ones? Because https://huggingface.co/tencent/Hunyuan-4B-Instruct is not accessible and in the vllm I see https://huggingface.co/tencent/Hunyuan-7B-Instruct-0124 is mentioned

It is for upcoming models. Those models will come soon.

jacekpoplawski · 2025-08-01T16:02:55Z

It is for upcoming models. Those models will come soon.

that's fantastic news, thanks!

* support hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <stevenkuang@tencent.com> * add blank line Signed-off-by: stevenkuang <stevenkuang@tencent.com> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * remove leftover code Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan dense chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> --------- Signed-off-by: stevenkuang <stevenkuang@tencent.com>

pwilkin · 2025-08-05T10:50:27Z

Just wanted to chime in, tested IQ4NL quants and the output is completely incoherent.

arch-btw · 2025-08-05T13:19:16Z

Same issue here, tried it with the different flags but it still doesn't work:

-cnv --jinja

-cnv --chat-template hunyuan-dense

-cnv --chat-template hunyuan-moe

Example output:

hello

，不对，这样处理的话，比如对于输入序列“1,

@stevenkuang-tencent

pwilkin · 2025-08-05T13:23:17Z

My 3 attempts were:

Lots of Chinese text after "Hello"
Started "<think" then completely froze (but generation wasn't finished)
Kept repeating "answer:" in new session
Something is completely broken.

stevenkuang-tencent · 2025-08-05T14:57:07Z

The chat-template has been updated before the model is open sourced, and we are updating it synchronously

arch-btw · 2025-08-05T19:06:22Z

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja
Then run with --jinja --chat-template-file hunyuan4b.jinja
The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <｜hy_User｜>{{ message['content'] }}<｜hy_Assistant｜>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <｜hy_Assistant｜>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

pwilkin · 2025-08-05T19:08:58Z

@stevenkuang-tencent thank you

@pwilkin I put this together and this seems to work for now, although it's not an official solution:

Save as hunyuan4b.jinja Then run with --jinja --chat-template-file hunyuan4b.jinja The model defaults to /no_think but putting /think before the prompt works.

hunyuan4b.jinja:

{%- if 'add_generation_prompt' not in context %}
    {%- set add_generation_prompt = false %}
{%- endif %}

{%- set ns = namespace(is_first=false) %}

{%- for message in messages if message['role'] == 'system' %}
    {%- if ns.is_first %}
        {%- set ns.is_first = false %}
        {{- bos_token -}}
        {{- message['content'] }}
    {%- else %}
        {{- '\n\n' + message['content'] }}
    {%- endif %}
{%- endfor %}

{%- for message in messages %}
    {% if message['role'] == 'user' %}
        <｜hy_User｜>{{ message['content'] }}<｜hy_Assistant｜>
    {% endif %}
    
    {% if message['role'] == 'assistant' %}
        {{ message['content'] }}{{ eos_token }}
    {% endif %}
{%- endfor %}

{%- if add_generation_prompt and not ns.is_last_user %}
    <｜hy_Assistant｜>
{%- endif %}

{%- if enable_thinking is defined and not enable_thinking %}
    ...
{%- endif %}

What's in the "..." part? The current contents?

arch-btw · 2025-08-05T19:13:07Z

I think so, when I remove it (with thinking enabled) it starts talking in Chinese again.

pwilkin · 2025-08-05T19:19:56Z

Nope, on Hunyuan 7B still garbage. Tried the fixed prompt from their tokenizer config, but still doesn't work.

pwilkin · 2025-08-05T19:20:47Z

I guess it might have something to do with this:
load: special_eos_id is not in special_eog_ids - the tokenizer config may be incorrect

arch-btw · 2025-08-05T19:21:11Z

I think 7b uses a different tokenizer.

pwilkin · 2025-08-05T19:21:43Z

Yes, but it's been incorrectly uploaded from what I've seen.

* support hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <stevenkuang@tencent.com> * add blank line Signed-off-by: stevenkuang <stevenkuang@tencent.com> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca21913aba77f6e81a935270ef7be222e75. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * remove leftover code Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan dense chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> --------- Signed-off-by: stevenkuang <stevenkuang@tencent.com>

* support hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan_moe to hunyuan_v1_moe Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix rope alpha assert and bos token Signed-off-by: stevenkuang <stevenkuang@tencent.com> * add blank line Signed-off-by: stevenkuang <stevenkuang@tencent.com> * Revert "update hunyuan_moe to hunyuan_v1_moe" This reverts commit aa973ca. * use hunyuan_dense instead of hunyuan_v1_dense Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan_moe chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * remove leftover code Signed-off-by: stevenkuang <stevenkuang@tencent.com> * update hunyuan dense chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> * fix hunyuan dense vocab and chat template Signed-off-by: stevenkuang <stevenkuang@tencent.com> --------- Signed-off-by: stevenkuang <stevenkuang@tencent.com>

stevenkuang-tencent added 3 commits July 25, 2025 19:55

support hunyuan_v1_dense

5d2c042

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

update hunyuan_moe to hunyuan_v1_moe

aa973ca

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

fix rope alpha assert and bos token

5645497

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

github-actions Bot added the python python script changes label Jul 25, 2025

add blank line

63f32c3

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

CISC reviewed Jul 25, 2025

View reviewed changes

Comment thread convert_hf_to_gguf.py Outdated

Comment thread convert_hf_to_gguf_update.py Outdated

Comment thread gguf-py/gguf/constants.py Outdated

stevenkuang-tencent added 3 commits July 26, 2025 03:26

Revert "update hunyuan_moe to hunyuan_v1_moe"

78de8db

This reverts commit aa973ca.

use hunyuan_dense instead of hunyuan_v1_dense

c7329b4

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

fix hunyuan_moe chat template

0192c12

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

stevenkuang-tencent changed the title ~~model: add hunyuan v1 dense~~ model: add hunyuan dense Jul 25, 2025

CISC requested changes Jul 25, 2025

View reviewed changes

xunjieliu mentioned this pull request Jul 26, 2025

Reddit News Daily 2025-07-26 xunjieliu/reddit-daily-news#132

Open

stevenkuang-tencent added 2 commits July 27, 2025 01:08

remove leftover code

3ecc5d3

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

update hunyuan dense chat template

6c17323

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

stevenkuang-tencent requested a review from CISC July 26, 2025 17:19

CISC approved these changes Jul 26, 2025

View reviewed changes

Comment thread convert_hf_to_gguf.py Outdated

fix hunyuan dense vocab and chat template

675f35d

Signed-off-by: stevenkuang <stevenkuang@tencent.com>

CISC reviewed Jul 31, 2025

View reviewed changes

Comment thread src/llama-chat.cpp

CISC merged commit 0f5ccd6 into ggml-org:master Aug 1, 2025
50 checks passed

stevenkuang-tencent mentioned this pull request Aug 6, 2025

model : fix hunyuan chat template #15114

Merged

weedge mentioned this pull request Sep 2, 2025

feat: add Hunyuan-MT and Seed-X transformers generator translation test; run websocket/webrtc asr+translate+tts bot serve for Hunyuan-MT ai-bot-pro/achatbot#188

Merged

wqerrewetw mentioned this pull request Jan 5, 2026

Feature Request: support HY-MT1.5-1.8B #18608

Closed

4 tasks

Conversation

stevenkuang-tencent commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CISC commented Jul 29, 2025

Uh oh!

Uh oh!

stevenkuang-tencent commented Aug 1, 2025

Uh oh!

CISC commented Aug 1, 2025

Uh oh!

Uh oh!

jacekpoplawski commented Aug 1, 2025

Uh oh!

stevenkuang-tencent commented Aug 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jacekpoplawski commented Aug 1, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

stevenkuang-tencent commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

arch-btw commented Aug 5, 2025

Uh oh!

pwilkin commented Aug 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

stevenkuang-tencent commented Jul 25, 2025 •

edited

Loading

stevenkuang-tencent commented Aug 1, 2025 •

edited

Loading

arch-btw commented Aug 5, 2025 •

edited

Loading