Skip to content

[Bug] MiMo TTS 风格控制未按官方文档拼接到 assistant 合成文本前缀 #6815

@RichardLiuda

Description

@RichardLiuda

What happened / 发生了什么

问题描述

当前 MiMo TTS 的风格控制方式与官方文档不一致。

根据 MiMo 官方语音合成文档,风格控制应当通过 <style>...</style> 标签直接添加到待合成文本开头,并且待合成文本必须放在 assistant 角色消息中。

但目前 AstrBot 中的实现是将风格相关内容作为单独的 user 消息发送,而不是直接拼接到 assistant 的目标文本前缀,这会导致风格控制方式与官方推荐用法不一致。

官方文档

参考文档:

Reproduce / 如何复现?

文档中的关键要求包括:

  • 待合成文本必须放在 assistant 角色消息中
  • 风格控制需要以 <style>...</style> 标签形式前置到目标文本开头
  • user 角色消息是可选的,仅作为辅助提示使用
  • 当使用“唱歌”风格时,目标文本最前面应仅保留 <style>唱歌</style>

当前行为

例如当前实现更接近下面这种结构:

{
  "messages": [
    {
      "role": "user",
      "content": "开心 四川话 seed text"
    },
    {
      "role": "assistant",
      "content": "明天就是周五了"
    }
  ]
}

这与文档要求不一致。

期望行为
应改为将风格信息直接前置到 assistant.content:

{
  "messages": [
    {
      "role": "user",
      "content": "seed text"
    },
    {
      "role": "assistant",
      "content": "<style>开心 四川话</style>明天就是周五了"
    }
  ]
}

并且在“唱歌”场景下应遵循文档要求,仅保留:

<style>唱歌</style>歌词

建议改动

  • 将 mimo-tts-style-prompt 和 mimo-tts-dialect 合并为 <style>...</style> 前缀
  • 将该前缀直接拼接到 assistant 角色中的待合成文本开头
  • 保留 mimo-tts-seed-text 作为可选 user 消息,而不是拼接进待合成文本
  • 补充对应测试,确保请求构造方式符合官方文档

AstrBot version, deployment method (e.g., Windows Docker Desktop deployment), provider used, and messaging platform used. / AstrBot 版本、部署方式(如 Windows Docker Desktop 部署)、使用的提供商、使用的消息平台适配器

astrbot 4.22.0

OS

Linux

Logs / 报错日志

"200 OK" Headers({'server': 'openresty', 'date': 'Sun, 22 Mar 2026 17:11:30 GMT', 'content-type': 'application/json; charset=UTF-8', 'content-length': '749', 'connection': 'keep-alive', 'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'vary': 'Origin', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'x-new-api-version': 'v0.11.2', 'x-oneapi-request-id': '20260322171125745975180zpVCLoCP', 'x-xss-protection': '0', 'cache-control': 'no-cache', 'strict-transport-security': 'max-age=31536000'})
[01:11:30.329] [Core] [DBUG] [openai._base_client:1571]: request_id: None
[01:11:30.398] [Core] [DBUG] [sources.openai_source:283]: completion: ChatCompletion(id='chatcmpl-20260322171125745975180zpVCLoCP', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content='哈基米……是指那种毛茸茸的小可爱吗?听起来像是某种奇怪的召唤咒语呢。&&meow&&', refusal=None, role='assistant', annotations=None, audio=None, function_call=None, tool_calls=None))], created=1774199490, model='gemini-3-flash-preview', object='chat.completion', service_tier=None, system_fingerprint=None, usage=CompletionUsage(completion_tokens=444, prompt_tokens=10373, total_tokens=10817, completion_tokens_details=CompletionTokensDetails(accepted_prediction_tokens=None, audio_tokens=0, reasoning_tokens=416, rejected_prediction_tokens=None, text_tokens=0), prompt_tokens_details=PromptTokensDetails(audio_tokens=0, cached_tokens=0, text_tokens=10373, image_tokens=0), input_tokens=0, output_tokens=0, input_tokens_details=None, claude_cache_creation_5_m_tokens=0, claude_cache_creation_1_h_tokens=0))
[01:11:30.415] [Core] [DBUG] [runners.base:64]: Agent state transition: AgentState.RUNNING -> AgentState.DONE
[01:11:30.416] [Core] [DBUG] [pipeline.context_utils:95]: hook(OnLLMResponseEvent) -> meme_manager - resp
[01:11:30.427] [Plug] [DBUG] [astrbot_plugin_meme_manager.main:595]: [meme_manager] 重复检测阶段找到的表情: []
[01:11:30.429] [Plug] [DBUG] [astrbot_plugin_meme_manager.main:624]: [meme_manager] 松散匹配阶段找到的表情: []
[01:11:30.430] [Plug] [INFO] [astrbot_plugin_meme_manager.main:637]: [meme_manager] 去重后的最终表情列表: ['meow']
[01:11:30.431] [Plug] [DBUG] [astrbot_plugin_meme_manager.main:642]: [meme_manager] 清理后的最终文本内容长度: 34
[01:11:30.432] [Core] [DBUG] [pipeline.context_utils:95]: hook(OnLLMResponseEvent) -> astrbot_plugin_mnemosyne - on_llm_resp
[01:11:30.434] [Plug] [WARN] [v4.22.0] [core.memory_operations:553]: 当前会话 (ID: astrbot:FriendMessage:2645345468) 未配置人格,将使用占位符 'UNKNOWN_PERSONA' 进行记忆操作(如果启用人格过滤)。
[01:11:30.437] [Plug] [DBUG] [memory_manager.message_counter:307]: 会话 astrbot:FriendMessage:2645345468 的上下文历史长度 (464) 与消息计数器 (9) 一致。
[01:11:30.438] [Plug] [DBUG] [core.memory_operations:468]: 返回的内容:哈基米……是指那种毛茸茸的小可爱吗?听起来像是某种奇怪的召唤咒语呢。
[01:11:30.443] [Plug] [DBUG] [memory_manager.message_counter:225]: 会话 astrbot:FriendMessage:2645345468 的计数器已加 1。
[01:11:30.443] [Core] [DBUG] [pipeline.context_utils:95]: hook(OnLLMResponseEvent) -> astrbot - record_llm_resp_to_ltm
[01:11:30.454] [Core] [DBUG] [result_decorate.stage:165]: hook(on_decorating_result) -> meme_manager - on_decorating_result
[01:11:30.457] [Plug] [DBUG] [astrbot_plugin_meme_manager.main:766]: [meme_manager] on_decorating_result 开始处理
[01:11:30.460] [Plug] [DBUG] [astrbot_plugin_meme_manager.main:914]: [meme_manager] on_decorating_result 处理完成
[01:11:30.467] [Core] [INFO] [result_decorate.stage:286]: TTS 请求: 哈基米……是指那种毛茸茸的小可爱吗?听起来像是某种奇怪的召唤咒语呢。
[01:11:30.480] [Core] [DBUG] [httpcore._trace:87]: close.started
[01:11:30.480] [Core] [DBUG] [httpcore._trace:87]: close.complete
[01:11:30.481] [Core] [DBUG] [httpcore._trace:87]: connect_tcp.started host='

[api.xiaomimimo.com](http://api.xiaomimimo.com/)

' port=443 local_address=None timeout=20 socket_options=None
[01:11:30.810] [Core] [DBUG] [httpcore._trace:87]: connect_tcp.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7f8a90106810>
[01:11:30.811] [Core] [DBUG] [httpcore._trace:87]: start_tls.started ssl_context=<ssl.SSLContext object at 0x7f8a93fcbb50> server_hostname='

[api.xiaomimimo.com](http://api.xiaomimimo.com/)

' timeout=20
[01:11:30.825] [Core] [DBUG] [httpcore._trace:87]: start_tls.complete return_value=<httpcore._backends.anyio.AnyIOStream object at 0x7f8a7f652b40>
[01:11:30.825] [Core] [DBUG] [httpcore._trace:87]: send_request_headers.started request=<Request [b'POST']>
[01:11:30.826] [Core] [DBUG] [httpcore._trace:87]: send_request_headers.complete
[01:11:30.826] [Core] [DBUG] [httpcore._trace:87]: send_request_body.started request=<Request [b'POST']>
[01:11:30.826] [Core] [DBUG] [httpcore._trace:87]: send_request_body.complete
[01:11:30.826] [Core] [DBUG] [httpcore._trace:87]: receive_response_headers.started request=<Request [b'POST']>
[01:11:34.049] [Core] [DBUG] [httpcore._trace:87]: receive_response_headers.complete return_value=(b'HTTP/1.1', 200, b'OK', [(b'Date', b'Sun, 22 Mar 2026 17:11:34 GMT'), (b'Content-Type', b'application/json'), (b'Transfer-Encoding', b'chunked'), (b'Connection', b'keep-alive'), (b'Vary', b'Origin'), (b'Vary', b'Access-Control-Request-Method'), (b'Vary', b'Access-Control-Request-Headers'), (b'Server', b'MiFE/3.4.29'), (b'X-MiFE-Upstream-Status', b'200'), (b'Content-Encoding', b'gzip')])
[01:11:34.081] [Core] [INFO] [httpx._client:1740]: HTTP Request: POST

https://api.xiaomimimo.com/v1/chat/completions

"HTTP/1.1 200 OK"
[01:11:34.084] [Core] [DBUG] [httpcore._trace:87]: receive_response_body.started request=<Request [b'POST']>
[01:11:34.101] [Core] [DBUG] [httpcore._trace:87]: receive_response_body.complete
[01:11:34.101] [Core] [DBUG] [httpcore._trace:87]: response_closed.started
[01:11:34.101] [Core] [DBUG] [httpcore._trace:87]: response_closed.complete
[01:11:34.127] [Core] [INFO] [result_decorate.stage:288]: TTS 结果: /AstrBot/data/temp/mimo_tts_api_3458855c-6bc6-4eb7-92f9-717eb6c681d6.wav
[01:11:34.141] [Core] [INFO] [respond.stage:184]: Prepare to send - Richard Liu/2645345468: [ComponentType.Record] 哈基米……是指那种毛茸茸的小可爱吗?听起来像是某种奇怪的召唤咒语呢。
[01:11:39.026] [Core] [DBUG] [pipeline.context_utils:95]: hook(OnAfterMessageSentEvent) -> meme_manager - after_message_sent
[01:11:39.028] [Core] [DBUG] [pipeline.context_utils:95]: hook(OnAfterMessageSentEvent) -> astrbot - after_message_sent
[01:11:39.132] [Core] [DBUG] [pipeline.scheduler:93]: pipeline 执行完毕。

Are you willing to submit a PR? / 你愿意提交 PR 吗?

  • Yes!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:providerThe bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner.bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions