fix(prompts): mirror user's language in reasoning + reply (#588)#590
Conversation
DeepSeek V4's `reasoning_content` channel inherits the system prompt's English bias even when users write in Chinese, so the visible thinking trace stays in English alongside (sometimes mixed-language) replies. Adds a `## Language` section near the top of `base.md` directing the model to mirror the user's language in *both* `reasoning_content` and the final reply, with a carve-out so identifiers, file paths, tool names, and log lines stay in their original form (translating `read_file` to `读取文件` would break tool calls). Default remains English when no clear signal is present, so existing behaviour is preserved. Includes a structural test in `crates/tui/src/prompts.rs` that asserts the section ships in every mode (Agent / Yolo / Plan). Wording is intentionally not asserted on, per the existing test module's "don't fail on prose" comment. Reported via the project Telegram community (#588). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a language-mirroring instruction to the shared TUI base prompt so DeepSeek is guided to match the user's language in both reasoning and visible replies across all prompt modes.
Changes:
- Added a new
## Languagesection to the sharedbase.mdprompt with an English-default fallback and carve-outs for code/tooling tokens. - Added a structural test in
prompts.rsto ensure the## Languagesection is present for Agent, Yolo, and Plan prompt composition.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
crates/tui/src/prompts/base.md |
Adds the shared language-mirroring prompt guidance used by composed system prompts. |
crates/tui/src/prompts.rs |
Adds a regression test asserting the new language section is included in all composed modes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| fn language_mirroring_section_present_in_all_modes() { | ||
| for mode in [AppMode::Agent, AppMode::Yolo, AppMode::Plan] { | ||
| let prompt = compose_prompt(mode, Personality::Calm); | ||
| assert!( | ||
| prompt.contains("## Language"), | ||
| "## Language section missing from mode {mode:?}" | ||
| ); |
There was a problem hiding this comment.
Code Review
This pull request introduces a language-mirroring directive to the base prompt, instructing the model to match the user's language in both its reasoning and final output while maintaining technical elements in their original form. A corresponding test was added to ensure this section is present across all application modes. Feedback suggests expanding the list of excluded technical terms to include environment variables, command-line flags, and URLs to prevent accidental translation.
|
|
||
| Detect the language the user writes in and respond in that same language — including your internal reasoning. If the user writes in Simplified Chinese (简体中文), your `reasoning_content` and final reply must both be in Simplified Chinese. If they switch languages mid-conversation, switch with them. The default when no clear signal is present is English. | ||
|
|
||
| Code, file paths, identifiers, tool names, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user. |
There was a problem hiding this comment.
The list of technical elements that should remain in their original form is excellent. However, to further prevent the model from accidentally translating technical parameters that might look like natural language, I suggest explicitly adding environment variables, command-line flags, and URLs to this exclusion list. These are common areas where LLMs might attempt to 'helpfully' translate terms (e.g., --verbose or DEBUG=true) when a language-mirroring directive is active.
| Code, file paths, identifiers, tool names, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user. | |
| Code, file paths, identifiers, tool names, environment variables, command-line flags, URLs, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user. |
…tent anchor Two small follow-ups to #588's review: * Gemini-code-assist suggested explicitly listing environment variables, command-line flags, and URLs alongside identifiers/tool-names in the carve-out clause, since those are exactly the categories an LLM is likeliest to "helpfully" translate (e.g. `--verbose` or `DEBUG=true`). Adopting verbatim — the additions are non-controversial and the failure mode they prevent is real. * Copilot flagged that the structural test only checked for the `## Language` heading. A future edit could keep the heading but silently weaken the section to a generic "respond in the user's language" directive, dropping the cross-cutting #588 commitment that the model's `reasoning_content` field — not just the visible reply — follows the user's language. Add a second structural anchor: assert the section body mentions `reasoning_content`. This matches the existing rlm test's "anchor tokens, not prose" convention (the API field name is the feature contract, not a wording choice). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix(prompts): mirror user's language in reasoning + reply (Hmbown#588)
Summary
## Languagesection tocrates/tui/src/prompts/base.mddirecting the model to mirror the user's language in bothreasoning_contentand the visible reply.read_fileto读取文件would break tool calls.prompts.rsasserts the section ships in every mode (Agent / Yolo / Plan). Wording is intentionally not asserted on per the test module's existing "don't fail on prose" comment.The new section
Placement
Inserted at the top of
base.md, immediately after the identity line and before## Preamble Rhythm. This puts the directive before any other guidance so the model reads it first — language affects every response on every turn, so prominence matters more than thematic grouping with## Output formattingfurther down.Cache-prefix impact: a one-time miss on rollout (the static prefix shifts by ~6 lines), then the new prefix is byte-stable turn-over-turn (verified by the existing
compose_prompt_is_byte_stable_across_callstest, which still passes).Notes for the reviewer
base.txtis orphaned. The issue body referencesbase.mdandbase.txtas though both load, butprompts.rsonlyinclude_str!sbase.md; a workspace-wide grep forbase.txtreturns no hits. I left it untouched here so this PR stays scoped to the behaviour fix — it's a reasonable follow-up cleanup either as achore:removal or to wire it into a real load path if you'd prefer.[ui] languageconfig knob deferred. The issue mentions an opt-in lock-to-language config as a complementary piece. I kept this PR tight to the cross-cutting fix that ships immediately for every Chinese-speaking user. Happy to follow up in a separate PR if you want it in v0.8.11.Test plan
cargo fmt --all -- --checkcargo clippy --workspace --all-targets --all-features --locked -- -D warningscargo test --workspace --all-features --locked— 2200+ tests, all passcargo test -p deepseek-tui-core --test snapshot --locked(parity gate)cargo test -p deepseek-protocol --test parity_protocol --lockedcargo test -p deepseek-state --test parity_state --lockedprompts::tests::language_mirroring_section_present_in_all_modespassesCargo.lockclean (git diff --exit-code -- Cargo.lock)deepseek-v4-proproduces Chinesereasoning_contentand Chinese reply.Closes #588.
🤖 Generated with Claude Code