Skip to content

fix(prompts): mirror user's language in reasoning + reply (#588)#590

Merged
Hmbown merged 2 commits into
mainfrom
fix/588-mirror-user-language
May 5, 2026
Merged

fix(prompts): mirror user's language in reasoning + reply (#588)#590
Hmbown merged 2 commits into
mainfrom
fix/588-mirror-user-language

Conversation

@Hmbown
Copy link
Copy Markdown
Owner

@Hmbown Hmbown commented May 4, 2026

Summary

  • Adds a ## Language section to crates/tui/src/prompts/base.md directing the model to mirror the user's language in both reasoning_content and the visible reply.
  • Carve-out so code, file paths, identifiers, tool names, and log lines stay in their original form — translating read_file to 读取文件 would break tool calls.
  • Default remains English when no clear input signal is present, so existing English-language behaviour is preserved.
  • Structural test in prompts.rs asserts the section ships in every mode (Agent / Yolo / Plan). Wording is intentionally not asserted on per the test module's existing "don't fail on prose" comment.

The new section

## Language

Detect the language the user writes in and respond in that same language — including your internal reasoning. If the user writes in Simplified Chinese (简体中文), your `reasoning_content` and final reply must both be in Simplified Chinese. If they switch languages mid-conversation, switch with them. The default when no clear signal is present is English.

Code, file paths, identifiers, tool names, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user.

Placement

Inserted at the top of base.md, immediately after the identity line and before ## Preamble Rhythm. This puts the directive before any other guidance so the model reads it first — language affects every response on every turn, so prominence matters more than thematic grouping with ## Output formatting further down.

Cache-prefix impact: a one-time miss on rollout (the static prefix shifts by ~6 lines), then the new prefix is byte-stable turn-over-turn (verified by the existing compose_prompt_is_byte_stable_across_calls test, which still passes).

Notes for the reviewer

  • base.txt is orphaned. The issue body references base.md and base.txt as though both load, but prompts.rs only include_str!s base.md; a workspace-wide grep for base.txt returns no hits. I left it untouched here so this PR stays scoped to the behaviour fix — it's a reasonable follow-up cleanup either as a chore: removal or to wire it into a real load path if you'd prefer.
  • Optional [ui] language config knob deferred. The issue mentions an opt-in lock-to-language config as a complementary piece. I kept this PR tight to the cross-cutting fix that ships immediately for every Chinese-speaking user. Happy to follow up in a separate PR if you want it in v0.8.11.

Test plan

  • cargo fmt --all -- --check
  • cargo clippy --workspace --all-targets --all-features --locked -- -D warnings
  • cargo test --workspace --all-features --locked — 2200+ tests, all pass
  • cargo test -p deepseek-tui-core --test snapshot --locked (parity gate)
  • cargo test -p deepseek-protocol --test parity_protocol --locked
  • cargo test -p deepseek-state --test parity_state --locked
  • New test prompts::tests::language_mirroring_section_present_in_all_modes passes
  • Cargo.lock clean (git diff --exit-code -- Cargo.lock)
  • Manual eval before release: a Chinese-input turn on deepseek-v4-pro produces Chinese reasoning_content and Chinese reply.

Closes #588.

🤖 Generated with Claude Code

DeepSeek V4's `reasoning_content` channel inherits the system
prompt's English bias even when users write in Chinese, so the
visible thinking trace stays in English alongside (sometimes
mixed-language) replies.

Adds a `## Language` section near the top of `base.md` directing
the model to mirror the user's language in *both*
`reasoning_content` and the final reply, with a carve-out so
identifiers, file paths, tool names, and log lines stay in their
original form (translating `read_file` to `读取文件` would break
tool calls). Default remains English when no clear signal is
present, so existing behaviour is preserved.

Includes a structural test in `crates/tui/src/prompts.rs` that
asserts the section ships in every mode (Agent / Yolo / Plan).
Wording is intentionally not asserted on, per the existing test
module's "don't fail on prose" comment.

Reported via the project Telegram community (#588).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 4, 2026 17:11
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a language-mirroring instruction to the shared TUI base prompt so DeepSeek is guided to match the user's language in both reasoning and visible replies across all prompt modes.

Changes:

  • Added a new ## Language section to the shared base.md prompt with an English-default fallback and carve-outs for code/tooling tokens.
  • Added a structural test in prompts.rs to ensure the ## Language section is present for Agent, Yolo, and Plan prompt composition.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
crates/tui/src/prompts/base.md Adds the shared language-mirroring prompt guidance used by composed system prompts.
crates/tui/src/prompts.rs Adds a regression test asserting the new language section is included in all composed modes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread crates/tui/src/prompts.rs
Comment on lines +613 to +619
fn language_mirroring_section_present_in_all_modes() {
for mode in [AppMode::Agent, AppMode::Yolo, AppMode::Plan] {
let prompt = compose_prompt(mode, Personality::Calm);
assert!(
prompt.contains("## Language"),
"## Language section missing from mode {mode:?}"
);
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a language-mirroring directive to the base prompt, instructing the model to match the user's language in both its reasoning and final output while maintaining technical elements in their original form. A corresponding test was added to ensure this section is present across all application modes. Feedback suggests expanding the list of excluded technical terms to include environment variables, command-line flags, and URLs to prevent accidental translation.

Comment thread crates/tui/src/prompts/base.md Outdated

Detect the language the user writes in and respond in that same language — including your internal reasoning. If the user writes in Simplified Chinese (简体中文), your `reasoning_content` and final reply must both be in Simplified Chinese. If they switch languages mid-conversation, switch with them. The default when no clear signal is present is English.

Code, file paths, identifiers, tool names, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The list of technical elements that should remain in their original form is excellent. However, to further prevent the model from accidentally translating technical parameters that might look like natural language, I suggest explicitly adding environment variables, command-line flags, and URLs to this exclusion list. These are common areas where LLMs might attempt to 'helpfully' translate terms (e.g., --verbose or DEBUG=true) when a language-mirroring directive is active.

Suggested change
Code, file paths, identifiers, tool names, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user.
Code, file paths, identifiers, tool names, environment variables, command-line flags, URLs, and log lines stay in their original form — translating `read_file` to `读取文件` would break tool calls. Only natural-language prose mirrors the user.

…tent anchor

Two small follow-ups to #588's review:

* Gemini-code-assist suggested explicitly listing environment variables,
  command-line flags, and URLs alongside identifiers/tool-names in the
  carve-out clause, since those are exactly the categories an LLM is
  likeliest to "helpfully" translate (e.g. `--verbose` or `DEBUG=true`).
  Adopting verbatim — the additions are non-controversial and the failure
  mode they prevent is real.

* Copilot flagged that the structural test only checked for the `## Language`
  heading. A future edit could keep the heading but silently weaken the
  section to a generic "respond in the user's language" directive,
  dropping the cross-cutting #588 commitment that the model's
  `reasoning_content` field — not just the visible reply — follows the
  user's language. Add a second structural anchor: assert the section
  body mentions `reasoning_content`. This matches the existing rlm test's
  "anchor tokens, not prose" convention (the API field name is the
  feature contract, not a wording choice).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Hmbown Hmbown merged commit f176470 into main May 5, 2026
8 checks passed
@Hmbown Hmbown deleted the fix/588-mirror-user-language branch May 5, 2026 01:05
MMMarcinho pushed a commit to MMMarcinho/DeepSeek-TUI that referenced this pull request May 6, 2026
fix(prompts): mirror user's language in reasoning + reply (Hmbown#588)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prompt: Chinese reasoning — assistant thinks in English even when user writes in Chinese

2 participants