Skip to content

feat(prompt): tokenizer-style prompt stitching with thinking-prefix support#255

Merged
CJackHwang merged 1 commit intodevfrom
codex/refactor-prompt-concatenation-using-tokenizer
Apr 12, 2026
Merged

feat(prompt): tokenizer-style prompt stitching with thinking-prefix support#255
CJackHwang merged 1 commit intodevfrom
codex/refactor-prompt-concatenation-using-tokenizer

Conversation

@CJackHwang
Copy link
Copy Markdown
Owner

Motivation

  • Make prompt assembly follow a tokenizer/chat-template style so turn boundaries and generation-prefixes are clearer and closer to native model formats.
  • Ensure assistant history and the generation prefix differ for thinking-capable models versus non-thinking models to support reasoner/chat model behavior.

Description

  • Add tokenizer-style framing markers and think markers: introduce <|begin▁of▁sentence|>, <think>, and </think>, and apply them when assembling prompts.
  • Add MessagesPrepareWithThinking(messages, thinkingEnabled) and keep MessagesPrepare as the non-thinking default wrapper for compatibility.
  • When building prompts, prepend a begin-of-sentence marker, emit historical assistant content with </think>{content}<|end▁of▁sentence|>, and append an assistant prefix (<think> for thinking models or </think> for non-thinking models) when the last role is not assistant.
  • Thread thinkingEnabled through OpenAI/Claude/Gemini prompt normalization (via deepseek) so resolved model capabilities select the appropriate suffix/prefix behavior, and update unit tests to assert the new markers and think-prefix semantics.

Testing

  • Ran go test ./internal/prompt ./internal/util ./internal/adapter/openai ./internal/adapter/claude ./internal/adapter/gemini and all affected unit tests passed.
  • Ran ./scripts/lint.sh, ./tests/scripts/check-refactor-line-gate.sh, ./tests/scripts/run-unit-all.sh, and npm run build --prefix webui, and these checks completed successfully.

Codex Task

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
ds2api Ready Ready Preview, Comment Apr 12, 2026 6:00am

@CJackHwang CJackHwang marked this pull request as ready for review April 12, 2026 08:20
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 433a3a877d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +76 to +80
thinkPrefix := closeThinkMarker
if thinkingEnabled {
thinkPrefix = openThinkMarker
}
parts = append(parts, assistantMarker+thinkPrefix)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Strip think delimiters from visible model output

This change starts every generation with raw <think>/</think> delimiters, but output cleanup still only strips DeepSeek role/end markers in sanitizeLeakedOutput and does not remove think tags. When upstream echoes prompt-template tokens (the same class of leakage already handled for <|Assistant|>/<|end...|>), those delimiters will now be returned to clients as literal answer text, polluting responses and exposing internal reasoning boundaries. Please add <think>/</think> (and the new BOS marker if needed) to output sanitization or avoid emitting them into user-visible channels.

Useful? React with 👍 / 👎.

@CJackHwang CJackHwang merged commit d759804 into dev Apr 12, 2026
3 checks passed
@CJackHwang CJackHwang deleted the codex/refactor-prompt-concatenation-using-tokenizer branch April 12, 2026 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant