Implement token parsing for <think> tags in responses (Deep Seek R1) #678

hankbeasley · 2025-01-31T17:06:28Z

Description

Some hosts for DeepSeek R1 return <Think> tags in the assistant responses. These reasoning tokens do not get classified by RooCode correctly.

I have noticed this behavior with both Azure and Kluster.ai.

This pull request parses these tokens correctly. I added one toggle to the UI to turn the parsing on or off although I think it could safely be used all the time unless someone needs to use <Think> tags for another purpose

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

How Has This Been Tested?

I have tested with Kluster.ai

Needs review

Note that if a response has both delta.content and delta.reasoning_content, the behavior is changed with this pull request. Previously, the chunk would have been returned as "text." Now, it would be returned as "reasoning." I think its fine as I don't think it should ever have both properties.

Important

Adds parsing for <think> tags in OpenAiHandler to classify reasoning tokens, with a UI toggle to enable/disable this feature.

Behavior:
- Parses <think> tags in responses to classify reasoning tokens in OpenAiHandler in openai.ts.
- Changes behavior when both delta.content and delta.reasoning_content are present, now classifying as "reasoning".
Classes:
- Adds ThinkingTokenSeparator and PassThroughTokenSeparator to handle token parsing in openai.ts.
UI:
- Adds a toggle in ApiOptions.tsx to enable/disable <think> tag parsing.
Models:
- Adds thinkTokensInResponse to ModelInfo in api.ts.

^{This description was created by}^{for 3921e8e6eb0c9b3f1e8ba356ebcbe7694942925b. It will automatically update as commits are pushed.}

changeset-bot · 2025-01-31T17:06:32Z

⚠️ No Changeset found

Latest commit: 74308e6

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

ellipsis-dev · 2025-01-31T17:08:38Z

webview-ui/src/components/settings/ApiOptions.tsx

Consider extracting common styles and repetitive logic for handling input changes into reusable components or functions. This will improve code readability and maintainability. Refer to our Development Standards: https://www.notion.so/Development-Standards-59febcf8ead647fd9c2ec3f60c22f3df?pvs=4#11869ad2d5818094a05ef707e188c0d5

webview-ui/src/components/settings/ApiOptions.tsx

mrubens · 2025-01-31T17:29:39Z

@Szpadel do you have time to take a look at this? You've been way closer to this than me. Thanks for opening the PR @hankbeasley!

Mushoz · 2025-01-31T20:50:03Z

Will this work for the distilled models as well? They suffer from the same issue.

hankbeasley · 2025-01-31T21:42:21Z

Will this work for the distilled models as well? They suffer from the same issue.

It should as they were also trained to respond with think tags.

Szpadel · 2025-01-31T22:12:11Z

@hankbeasley I believe this will only work when whole <think> is encoded as single token, but I'm not sure if that's always true.

I believe we should check if chunk contains <think> (then split chunk into reasoning chunk and text chunk) or ends with any part of it, like <thin, < etc., then save it to buffer and do not emit anything (so we can get next chunk and determine if that is expected tag)

Second, I wonder if some models output <thinking> instead. https://github.com/RooVetGit/Roo-Code/blob/e8f0b35860506ea6a43418a1da3e5f9596fabafc/src/core/Cline.ts#L1006-L1007
The code currently strips such tags in post process.

hankbeasley · 2025-01-31T22:44:59Z

I am pretty sure that <think> equals one token in the model and therefore would never be output in pieces. I thought about parsing pieces but I most likely would introduce a bug for edge cases on my first try and would need lots of unit tests.

Perhaps there is another model that outputs <thinking> instead of <think> ? Instead of a boolean to turn on the parsing on and off we could use a string for the tag name?

@hankbeasley I believe this will only work when whole <think> is encoded as single token, but I'm not sure if that's always true.

I believe we should check if chunk contains <think> (then split chunk into reasoning chunk and text chunk) or ends with any part of it, like <thin, < etc., then save it to buffer and do not emit anything (so we can get next chunk and determine if that is expected tag)

Second, I wonder if some models output <thinking> instead.

https://github.com/RooVetGit/Roo-Code/blob/e8f0b35860506ea6a43418a1da3e5f9596fabafc/src/core/Cline.ts#L1006-L1007

The code currently strips such tags in post process.

hankbeasley · 2025-01-31T22:57:16Z

Second, I wonder if some models output <thinking> instead.

It looks like the <thinking> tags you are referring to are an attempt to get non-thinking models to reason via prompt engineering. See the prompt below.

[ ](https://github.com/RooVetGit/Roo-Code/blob/main/src/core/prompts/sections/objective.ts)

…> tags

Szpadel · 2025-01-31T23:04:21Z

What do you think about doing it this way?
Szpadel@f3192d7

this should not contain any corner cases, independently if think is used as single token or

Why I believe this might not be using single token:
Distilled models are trained to use reasoning in their response, but their embeddings were kept as they were.
This means that there was no designed special token to start thinking, and most likely it will be constructed form multiple tokens.

hankbeasley · 2025-01-31T23:05:39Z

Link to partial token parser created by o1. I don't trust it though without tests for edge cases. Including here for reference.

https://gist.github.com/hankbeasley/ebc474f66cc489625f496effc607ac80

hankbeasley · 2025-01-31T23:13:30Z

What do you think about doing it this way? Szpadel@f3192d7

this should not contain any corner cases, independently if think is used as single token or

Why I believe this might not be using single token: Distilled models are trained to use reasoning in their response, but their embeddings were kept as they were. This means that there was no designed special token to start thinking, and most likely it will be constructed form multiple tokens.

Looks good to me. Good point about distilled models.

Szpadel · 2025-01-31T23:18:19Z

Fell free to rebase on top of my commit and add settings option, I do not want to steal contribution from you :)

hankbeasley · 2025-01-31T23:22:32Z

Fell free to rebase on top of my commit and add settings option, I do not want to steal contribution from you :)

what do you think about where I put the UI option? I only started working with the project today so not sure if it's aligned with patterns.

…le for model settings

…onent

Szpadel · 2025-02-01T06:58:55Z

I'm no expert on UI, so I won't comment on that.
While thinking about steam handling, there is one more corner case: when steam ends with any partial of think tag, incomplete response will be returned.

We need to flush buffer when steam is ended

I'll prepare code for this when I'll have access to computer

Szpadel · 2025-02-01T07:33:45Z

One more idea: could we assume any of this:
a) reasoning model always starts with <think> token
b) there can be only one <think> tag containing reasoning in the output.

With those we could greatly reduce chances of incorrect handling in case output should contain such tag.

We would be only affected in case there would be closing tag inside reasoning

Szpadel · 2025-02-01T08:00:14Z

Flushing: Szpadel@e0e2bbe

Szpadel · 2025-02-04T08:55:05Z

@hankbeasley could you pull my commit and rebase(or fix conflicts)?
it would be nice to merge this

…o-Code into thinkTokenSupport

hankbeasley · 2025-02-06T16:25:16Z

@Szpadel I merged your flush change. I haven't had time to test these changes though. I am fine with you closing this pull request and reopening a new one.

jcbdev · 2025-02-10T14:43:32Z

Don't know if it's still relevant to the conversation above but the the tag is a a specific token for the R1 distill models as you can see from the tokenizer.json on hugging face - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/tokenizer.json

{
      "id": 128013,
      "content": "<think>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },
    {
      "id": 128014,
      "content": "</think>",
      "single_word": false,
      "lstrip": false,
      "rstrip": false,
      "normalized": false,
      "special": false
    },

mrubens · 2025-03-14T00:50:48Z

Pretty sure #1566 covers this?

hankbeasley requested review from ColemanRoo, cte, mrubens and stea9499 as code owners January 31, 2025 17:06

ellipsis-dev bot reviewed Jan 31, 2025

View reviewed changes

mrubens reviewed Jan 31, 2025

View reviewed changes

webview-ui/src/components/settings/ApiOptions.tsx Outdated Show resolved Hide resolved

mrubens reviewed Jan 31, 2025

View reviewed changes

webview-ui/src/components/settings/ApiOptions.tsx Outdated Show resolved Hide resolved

hankbeasley closed this Jan 31, 2025

hankbeasley reopened this Jan 31, 2025

Parse text in openai response stream to extract reasoning from <think…

f3192d7

…> tags

Hank Beasley added 2 commits January 31, 2025 17:58

Implement token parsing for <think> tags in responses and add UI togg…

e3163a4

…le for model settings

Fix typo in tooltip text for DeepSeek R1 providers in ApiOptions comp…

b784715

…onent

hankbeasley force-pushed the thinkTokenSupport branch from 506a3fb to b784715 Compare February 1, 2025 00:12

hankbeasley closed this Feb 6, 2025

hankbeasley force-pushed the thinkTokenSupport branch from b784715 to 92561aa Compare February 6, 2025 16:15

Hank Beasley added 2 commits February 6, 2025 10:18

Merge branch 'thinkTokenSupport' of https://github.com/hankbeasley/Ro…

5a9b8f7

…o-Code into thinkTokenSupport

merge flush from Szpadel

74308e6

hankbeasley reopened this Feb 6, 2025

hannesrudolph moved this to PR in Roo Code Roadmap Mar 5, 2025

hannesrudolph added this to Roo Code Roadmap Mar 5, 2025

hannesrudolph moved this to To triage in Roo Code Roadmap Mar 5, 2025

hannesrudolph moved this from To triage to PR - Needs Approval in Roo Code Roadmap Mar 6, 2025

mrubens moved this from PR [Unverified] to PR [Deferred] in Roo Code Roadmap Mar 10, 2025

mrubens closed this Mar 14, 2025

github-project-automation bot moved this from PR [Deferred] to Done in Roo Code Roadmap Mar 14, 2025

Implement token parsing for <think> tags in responses (Deep Seek R1) #678

Implement token parsing for <think> tags in responses (Deep Seek R1) #678

Uh oh!

Conversation

hankbeasley commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

How Has This Been Tested?

Needs review

Uh oh!

changeset-bot bot commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

ellipsis-dev bot Jan 31, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

mrubens commented Jan 31, 2025

Uh oh!

Mushoz commented Jan 31, 2025

Uh oh!

hankbeasley commented Jan 31, 2025

Uh oh!

Szpadel commented Jan 31, 2025

Uh oh!

hankbeasley commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hankbeasley commented Jan 31, 2025

Uh oh!

Szpadel commented Jan 31, 2025

Uh oh!

hankbeasley commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hankbeasley commented Jan 31, 2025

Uh oh!

Szpadel commented Jan 31, 2025

Uh oh!

hankbeasley commented Jan 31, 2025

Uh oh!

Szpadel commented Feb 1, 2025

Uh oh!

Szpadel commented Feb 1, 2025

Uh oh!

Szpadel commented Feb 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Szpadel commented Feb 4, 2025

Uh oh!

hankbeasley commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jcbdev commented Feb 10, 2025

Uh oh!

mrubens commented Mar 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

hankbeasley commented Jan 31, 2025 •

edited

Loading

changeset-bot bot commented Jan 31, 2025 •

edited

Loading

hankbeasley commented Jan 31, 2025 •

edited

Loading

hankbeasley commented Jan 31, 2025 •

edited

Loading

Szpadel commented Feb 1, 2025 •

edited

Loading

hankbeasley commented Feb 6, 2025 •

edited

Loading