-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Implement token parsing for <think> tags in responses (Deep Seek R1) #678
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider extracting common styles and repetitive logic for handling input changes into reusable components or functions. This will improve code readability and maintainability. Refer to our Development Standards: https://www.notion.so/Development-Standards-59febcf8ead647fd9c2ec3f60c22f3df?pvs=4#11869ad2d5818094a05ef707e188c0d5
|
@Szpadel do you have time to take a look at this? You've been way closer to this than me. Thanks for opening the PR @hankbeasley! |
|
Will this work for the distilled models as well? They suffer from the same issue. |
It should as they were also trained to respond with think tags. |
|
@hankbeasley I believe this will only work when whole I believe we should check if chunk contains Second, I wonder if some models output |
|
I am pretty sure that <think> equals one token in the model and therefore would never be output in pieces. I thought about parsing pieces but I most likely would introduce a bug for edge cases on my first try and would need lots of unit tests. Perhaps there is another model that outputs <thinking> instead of <think> ? Instead of a boolean to turn on the parsing on and off we could use a string for the tag name?
|
It looks like the [ ](https://github.com/RooVetGit/Roo-Code/blob/main/src/core/prompts/sections/objective.ts) |
|
What do you think about doing it this way? this should not contain any corner cases, independently if think is used as single token or Why I believe this might not be using single token: |
|
Link to partial token parser created by o1. I don't trust it though without tests for edge cases. Including here for reference. https://gist.github.com/hankbeasley/ebc474f66cc489625f496effc607ac80 |
Looks good to me. Good point about distilled models. |
|
Fell free to rebase on top of my commit and add settings option, I do not want to steal contribution from you :) |
what do you think about where I put the UI option? I only started working with the project today so not sure if it's aligned with patterns. |
506a3fb to
b784715
Compare
|
I'm no expert on UI, so I won't comment on that. We need to flush buffer when steam is ended I'll prepare code for this when I'll have access to computer |
|
One more idea: could we assume any of this: With those we could greatly reduce chances of incorrect handling in case output should contain such tag. We would be only affected in case there would be closing tag inside reasoning |
|
Flushing: Szpadel@e0e2bbe |
|
@hankbeasley could you pull my commit and rebase(or fix conflicts)? |
b784715 to
92561aa
Compare
|
@Szpadel I merged your flush change. I haven't had time to test these changes though. I am fine with you closing this pull request and reopening a new one. |
|
Don't know if it's still relevant to the conversation above but the the tag is a a specific token for the R1 distill models as you can see from the tokenizer.json on hugging face - https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Llama-70B/blob/main/tokenizer.json |
|
Pretty sure #1566 covers this? |
Description
Some hosts for DeepSeek R1 return <Think> tags in the assistant responses. These reasoning tokens do not get classified by RooCode correctly.
I have noticed this behavior with both Azure and Kluster.ai.
This pull request parses these tokens correctly. I added one toggle to the UI to turn the parsing on or off although I think it could safely be used all the time unless someone needs to use <Think> tags for another purpose
Type of change
How Has This Been Tested?
I have tested with Kluster.ai
Needs review
Note that if a response has both delta.content and delta.reasoning_content, the behavior is changed with this pull request. Previously, the chunk would have been returned as "text." Now, it would be returned as "reasoning." I think its fine as I don't think it should ever have both properties.
Important
Adds parsing for
<think>tags inOpenAiHandlerto classify reasoning tokens, with a UI toggle to enable/disable this feature.<think>tags in responses to classify reasoning tokens inOpenAiHandlerinopenai.ts.delta.contentanddelta.reasoning_contentare present, now classifying as "reasoning".ThinkingTokenSeparatorandPassThroughTokenSeparatorto handle token parsing inopenai.ts.ApiOptions.tsxto enable/disable<think>tag parsing.thinkTokensInResponsetoModelInfoinapi.ts.This description was created by
for 3921e8e6eb0c9b3f1e8ba356ebcbe7694942925b. It will automatically update as commits are pushed.