Fix GPT-OSS Harmony format end token handling crash#15195
Closed
trvon wants to merge 4 commits intoggml-org:masterfrom
trvon:fix-gpt-oss-end-token
Closed
Fix GPT-OSS Harmony format end token handling crash#15195trvon wants to merge 4 commits intoggml-org:masterfrom trvon:fix-gpt-oss-end-token
trvon wants to merge 4 commits intoggml-org:masterfrom
trvon:fix-gpt-oss-end-token
Conversation
The GPT-OSS model uses OpenAI's Harmony format which includes an <|end|> token after the final message. The parser wasn't handling this token properly, causing finish() to throw 'Unexpected content at end of input'. Modified the GPT-OSS parser to strip the <|end|> token from the content if present. Added comprehensive tests for GPT-OSS format with and without the end token to ensure backward compatibility.
… by unconsumed <|end|> tokens in the chat parser.
There was a problem hiding this comment.
Pull Request Overview
Fix server crash when using GPT-OSS models with tools that was caused by unconsumed <|end|> tokens in the chat parser.
- Modified GPT-OSS chat parser to properly handle
<|end|>tokens in both tool and non-tool scenarios - Added reasoning format parameter passing from templates to server configuration
- Added comprehensive test coverage for GPT-OSS Harmony format parsing edge cases
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| common/chat.h | Added reasoning_format field to common_chat_params struct for template-specific configuration |
| common/chat.cpp | Fixed GPT-OSS parser to consume remaining content and handle `< |
| tools/server/utils.hpp | Added reasoning_format parameter passing from chat templates to server parameters |
| tools/server/server.cpp | Updated server to use template-provided reasoning format with proper fallback logic |
| tests/test-chat-parser.cpp | Added comprehensive test suite for GPT-OSS Harmony format edge cases and token handling |
| } else { | ||
| // No <|end|> token, consume everything remaining | ||
| if (!builder.syntax().parse_tool_calls) { | ||
| builder.add_content(builder.consume_rest()); |
There was a problem hiding this comment.
This line is unreachable because it's inside an else block that follows a return statement that was removed. The logic flow suggests this should be part of the else clause for the if (!builder.syntax().parse_tool_calls) condition.
Author
There was a problem hiding this comment.
I made changes that address this in subsequent commits.
Author
|
I see and I am merging the changes from #15181 against my own in another branch. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix server crash when using GPT-OSS models with tools that was caused by unconsumed
<|end|>tokens in the chat parser.Changes
Module:
chat-parsercommon_chat_parse_gpt_oss()only consumed content whenparse_tool_calls=false, leaving<|end|>token unconsumed whenparse_tool_calls=true(tools enabled)parse_tool_callscasesManual Validation
Before Fix:
After Fix: