Fix: Revert showing control tokens by default for server OpenAI Chat completions#6860
Conversation
…vide overridden declaration to receive "bool special" param to toggle showing control tokens
…mon/common.cpp to specify "false" so that control tokens are not shown in chat completion responses"
|
I provided an alternative solution that reverts the change to I added an overridden declaration of
|
|
I am open to comments, concerns and/or complaints regarding if this is the correct way to fix this problem |
|
Hmm i am not sure what happened but on the most up-to-date llama.cpp |
…g#6860) * fix: revert showing control tokens by default * feat: revert changes to default behavior of llama_token_to_piece; provide overridden declaration to receive "bool special" param to toggle showing control tokens * feat: use the overridden declaration of llama_token_to_piece from common/common.cpp to specify "false" so that control tokens are not shown in chat completion responses" * common : simplify --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
…g#6860) * fix: revert showing control tokens by default * feat: revert changes to default behavior of llama_token_to_piece; provide overridden declaration to receive "bool special" param to toggle showing control tokens * feat: use the overridden declaration of llama_token_to_piece from common/common.cpp to specify "false" so that control tokens are not shown in chat completion responses" * common : simplify --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

In #6807 @ggerganov added the ability to toggle showing control tokens (e.g. EOS tokens). In
common.cppthis was set totrueby default in two places, which broke the/v1/chat/completionsendpoint as described in #6859 - in short, the OpenAI chat completions endpoint response now includes the EOS / stop token, which is different from past behavior / expected behavior.I have confirmed that reverting the booleans to be
falsein the two places incommon.cppfixes this behavior.While this PR fixes the breaking change, it may affect behavior that is dependent on #6807's new default of
truein other places. This may need to be investigated further, but I propose reverting the change for now to fix the broken/v1/chat/completionsbehavior.s/o @QueryType for opening #6847 as well which was caused by the same underlying issue.
API Response before the change (ChatML model):
API Response before the change (Mistral model / llama2 template):
Correct API response after this change:
(note the absence of control tokens)