Skip to content

server : correct index on finish in OAI completion streams#20226

Merged
ngxson merged 1 commit intoggml-org:masterfrom
vitri-ent:complete-index
Mar 8, 2026
Merged

server : correct index on finish in OAI completion streams#20226
ngxson merged 1 commit intoggml-org:masterfrom
vitri-ent:complete-index

Conversation

@decahedron1
Copy link
Copy Markdown
Contributor

When calling /v1/chat/completions with "stream": true and an n value greater than 1, the stop message would always have its index set to 0 instead of the correct index, making it impossible to tell which stream actually finished.

@ngxson ngxson merged commit ff52ee9 into ggml-org:master Mar 8, 2026
66 of 75 checks passed
@ggerganov
Copy link
Copy Markdown
Member

@decahedron1 Thanks for the fix. If you would like to contribute more, adding a basic server test to https://github.com/ggml-org/llama.cpp/blob/master/tools/server/tests/unit/test_chat_completion.py would be appreciated. It can send a chat completion request with stream = true and n > 1 and assert that the n indices are observed in the responses.

bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 10, 2026
Ethan-a2 pushed a commit to Ethan-a2/llama.cpp that referenced this pull request Mar 20, 2026
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants