[Mirror] server: /v1/responses (partial)#85
Conversation
…ver_task_result_cmpl_partial, and server_task_result_cmpl_final
📝 WalkthroughWalkthroughAdds OpenAI Responses API support: new /v1/responses route, conversion utility to translate Responses requests into Chat Completions format, new TASK_RESPONSE_TYPE_OAI_RESP with streaming/state handling, SSE formatting, unit tests, and dependency bumps for the OpenAI client. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Server as HTTP Server
participant Conv as Conversion Layer
participant Handler as Completions Handler
participant Task as Task Processor
Client->>Server: POST /v1/responses
Server->>Conv: convert_responses_to_chatcmpl(request_body)
Conv-->>Server: chat-completions-format JSON
Server->>Handler: handle_completions_impl(..., TASK_RESPONSE_TYPE_OAI_RESP)
Handler->>Task: create/process task (streaming state)
Task->>Handler: emit SSE events or final JSON (using format_oai_resp_sse)
Handler-->>Server: stream or response payload
Server-->>Client: SSE stream or JSON response
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Fix all issues with AI agents
In `@tools/server/server-common.cpp`:
- Around line 1156-1173: The current branch that handles "input_file" pushes a
content entry with {"type":"file"} into chatcmpl_content which
oaicompat_chat_params_parse does not accept (it only supports text, image_url,
input_audio); update server-common.cpp to reject "input_file" early instead of
creating an unsupported content type: in the else-if for type == "input_file"
(the block that currently checks for file_url/file_data/filename and pushes into
chatcmpl_content), throw a clear std::invalid_argument like "'input_file' is not
supported for chat content; use an alternative upload/attachment flow" (or
similar) and remove the code that injects {"type":"file"} so parsing via
oaicompat_chat_params_parse will not later fail; alternatively, if you prefer to
support files, implement corresponding handling in oaicompat_chat_params_parse
to accept "file" content types.
In `@tools/server/server-task.cpp`:
- Around line 879-899: The final streaming reasoning output (created when
oaicompat_msg.reasoning_content is non-empty) is missing the "status" field;
update the output_item construction (the json assigned to output_item in the
block that builds the "response.output_item.done" event) to include
"status":"completed" so the streaming final item matches the non-streaming
schema and the intermediate "in_progress" items.
- Around line 806-854: The non-streaming output in
server_task_result_cmpl_final::to_json_oaicompat_resp() currently uses raw
tool_call.id for the function_call "call_id", while
to_json_oaicompat_resp_stream() prefixes IDs with "fc_", causing inconsistent
IDs; add a small helper (e.g., normalize_fc_id(const std::string&)) that returns
tool_call.id if already prefixed or prefixes with "fc_" otherwise, replace
direct uses of tool_call.id in to_json_oaicompat_resp() and the corresponding
places in to_json_oaicompat_resp_stream() to call this helper, and ensure
common_chat_tool_call.id is wrapped via normalize_fc_id when building the
{"call_id", ...} JSON field so both streaming and non‑streaming paths produce
consistent "fc_"‑prefixed IDs.
- Around line 1502-1527: The code currently stores a single oai_resp_fc_id and
overwrites it when multiple function-call deltas interleave; instead maintain a
map keyed by diff.tool_call_index (e.g., std::unordered_map<int, std::string>
tool_call_ids) and update/lookup entries when processing diff.tool_call_delta in
functions that build events (similar to to_json_anthropic()); on name/create
deltas set tool_call_ids[diff.tool_call_index] = diff.tool_call_delta.id (or
"fc_"+id for item_id), and when argument deltas arrive use
tool_call_ids.at(diff.tool_call_index) to produce the correct "item_id" so each
interleaved call keeps its own ID; ensure any cleanup (erase) happens when a
call completes.
🧹 Nitpick comments (1)
tools/server/tests/unit/test_compat_oai_responses.py (1)
1-3: Prefer explicit imports overfrom utils import *.
Ruff already flags this (F403/F405). Explicit imports reduce namespace ambiguity in tests.♻️ Suggested change
-import pytest -from openai import OpenAI -from utils import * +import pytest +from openai import OpenAI +from utils import ServerProcess, ServerPreset, match_regex
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
Mirror from upstream PR: ggml-org#18486
Note: @coderabbitai use my 'Mirror PR' preset for reviewing this.
Summary by CodeRabbit
New Features
Documentation
Tests
Chores
✏️ Tip: You can customize this high-level summary in your review settings.