Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| if "base64" in url: | ||
| image_data = re.sub("^data:image/.+;base64,", "", url) | ||
| image = Image.open(BytesIO(base64.b64decode(image_data))) | ||
| file = tempfile.NamedTemporaryFile(suffix=".png", delete=False) | ||
| image.save(file.name) | ||
| url = file.name |
| audio_data = base64.b64decode(input_audio["data"]) | ||
| suffix = f".{input_audio.get('format', 'wav')}" if isinstance(input_audio, dict) else ".wav" | ||
| file = tempfile.NamedTemporaryFile(suffix=suffix, delete=False) | ||
| file.write(audio_data) | ||
| file.flush() | ||
| parsed["content"].append({"type": "audio", "url": file.name}) |
There was a problem hiding this comment.
same here, we delegate this to the processor
…o better-response-api-support
|
run-slow: cli |
|
This comment contains models: ["cli"] |
|
run-slow: cli |
|
This comment contains models: ["cli"] |
zucchini-nlp
left a comment
There was a problem hiding this comment.
Okay for me, let's just check if breaking "dict-format-inputs" is fine
| 3. **Messages list** (multi-turn, with ``role`` keys): | ||
| ``input=[{"role": "user", "content": [...]}, {"role": "assistant", ...}]`` | ||
| → passed through as-is. |
There was a problem hiding this comment.
should multi-turn it be added in the docs? I see that now we show #2 input type in this PR, but prob I am not aware and it's already available somewhere else
There was a problem hiding this comment.
Yeah we show type 2 input as it is simpler. I'll add a small section about multi turn conversation as I think most users will most likely use an existing client.
| elif isinstance(inp, dict): | ||
| messages = [{"role": "system", "content": instructions}] if instructions else [] | ||
| messages.append(inp) |
There was a problem hiding this comment.
flagging that this looks breaking, up to you and Lysandre tho 😄
There was a problem hiding this comment.
yeah it's fine ! response api is supposed to only allow string or List. It was my mistake to add it in the first place.
OpenAI-compatible tool-calling payloads may encode assistant tool-call messages with explicit null content rather than omitting the field. The serve path already handled missing content, but iterated over None for the explicit-null case and raised TypeError before the tool calls reached apply_chat_template. This keeps the fix intentionally small: normalize None to an empty content list and pin the behavior with focused LLM/VLM regression tests. Constraint: Recent serve/tool-call fixes (huggingface#45348, huggingface#45418, huggingface#45463) make compatibility regressions in this path especially review-sensitive Rejected: Broader serve content refactor | unnecessary scope for a one-line crash fix Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep follow-up discussion focused on the explicit null-content case; do not mix unrelated serve cleanup into the PR Tested: Direct local repro before/after; pytest -q tests/cli/test_serve.py -k 'tool_use_fields_forwarded' Not-tested: Full repo CI; end-to-end HTTP serve session
What does this PR do?
This PR updates the support for response api. I was mainly basing myself on chat completion api but there are minor differences with it e.g
input_imagevsimage_urlfortypeorinput_textvstext, different types of input structure accepted. Also i've simplified a bit the processing of the inputs.