Skip to content

Drop content=None from messages in apply_chat_template#45422

Open
qgallouedec wants to merge 6 commits intomainfrom
drop-content
Open

Drop content=None from messages in apply_chat_template#45422
qgallouedec wants to merge 6 commits intomainfrom
drop-content

Conversation

@qgallouedec
Copy link
Copy Markdown
Member

@qgallouedec qgallouedec commented Apr 14, 2026

In apply_chat_template, drop the content key from messages when its value is None before passing to the Jinja template.

Why this is a bug fix, not a breaking change

content=None means "there is no content", it is semantically identical to the key being absent. No caller sets content=None expecting the literal string "None" to appear in the output, or the output to be different than if the key were absent.

Yet today, several templates crash or misbehave:

After this change, out of the 20 tool-calling models tested:

  • 12 are unaffected
  • 6 are fixed (crashes or literal "None" → correct output)
  • 1 edge case (LFM2.5-VL — separate template bug)
  • 1 would regress (DeepSeek-R1 — accepts content=None but crashes on absent key, which IMO is itself a template bug to fix on the Hub repo).

Why not just leave it to template authors?

Because content=None is a real-world input (it's what the OpenAI API returns for tool-call-only messages)

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "What's the weather in Paris?"}],
    tools=[{
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get weather for a city",
            "parameters": {"type": "object", "properties": {"city": {"type": "string"}}},
        },
    }],
)

msg = response.choices[0].message
msg_dict = msg.model_dump()
print(f"\nmsg_dict = {msg_dict}")
msg_dict = {'content': None, 'refusal': None, 'role': 'assistant', 'annotations': [], 'audio': None, 'function_call': None, 'tool_calls': [{'id': 'call_Im08OArU1YY2mKi7ntPxdm22', 'function': {'arguments': '{"city":"Paris"}', 'name': 'get_weather'}, 'type': 'function'}]}

and expecting every template author to handle it correctly is a losing game — 8 out of 20 already don't. One normalization line in apply_chat_template fixes all of them at once and prevents future templates from hitting the same issue.

Part of the broader discussion in #45419.

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Copy Markdown
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for losing track of this one! I think it's a good fix, and a good path to standardization, but Deepseek-R1 is a very major model, even if it's a little out of date now that V4 exists, and this change would be very breaking for it. I think we should patch its template on the Hub first, then merge this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants