Automatically add generation tags to chat template for assistant_only_loss=True training (TRL Issue #4879) by Neelectric · Pull Request #4900 · huggingface/trl

Neelectric · 2026-01-26T22:14:09Z

What does this PR do?

This is a very first step at implementing the feature requested in #4879. For now the approach would be trying to recognise the model type and then replacing the chat template with a 'fixed' version. This would be helpful for the most common models (Qwen, Llama, OLMo etc) but I wonder if there's a more fancy way to go about this that automatically recognises when the assistant starts messaging, which we would want to train on.

Implements #4879 (feature request)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a GitHub issue? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@qgallouedec

This is a first attempt at implementing feature request huggingface#4879, leaning heavily on https://github.com/HarryMayne/qwen_3_chat_templates

…in add_generation_tags()

qgallouedec · 2026-01-26T23:11:24Z

trl/chat_template_utils.py

+    # https://huggingface.co/Qwen/Qwen3-8B/discussions/14 and https://github.com/HarryMayne/qwen_3_chat_templates
+    if tokenizer.chat_template == qwen3_chat_template:
+        tokenizer.chat_template = qwen3_chat_template_all_assistant
+    # probably want to add support here for most other popular model families like Llama 2/3/3.1, OLMo 2/3/3.1 etc... is there a way to do this fully automatically?


I don't think we can do it 100% automatically, but I think that there are not thousands of different chat templates out there, if we support the top 5 chat templates, we may covers a good majority of use cases

qgallouedec · 2026-01-26T23:17:54Z

This looks solid so far. A couple things still missing:

A unit test.
In SFTTrainer, we should call this function when assistant_only_loss is enabled (and when {% generation %} isn't in the chat template.

Also, in SFTConfig, we should explicitly mention that toggling assistant_only_loss may modify the tokenizer’s chat template.

Ideally, this feature wouldn’t mutate the tokenizer: we can avoid that by calling apply_chat_template with the chat_template argument set to the enhanced template (this is what we do in the GRPO trainer). That said, I’m okay with modifying the chat template for now, it keeps the implementation simpler.

Neelectric and others added 4 commits January 20, 2026 18:09

Add add_generation_tags with sample Qwen3 templates

b015488

This is a first attempt at implementing feature request huggingface#4879, leaning heavily on https://github.com/HarryMayne/qwen_3_chat_templates

Fix: Replaces tokenizer.response_schema with tokenizer.chat_template …

60ae255

…in add_generation_tags()

Adding minor to-do comment huggingface#4879

66958ec

Merge branch 'huggingface:main' into auto-chat_template-generation_tags

b3e5320

qgallouedec reviewed Jan 26, 2026

View reviewed changes

jiosephlee mentioned this pull request Jan 30, 2026

documentation for modifying chat templates for assistant-only loss #4937

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automatically add generation tags to chat template for assistant_only_loss=True training (TRL Issue #4879)#4900

Automatically add generation tags to chat template for assistant_only_loss=True training (TRL Issue #4879)#4900
Neelectric wants to merge 4 commits intohuggingface:mainfrom
Neelectric:auto-chat_template-generation_tags

Neelectric commented Jan 26, 2026

Uh oh!

qgallouedec Jan 26, 2026

Uh oh!

qgallouedec commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

Neelectric commented Jan 26, 2026

What does this PR do?

Before submitting

Who can review?

Uh oh!

qgallouedec Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments