Skip to content

Automatically add generation tags to chat template for assistant_only_loss=True training (TRL Issue #4879)#4900

Draft
Neelectric wants to merge 4 commits intohuggingface:mainfrom
Neelectric:auto-chat_template-generation_tags
Draft

Automatically add generation tags to chat template for assistant_only_loss=True training (TRL Issue #4879)#4900
Neelectric wants to merge 4 commits intohuggingface:mainfrom
Neelectric:auto-chat_template-generation_tags

Conversation

@Neelectric
Copy link

What does this PR do?

This is a very first step at implementing the feature requested in #4879. For now the approach would be trying to recognise the model type and then replacing the chat template with a 'fixed' version. This would be helpful for the most common models (Qwen, Llama, OLMo etc) but I wonder if there's a more fancy way to go about this that automatically recognises when the assistant starts messaging, which we would want to train on.

Implements #4879 (feature request)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a GitHub issue? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@qgallouedec

# https://huggingface.co/Qwen/Qwen3-8B/discussions/14 and https://github.com/HarryMayne/qwen_3_chat_templates
if tokenizer.chat_template == qwen3_chat_template:
tokenizer.chat_template = qwen3_chat_template_all_assistant
# probably want to add support here for most other popular model families like Llama 2/3/3.1, OLMo 2/3/3.1 etc... is there a way to do this fully automatically?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can do it 100% automatically, but I think that there are not thousands of different chat templates out there, if we support the top 5 chat templates, we may covers a good majority of use cases

@qgallouedec
Copy link
Member

This looks solid so far. A couple things still missing:

  • A unit test.
  • In SFTTrainer, we should call this function when assistant_only_loss is enabled (and when {% generation %} isn't in the chat template.

Also, in SFTConfig, we should explicitly mention that toggling assistant_only_loss may modify the tokenizer’s chat template.


Ideally, this feature wouldn’t mutate the tokenizer: we can avoid that by calling apply_chat_template with the chat_template argument set to the enhanced template (this is what we do in the GRPO trainer). That said, I’m okay with modifying the chat template for now, it keeps the implementation simpler.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments