documentation for modifying chat templates for assistant-only loss by jiosephlee · Pull Request #4937 · huggingface/trl

jiosephlee · 2026-01-30T21:06:47Z

What does this PR do?

While this issue is not super prevalent (most datasets are single-turn interactions and can be addressed with completion-only fine-tuning), some users want to fine-tune on multi-turn interactions or responses with tool-calling.

To my knowledge, this issue can be easily addressed with a proper chat template that adds {% generation %} when appropriate, including executed tool responses, so that the tokenizer returns the masks automatically. If the chat template doesn't do this, there is an ongoing PR to automatically address the issue with chat templates (#4900), but this doesn't scale well imo.

An easy solution would be to provide sufficient documentation so users can edit chat templates themselves!

jiosephlee added 2 commits January 30, 2026 16:03

documentation for modifying chat templates for assistant-only loss

4ff125d

simplify docs for modifying chat template for assistant only loss

58761c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

documentation for modifying chat templates for assistant-only loss#4937

documentation for modifying chat templates for assistant-only loss#4937
jiosephlee wants to merge 2 commits intohuggingface:mainfrom
jiosephlee:chat_template_docs

jiosephlee commented Jan 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

jiosephlee commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

jiosephlee commented Jan 30, 2026 •

edited

Loading