Pass required token_type_ids by albertvillanova · Pull Request #4148 · huggingface/trl

albertvillanova · 2025-09-26T10:19:14Z

Pass required token_type_ids.

Follow-up to transformers PR:

🚨 [generate] update paligemma mask updates (and other assisted generation-related fixes) transformers#40917

🚨 BC-breaking: paligemma processor now returns token_type_ids by default. This is required to disambiguate forward passes, due to the bidirectional attention mask in the prompt. Advanced generation methods may run forward passes with prompt + generated tokens, so they will fail without token_type_ids.

Fix #4142, fix #4150.

This PR extends support for the token_type_ids input across the GRPO and RLOO trainers, ensuring that models using token type information can correctly handle these inputs during training, evaluation, and loss computation.

The changes are applied consistently to:

GRPO
RLOO

Changes

Token type IDs support:

Added token_type_ids as an optional argument to the _get_per_token_logps_and_entropies method in both grpo_trainer.py and rloo_trainer.py, allowing the trainers to process token type information.
Updated batching logic to include token_type_ids when present, ensuring correct slicing and passing of token type IDs during batched forward passes.

Integration with completions and output:

In the _generate_and_score_completions method, extended token_type_ids with zeros for the completion tokens and ensured they are passed through the forward arguments and included in the output dictionary.

Loss computation:

Updated the _compute_loss method in both trainers to pass token_type_ids when available, ensuring that loss calculations take token type information into account.

CC: @gante

HuggingFaceDocBuilderDev · 2025-09-26T10:23:10Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

albertvillanova · 2025-09-26T11:27:13Z

Current state of the PR seems to fix the issue:

CI fails: ValueError: token_type_ids is required as a model input when training #4142
no ValueError with dev dependencies https://github.com/huggingface/trl/actions/runs/18034938032/job/51319680742?pr=4148

FAILED tests/test_rloo_trainer.py::RLOOTrainerTester::test_training_vlm_0_trl_internal_testing_tiny_Gemma3ForConditionalGeneration - TypeError: RLOOTrainer._get_per_token_logps_and_entropies() got an unexpected keyword argument 'token_type_ids'
= 1 failed, 920 passed, 49 skipped, 3 xfailed, 219 warnings, 5 rerun in 906.89s (0:15:06) =

The only remaining issue is now the TypeError:

CI fails: TypeError: RLOOTrainer._get_per_token_logps_and_entropies() got an unexpected keyword argument 'token_type_ids' #4150

…opies

albertvillanova · 2025-09-26T12:09:25Z

Everything is green now! 🚀

albertvillanova · 2025-09-26T12:11:34Z

trl/trainer/grpo_trainer.py

+            token_type_ids = forward_kwargs["token_type_ids"]
+            forward_kwargs["token_type_ids"] = torch.cat(
+                [token_type_ids, token_type_ids.new_zeros(completion_ids.shape)], dim=1
+            )


If you validate this approach, do you think this should be implemented in other trainers as well?

qgallouedec · 2025-09-26T14:35:45Z

trl/trainer/rloo_trainer.py

        # Concatenate prompt_mask with completion_mask for logit computation
        prompt_completion_ids = torch.cat([prompt_ids, completion_ids], dim=1)  # (B, P+C)
        attention_mask = torch.cat([prompt_mask, completion_mask], dim=1)  # (B, P+C)
+        # If token_type_ids are used, extend them with zeros for the completion part


1 is for image and 0 for text, right?

Yes, completion tokens are text.

albertvillanova · 2025-09-29T15:25:21Z

@qgallouedec could you please validate this PR so we can finally have the CI green?

qgallouedec

lgtm, thanks!

gante · 2025-09-30T12:19:57Z

Thank you for the fix 🙏

(I wouldn't be surprised if we see more models of this kind in the future -- token_type_ids is used essentially to tag blocks of inputs that need bidirectional attention)

Pass token_type_ids in GRPOTrainer

a65947c

albertvillanova added 2 commits September 26, 2025 13:28

Add token_type_ids param to RLOOTrainer._get_per_token_logps_and_entr…

09a95c4

…opies

Align RLOOTrainer with GRPOTrainer

da880e9

albertvillanova commented Sep 26, 2025

View reviewed changes

qgallouedec reviewed Sep 26, 2025

View reviewed changes

albertvillanova added 2 commits September 26, 2025 17:58

Align trl.experimental

317c675

Merge remote-tracking branch 'upstream/main' into fix-4142

4308374

albertvillanova changed the title ~~WIP: Pass required token_type_ids~~ Pass required token_type_ids Sep 29, 2025

albertvillanova requested a review from qgallouedec September 29, 2025 06:16

qgallouedec approved these changes Sep 29, 2025

View reviewed changes

albertvillanova merged commit 910aeeb into huggingface:main Sep 29, 2025
10 checks passed

kashif pushed a commit that referenced this pull request Sep 30, 2025

Pass required token_type_ids (#4148)

3715297

aweers mentioned this pull request Oct 15, 2025

Add support for token_type_ids in DPOTrainer #4285

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pass required token_type_ids#4148

Pass required token_type_ids#4148
albertvillanova merged 5 commits intohuggingface:mainfrom
albertvillanova:fix-4142

albertvillanova commented Sep 26, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Sep 26, 2025

Uh oh!

albertvillanova commented Sep 26, 2025

Uh oh!

albertvillanova commented Sep 26, 2025

Uh oh!

albertvillanova Sep 26, 2025 •

edited

Loading

Uh oh!

qgallouedec Sep 26, 2025

Uh oh!

albertvillanova Sep 26, 2025

Uh oh!

albertvillanova commented Sep 29, 2025

Uh oh!

qgallouedec left a comment

Uh oh!

Uh oh!

gante commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

albertvillanova commented Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Uh oh!

HuggingFaceDocBuilderDev commented Sep 26, 2025

Uh oh!

albertvillanova commented Sep 26, 2025

Uh oh!

albertvillanova commented Sep 26, 2025

Uh oh!

albertvillanova Sep 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qgallouedec Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

albertvillanova Sep 26, 2025

Choose a reason for hiding this comment

Uh oh!

albertvillanova commented Sep 29, 2025

Uh oh!

qgallouedec left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gante commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

albertvillanova commented Sep 26, 2025 •

edited

Loading

albertvillanova Sep 26, 2025 •

edited

Loading