Skip to content

[chat template] return assistant mask in processors#38545

Merged
zucchini-nlp merged 9 commits intohuggingface:mainfrom
zucchini-nlp:chat-template-assistant-mask
Jul 18, 2025
Merged

[chat template] return assistant mask in processors#38545
zucchini-nlp merged 9 commits intohuggingface:mainfrom
zucchini-nlp:chat-template-assistant-mask

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

@zucchini-nlp zucchini-nlp commented Jun 3, 2025

What does this PR do?

Fixes #38521. I checked with fast tokenizers' implementation of word_to_char and saw no difference in the time taken, so I think this can be the permanent solution

Otherwise we can add in BatchFeature support for EncodingFast features, though I don't think anyone needs them and I have never seen user requesting it

@zucchini-nlp zucchini-nlp force-pushed the chat-template-assistant-mask branch from 7d3e2a9 to 61112a1 Compare June 3, 2025 07:19
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp zucchini-nlp changed the title [WIP chat template] return assistant mask in processors [chat template] return assistant mask in processors Jun 12, 2025
@zucchini-nlp
Copy link
Copy Markdown
Member Author

Ready for review!

Comment on lines +1464 to +1465
is_tokenizers_fast = hasattr(self, "tokenizer") and self.tokenizer.__class__.__name__.endswith("Fast")

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that tokenizer's never checks this, probably because all new LLMs support fast tokenizers. Though users can force set use_fast=False for some reasons and the error message in that case is not informative

Should I add the check on tokenizer's apply_chat_template as well, WDYT?

Copy link
Copy Markdown
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long to get to this! The logic makes sense, but the use of bisect_left confused me for a bit. After staring at it for a while, though, I think it's valid.

@zucchini-nlp zucchini-nlp enabled auto-merge (squash) July 18, 2025 08:27
@zucchini-nlp zucchini-nlp disabled auto-merge July 18, 2025 08:38
@zucchini-nlp zucchini-nlp enabled auto-merge (squash) July 18, 2025 10:10
@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: csm, shieldgemma2

@zucchini-nlp zucchini-nlp merged commit bcc0091 into huggingface:main Jul 18, 2025
25 checks passed
zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Jul 22, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
zaristei pushed a commit to zaristei/transformers that referenced this pull request Sep 9, 2025
* messed up the git history, squash commits

* raise error if slow and refine tests

* index was off by one

* fix the test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Error for return_assistant_tokens_mask in MLLM processor

3 participants