Skip to content

Generalize gemma vision mask to videos#45185

Merged
zucchini-nlp merged 2 commits intohuggingface:mainfrom
zucchini-nlp:gemma-masks
Apr 2, 2026
Merged

Generalize gemma vision mask to videos#45185
zucchini-nlp merged 2 commits intohuggingface:mainfrom
zucchini-nlp:gemma-masks

Conversation

@zucchini-nlp
Copy link
Copy Markdown
Member

What does this PR do?

If we have videos, the token type ids will be 2 but the current fn checks only image token types. This PR generalizes it rely only on vision_group_ids instead of token types

@zucchini-nlp
Copy link
Copy Markdown
Member Author

run-slow: gemma3, paligemma, git

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/gemma3", "models/git", "models/paligemma"]
quantizations: []

@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This function adds the correct offsets to the `q_idx` and `kv_idx` as the torch API can only accept lengths,
not start and end indices.
Args:
vision_group_ids (`torch.Tensor`):
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
vision_group_ids (`torch.Tensor`):
group_ids (`torch.Tensor`):

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it includes text group as well

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

CI Results

Workflow Run ⚙️

Commit Info

Context Commit Description
RUN 8fcab2dd workflow commit (merge commit)
PR c05182ba branch commit (from PR)
main abc417a4 base commit (on main)

✅ No failing test specific to this PR 🎉 👏 !

@zucchini-nlp zucchini-nlp enabled auto-merge April 2, 2026 12:35
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 2, 2026

[For maintainers] Suggested jobs to run (before merge)

run-slow: gemma3, git, paligemma

@zucchini-nlp zucchini-nlp added this pull request to the merge queue Apr 2, 2026
Merged via the queue into huggingface:main with commit ade7a05 Apr 2, 2026
22 checks passed
@zucchini-nlp zucchini-nlp deleted the gemma-masks branch April 2, 2026 13:15
marvinzh pushed a commit to marvinzh/transformers that referenced this pull request Apr 3, 2026
* more general inner mask

* arthur's comment - rename everywhere
SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Apr 4, 2026
* more general inner mask

* arthur's comment - rename everywhere
sirzechs66 pushed a commit to sirzechs66/transformers that referenced this pull request Apr 18, 2026
* more general inner mask

* arthur's comment - rename everywhere
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants