fix: handle ragged batch inputs in Qwen2_5_VLProcessor mm_token_type_ids computation by s-zx · Pull Request #44919 · huggingface/transformers

s-zx · 2026-03-21T23:57:37Z

What does this PR do?

Fixes a crash in Qwen2_5_VLProcessor.__call__ when processing batched inputs without padding (padding=False).

Root cause: When the tokenizer returns sequences of different lengths (ragged list), np.array(text_inputs["input_ids"]) creates an object array instead of a 2D integer array. The subsequent element-wise comparisons (== image_token_id, == video_token_id) then fail or produce wrong results.

Fix: Detect batch inputs (list-of-lists) and process each sequence individually, which correctly handles both padded (uniform length) and unpadded (ragged) batches.

Before:

array_ids = np.array(text_inputs["input_ids"])  # fails for ragged batches
mm_token_type_ids = np.zeros_like(text_inputs["input_ids"])
mm_token_type_ids[array_ids == self.image_token_id] = 1

After:

if isinstance(input_ids, list) and input_ids and isinstance(input_ids[0], list):
    # process each sequence individually
    for ids in input_ids:
        arr = np.array(ids)
        ...

Fixes #44514

…_ids When processing batched inputs without padding (padding=False), the tokenizer returns lists of different lengths. Calling np.array() on such a ragged list produces an object array, making the subsequent element-wise comparisons (== image_token_id, == video_token_id) fail. Fix by detecting batch inputs (list of lists) and processing each sequence individually, which works for both padded and unpadded batches. Fixes huggingface#44514

github-actions · 2026-03-21T23:58:47Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: qwen2_5_vl

zucchini-nlp · 2026-03-23T10:38:30Z

Will close in favor of #44563 (comment) which solves the issue for all models at once

zucchini-nlp closed this Mar 23, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: handle ragged batch inputs in Qwen2_5_VLProcessor mm_token_type_ids computation#44919

fix: handle ragged batch inputs in Qwen2_5_VLProcessor mm_token_type_ids computation#44919
s-zx wants to merge 1 commit intohuggingface:mainfrom
s-zx:fix/qwen2-5-vl-ragged-batch-token-ids

s-zx commented Mar 21, 2026

Uh oh!

github-actions Bot commented Mar 21, 2026

Uh oh!

zucchini-nlp commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

s-zx commented Mar 21, 2026

What does this PR do?

Uh oh!

github-actions Bot commented Mar 21, 2026

Uh oh!

zucchini-nlp commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants