fix bug for janus model image generation by kaixuanliu · Pull Request #45044 · huggingface/transformers

kaixuanliu · 2026-03-27T07:50:21Z

Fix issue in #44792. @zucchini-nlp @ydshieh pls help review, thx!

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

zucchini-nlp

left a couple q

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu · 2026-03-30T05:54:03Z

@zucchini-nlp ,Hi, can you help review it again? Thx!!

ydshieh · 2026-03-31T15:02:09Z

Investigation notes [Just for the record, no need to read]

9a6df2ce is the last commit where the test passes completely (no fixes needed).

Observations from bisecting the regression, in commit order:

bdaddb6f — Generation works (after removing 2 unrelated asserts at the top of the test, or equivalently with 9daee2e8), but produced output differs from expected values → test assertion mismatch only.

a81e04a9 — After removing the 2 asserts, generation now crashes:

TypeError: '>' not supported between instances of 'int' and 'NoneType'

at max(generation_config.max_length, num_image_tokens + seq_len) — generation_config.max_length became None.

9daee2e8 / 2877e4e2 — Same max_length=None crash, no longer need to remove the 2 asserts to reproduce it.

93d7affd ("Generation config boolean defaults #43000") — New crash:

TypeError: repeat_interleave() received an invalid combination of arguments - got (NoneType, dim=int)

generation_config.num_return_sequences became None, passed as expand_size=None into _expand_inputs_for_generation.

Current main — Same expand_size=None / repeat_interleave(None) crash.

Experimenting with the PR fix:

PR fix applied, but without is_first_iteration=True: the expand_size crash is gone, but generation now fails deep in RoPE:
```
CUDA error: device-side assert triggered
```
at apply_rotary_pos_emb — position_ids are computed incorrectly, causing an out-of-bounds RoPE index error.
Full PR fix (with is_first_iteration=True): generation completes successfully. Only remaining issue is the expected output values in the test needing to be updated, which the PR handles.

So is_first_iteration=True is necessary to work around a behavioral change introduced somewhere between bdaddb6f and current main in prepare_inputs_for_generation (likely 3c52b78 — "Always pass full input_ids in prepare_inputs_for_generation"), which changed how input_ids are sliced when use_cache=True. Without is_first_iteration=True, the base class slices input_tokens (the full prompt) to 1 token, producing wrong position_ids.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ydshieh · 2026-03-31T16:22:24Z

ok, the failure, if we don't include the change is_first_iteration, comes from

421c7f6 [core] 🚨 Completely remove cache positions (#44181)

I ping him internally

ydshieh · 2026-04-01T00:51:26Z

+                    4484, 4015, 15750, 15131, 7551, 7326, 3485, 4845, 376, 9925, 1082, 1457, 15550, 7029, 1482, 11522,
+                    14695, 8587, 6807, 8221, 6807, 6140, 15079, 11766, 705, 11799, 405, 4228, 13153, 3910, 8631, 10037,
+                    12758, 6321, 12249, 1787, 15982, 366, 8811, 6910, 1957, 10597, 8889, 8500, 7068, 2037, 897, 4044,
+                    1762, 4080


hi, on which hardware you get this value for cuda? On A10, I get

([ 2567, 6155, 6155, 250, 15131, 15797, 15453, 12190, 3351, 10803,

I use A100 with torch version 2.11.0+cu128. You can adjust the expected token to adapt to your CI env.

ydshieh · 2026-04-01T08:43:51Z

+            # computed incorrectly based on cache length, leading to RoPE index out of bounds errors.
            model_inputs = self.prepare_inputs_for_generation(
-                inputs_embeds=inputs_embeds, input_ids=input_tokens, **model_kwargs
+                inputs_embeds=inputs_embeds, input_ids=input_tokens, is_first_iteration=True, **model_kwargs


This image test for janus is broken for so long, with different errors introduced over several commits (and some of them are resolved).

This is_first_iteration=True not only fixes the crash issue (I didn't find the root commit for it yet) but also bring the actual outputs back to match the expected outputs (which should have been updated in Default auto (#42805) ).

This fix is thus valid .

so short (or long) history

Default auto (Default auto 🚨 🚨 #42805) --> we should have updated the expected output but we didn't (dtype change)

Prefill-related logic in input preparation for generation (Prefill-related logic in input preparation for generation #42088) --> This changes the expected output further, which requires is_first_iteration=True to bring it back to normal

Completely remove cache positions ([core] 🚨 Completely remove cache positions #44181) --> this will cause the crash, and is_first_iteration=True will avoid it.

Cyrilvallez · 2026-04-01T09:19:39Z

+            # Set is_first_iteration=True to force using inputs_embeds instead of input_ids.
+            # Without this, prepare_inputs_for_generation would use input_ids (the full prompt)
+            # instead of our prepared inputs_embeds (1 new token). This causes position_ids to be
+            # computed incorrectly based on cache length, leading to RoPE index out of bounds errors.
            model_inputs = self.prepare_inputs_for_generation(


If we are forced to use inputs_embeds, then this fix is correct - otherwise it would indeed use input_ids without is_first_iteration. Is this expected to create inputs_embeds like that @zucchini-nlp ? Can't we let the model do it in forward from input_ids?

I will put a comment along the code, but would like to move on by merge for now.

Use inputs_embeds doesn't seem a bad thing here (no need to recompute stuff in the for loop).
I do agree that, it's strange using input_ids won't work (giving wrong value part, the crash part I have no idea).

not doable for this model, because embeddings in image-generation mode are obtained via embed+pooling, later the lm head is also a bit different. In text gen generation is simple lm-style tho, which is why we have early exit a few lines above

https://github.com/kaixuanliu/transformers/blob/e634aa1bcb43e81bc12e4977bf2a673838ef7836/src/transformers/models/janus/modeling_janus.py#L1367

github-actions · 2026-04-01T09:21:53Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: janus

ydshieh · 2026-04-01T09:22:00Z

run-slow: janus

github-actions · 2026-04-01T09:23:23Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/janus"]
quantizations: []

github-actions · 2026-04-01T09:33:16Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	b6c6bbad	workflow commit (merge commit)
PR	96779dcf	branch commit (from PR)
main	9914a364	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

HuggingFaceDocBuilderDev · 2026-04-01T09:56:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

* fix bug for janus model image generation Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected tokens Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comment Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * use `_preapre_generation_config` Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected token Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comments Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update * update * update --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

kaixuanliu added 5 commits March 27, 2026 07:04

fix bug for janus model image generation

9522674

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update expected tokens

5283acb

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update

234f3d8

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update comment

bb5bbdd

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Merge branch 'main' into janus-image-generation

64bf0ba

kaixuanliu changed the title ~~Janus image generation~~ fix bug for janus model image generation Mar 27, 2026

zucchini-nlp reviewed Mar 27, 2026

View reviewed changes

Comment thread src/transformers/models/janus/modeling_janus.py Outdated

Comment thread src/transformers/models/janus/modeling_janus.py Outdated

use _preapre_generation_config

a53ec20

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu marked this pull request as draft March 27, 2026 08:53

kaixuanliu added 5 commits March 27, 2026 08:56

update

68e71ad

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update expected token

1c1c25d

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update code

ff3b9f8

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update

f99704b

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

update

7a123d1

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu marked this pull request as ready for review March 27, 2026 14:33

update comments

1b032fe

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

kaixuanliu marked this pull request as draft March 30, 2026 05:15

kaixuanliu added 2 commits March 30, 2026 05:25

update

7b75186

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Merge branch 'main' into janus-image-generation

e185563

kaixuanliu marked this pull request as ready for review March 30, 2026 05:42

Merge branch 'main' into janus-image-generation

2014b25

ydshieh added a commit that referenced this pull request Mar 31, 2026

fix(janus): fix image generation (from PR #45044)

90d46ec

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Merge branch 'main' into janus-image-generation

648c392

This comment was marked as resolved.

Sign in to view

ydshieh reviewed Apr 1, 2026

View reviewed changes

update

96779dc

Cyrilvallez reviewed Apr 1, 2026

View reviewed changes

update

439b0cc

ydshieh approved these changes Apr 1, 2026

View reviewed changes

update

e634aa1

ydshieh merged commit 6abd972 into huggingface:main Apr 1, 2026
18 checks passed

kaixuanliu deleted the janus-image-generation branch April 2, 2026 02:46

zucchini-nlp mentioned this pull request Apr 29, 2026

fix(janus): Handle None values in image generation mode #44793

Closed

2 tasks

Conversation

kaixuanliu commented Mar 27, 2026

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kaixuanliu commented Mar 30, 2026

Uh oh!

ydshieh commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Investigation notes [Just for the record, no need to read]

Uh oh!

ydshieh commented Mar 31, 2026

Uh oh!

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

ydshieh Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

kaixuanliu Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

ydshieh Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

ydshieh Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

ydshieh Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Apr 1, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

ydshieh commented Apr 1, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026

Uh oh!

github-actions Bot commented Apr 1, 2026

CI Results

Commit Info

Uh oh!

HuggingFaceDocBuilderDev commented Apr 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ydshieh commented Mar 31, 2026 •

edited

Loading