fix bug for janus model image generation#45044
Conversation
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
|
@zucchini-nlp ,Hi, can you help review it again? Thx!! |
Investigation notes [Just for the record, no need to read]
Observations from bisecting the regression, in commit order:
at
Current Experimenting with the PR fix:
So |
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
| 4484, 4015, 15750, 15131, 7551, 7326, 3485, 4845, 376, 9925, 1082, 1457, 15550, 7029, 1482, 11522, | ||
| 14695, 8587, 6807, 8221, 6807, 6140, 15079, 11766, 705, 11799, 405, 4228, 13153, 3910, 8631, 10037, | ||
| 12758, 6321, 12249, 1787, 15982, 366, 8811, 6910, 1957, 10597, 8889, 8500, 7068, 2037, 897, 4044, | ||
| 1762, 4080 |
There was a problem hiding this comment.
hi, on which hardware you get this value for cuda? On A10, I get
([ 2567, 6155, 6155, 250, 15131, 15797, 15453, 12190, 3351, 10803,
There was a problem hiding this comment.
I use A100 with torch version 2.11.0+cu128. You can adjust the expected token to adapt to your CI env.
| # computed incorrectly based on cache length, leading to RoPE index out of bounds errors. | ||
| model_inputs = self.prepare_inputs_for_generation( | ||
| inputs_embeds=inputs_embeds, input_ids=input_tokens, **model_kwargs | ||
| inputs_embeds=inputs_embeds, input_ids=input_tokens, is_first_iteration=True, **model_kwargs |
There was a problem hiding this comment.
This image test for janus is broken for so long, with different errors introduced over several commits (and some of them are resolved).
This is_first_iteration=True not only fixes the crash issue (I didn't find the root commit for it yet) but also bring the actual outputs back to match the expected outputs (which should have been updated in Default auto (#42805) ).
This fix is thus valid .
There was a problem hiding this comment.
so short (or long) history
- Default auto (Default auto 🚨 🚨 #42805) --> we should have updated the expected output but we didn't (dtype change)
- Prefill-related logic in input preparation for generation (Prefill-related logic in input preparation for generation #42088) --> This changes the expected output further, which requires
is_first_iteration=Trueto bring it back to normal - Completely remove cache positions ([core] 🚨 Completely remove cache positions #44181) --> this will cause the crash, and
is_first_iteration=Truewill avoid it.
| # Set is_first_iteration=True to force using inputs_embeds instead of input_ids. | ||
| # Without this, prepare_inputs_for_generation would use input_ids (the full prompt) | ||
| # instead of our prepared inputs_embeds (1 new token). This causes position_ids to be | ||
| # computed incorrectly based on cache length, leading to RoPE index out of bounds errors. | ||
| model_inputs = self.prepare_inputs_for_generation( |
There was a problem hiding this comment.
If we are forced to use inputs_embeds, then this fix is correct - otherwise it would indeed use input_ids without is_first_iteration. Is this expected to create inputs_embeds like that @zucchini-nlp ? Can't we let the model do it in forward from input_ids?
There was a problem hiding this comment.
I will put a comment along the code, but would like to move on by merge for now.
Use inputs_embeds doesn't seem a bad thing here (no need to recompute stuff in the for loop).
I do agree that, it's strange using input_ids won't work (giving wrong value part, the crash part I have no idea).
There was a problem hiding this comment.
not doable for this model, because embeddings in image-generation mode are obtained via embed+pooling, later the lm head is also a bit different. In text gen generation is simple lm-style tho, which is why we have early exit a few lines above
|
[For maintainers] Suggested jobs to run (before merge) run-slow: janus |
|
run-slow: janus |
|
This comment contains models: ["models/janus"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
* fix bug for janus model image generation Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected tokens Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comment Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * use `_preapre_generation_config` Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected token Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comments Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update * update * update --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
* fix bug for janus model image generation Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected tokens Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comment Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * use `_preapre_generation_config` Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update expected token Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update code Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update comments Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> * update * update * update --------- Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com> Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Fix issue in #44792. @zucchini-nlp @ydshieh pls help review, thx!