Fix causal mask in llama for long seq_length by YLGH · Pull Request #29263 · huggingface/transformers

YLGH · 2024-02-23T21:01:34Z

What does this PR do?

Fix a bug in Llama causal mask creation for when input length > causal_mask.shape (which comes from max_position_embeddings).

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@ArthurZucker @younesbelkada

Mihaiii · 2024-02-24T11:42:07Z

Will this PR fix #29252 ? (Yi has same architecture as Llama2)

YLGH · 2024-02-24T21:56:35Z

Will this PR fix #29252 ? (Yi has same architecture as Llama2)

Yeah I think so, (it uses LlamaForCausalLM)

YLGH · 2024-02-24T21:58:14Z

I'm not sure why unit tests are failing, they seem unrelated to my change...

ArthurZucker

thanks! I don't think a while loop is the best way to achieve this. However we do need a proper fix!

ArthurZucker · 2024-02-27T02:11:08Z

+        while seq_length > causal_mask.shape[-1]:
+            causal_mask = torch.full((2 * causal_mask.shape[-1], 2 * causal_mask.shape[-1]), fill_value=1)


Suggested change

while seq_length > causal_mask.shape[-1]:

causal_mask = torch.full((2 * causal_mask.shape[-1], 2 * causal_mask.shape[-1]), fill_value=1)

if seq_length > causal_mask.shape[-1]:

new_max_positions = round(seq_length / causal_mask.shape[-1]) * causal_mask.shape[-1]

causal_mask = torch.full((new_max_positions, new_max_positions), fill_value=1)

this should offer a tradeoff between loading too big of a mask and increasing it based on the length of the input

BlackSamorez · 2024-03-10T12:57:14Z

It would also allow one to run miqu-1-70b on an RTX 3090 in 2 bits. The model itself is ~19Gb but those masks are too large (around a Gb each and copied sometimes) at 32k max context length, preventing one from running the model.

ArthurZucker · 2024-03-28T14:18:24Z

fixed by #29753 !

github-actions · 2024-04-22T08:04:56Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

fix causal

8f17c49

ArthurZucker reviewed Feb 27, 2024

View reviewed changes

ArthurZucker mentioned this pull request Feb 28, 2024

[Regression] Yi 200K models won't load in latest release #29252

Closed

4 tasks

ArthurZucker mentioned this pull request Mar 7, 2024

Upgrade from 4.37.2 to 4.38.2 causes CUDA out of memory error with identical configuration. #29484

Closed

4 tasks

github-actions Bot closed this Apr 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix causal mask in llama for long seq_length#29263

Fix causal mask in llama for long seq_length#29263
YLGH wants to merge 1 commit intohuggingface:mainfrom
YLGH:fix_llama_mask

YLGH commented Feb 23, 2024

Uh oh!

Mihaiii commented Feb 24, 2024

Uh oh!

YLGH commented Feb 24, 2024

Uh oh!

YLGH commented Feb 24, 2024

Uh oh!

ArthurZucker left a comment

Uh oh!

ArthurZucker Feb 27, 2024 •

edited

Loading

Uh oh!

ArthurZucker Feb 27, 2024

Uh oh!

BlackSamorez commented Mar 10, 2024

Uh oh!

ArthurZucker commented Mar 28, 2024

Uh oh!

github-actions Bot commented Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		while seq_length > causal_mask.shape[-1]:
		causal_mask = torch.full((2 * causal_mask.shape[-1], 2 * causal_mask.shape[-1]), fill_value=1)

-        while seq_length > causal_mask.shape[-1]:
-            causal_mask = torch.full((2 * causal_mask.shape[-1], 2 * causal_mask.shape[-1]), fill_value=1)
+        if seq_length > causal_mask.shape[-1]:
+            new_max_positions = round(seq_length / causal_mask.shape[-1]) * causal_mask.shape[-1]
+            causal_mask = torch.full((new_max_positions, new_max_positions), fill_value=1)

Conversation

YLGH commented Feb 23, 2024

What does this PR do?

Who can review?

Uh oh!

Mihaiii commented Feb 24, 2024

Uh oh!

YLGH commented Feb 24, 2024

Uh oh!

YLGH commented Feb 24, 2024

Uh oh!

ArthurZucker left a comment

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Feb 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ArthurZucker Feb 27, 2024

Choose a reason for hiding this comment

Uh oh!

BlackSamorez commented Mar 10, 2024

Uh oh!

ArthurZucker commented Mar 28, 2024

Uh oh!

github-actions Bot commented Apr 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ArthurZucker Feb 27, 2024 •

edited

Loading