Skip to content

[_unmask_unattended] Refactor#29356

Closed
ArthurZucker wants to merge 3 commits intomainfrom
nit-refactor
Closed

[_unmask_unattended] Refactor#29356
ArthurZucker wants to merge 3 commits intomainfrom
nit-refactor

Conversation

@ArthurZucker
Copy link
Copy Markdown
Collaborator

@ArthurZucker ArthurZucker commented Feb 29, 2024

What does this PR do?

Follow up on #29318, if we keep the AttentionMaskConverter then let's put the checks in the _unmask_unattended since this function is bound to disappear once the bug is fixed in torch

@fxmarty let's make this easier to refactor later on, and more readable!

@ArthurZucker ArthurZucker marked this pull request as ready for review February 29, 2024 00:12
@ArthurZucker ArthurZucker requested a review from fxmarty February 29, 2024 00:12
Comment on lines +240 to +241
is_tracing = torch.jit.is_tracing() or (hasattr(torch, "_dynamo") and torch._dynamo.is_compiling())
is_tracing |= isinstance(input_tensor, torch.fx.Proxy)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can leave this as a one-liner. I believe in python a or b never evaluates b if a is already True.

Comment on lines +240 to +241
is_tracing = torch.jit.is_tracing() or (hasattr(torch, "_dynamo") and torch._dynamo.is_compiling())
is_tracing |= isinstance(input_tensor, torch.fx.Proxy)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pass is_tracing as an Optional[bool]? This variable is already computed in the _prepare_xxx methods. And check it here only if it is None.

def _unmask_unattended(
expanded_mask: torch.FloatTensor,
attention_mask: torch.FloatTensor,
input_tensor: torch.FloatTensor,
Copy link
Copy Markdown
Contributor

@fxmarty fxmarty Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need input_tensor, do we? The FX check should be doable on attention_mask.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but sometimes the attention mask is None even when tracing no?

Copy link
Copy Markdown
Contributor

@fxmarty fxmarty Feb 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant expanded_mask. See the type hint.

This method is never intended to be called with None input, if you want to change that you could just return and edit the type hint.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alright

@huggingface huggingface deleted a comment from github-actions Bot Mar 30, 2024
@github-actions
Copy link
Copy Markdown
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants