Fix(43240): pass kwargs to nn.functional.cross_entropy#43251
Fix(43240): pass kwargs to nn.functional.cross_entropy#43251jasiecky wants to merge 22 commits intohuggingface:mainfrom
Conversation
There was a problem hiding this comment.
I have a hard time matching the description of this PR to the proposed change. It looks like you want to pass additional kwargs which at the moment are dropped by this wrapper
looking at https://docs.pytorch.org/docs/stable/generated/torch.nn.functional.cross_entropy.html
it's weight and label_smoothing - is that what you're trying to pass?
And of course you need a test to support your PR, which would also self-document what you're trying to accomplish.
ensures consistent loss scaling by controlling the reduction mode when num_items_in_batch is provided
This is already done. Look at the first line of the function.
|
Thank you for adding the tests. I still don't understand what you're trying to solve with this PR. Your PR description:
The 2nd part is invalid, the pre-PR code already does that. For the first part, what transformers/src/transformers/trainer.py Lines 708 to 709 in 0a5d574 |
|
Thinking more about it and stepping away from this particular PR, one of the remaining issues in the HF Transformers API is that some keys in So what I suggest is that if you don't have a particular problem to solve in this PR, it can be made useful by asserting when unexpected The decision of whether |
The problem to be solved is the issue 43240. The thing is that currently we are not able to pass kwargs into nn.functional.cross_entropy so usage of weight and label_smoothing is impossible. If you think it would be a better solution I might change kwargs to these two parameters and pass them into the mentioned function. |
|
That's helpful. Then that should be the description of the PR. And you have a competitor here: #43254 |
I updated the code and the description, ready for review. |
| weight: torch.Tensor | None = None, | ||
| **kwargs, | ||
| label_smoothing: float = 0.0, | ||
| **_kwargs, |
There was a problem hiding this comment.
huh? _?
I'd say remove it altogether, since it's being silently ignored and that's bad for the caller.
There was a problem hiding this comment.
Do you mean to remove kwargs? You accepted the code containing them;) If we don't use them the function isn't compatible with some parts of the repo so I changed it to _kwargs in order to explicitly show that kwargs're ignored.
There was a problem hiding this comment.
I have no idea how renaming to _kwargs implies that it is ignored. When something is ignored it shouldn't be there.
As I shared earlier my opinion is that if **kwargs is in the API, they should be introspected and any unexpected keys should be asserted on. **kwargs are useful when a function is an intermediary and passes it on. In this case kwargs aren't passed on and thus shouldn't be there.
You accepted the code containing them;)
I'm not a current maintainer so my vote isn't binding. You want to engage current maintainers instead.
There was a problem hiding this comment.
It's a naming convention, it doesn't imply anything indeed. Let's wait for the mainteners;)
|
View the CircleCI Test Summary for this PR: https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=43251&sha=6e287b |
iamsernine
left a comment
There was a problem hiding this comment.
didn't know why i'm being mentionned here but lgtm
What does this PR do?
The problem to be solved is the issue #43240. This PR implements passing weight and label_smoothing parameters of nn.functional.cross_entropy in fixed_cross_entropy function.
Fixes #43240
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
@iamsernine @ArthurZucker @stas00 @cyyever