[Trainer] use output.loss when using liger-kernel by kashif · Pull Request #42444 · huggingface/transformers

kashif · 2025-11-27T08:52:38Z

What does this PR do?

Handle loss computation for models using Liger-kernel.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Handle loss computation for models using Liger-kernel. fixes #42414

HuggingFaceDocBuilderDev · 2025-11-27T09:04:49Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

SunMarc

Thanks, left a few ideas to fix this !

…ctly via **kwargs

kashif · 2025-11-27T15:36:33Z

@SunMarc i'll update the accelerate docs based on this change when you approve

SunMarc

Much cleaner thanks !

* use output.loss when using liger Handle loss computation for models using Liger-kernel. fixes huggingface#42414 * Clarify Liger-kernel loss computation in comments * Both standard transformers and Liger models handle shift_labels correctly via **kwargs * removed unused shift_labels reference in loss computation * Remove unused model unwrapping

zhangwj618 · 2025-12-10T08:19:03Z

With the latest code and without liger kernel, I ran into an error caused by outputs.loss being None, as models (e.g. Qwen3ForCausalLM) won't calculate loss if labels is None. @kashif @SunMarc

SunMarc · 2025-12-10T16:30:16Z

With the latest code and without liger kernel, I ran into an error caused by outputs.loss being None, as models (e.g. Qwen3ForCausalLM) won't calculate loss if labels is None. @kashif @SunMarc

How come there is no labels ?

zhangwj618 · 2025-12-11T01:28:03Z

@SunMarc UlyssesSPDataLoaderAdapter took labels out from inputs while inserting shift_labels

SunMarc · 2025-12-11T13:52:09Z

Indeed @zhangwj618 ... thanks for spotting this ! I will open a PR for a quick fix

* use output.loss when using liger Handle loss computation for models using Liger-kernel. fixes huggingface#42414 * Clarify Liger-kernel loss computation in comments * Both standard transformers and Liger models handle shift_labels correctly via **kwargs * removed unused shift_labels reference in loss computation * Remove unused model unwrapping

use output.loss when using liger

e6e8ec0

Handle loss computation for models using Liger-kernel. fixes #42414

kashif requested a review from Rocketknight1 November 27, 2025 08:52

Clarify Liger-kernel loss computation in comments

9fe9da1

Merge branch 'main' into issue-42414

160b320

kashif requested a review from SunMarc November 27, 2025 09:11

kashif mentioned this pull request Nov 27, 2025

[ALST/Ulysses] Added ALST/Ulysses documentation huggingface/trl#4420

Merged

9 tasks

SunMarc reviewed Nov 27, 2025

View reviewed changes

Comment thread src/transformers/trainer.py Outdated

kashif added 3 commits November 27, 2025 12:29

Both standard transformers and Liger models handle shift_labels corre…

10b4ae8

…ctly via **kwargs

removed unused shift_labels reference in loss computation

4f0c18a

Remove unused model unwrapping

9a5654f

kashif mentioned this pull request Nov 27, 2025

[SP] fix loss computation example huggingface/accelerate#3858

Merged

5 tasks

kashif added the bug label Nov 27, 2025

SunMarc approved these changes Nov 28, 2025

View reviewed changes

SunMarc merged commit 6db4332 into main Nov 28, 2025
25 checks passed

SunMarc deleted the issue-42414 branch November 28, 2025 11:00

SunMarc mentioned this pull request Dec 11, 2025

Fix deepspeed sp loss due to missing labels #42812

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Trainer] use output.loss when using liger-kernel#42444

[Trainer] use output.loss when using liger-kernel#42444
SunMarc merged 6 commits intomainfrom
issue-42414

kashif commented Nov 27, 2025 •

edited

Loading

Uh oh!

HuggingFaceDocBuilderDev commented Nov 27, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

kashif commented Nov 27, 2025

Uh oh!

SunMarc left a comment

Uh oh!

Uh oh!

zhangwj618 commented Dec 10, 2025

Uh oh!

SunMarc commented Dec 10, 2025

Uh oh!

zhangwj618 commented Dec 11, 2025

Uh oh!

SunMarc commented Dec 11, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kashif commented Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Before submitting

Who can review?

Uh oh!

HuggingFaceDocBuilderDev commented Nov 27, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kashif commented Nov 27, 2025

Uh oh!

SunMarc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zhangwj618 commented Dec 10, 2025

Uh oh!

SunMarc commented Dec 10, 2025

Uh oh!

zhangwj618 commented Dec 11, 2025

Uh oh!

SunMarc commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kashif commented Nov 27, 2025 •

edited

Loading

SunMarc commented Dec 11, 2025 •

edited

Loading