Raise clear error for problem_type="single_label_classification" with num_labels=1#45611
Merged
Rocketknight1 merged 4 commits intohuggingface:mainfrom Apr 24, 2026
Conversation
… num_labels=1 This combination is mathematically degenerate: applying cross-entropy loss to a single logit always yields zero loss, so training silently accomplishes nothing. Validate the combination in PreTrainedConfig.__post_init__ so users get a clear error at config construction with a pointer to the correct setup (num_labels=2 for binary classification, or problem_type="regression" for a single-output regression head). Closes huggingface#45479
Rocketknight1
approved these changes
Apr 24, 2026
Member
Rocketknight1
left a comment
There was a problem hiding this comment.
Looks good, but the code agent was a little verbose! I'll trim it down
Rocketknight1
approved these changes
Apr 24, 2026
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
Contributor
Author
|
Hi @Rocketknight1 |
Member
|
@gaurav0107 probably just an intermittent CI error, I'll throw it into the queue again |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Adds validation in
PreTrainedConfig.__post_init__that rejects the combinationproblem_type="single_label_classification"+num_labels=1with a clearValueErrorpointing users to the correct setup.Closes #45479
Why
Before this change, constructing a sequence-classification model with this combination silently produced a degenerate zero cross-entropy loss (softmax over a single logit is always
1, soCrossEntropyLossalways returns0). Training appeared to run but accomplished nothing.Reproducer (confirmed on
main, both on shared-loss models like BERT and inlined-loss models like ModernBERT):Fix
PreTrainedConfig.__post_init__now raisesValueErroron this combination afternum_labelsis resolved fromid2label. Error message redirects the user tonum_labels=2(binary classification) orproblem_type="regression"(single-output regression).The check sits at the config layer so it covers every sequence-classification model uniformly — both models that delegate to
ForSequenceClassificationLossinsrc/transformers/loss/loss_utils.pyand models that inline the loss selection (e.g. ModernBERT, ~40 others).Coordination / approach
This matches the maintainer direction in this comment on #45479:
Tests
New regression test
tests/utils/test_configuration_utils.py::ConfigTestUtils::test_single_label_classification_requires_more_than_one_labelasserts the degenerate combination raises, and verifies the three valid shapes (num_labels=2 + single_label,num_labels=1 + regression,num_labels=1with no explicitproblem_type) still construct.Verified that the test FAILS on
mainwithout the fix and PASSES with it.Ran locally:
All pass.
ruff checkandruff format --checkclean on the two touched files.The existing common
test_problem_typescontinues to pass: it assignsproblem_typeandnum_labelsas attributes on an already-constructed config, which bypasses__post_init__by design, and the test only assertsloss.backward()doesn't error.Compatibility
AI disclosure
AI-assisted: the fix was drafted with Claude, then human-reviewed end-to-end including the reproducer, diff, and test outputs listed above. Disclosed per
CONTRIBUTING.md"AI-assisted and agentic contributions".