Raise clear error for `problem_type="single_label_classification"` with `num_labels=1` by gaurav0107 · Pull Request #45611 · huggingface/transformers

gaurav0107 · 2026-04-23T19:12:43Z

What

Adds validation in PreTrainedConfig.__post_init__ that rejects the combination problem_type="single_label_classification" + num_labels=1 with a clear ValueError pointing users to the correct setup.

Closes #45479

Why

Before this change, constructing a sequence-classification model with this combination silently produced a degenerate zero cross-entropy loss (softmax over a single logit is always 1, so CrossEntropyLoss always returns 0). Training appeared to run but accomplished nothing.

Reproducer (confirmed on main, both on shared-loss models like BERT and inlined-loss models like ModernBERT):

import torch
from transformers import BertConfig, BertForSequenceClassification
config = BertConfig(vocab_size=100, hidden_size=32, num_hidden_layers=2,
                    num_attention_heads=2, intermediate_size=64,
                    num_labels=1, problem_type="single_label_classification")
model = BertForSequenceClassification(config)
model.eval()
out = model(input_ids=torch.tensor([[1, 2, 3, 4]]), labels=torch.tensor([0]))
print(out.loss)  # tensor(0.)

Fix

PreTrainedConfig.__post_init__ now raises ValueError on this combination after num_labels is resolved from id2label. Error message redirects the user to num_labels=2 (binary classification) or problem_type="regression" (single-output regression).

The check sits at the config layer so it covers every sequence-classification model uniformly — both models that delegate to ForSequenceClassificationLoss in src/transformers/loss/loss_utils.py and models that inline the loss selection (e.g. ModernBERT, ~40 others).

Coordination / approach

This matches the maintainer direction in this comment on #45479:

"maybe we could raise a clearer error if users set num_labels=1 and problem_type="single_label_classification" to tell them not to do that?"

Tests

New regression test tests/utils/test_configuration_utils.py::ConfigTestUtils::test_single_label_classification_requires_more_than_one_label asserts the degenerate combination raises, and verifies the three valid shapes (num_labels=2 + single_label, num_labels=1 + regression, num_labels=1 with no explicit problem_type) still construct.

Verified that the test FAILS on main without the fix and PASSES with it.

Ran locally:

pytest tests/utils/test_configuration_utils.py
pytest tests/models/bert/test_modeling_bert.py -k "for_sequence or problem_type"
pytest tests/models/modernbert/test_modeling_modernbert.py -k "problem_type or sequence_classification"
pytest tests/models/roberta/test_modeling_roberta.py::RobertaModelTest::test_problem_types
pytest tests/models/albert/test_modeling_albert.py::AlbertModelTest::test_problem_types

All pass. ruff check and ruff format --check clean on the two touched files.

The existing common test_problem_types continues to pass: it assigns problem_type and num_labels as attributes on an already-constructed config, which bypasses __post_init__ by design, and the test only asserts loss.backward() doesn't error.

Compatibility

Valid configurations are unaffected.
Save/load round-trip of valid configs verified.
A saved Hub config containing this exact bad combination will now fail to load with a clear error. Any such checkpoint can never have been trained successfully (loss was always 0), so no working workflow regresses.

AI disclosure

AI-assisted: the fix was drafted with Claude, then human-reviewed end-to-end including the reproducer, diff, and test outputs listed above. Disclosed per CONTRIBUTING.md "AI-assisted and agentic contributions".

… num_labels=1 This combination is mathematically degenerate: applying cross-entropy loss to a single logit always yields zero loss, so training silently accomplishes nothing. Validate the combination in PreTrainedConfig.__post_init__ so users get a clear error at config construction with a pointer to the correct setup (num_labels=2 for binary classification, or problem_type="regression" for a single-output regression head). Closes huggingface#45479

Rocketknight1

Looks good, but the code agent was a little verbose! I'll trim it down

HuggingFaceDocBuilderDev · 2026-04-24T11:42:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gaurav0107 · 2026-04-24T13:34:02Z

Hi @Rocketknight1
would you mind taking a look at the CI results and helping me understand what caused the failure? The tests_processors job is reporting a worker crash (crashed and worker restarting disabled, exit status 1)

Rocketknight1 · 2026-04-24T16:36:59Z

@gaurav0107 probably just an intermittent CI error, I'll throw it into the queue again

tarekziade mentioned this pull request Apr 24, 2026

Raise clear error for problem_type="single_label_classification" with num_labels=1 tarekziade/tarekziade-transformers-reviewer-test#2

Closed

Rocketknight1 approved these changes Apr 24, 2026

View reviewed changes

Comment thread src/transformers/configuration_utils.py Outdated

Comment thread tests/utils/test_configuration_utils.py Outdated

Rocketknight1 added 2 commits April 24, 2026 12:26

Update src/transformers/configuration_utils.py

f4b77d1

Update tests/utils/test_configuration_utils.py

c3d94f1

Rocketknight1 approved these changes Apr 24, 2026

View reviewed changes

Comment thread src/transformers/configuration_utils.py

Update src/transformers/configuration_utils.py

787cc3b

Rocketknight1 enabled auto-merge April 24, 2026 11:40

Rocketknight1 added this pull request to the merge queue Apr 24, 2026

github-merge-queue Bot removed this pull request from the merge queue due to failed status checks Apr 24, 2026

Rocketknight1 added this pull request to the merge queue Apr 24, 2026

Merged via the queue into huggingface:main with commit c472755 Apr 24, 2026
28 checks passed

evalstate mentioned this pull request Apr 28, 2026

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#41

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Raise clear error for `problem_type="single_label_classification"` with `num_labels=1`#45611

Raise clear error for `problem_type="single_label_classification"` with `num_labels=1`#45611
Rocketknight1 merged 4 commits intohuggingface:mainfrom
gaurav0107:fix-45479-single-label-num-labels-1-error

gaurav0107 commented Apr 23, 2026

Uh oh!

Rocketknight1 left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 24, 2026

Uh oh!

Uh oh!

gaurav0107 commented Apr 24, 2026 •

edited

Loading

Uh oh!

Rocketknight1 commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

gaurav0107 commented Apr 23, 2026

What

Why

Fix

Coordination / approach

Tests

Compatibility

AI disclosure

Uh oh!

Rocketknight1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

HuggingFaceDocBuilderDev commented Apr 24, 2026

Uh oh!

Uh oh!

gaurav0107 commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Rocketknight1 commented Apr 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gaurav0107 commented Apr 24, 2026 •

edited

Loading