[RoPE] run RoPE tests when the model uses RoPE by gante · Pull Request #40630 · huggingface/transformers

gante · 2025-09-02T17:31:41Z

What does this PR do?

The CausalLMModelTest mixin has RoPE tests, but it requires setting rotary_embedding_layer in each model tester. In most models, it was unset, so they were not running RoPE-related tests -- exposing ourselves to easily preventable issues like #40461 😱

This PR:

Removes rotary_embedding_layer from the CausalLMModelTest mixin.
Automates detection of RoPE-compatible models in CausalLMModelTest, and uses it to enable RoPE tests on each model. No more manual errors 🤗
Adds test skips where needed.

HuggingFaceDocBuilderDev · 2025-09-02T17:41:21Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Cyrilvallez

Nice! I think we can infer it directly from a model no? I.e. based on the model's module names, we can get the class dynamically and reinstantiate directly from it
Would avoid setting it explicitly in all model testers. WDYT?

zucchini-nlp

Thanks! Also noticed this when working on RoPE refactoring. I guess we can enable these tests without checking for rotary_embedding_layer and instead have a heuristic to check if model has a layer called self.rotary_embedding

That way we'll be sure the test is run in all models. and if model is special it will skip the test. WDYT?

gante · 2025-09-03T13:18:16Z

@Cyrilvallez @zucchini-nlp great PR comments, we can (and should!) automate test runs.

The latest commit removes the manual rotary_embedding_layer -- we now automatically run RoPE tests on models with RoPE, and programatically find the RoPE class in the model. This should future-proof things for a while 🤗

zucchini-nlp

Thanks for making the test suite better

zucchini-nlp · 2025-09-03T14:01:00Z

+        # Retrieves the RoPE layer class from the base model class. Assumption: the RoPE layer is under a few
+        # possible attribute names and is found in the base model class. In some (inconsistent) cases, it may be
+        # found in the self_attention layer instead.
+        base_model = self.model_tester.base_model_class(config)


we might need to model.get_decoder as well for models where the lm backbone is hidden inside the base model. Though ig these tests aren't yet used in multimodal models

Good point.

Given that the tests are only run on decoder-only models for now, I'd rather leave as is (and upgrade when it's needed) 🤗

Cyrilvallez

Super nice, thanks!

Cyrilvallez · 2025-09-08T08:45:07Z

+        for rope_attr in possible_rope_attributes:
+            rope_class = getattr(base_model, rope_attr, None)  # expected pattern
+            if (
+                rope_class is None
+                and hasattr(base_model, "layers")
+                and hasattr(base_model.layers[0], "self_attention")
+            ):
+                rope_class = getattr(base_model.layers[0].self_attention, rope_attr, None)  # fallback
+            if rope_class is not None:
+                rope_class = type(rope_class)
+                break


I think we can make it a bit more general/catch more modules if we do something like

for name, module in model.named_modules(): if any(potential_name in name for potential_name in possible_rope_attributes): rope_class = type(module) break

-- it would avoid edge cases when layers are not named layers, or attention is not self_attention

Added 👍

(and confirmed that it doesn't have a significative negative impact on test runtime)

github-actions · 2025-09-09T16:03:06Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: arcee, dbrx, deepseek_v2, ernie4_5, ernie4_5_moe, hunyuan_v1_dense, hunyuan_v1_moe, llama, minimax, mistral, nemotron, phi, phi3, phimoe, recurrent_gemma, stablelm

* enable rope tests * no manual rope test parameterization * Apply suggestions from code review * Update tests/models/hunyuan_v1_dense/test_modeling_hunyuan_v1_dense.py * PR comment: use generalist torch code to find the rope layer

enable rope tests

7f85b69

gante requested review from Cyrilvallez and zucchini-nlp September 2, 2025 17:31

gante mentioned this pull request Sep 2, 2025

Validate GptOssConfig rope config after it's fully initialized #40474

Merged

5 tasks

Cyrilvallez reviewed Sep 2, 2025

View reviewed changes

Comment thread tests/causal_lm_tester.py

zucchini-nlp reviewed Sep 3, 2025

View reviewed changes

Comment thread tests/causal_lm_tester.py

no manual rope test parameterization

3ff9aad

gante requested review from Cyrilvallez and zucchini-nlp September 3, 2025 13:16

gante changed the title ~~[RoPE] enable missing rope tests on many modern models~~ [RoPE] run RoPE tests when the model uses RoPE Sep 3, 2025

gante commented Sep 3, 2025

View reviewed changes

Comment thread tests/causal_lm_tester.py Outdated

Apply suggestions from code review

285c1b5

gante commented Sep 3, 2025

View reviewed changes

Comment thread tests/models/hunyuan_v1_dense/test_modeling_hunyuan_v1_dense.py Outdated

Update tests/models/hunyuan_v1_dense/test_modeling_hunyuan_v1_dense.py

8c51fb5

zucchini-nlp approved these changes Sep 3, 2025

View reviewed changes

Cyrilvallez approved these changes Sep 8, 2025

View reviewed changes

PR comment: use generalist torch code to find the rope layer

9330b63

gante enabled auto-merge (squash) September 9, 2025 16:02

gante merged commit d33c189 into huggingface:main Sep 9, 2025
24 checks passed

gante deleted the yarn_init_test branch September 9, 2025 16:14

Conversation

gante commented Sep 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Sep 2, 2025

Uh oh!

Cyrilvallez left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

gante commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zucchini-nlp left a comment

Choose a reason for hiding this comment

Uh oh!

zucchini-nlp Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

gante Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez left a comment

Choose a reason for hiding this comment

Uh oh!

Cyrilvallez Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

gante Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

gante commented Sep 2, 2025 •

edited

Loading

Cyrilvallez left a comment •

edited

Loading

gante commented Sep 3, 2025 •

edited

Loading

gante Sep 9, 2025 •

edited

Loading