Skip to content

Add Solar-Open Model#43244

Merged
vasqu merged 48 commits intohuggingface:mainfrom
oesni:solar-open-100b
Jan 21, 2026
Merged

Add Solar-Open Model#43244
vasqu merged 48 commits intohuggingface:mainfrom
oesni:solar-open-100b

Conversation

@oesni
Copy link
Copy Markdown
Contributor

@oesni oesni commented Jan 13, 2026

What does this PR do?

Implements Solar-Open model.
Solar Open is the open-weights MoE Solar LLM created by Upstage.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 13, 2026

@oesni you can ping me when you think it's ready for review (assuming it's not yet because it's a draft)

@oesni oesni changed the title [WIP] Add Solar-Open Model Add Solar-Open Model Jan 14, 2026
@oesni oesni marked this pull request as ready for review January 14, 2026 08:42
@oesni
Copy link
Copy Markdown
Contributor Author

oesni commented Jan 14, 2026

It's ready for review! @vasqu
But after rebase, it seems some tests fails.
I'll work on it.

@oesni
Copy link
Copy Markdown
Contributor Author

oesni commented Jan 14, 2026

wonder if it's okay to add SolarOpenConfig class to OBJECTS_TO_IGNORE in utils/check_docstrings.py since auto-generated config class fails docstring check.
There is comment for OBJECTS_TO_IGNORE: Do not add anything here ...

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks already super good, my main points are mostly related to making the config more aligned with the current way we handle rope + tests to add a small dummy model for us - 100B is sadly too heavy for our CI 😢

Comment thread docs/source/en/model_doc/solar_open.md Outdated
rendered properly in your Markdown viewer.

-->
*This model was released on 2025-12-31 and added to Hugging Face Transformers on 2026-01-13.*
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just as reminder to keep track of this when we merge

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's now enforced on our CI, will need make fix-repo but that happens automatically then

Comment thread docs/source/en/model_doc/solar_open.md Outdated
Comment thread docs/source/en/model_doc/solar_open.md
Comment thread src/transformers/conversion_mapping.py
Comment thread src/transformers/models/auto/configuration_auto.py
Comment thread src/transformers/models/solar_open/modular_solar_open.py Outdated
Comment thread src/transformers/models/solar_open/modular_solar_open.py Outdated
Comment thread src/transformers/models/solar_open/modular_solar_open.py
Comment thread tests/models/solar_open/test_modeling_solar_open.py Outdated
Comment thread tests/models/solar_open/test_modeling_solar_open.py Outdated
@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 14, 2026

wonder if it's okay to add SolarOpenConfig class to OBJECTS_TO_IGNORE in utils/check_docstrings.py since auto-generated config class fails docstring check.
There is comment for OBJECTS_TO_IGNORE: Do not add anything here ...

Just checked why it failed, we should not add it there. You can run make fix-repo and you will see that it complains because the config has a wrong default for rope_parameters but since we will change it (we should default to None --> non-mutable args) either way let's wait. We can take a look afterwards or I just quickly fix that no worries

Copy link
Copy Markdown
Contributor

@vasqu vasqu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super small nits, let's move the test(s) under causal lm tester, mostly the one test to check if partial rotary factor has the correct default

Comment on lines +81 to +84
attention_bias (`bool`, *optional*, defaults to `False`):
Whether to use a bias in the projection layers.
attention_dropout (`float`, *optional*, defaults to 0.0):
The dropout ratio for the attention probabilities.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually only support extra branches / features when they are actually used within a model

Comment thread tests/models/solar_open/test_modeling_solar_open.py Outdated
Comment thread tests/models/solar_open/test_modeling_solar_open.py Outdated
@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 19, 2026

Yup, dont worry about the CI - it's been a bit flaky these past few days/weeks

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 21, 2026

run-slow: solar_open

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/solar_open"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

@github-actions
Copy link
Copy Markdown
Contributor

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, solar_open

@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 21, 2026

run-slow: solar_open

@github-actions
Copy link
Copy Markdown
Contributor

This comment contains run-slow, running the specified jobs:

models: ["models/solar_open"]
quantizations: []

@github-actions
Copy link
Copy Markdown
Contributor

CI Results

Workflow Run ⚙️

✅ No failing test specific to this PR 🎉 !

@vasqu vasqu enabled auto-merge (squash) January 21, 2026 16:36
@vasqu vasqu disabled auto-merge January 21, 2026 16:38
@vasqu vasqu enabled auto-merge (squash) January 21, 2026 16:39
@vasqu
Copy link
Copy Markdown
Contributor

vasqu commented Jan 21, 2026

Merging now 🤗 thanks for the contribution

@vasqu vasqu merged commit 93dd4fb into huggingface:main Jan 21, 2026
26 checks passed
@HuggingFaceDocBuilderDev
Copy link
Copy Markdown

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@oesni
Copy link
Copy Markdown
Contributor Author

oesni commented Jan 22, 2026

Thanks for the review! 🤗 @vasqu

SangbumChoi pushed a commit to SangbumChoi/transformers that referenced this pull request Jan 23, 2026
* feat: implement solar-open-100b

* feat: update modeling_solar_open.py

* feat: update solar-open config

* chore: apply style

* feat: remove _tied_weights_keys

* feat: update modeling code

* chore: remove speech_to_text_2 in modeling

* docs: solar_open model

* test: solar open model

* chore: re-convert modular

* fix: remove require_read_token

* Apply suggestion from @vasqu

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* chore: update lincse year -> 2026

* feat: add solar_open to tokenizer mapping

* chore: update license year

* test: remove _torch_compile_train_cls

* docs: update solar_open doc

* refactor: simplify SolarOpenDecoderLayer

* refactor: inherit Glm4MoeConfig class

* fix: handle head_dim properly

* chore: apply style

* fix: default parameters

* test: use tiny dummy model

* update expectations and switch to eager moe (no fluctuations per grouped_mm / batched_mm)

* chore: remove trust_remote_code (suggestion from @vasqu)

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* Update src/transformers/models/solar_open/modular_solar_open.py

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>

* chore: update config docstring

* chore: add partial_rotary_factor workaround comment

* test: check default config values in test_modeling_solar_open.py

* fix: config class interface

* docs: add SolarOpen to doctree

* docs: update dates

* Revert "feat: add solar_open to tokenizer mapping"

This reverts commit 038b1c1.

* feat: remove unnecessary configs

* test: update SolarOpenConfig tests

* fix: attention_dropout issue on training

* Revert "feat: remove unnecessary configs"

This reverts commit 9023688.

* Revert "fix: attention_dropout issue on training"

This reverts commit 3c275dc.

* Revert "Revert "feat: remove unnecessary configs""

This reverts commit e6adcd9.

* Revert "Revert "fix: attention_dropout issue on training""

This reverts commit 573fa9a.

* feat: inherit attention from Llama

* fix: remove del for attention_bias and attention_dropout

* chore: convert solar_open

* fix date

---------

Co-authored-by: Anton Vlasjuk <73884904+vasqu@users.noreply.github.com>
Co-authored-by: vasqu <antonprogamer@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants