Adding support for Nandi Models by HemanthSai7 · Pull Request #45101 · huggingface/transformers

HemanthSai7 · 2026-03-29T20:35:56Z

Co-authored-by: Vishesht27

This PR adds support for codes for the upcoming Nandi series models. We also appreciate the valuable feedback and thorough review provided by @vasqu and @ArthurZucker 🤗🙏

Co-authored-by: Vishesht27 vishesht27@gmail.com

HemanthSai7 · 2026-04-01T11:57:20Z

We are the team at RTA AI Labs, a tiny but passionate startup dedicated to making high-performance language models more accessible through efficient architecture. Today, we are excited (and a little nervous!) to submit this PR to add support for Nandi, our custom "smol" model series. As a very small team, we have poured our hearts, late nights, and limited resources into building Nandi. We believe that the future of AI belongs to efficient, edge-compatible models, and we’ve designed Nandi to punch significantly above its weight class in terms of reasoning and throughput. Bringing Nandi to the Hugging Face ecosystem is a massive milestone for us. It is the "make or break" step for our upcoming release, as it will allow the community to easily fine-tune, deploy, and experiment with what we've built.

Why this PR matters

Architecture: Combines Factorized embeddings, Grouped Query Attention (GQA), RoPE, layer sharing for high efficiency.
Community Impact: Enables developers to build and deploy capable models without large-scale compute.

A small note to the maintainers
We know how incredibly busy the transformers maintainers are, and we have the utmost respect for the work you do in keeping this ecosystem thriving. For a small lab like ours, getting this PR reviewed and merged is more than just a technical update, it is the foundation of our startup’s mission. We have done our best to follow the contribution guidelines strictly to make the review process as smooth as possible for you. We are standing by to make any requested changes immediately.

@Rocketknight1 @vasqu @ArthurZucker @xenova @zucchini-nlp 🤗

vasqu · 2026-04-02T18:02:02Z

Heya super excited to see this 👋

Just as a heads up, we have holidays + torch conference so reviews will be delayed for at least a week ish. Not sure if I will be the one reviewing or someone else, but as a first step it would be best to fully utilize modular https://huggingface.co/docs/transformers/v5.5.0/en/modular_transformers#implementing-a-modular-file

Appreciate the work, and dont hesitate to ping us 😄

HemanthSai7 · 2026-04-05T09:53:35Z

Hi @vasqu, thanks for the heads up! I’ve updated the PR to fully utilise the modular transformer format as suggested in the documentation. Looking forward to your feedback whenever you're back.

xenova · 2026-04-14T09:03:29Z

🎉 https://huggingface.co/collections/Rta-AILabs/nandi-mini

xenova

Love seeing smaller models! Just an FYI before the main reviewers get to this... your usage of modular is not correct, as I see many classes which are mostly duplicates of existing implementations. Take a look at how a modular file like https://github.com/huggingface/transformers/blob/def5e6864fe4f2bbd7f056f37366f4dd0d693097/src/transformers/models/apertus/modular_apertus.py is laid out.

e.g.,

class ApertusRMSNorm(LlamaRMSNorm):
    pass


class ApertusRotaryEmbedding(LlamaRotaryEmbedding):
    pass

are valid usages of modular.

xenova · 2026-04-14T09:27:43Z

+            config.num_attention_heads * self.head_dim, config.hidden_size, bias=config.attention_bias
+        )
+
+    @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58")


no need to deprecate here. I don't think v5 shouldn't have these decorators anymore :)

xenova · 2026-04-14T09:31:43Z

+        return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}"
+
+
+class NandiRotaryEmbedding(nn.Module):


identical to normal LlamaRotaryEmbedding afaict.

xenova · 2026-04-14T09:34:28Z

+            config.num_attention_heads * self.head_dim, config.hidden_size, bias=config.attention_bias
+        )
+
+    @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58")


xenova · 2026-04-14T09:34:44Z

+        self.input_layernorm = NandiRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
+        self.post_attention_layernorm = NandiRMSNorm(config.hidden_size, eps=config.rms_norm_eps)
+
+    @deprecate_kwarg("past_key_value", new_name="past_key_values", version="4.58")


github-actions · 2026-04-14T17:00:11Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, nandi

HemanthSai7 · 2026-04-16T16:48:53Z

Hi @xenova , I’ve inherited the Llama modules wherever necessary and followed the modular structure closely. I also removed the deprecate_kwargs decorator. Please let me know if there are any other changes you’d like me to make.

github-actions · 2026-04-25T18:26:27Z

View the CircleCI Test Summary for this PR:

https://huggingface.co/spaces/transformers-community/circle-ci-viz?pr=45101&sha=b384e1

Vishesht27 · 2026-04-27T07:08:01Z

hi @xenova , we have made all changes from our side, can you review and please give us some feedback

Adding support for Nandi Models

fa784a6

Co-authored-by: Vishesht27 vishesht27@gmail.com

HemanthSai7 and others added 2 commits April 2, 2026 09:26

Merge branch 'huggingface:main' into nandi_v1

585549d

Fixed cache_position error and updated tokenizer to handle bos

b80edaf

HemanthSai7 and others added 2 commits April 5, 2026 13:59

Merge branch 'huggingface:main' into nandi_v1

7786961

replaced encode method with __call__ in tokenization_nandi.py

ba1acf0

xenova requested changes Apr 14, 2026

View reviewed changes

Merge branch 'huggingface:main' into nandi_v1

1543e13

Addressed the suggested changes

e46e69d

HemanthSai7 added 2 commits April 24, 2026 18:07

Fix tests

5a2ba99

Fixed linting issues

b384e15

This was referenced Apr 29, 2026

Cumulative feature and defect updates from recent Transformers PRs evalstate/transformers#42

Open

Cumulative defect fixes from recent Transformers PRs evalstate/transformers#43

Open

		return f"{tuple(self.weight.shape)}, eps={self.variance_epsilon}"


		class NandiRotaryEmbedding(nn.Module):

Conversation

HemanthSai7 commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HemanthSai7 commented Apr 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vasqu commented Apr 2, 2026

Uh oh!

HemanthSai7 commented Apr 5, 2026

Uh oh!

xenova commented Apr 14, 2026

Uh oh!

xenova left a comment

Choose a reason for hiding this comment

Uh oh!

xenova Apr 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

xenova Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

xenova Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

xenova Apr 14, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Apr 14, 2026

Uh oh!

HemanthSai7 commented Apr 16, 2026

Uh oh!

github-actions Bot commented Apr 25, 2026

Uh oh!

Vishesht27 commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

HemanthSai7 commented Mar 29, 2026 •

edited

Loading

HemanthSai7 commented Apr 1, 2026 •

edited

Loading

xenova Apr 14, 2026 •

edited

Loading