-
Notifications
You must be signed in to change notification settings - Fork 561
[Proposal] support simplestories models #950
Copy link
Copy link
Open
Labels
complexity-moderateModerately complicated issues for people who have intermediate experience with the codeModerately complicated issues for people who have intermediate experience with the codemodel-requestAny issues related to requesting additional model supportAny issues related to requesting additional model support
Metadata
Metadata
Assignees
Labels
complexity-moderateModerately complicated issues for people who have intermediate experience with the codeModerately complicated issues for people who have intermediate experience with the codemodel-requestAny issues related to requesting additional model supportAny issues related to requesting additional model support
Proposal
Support the SimpleStories family of models.
Motivation
The TinyStories models already included in TransformerLens are incredibly useful both as objects of study in their own right, as well as for debugging research code in low-resource environments before investigating larger models. The SimpleStories models are an improvement on the TinyStories models, built on a much more diverse dataset.
Links
models on HF: https://huggingface.co/SimpleStories
training repo: https://github.com/danbraunai/simple_stories_train
dataset paper: https://arxiv.org/pdf/2504.09184
Checklist