Re-producing the results for the full parameter models

Hi,
I am trying to reproduce the results in Table 2, specifically, the results for "Full-Rank" row in the table for 60m parameter model.

<img width="1086" height="491" alt="Image" src="https://github.com/user-attachments/assets/273e91ac-e0a7-4754-997d-00c4ca4927ff" />

I am running scripts/llm_pretrain/sltrain60m.sh, but I have modified --peft_model from "sltrain" to "full". After 11,000 steps, I get this:

"Eval loss and perplexity at step 11001: **3.4039955139160156**, **30.084061520394627**"

Whereas the perplexity reported in Table 2 is **34.06**

Could you please advise on how to re-produce the Table 2 results? Greatly appreciate your help!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-producing the results for the full parameter models #7

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Re-producing the results for the full parameter models #7

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions