Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ The default SFT experiment is configured to run on a single GPU. To launch the e
uv run python examples/run_sft.py
```

This trains `Llama3.2-1B` on one GPU using SQUAD dataset.
This trains `Llama3.2-1B` on one GPU using the SQUAD dataset.

If you have access to more GPUs, you can update the experiment accordingly. To run on 8 GPUs, we update the cluster configuration. We also switch to an 8B Llama base model and increase the batch size:

Expand Down
2 changes: 1 addition & 1 deletion examples/configs/sft.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ data:

logger:
log_dir: "logs" # Base directory for all logs
wandb_enabled: true # Make sure you do ``wandb login [Your API key]'' before run
wandb_enabled: true # Make sure you do a ``wandb login [Your API key]'' before running
tensorboard_enabled: true
monitor_gpus: false # If true, will monitor GPU usage and log to wandb and/or tensorboard
wandb:
Expand Down