Skip to content

init commit#11

Merged
akoumpa merged 70 commits intomainfrom
akoumparouli/distributed
May 28, 2025
Merged

init commit#11
akoumpa merged 70 commits intomainfrom
akoumparouli/distributed

Conversation

@akoumpa
Copy link
Copy Markdown
Contributor

@akoumpa akoumpa commented May 27, 2025

Adds initial support for LLMs, in particular:

  • hellaswag dataset
  • FSDP2/DDP
  • Finetune LLM recipe

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 27, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

hemildesai and others added 13 commits May 27, 2025 11:36
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
* move examples to recipes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move everything under automodel

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* baby steps

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* import from NeMo 1f511fd & bfbd333

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* simplify

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add get method with fallback

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add ranked param

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update resolve target

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* renmame

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add rng.py

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* cleanup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move files

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add __contains__

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* minor fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* special handle for _fn keys

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move utils to file

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move hellaswag to SFTSingleTurnPreprocessor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix for _fn

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add base recipe

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
* cleaup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* cleaup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move DistInfo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move num_epochs to StepScheduler

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* change recipe name to FinetuneRecipeForNextTokenPrediction

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* change recipe name to FinetuneRecipeForNextTokenPrediction

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* change recipe name to FinetuneRecipeForNextTokenPrediction

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
@akoumpa akoumpa force-pushed the akoumparouli/distributed branch from 917a662 to 72afda1 Compare May 27, 2025 18:37
akoumpa added 15 commits May 27, 2025 11:40
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
akoumpa added 5 commits May 27, 2025 12:39
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
@akoumpa akoumpa changed the title Akoumparouli/distributed init commit May 27, 2025
akoumpa added 21 commits May 28, 2025 00:36
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
@akoumpa akoumpa requested a review from BoxiangW May 28, 2025 16:08
@akoumpa akoumpa merged commit e655b1f into main May 28, 2025
3 checks passed
@ko3n1g ko3n1g deleted the akoumparouli/distributed branch June 16, 2025 15:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants