Skip to content

Add non-mcore fsdp2 strategy#11525

Merged
ko3n1g merged 135 commits intomainfrom
boxiangw/non-mcore-fsdp2
Jan 8, 2025
Merged

Add non-mcore fsdp2 strategy#11525
ko3n1g merged 135 commits intomainfrom
boxiangw/non-mcore-fsdp2

Conversation

@BoxiangW
Copy link
Collaborator

@BoxiangW BoxiangW commented Dec 9, 2024

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>
@BoxiangW BoxiangW self-assigned this Dec 9, 2024
Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>
BoxiangW and others added 26 commits December 9, 2024 16:01
Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>
Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>
Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>
Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>
* Initial commit

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

---------

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>
Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Co-authored-by: Piotr Kaminski <pikaminski@nvidia.com>
Co-authored-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
* ci: Remove token from checkout

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* bump version

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
* Fix llm.deploy api

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>
Signed-off-by: Malay Nagda <malayn@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>
* update recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix mllama mock ds

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use attention bias

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* remove example

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mock.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/base.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* bump mcore

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add scripts for mllama

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* update script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix pylint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* revert Dockerfile.ci

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* add scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add vlm training test in ci

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring issues

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update script match recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update recipes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Update mllama_train.py

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* update mllama 90b recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use tmp in ci tests

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update default llava config

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add nemo run scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix vpp issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* remove duplicated script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* ci: Add HF cache

Signed-off-by: oliver könig <okoenig@nvidia.com>

* update to use SP in recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* upgrade

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Revert "upgrade"

This reverts commit f6ad2cd.

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix neva processing

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix lint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix data fields

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* few fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: Oliver Koenig <okoenig@nvidia.com>
* Add from_dict method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
* prevent llama3.1 from using linear interpolation

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>
Signed-off-by: Ryan <rlangman@nvidia.com>
* update for nest release

Signed-off-by: stevehuang52 <heh@nvidia.com>

* make pylint happier

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix for lhotse dataloader

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update yaml

Signed-off-by: stevehuang52 <heh@nvidia.com>

* minor refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>
* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying) (#10062)

Note: as of now, this is still not fully working on the cluster. See above doc for details.
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix unit tests after rebasing on recent main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support megatron_amp_O2 and tp (#10599)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* debug TP deadlock

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* some fixes for fsdp and tp

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs2048_mbs16_ep200/error-1417621-0.out

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_tp_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs128_mbs16_ep200/error-1421103-3.out

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit fix
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix grad accu
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* initial impl to support megatron_amp_O2 in salm, bestow, salm-t5

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* minor change in dataloader (#10601)

* Speechllm dataset basic unit test (#10631)

* Basic unit test for speechllm lhotse dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* cleanup

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Unit test for existing speechllm dataset with llama2 prompt format (#10634)

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* [speechllm] Replace TextProcessing with PromptFormatter (#10639)

* [speechllm] Replace TextProcessing with PromptFormatter

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Test for tokens_to_generate

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Padding optimization for speechlm dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Multimodal conversation format dataloading (#10683)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* a few fixes for the new prompt template based dataloader and lora+distributed fused adam (#10701)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* compatible with previous expts

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support gemma

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* handle the case max_seq_length is smaller than input_id length

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix max seq case

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix lora ckpt storing and loading

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for distributed fused adam

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert changes in nemo_adapters.py
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix tokenize_with_prompt

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>

* Mechanism to insert BOS/EOS at the beginning/end of dialog (#10923)

* Mechanism to insert BOS/EOS at the beginning/end of dialog

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix Gemma prompt formatter test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add a test specifically for multiturn insertion of bos/eos

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add options to override default map/iterable dataset style selection in lhotse dataloader

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Feature/conversations tarred (#11086)

* Multimodal conversation tarring script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix sharding logic

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix dir creation

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* EMMeTT support in SpeechLLM + tutorial for Lhotse Multimodal Dataloading (#10927)

* Preliminary support for oomptimizer

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* OOMptimizer for SpeechLLM

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial version of estimate token bins script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Initial support for multimodal 2d bucketing

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Extend to text-to-text oomptimizer

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Preliminary support for Llama2 prompt format in ast+mt

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add min/max tokens filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Change to bisect_left for bucket idx selection

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add reconfigure_num_microbatches_calculator at the start of train epoch for modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update lhotse multi-sampler config and make validation datasets finite

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial implementation of text+audio training for T5 modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* megatron t5 nmt prompt formatter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for MT+AST T5 oomptimizer and training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* configs, fixes, token-per-token filtering

* Support text modality in predict_step

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support text data in val/test dl

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix infinite

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* prompt format fixes

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes in audio supervision

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* remove superficial padding

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* test config and prompt context fetching fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* support text-only decoding for salm/bestow

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add unit tests for EMMETT / refactor prompt_format_fn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* make t5nmt prompt formatter auto discoverable

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* include token count / tpt filtering in estimate_token_bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix max token filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* some fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* custom mixin for text adapters

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Warmup in oomptimizer-speechlm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Move oomptimizer-speechllm to separate directory

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial cleanup

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactoring of prompt format fn and length measurement and filtering for data types; improved unit test coverage

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactor sampler constraints / filters into sampling.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Tests and support for sampler length measurement of multimodal conversations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update estimate_token_bins.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Move estimate_token_bins.py to speech_llm scripts

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for SpeechLLM dataset

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Add missing emmett tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add tutorial about multimodal lhotse dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Updated documentation for multimodal dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Prompt Formatter tutorial

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Review comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for sampling filters None values

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Changes requested by Steve: moving some args to main config namespace in multi config sampler

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Update default configs to the modified config schema

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix omegaconf use issue

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update the docs to the modified multi config format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>

* Remove old TODO comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Remove prompts/fn.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Copyright notices

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Disable plugin for high entropy strings in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix CodeQL errors

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix unit tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix another unit test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix multimodal tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* fixes after merging canary2 pr to main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix headers

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix canary integration test + formatting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address reviews - add sync_max_audio_length flag for conformer encoder

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address code review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address Steve's review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <139396994+zhehuaichen@users.noreply.github.com>
* Sync validation metrics for ASRModel

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support sync for single-dataloader case

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
* nemo 2 support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove unwanted params in DDP init in Megatron Parallel

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* nemo2 working with query

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* multigpu deployment with nemo2 works

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* add max output lenght

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove prints

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Fix merge conflicts

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* readded this file

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
* Add SFT/PEFT HF tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move hf examples to examples dir

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add 2gpu DDP

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use labels as passed by the user

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update samples/ tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm unused imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add tests with subset split names, e.g. train[:100]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add --disable-ckpt

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use self-hosted-azure-gpus-1 for single-gpu test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add TRANSFORMERS_OFFLINE=1 to hf tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
@BoxiangW BoxiangW added Run CICD and removed Run CICD labels Jan 7, 2025
BoxiangW and others added 2 commits January 7, 2025 14:40
Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>
@BoxiangW BoxiangW added Run CICD and removed Run CICD labels Jan 7, 2025
@@ -0,0 +1,11 @@
from importlib.metadata import version
from packaging.version import Version as PkgVersion

Check notice

Code scanning / CodeQL

Unused import Note test

Import of 'PkgVersion' is not used.
@BoxiangW BoxiangW added Run CICD and removed Run CICD labels Jan 7, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Jan 7, 2025

beep boop 🤖: 🚨 The following files must be fixed before merge!


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.lightning.pytorch.strategies.fsdp2_strategy
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:85:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:91:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:116:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:142:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:151:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:161:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:169:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:177:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:198:4: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.25/10

Mitigation guide:

  • Add sensible and useful docstrings to functions and methods
  • For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
  • To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2025

[🤖]: Hi @BoxiangW 👋,

We wanted to let you know that a CICD pipeline for this PR just finished successfully

So it might be time to merge this PR or get some approvals

I'm just a bot so I'll leave it you what to do next.

//cc @pablo-garay @ko3n1g

akoumpa
akoumpa previously approved these changes Jan 8, 2025
@BoxiangW BoxiangW enabled auto-merge (squash) January 8, 2025 18:05
Copy link
Collaborator

@ko3n1g ko3n1g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the tests to the final step:)

pablo-garay
pablo-garay previously approved these changes Jan 8, 2025
Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
@akoumpa akoumpa dismissed stale reviews from pablo-garay and themself via d8e7247 January 8, 2025 18:29
@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2025

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.collections.llm.gpt.model.hf_auto_model_for_causal_lm
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:27:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:35:0: C0115: Missing class docstring (missing-class-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:63:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:74:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:77:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:104:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:107:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:130:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/collections/llm/gpt/model/hf_auto_model_for_causal_lm.py:150:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.lightning.pytorch.strategies.megatron_strategy
nemo/lightning/pytorch/strategies/megatron_strategy.py:286:4: C0116: Missing function or method docstring (missing-function-docstring)
************* Module nemo.lightning.pytorch.strategies.utils
nemo/lightning/pytorch/strategies/utils.py:41:0: C0115: Missing class docstring (missing-class-docstring)
nemo/lightning/pytorch/strategies/utils.py:50:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:58:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:70:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:86:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:121:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:131:0: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/utils.py:186:0: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.75/10

Mitigation guide:

  • Add sensible and useful docstrings to functions and methods
  • For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
  • To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 8, 2025

beep boop 🤖: 🚨 The following files must be fixed before merge!


Your code was analyzed with PyLint. The following annotations have been identified:

************* Module nemo.lightning.pytorch.strategies.fsdp2_strategy
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:85:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:91:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:116:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:142:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:151:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:161:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:169:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:177:4: C0116: Missing function or method docstring (missing-function-docstring)
nemo/lightning/pytorch/strategies/fsdp2_strategy.py:198:4: C0116: Missing function or method docstring (missing-function-docstring)

-----------------------------------
Your code has been rated at 9.25/10

Mitigation guide:

  • Add sensible and useful docstrings to functions and methods
  • For trivial methods like getter/setters, consider adding # pylint: disable=C0116 inside the function itself
  • To disable multiple functions/methods at once, put a # pylint: disable=C0116 before the first and a # pylint: enable=C0116 after the last.

By applying these rules, we reduce the occurance of this message in future.

Thank you for improving NeMo's documentation!

@ko3n1g ko3n1g disabled auto-merge January 8, 2025 18:32
@ko3n1g ko3n1g merged commit 5d8baa4 into main Jan 8, 2025
26 of 28 checks passed
@ko3n1g ko3n1g deleted the boxiangw/non-mcore-fsdp2 branch January 8, 2025 18:32
abhinavg4 pushed a commit that referenced this pull request Jan 30, 2025
* Add fsdp2 strategy

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add imports

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add init import

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Fix mixtral export for NeMo 2.0 (#11532)

* Initial commit

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

---------

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>
Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Co-authored-by: Piotr Kaminski <pikaminski@nvidia.com>
Co-authored-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Make HFDatasetDataModule a datasets.load_dataset wrapper (#11500)

* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11544)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Use SHA for cut-off (#11545)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* link to mcore documentation (#11538)

Signed-off-by: ashors1 <ashors@nvidia.com>

* ci: Adjust inputs for code-freeze workflow (#11550)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump release freeze (#11551)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Ko3n1g/ci/commit sha for cutoff (#11553)

* ci: Remove token from checkout

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* bump version

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code-freeze workflow (#11554)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code freeze workflow (#11557)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Fix deploy conflicts in llm.api (#11367)

* Fix llm.deploy api

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* perf summary docs link (#11262)

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Add vlm nemo run scripts (#11394)

* update recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix mllama mock ds

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use attention bias

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* remove example

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mock.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/base.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* bump mcore

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add scripts for mllama

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* update script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix pylint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* revert Dockerfile.ci

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* add scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add vlm training test in ci

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring issues

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update script match recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update recipes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Update mllama_train.py

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* update mllama 90b recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use tmp in ci tests

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update default llava config

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add nemo run scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix vpp issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* remove duplicated script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* ci: Add HF cache

Signed-off-by: oliver könig <okoenig@nvidia.com>

* update to use SP in recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* upgrade

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Revert "upgrade"

This reverts commit f6ad2cd76abcdd9258cb53a25c788fd658189150.

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix neva processing

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix lint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix data fields

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* few fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: Oliver Koenig <okoenig@nvidia.com>

* Add from_dict to HFDatasetDataModule (#11559)

* Add from_dict method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Prevent llama3.1 from using Linear interpolation (#11548)

* prevent llama3.1 from using linear interpolation

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>

* [TTS] Add audio and mel codec HF models to docs (#11526)

Signed-off-by: Ryan <rlangman@nvidia.com>

* Update for NEST release (#11537)

* update for nest release

Signed-off-by: stevehuang52 <heh@nvidia.com>

* make pylint happier

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix for lhotse dataloader

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update yaml

Signed-off-by: stevehuang52 <heh@nvidia.com>

* minor refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>

* Merging SpeechLLM development branch (#11462)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying) (#10062)

Note: as of now, this is still not fully working on the cluster. See above doc for details.
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix unit tests after rebasing on recent main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support megatron_amp_O2 and tp (#10599)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* debug TP deadlock

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* some fixes for fsdp and tp

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs2048_mbs16_ep200/error-1417621-0.out

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_tp_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs128_mbs16_ep200/error-1421103-3.out

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit fix
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix grad accu
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* initial impl to support megatron_amp_O2 in salm, bestow, salm-t5

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* minor change in dataloader (#10601)

* Speechllm dataset basic unit test (#10631)

* Basic unit test for speechllm lhotse dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* cleanup

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Unit test for existing speechllm dataset with llama2 prompt format (#10634)

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* [speechllm] Replace TextProcessing with PromptFormatter (#10639)

* [speechllm] Replace TextProcessing with PromptFormatter

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Test for tokens_to_generate

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Padding optimization for speechlm dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Multimodal conversation format dataloading (#10683)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* a few fixes for the new prompt template based dataloader and lora+distributed fused adam (#10701)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* compatible with previous expts

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support gemma

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* handle the case max_seq_length is smaller than input_id length

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix max seq case

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix lora ckpt storing and loading

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for distributed fused adam

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert changes in nemo_adapters.py
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix tokenize_with_prompt

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>

* Mechanism to insert BOS/EOS at the beginning/end of dialog (#10923)

* Mechanism to insert BOS/EOS at the beginning/end of dialog

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix Gemma prompt formatter test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add a test specifically for multiturn insertion of bos/eos

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add options to override default map/iterable dataset style selection in lhotse dataloader

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Feature/conversations tarred (#11086)

* Multimodal conversation tarring script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix sharding logic

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix dir creation

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* EMMeTT support in SpeechLLM + tutorial for Lhotse Multimodal Dataloading (#10927)

* Preliminary support for oomptimizer

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* OOMptimizer for SpeechLLM

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial version of estimate token bins script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Initial support for multimodal 2d bucketing

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Extend to text-to-text oomptimizer

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Preliminary support for Llama2 prompt format in ast+mt

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add min/max tokens filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Change to bisect_left for bucket idx selection

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add reconfigure_num_microbatches_calculator at the start of train epoch for modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update lhotse multi-sampler config and make validation datasets finite

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial implementation of text+audio training for T5 modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* megatron t5 nmt prompt formatter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for MT+AST T5 oomptimizer and training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* configs, fixes, token-per-token filtering

* Support text modality in predict_step

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support text data in val/test dl

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix infinite

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* prompt format fixes

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes in audio supervision

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* remove superficial padding

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* test config and prompt context fetching fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* support text-only decoding for salm/bestow

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add unit tests for EMMETT / refactor prompt_format_fn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* make t5nmt prompt formatter auto discoverable

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* include token count / tpt filtering in estimate_token_bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix max token filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* some fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* custom mixin for text adapters

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Warmup in oomptimizer-speechlm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Move oomptimizer-speechllm to separate directory

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial cleanup

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactoring of prompt format fn and length measurement and filtering for data types; improved unit test coverage

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactor sampler constraints / filters into sampling.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Tests and support for sampler length measurement of multimodal conversations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update estimate_token_bins.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Move estimate_token_bins.py to speech_llm scripts

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for SpeechLLM dataset

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Add missing emmett tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add tutorial about multimodal lhotse dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Updated documentation for multimodal dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Prompt Formatter tutorial

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Review comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for sampling filters None values

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Changes requested by Steve: moving some args to main config namespace in multi config sampler

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Update default configs to the modified config schema

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix omegaconf use issue

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update the docs to the modified multi config format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>

* Remove old TODO comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Remove prompts/fn.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Copyright notices

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Disable plugin for high entropy strings in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix CodeQL errors

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix unit tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix another unit test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix multimodal tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* fixes after merging canary2 pr to main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix headers

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix canary integration test + formatting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address reviews - add sync_max_audio_length flag for conformer encoder

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address code review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address Steve's review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <139396994+zhehuaichen@users.noreply.github.com>

* Sync validation metrics for ASRModel (#11533)

* Sync validation metrics for ASRModel

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support sync for single-dataloader case

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* NeMo 2.0 In-framework deployment support (#11523)

* nemo 2 support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove unwanted params in DDP init in Megatron Parallel

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* nemo2 working with query

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* multigpu deployment with nemo2 works

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* add max output lenght

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove prints

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Fix merge conflicts

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* readded this file

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* Add SFT/PEFT HF tests (#11519)

* Add SFT/PEFT HF tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move hf examples to examples dir

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add 2gpu DDP

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use labels as passed by the user

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update samples/ tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm unused imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add tests with subset split names, e.g. train[:100]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add --disable-ckpt

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use self-hosted-azure-gpus-1 for single-gpu test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add TRANSFORMERS_OFFLINE=1 to hf tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Fix typo: LocalNonpersitentObject -> LocalNonpersistentObject (#11546)

Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>

* Adding documentation for packed dataset preparation with context para… (#11564)

* adding documentation for packed dataset preparation with context parallel

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* addressing Anna Shor's comment

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

---------

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* have micro_batch_size and global_batch_size as class attributes in mock datamodule (#11563)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping" (#11560)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping (#9628)"

This reverts commit 6784db56a03f19f37bc4f37bdf87dabb3fc1acee.

* keep underscores

Signed-off-by: ashors1 <ashors@nvidia.com>

---------

Signed-off-by: ashors1 <ashors@nvidia.com>

* add huggingface-based tokenizer support for mixtral HF -> .nemo (#11572)

* add huggingface-based tokenizer support

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>

* Github Actions tests for Llava Next and modify pretrain recipe to have language model path (#11424)

* modified pretrain recipe to have language_model_from_pretrained

* ci test for llava next

* fixed indent/lint issue in cicd yml file

* fix lint issues

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

---------

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>
Co-authored-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Fix SingleDeviceStrategy support in Nsys callback (#11574)

* fix for SingleDeviceStrategy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* remove dialogue scripts and docs (#11577)

* remove deprecated scripts

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* add JitTransform (#11131)

* add JitTransform

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fixes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add JiT CB test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove stale imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* cleanup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add jit callback test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix param passing

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use sgd in test_nemo_jit_cb

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add thunder call

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use .compile method to avoid changing module structure

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use JitConfig

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* thunder setting

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* avoid reentry

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove optional

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rewrite

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor & module_selector

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo 2.0 documentation upgrade (#11235)

* update attention

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update docs to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update usage

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix style

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated file

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update in respect to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix hyperlinks

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update documentation to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix typo

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix punctuation

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Remove auto-import of lhotse when importing nemo.collections.common.data (#11578)

* Remove auto-import of lhotse when importing nemo.collections.common.data

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix test import

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix example configs (#11571)

* Fix example configs

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Fix line length

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

---------

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* fix (#11575)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* NIM supporting changes for nemo.export for NeMo 2.0 (#11488)

* Move torch_dtype_from_precision for independent export module

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Remove unused imports

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix signature and default for megatron_amp_O2

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: janekl <janekl@users.noreply.github.com>
Co-authored-by: Bobby Chen <bobchen@nvidia.com>
Co-authored-by: janekl <janekl@users.noreply.github.com>

* AED greedy confidence estimation (#11573)

* upload

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: GNroy <GNroy@users.noreply.github.com>

* set prompt confidence dtype at initialization

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: GNroy <GNroy@users.noreply.github.com>
Co-authored-by: GNroy <GNroy@users.noreply.github.com>

* gemma fix (#11587)

* Update T5 DataModule regarding Pretrain/Finetune validate (#11584)

* update datamodule to have mbs/gbs

* update datamodule to have mbs/gbs

* Apply isort and black reformatting

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>

---------

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>
Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>
Co-authored-by: huvunvidia <huvunvidia@users.noreply.github.com>

* fix llama3 (#11580)

* Add Hf nemorun tests (#11566)

* minor fixes for recipe

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add peft nemorun script

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add sft script and data module

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* clean up

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add disable ckpt and data config for tests

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* add tests to cicd yaml

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* cleanup

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

---------

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* [🤖]: Howdy folks, let's bump NeMo-Toolkit to `2.2.0rc0` ! (#11555)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Pass the number of experts to modelopt layer spec (#11607)

* Pass number of experts to modelopt layer spec

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Adding changes to asr documentation (#11397)

Signed-off-by: Ssofja <sofiakostandian@gmail.com>

* Support Cosmos tokenizer TensorRT inference (#11472)

* Add cosmos TRT

* Add trt run script

* Apply isort and black reformatting

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>

* Clean code

* Fix CodeQL

---------

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
Co-authored-by: meatybobby <meatybobby@users.noreply.github.com>

* Neva updates to latest mcore and some fixes (#11565)

* api updates and fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix arg

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add nemo2-sft-peft to readme (#11613)

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* Set Minitron width pruning batch size 1 (#11603)

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Disable CP for running Inference using megatron_gpt_eval (#11547)

* Disable CP for megatron_gpt_eval

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

* Update examples/nlp/language_modeling/megatron_gpt_eval.py

Co-authored-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>

* ci: Add `no-fail-fast` mode (#11608)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Chat dataset support (#11423)

* chat dataset support

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* add ci test

Signed-off-by: Chen Cui <chcui@nvidia.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

* Sortformer Diarizer 4spk v1 model PR Part 2: Unit-tests for Sortformer Diarizer. (#11336)

* Adding the first pr files models and dataset

Signed-off-by: taejinp <tango4j@gmail.com>

* Tested all unit-test files

Signed-off-by: taejinp <tango4j@gmail.com>

* Name changes on yaml files and train example

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments and removing unnecessary parts for this PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding docstrings to reflect the PR comments

Signed-off-by: taejinp <tango4j@gmail.com>

* removed the unused find_first_nonzero

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed all pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removing unused varialbe in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed docstrings in training script

Signed-off-by: taejinp <tango4j@gmail.com>

* Line-too-long issue from Pylint fixed

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding get_subsegments_scriptable to prevent jit.script error

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Addressed Code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved conflicts on bce_loss.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding all the diarization reltated unit-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Moving speaker task related unit test files to speaker_tasks folder

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed uninit variable issue in bce_loss.py spotted by codeQL

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixing code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting PR comments from weiqingw

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Line too long pylint issue resolved in e2e_diarize_speech.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resovled unused variable issue in model test

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting the comment on Nov 21st  2024.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused variable import time

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding docstrings to score_labels() function in der.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments on YAML files and model file variable changes.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added get_subsegments_scriptable for legacy get_subsegment functions

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved line too long pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added training and inference CI-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Added the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed an indentation error

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved multi_bin_acc and bce_loss issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved line-too-long for msdd_models.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Code QL issues and fixed test errors

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* line too long in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* resolving CICD test issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixing codeQL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed pin memory False for inference

Signed-off-by: taejinp <tango4j@gmail.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>

* 2x more memory efficient Graph-based RNN-T (#11169)

* Optimized Graph-Transducer implementation

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* Use explicit subpaths in io for exporting a checkpoint (#11352)

* Fix llm.export_ckpt

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Remove triton requirement (#11627)

* Specify pytorch-triton instead of triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

---------

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* ci: Remove comment if no changes required anymore (#11624)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Jit with peft (#11586)

* move jitransform at the end

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstring & post-init

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add remove_extra_batch_keys and remove align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Run JitTransform on_train_epoch_start

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add --use-torch-jit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstrings

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pep8

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo-UX: add Hf's AutoModelForImageTextToText (#11321)

* init commit

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* wip

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* peft examp;le

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* move peft example to multimodal_llm

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* surface HFAutoModelForImageTextToText

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add hf vlm dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* train_log -> train_loss

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* vlm.HFDatasetDataModule pass collate_fn as argument

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update peft example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove unused var

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Move example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* remove unused

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Small change

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Fix loss calculation

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Use vlm.HFAutoModelForImageTextToText.extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update logits/labels handling

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add trust_remote_code to configure_processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add LLAMA_TOKENS

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update hf_dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add lora_dtype for models with non-FP weights

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add load_in_4bit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add default_dtype

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add load_in_4bit to llm collection

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm import

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix asset path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move vlm test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move data offline

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use signel gpu

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* drop align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove align_labels from llm too

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use loss * mask instead of loss[mask == 1]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11635)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add fix docstring for speech commands (#11638)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fixing Multi_Task_Adapters.ipynb by replacing canary2 with canary_custom (#11641)

Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>

* fixed config name in online augmentation tutorial (#11628)

Signed-off-by: Rauf <rnasretdinov@nvidia.com>

* fix default nodes (#11632)

* add renormalize_blend_weights param (#11647)

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Sortformer Diarizer 4spk v1 model PR Part 3: Speaker Diarization Mixin (#11511)

* Adding diarization mixin for one click inference

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving CodeQL and Pylint

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolving CodeQL and Pylint - unsaved files resolved

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused package manifest_utils

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved diarization mixin test issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removed commented lines

Signed-off-by: taejinp <tango4j@gmail.com>

* updating mixins code

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* fixing test_diarizartion.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* moving diarization postprocessing-related stuff to vad_utils.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* Resolving PyLint

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* fixing batch_idx issue in sortformer_diar_models.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* adding sync_dist=True for sortformer validation

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting the comments from PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting the comments from PR 2nd

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved a codeQL unused variable

Signed-off-by: taejinp <tango4j@gmail.com>

* Now moved existance check after

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>
Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>
Co-authored-by: ipmedenn <ipmedenn@users.noreply.github.c…
youngeunkwon0405 pushed a commit to youngeunkwon0405/NeMo that referenced this pull request Feb 10, 2025
* Add fsdp2 strategy

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add imports

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add init import

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Fix mixtral export for NeMo 2.0 (#11532)

* Initial commit

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

---------

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>
Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Co-authored-by: Piotr Kaminski <pikaminski@nvidia.com>
Co-authored-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Make HFDatasetDataModule a datasets.load_dataset wrapper (#11500)

* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11544)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Use SHA for cut-off (#11545)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* link to mcore documentation (#11538)

Signed-off-by: ashors1 <ashors@nvidia.com>

* ci: Adjust inputs for code-freeze workflow (#11550)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump release freeze (#11551)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Ko3n1g/ci/commit sha for cutoff (#11553)

* ci: Remove token from checkout

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* bump version

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code-freeze workflow (#11554)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code freeze workflow (#11557)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Fix deploy conflicts in llm.api (#11367)

* Fix llm.deploy api

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* perf summary docs link (#11262)

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Add vlm nemo run scripts (#11394)

* update recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix mllama mock ds

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use attention bias

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* remove example

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mock.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/base.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* bump mcore

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add scripts for mllama

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* update script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix pylint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* revert Dockerfile.ci

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* add scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add vlm training test in ci

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring issues

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update script match recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update recipes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Update mllama_train.py

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* update mllama 90b recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use tmp in ci tests

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update default llava config

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add nemo run scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix vpp issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* remove duplicated script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* ci: Add HF cache

Signed-off-by: oliver könig <okoenig@nvidia.com>

* update to use SP in recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* upgrade

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Revert "upgrade"

This reverts commit f6ad2cd76abcdd9258cb53a25c788fd658189150.

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix neva processing

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix lint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix data fields

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* few fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: Oliver Koenig <okoenig@nvidia.com>

* Add from_dict to HFDatasetDataModule (#11559)

* Add from_dict method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Prevent llama3.1 from using Linear interpolation (#11548)

* prevent llama3.1 from using linear interpolation

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>

* [TTS] Add audio and mel codec HF models to docs (#11526)

Signed-off-by: Ryan <rlangman@nvidia.com>

* Update for NEST release (#11537)

* update for nest release

Signed-off-by: stevehuang52 <heh@nvidia.com>

* make pylint happier

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix for lhotse dataloader

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update yaml

Signed-off-by: stevehuang52 <heh@nvidia.com>

* minor refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>

* Merging SpeechLLM development branch (#11462)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying) (#10062)

Note: as of now, this is still not fully working on the cluster. See above doc for details.
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix unit tests after rebasing on recent main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support megatron_amp_O2 and tp (#10599)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* debug TP deadlock

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* some fixes for fsdp and tp

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs2048_mbs16_ep200/error-1417621-0.out

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_tp_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs128_mbs16_ep200/error-1421103-3.out

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit fix
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix grad accu
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* initial impl to support megatron_amp_O2 in salm, bestow, salm-t5

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* minor change in dataloader (#10601)

* Speechllm dataset basic unit test (#10631)

* Basic unit test for speechllm lhotse dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* cleanup

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Unit test for existing speechllm dataset with llama2 prompt format (#10634)

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* [speechllm] Replace TextProcessing with PromptFormatter (#10639)

* [speechllm] Replace TextProcessing with PromptFormatter

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Test for tokens_to_generate

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Padding optimization for speechlm dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Multimodal conversation format dataloading (#10683)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* a few fixes for the new prompt template based dataloader and lora+distributed fused adam (#10701)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* compatible with previous expts

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support gemma

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* handle the case max_seq_length is smaller than input_id length

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix max seq case

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix lora ckpt storing and loading

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for distributed fused adam

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert changes in nemo_adapters.py
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix tokenize_with_prompt

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>

* Mechanism to insert BOS/EOS at the beginning/end of dialog (#10923)

* Mechanism to insert BOS/EOS at the beginning/end of dialog

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix Gemma prompt formatter test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add a test specifically for multiturn insertion of bos/eos

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add options to override default map/iterable dataset style selection in lhotse dataloader

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Feature/conversations tarred (#11086)

* Multimodal conversation tarring script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix sharding logic

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix dir creation

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* EMMeTT support in SpeechLLM + tutorial for Lhotse Multimodal Dataloading (#10927)

* Preliminary support for oomptimizer

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* OOMptimizer for SpeechLLM

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial version of estimate token bins script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Initial support for multimodal 2d bucketing

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Extend to text-to-text oomptimizer

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Preliminary support for Llama2 prompt format in ast+mt

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add min/max tokens filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Change to bisect_left for bucket idx selection

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add reconfigure_num_microbatches_calculator at the start of train epoch for modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update lhotse multi-sampler config and make validation datasets finite

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial implementation of text+audio training for T5 modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* megatron t5 nmt prompt formatter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for MT+AST T5 oomptimizer and training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* configs, fixes, token-per-token filtering

* Support text modality in predict_step

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support text data in val/test dl

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix infinite

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* prompt format fixes

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes in audio supervision

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* remove superficial padding

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* test config and prompt context fetching fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* support text-only decoding for salm/bestow

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add unit tests for EMMETT / refactor prompt_format_fn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* make t5nmt prompt formatter auto discoverable

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* include token count / tpt filtering in estimate_token_bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix max token filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* some fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* custom mixin for text adapters

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Warmup in oomptimizer-speechlm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Move oomptimizer-speechllm to separate directory

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial cleanup

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactoring of prompt format fn and length measurement and filtering for data types; improved unit test coverage

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactor sampler constraints / filters into sampling.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Tests and support for sampler length measurement of multimodal conversations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update estimate_token_bins.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Move estimate_token_bins.py to speech_llm scripts

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for SpeechLLM dataset

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Add missing emmett tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add tutorial about multimodal lhotse dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Updated documentation for multimodal dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Prompt Formatter tutorial

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Review comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for sampling filters None values

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Changes requested by Steve: moving some args to main config namespace in multi config sampler

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Update default configs to the modified config schema

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix omegaconf use issue

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update the docs to the modified multi config format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>

* Remove old TODO comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Remove prompts/fn.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Copyright notices

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Disable plugin for high entropy strings in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix CodeQL errors

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix unit tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix another unit test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix multimodal tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* fixes after merging canary2 pr to main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix headers

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix canary integration test + formatting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address reviews - add sync_max_audio_length flag for conformer encoder

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address code review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address Steve's review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <139396994+zhehuaichen@users.noreply.github.com>

* Sync validation metrics for ASRModel (#11533)

* Sync validation metrics for ASRModel

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support sync for single-dataloader case

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* NeMo 2.0 In-framework deployment support (#11523)

* nemo 2 support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove unwanted params in DDP init in Megatron Parallel

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* nemo2 working with query

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* multigpu deployment with nemo2 works

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* add max output lenght

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove prints

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Fix merge conflicts

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* readded this file

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* Add SFT/PEFT HF tests (#11519)

* Add SFT/PEFT HF tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move hf examples to examples dir

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add 2gpu DDP

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use labels as passed by the user

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update samples/ tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm unused imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add tests with subset split names, e.g. train[:100]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add --disable-ckpt

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use self-hosted-azure-gpus-1 for single-gpu test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add TRANSFORMERS_OFFLINE=1 to hf tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Fix typo: LocalNonpersitentObject -> LocalNonpersistentObject (#11546)

Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>

* Adding documentation for packed dataset preparation with context para… (#11564)

* adding documentation for packed dataset preparation with context parallel

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* addressing Anna Shor's comment

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

---------

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* have micro_batch_size and global_batch_size as class attributes in mock datamodule (#11563)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping" (#11560)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping (#9628)"

This reverts commit 6784db56a03f19f37bc4f37bdf87dabb3fc1acee.

* keep underscores

Signed-off-by: ashors1 <ashors@nvidia.com>

---------

Signed-off-by: ashors1 <ashors@nvidia.com>

* add huggingface-based tokenizer support for mixtral HF -> .nemo (#11572)

* add huggingface-based tokenizer support

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>

* Github Actions tests for Llava Next and modify pretrain recipe to have language model path (#11424)

* modified pretrain recipe to have language_model_from_pretrained

* ci test for llava next

* fixed indent/lint issue in cicd yml file

* fix lint issues

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

---------

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>
Co-authored-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Fix SingleDeviceStrategy support in Nsys callback (#11574)

* fix for SingleDeviceStrategy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* remove dialogue scripts and docs (#11577)

* remove deprecated scripts

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* add JitTransform (#11131)

* add JitTransform

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fixes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add JiT CB test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove stale imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* cleanup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add jit callback test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix param passing

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use sgd in test_nemo_jit_cb

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add thunder call

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use .compile method to avoid changing module structure

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use JitConfig

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* thunder setting

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* avoid reentry

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove optional

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rewrite

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor & module_selector

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo 2.0 documentation upgrade (#11235)

* update attention

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update docs to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update usage

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix style

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated file

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update in respect to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix hyperlinks

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update documentation to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix typo

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix punctuation

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Remove auto-import of lhotse when importing nemo.collections.common.data (#11578)

* Remove auto-import of lhotse when importing nemo.collections.common.data

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix test import

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix example configs (#11571)

* Fix example configs

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Fix line length

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

---------

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* fix (#11575)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* NIM supporting changes for nemo.export for NeMo 2.0 (#11488)

* Move torch_dtype_from_precision for independent export module

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Remove unused imports

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix signature and default for megatron_amp_O2

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: janekl <janekl@users.noreply.github.com>
Co-authored-by: Bobby Chen <bobchen@nvidia.com>
Co-authored-by: janekl <janekl@users.noreply.github.com>

* AED greedy confidence estimation (#11573)

* upload

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: GNroy <GNroy@users.noreply.github.com>

* set prompt confidence dtype at initialization

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: GNroy <GNroy@users.noreply.github.com>
Co-authored-by: GNroy <GNroy@users.noreply.github.com>

* gemma fix (#11587)

* Update T5 DataModule regarding Pretrain/Finetune validate (#11584)

* update datamodule to have mbs/gbs

* update datamodule to have mbs/gbs

* Apply isort and black reformatting

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>

---------

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>
Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>
Co-authored-by: huvunvidia <huvunvidia@users.noreply.github.com>

* fix llama3 (#11580)

* Add Hf nemorun tests (#11566)

* minor fixes for recipe

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add peft nemorun script

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add sft script and data module

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* clean up

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add disable ckpt and data config for tests

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* add tests to cicd yaml

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* cleanup

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

---------

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* [🤖]: Howdy folks, let's bump NeMo-Toolkit to `2.2.0rc0` ! (#11555)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Pass the number of experts to modelopt layer spec (#11607)

* Pass number of experts to modelopt layer spec

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Adding changes to asr documentation (#11397)

Signed-off-by: Ssofja <sofiakostandian@gmail.com>

* Support Cosmos tokenizer TensorRT inference (#11472)

* Add cosmos TRT

* Add trt run script

* Apply isort and black reformatting

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>

* Clean code

* Fix CodeQL

---------

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
Co-authored-by: meatybobby <meatybobby@users.noreply.github.com>

* Neva updates to latest mcore and some fixes (#11565)

* api updates and fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix arg

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add nemo2-sft-peft to readme (#11613)

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* Set Minitron width pruning batch size 1 (#11603)

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Disable CP for running Inference using megatron_gpt_eval (#11547)

* Disable CP for megatron_gpt_eval

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

* Update examples/nlp/language_modeling/megatron_gpt_eval.py

Co-authored-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>

* ci: Add `no-fail-fast` mode (#11608)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Chat dataset support (#11423)

* chat dataset support

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* add ci test

Signed-off-by: Chen Cui <chcui@nvidia.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

* Sortformer Diarizer 4spk v1 model PR Part 2: Unit-tests for Sortformer Diarizer. (#11336)

* Adding the first pr files models and dataset

Signed-off-by: taejinp <tango4j@gmail.com>

* Tested all unit-test files

Signed-off-by: taejinp <tango4j@gmail.com>

* Name changes on yaml files and train example

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments and removing unnecessary parts for this PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding docstrings to reflect the PR comments

Signed-off-by: taejinp <tango4j@gmail.com>

* removed the unused find_first_nonzero

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed all pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removing unused varialbe in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed docstrings in training script

Signed-off-by: taejinp <tango4j@gmail.com>

* Line-too-long issue from Pylint fixed

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding get_subsegments_scriptable to prevent jit.script error

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Addressed Code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved conflicts on bce_loss.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding all the diarization reltated unit-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Moving speaker task related unit test files to speaker_tasks folder

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed uninit variable issue in bce_loss.py spotted by codeQL

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixing code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting PR comments from weiqingw

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Line too long pylint issue resolved in e2e_diarize_speech.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resovled unused variable issue in model test

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting the comment on Nov 21st  2024.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused variable import time

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding docstrings to score_labels() function in der.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments on YAML files and model file variable changes.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added get_subsegments_scriptable for legacy get_subsegment functions

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved line too long pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added training and inference CI-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Added the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed an indentation error

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved multi_bin_acc and bce_loss issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved line-too-long for msdd_models.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Code QL issues and fixed test errors

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* line too long in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* resolving CICD test issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixing codeQL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed pin memory False for inference

Signed-off-by: taejinp <tango4j@gmail.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>

* 2x more memory efficient Graph-based RNN-T (#11169)

* Optimized Graph-Transducer implementation

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* Use explicit subpaths in io for exporting a checkpoint (#11352)

* Fix llm.export_ckpt

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Remove triton requirement (#11627)

* Specify pytorch-triton instead of triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

---------

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* ci: Remove comment if no changes required anymore (#11624)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Jit with peft (#11586)

* move jitransform at the end

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstring & post-init

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add remove_extra_batch_keys and remove align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Run JitTransform on_train_epoch_start

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add --use-torch-jit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstrings

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pep8

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo-UX: add Hf's AutoModelForImageTextToText (#11321)

* init commit

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* wip

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* peft examp;le

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* move peft example to multimodal_llm

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* surface HFAutoModelForImageTextToText

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add hf vlm dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* train_log -> train_loss

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* vlm.HFDatasetDataModule pass collate_fn as argument

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update peft example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove unused var

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Move example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* remove unused

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Small change

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Fix loss calculation

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Use vlm.HFAutoModelForImageTextToText.extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update logits/labels handling

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add trust_remote_code to configure_processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add LLAMA_TOKENS

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update hf_dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add lora_dtype for models with non-FP weights

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add load_in_4bit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add default_dtype

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add load_in_4bit to llm collection

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm import

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix asset path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move vlm test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move data offline

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use signel gpu

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* drop align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove align_labels from llm too

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use loss * mask instead of loss[mask == 1]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11635)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add fix docstring for speech commands (#11638)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fixing Multi_Task_Adapters.ipynb by replacing canary2 with canary_custom (#11641)

Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>

* fixed config name in online augmentation tutorial (#11628)

Signed-off-by: Rauf <rnasretdinov@nvidia.com>

* fix default nodes (#11632)

* add renormalize_blend_weights param (#11647)

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Sortformer Diarizer 4spk v1 model PR Part 3: Speaker Diarization Mixin (#11511)

* Adding diarization mixin for one click inference

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving CodeQL and Pylint

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolving CodeQL and Pylint - unsaved files resolved

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused package manifest_utils

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved diarization mixin test issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removed commented lines

Signed-off-by: taejinp <tango4j@gmail.com>

* updating mixins code

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* fixing test_diarizartion.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* moving diarization postprocessing-related stuff to vad_utils.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* Resolving PyLint

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>

* fixing batch_idx issue in sortformer_diar_models.py

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* adding sync_dist=True for sortformer validation

Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting the comments from PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting the comments from PR 2nd

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved a codeQL unused variable

Signed-off-by: taejinp <tango4j@gmail.com>

* Now moved existance check after

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>
Signed-off-by: ipmedenn <ipmedenn@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: ipmedenn <65592416+ipmedenn@users.noreply.github.com>
Co-authored-by: ipmedenn <ipmedenn@users.noreply.github.c…
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Comments