Skip to content

clean a initial version of example config; make sure it works by test#2

Merged
zhehuaichen merged 1 commit intospeechllmfrom
speechllm_zhc
Aug 3, 2023
Merged

clean a initial version of example config; make sure it works by test#2
zhehuaichen merged 1 commit intospeechllmfrom
speechllm_zhc

Conversation

@zhehuaichen
Copy link
Owner

@zhehuaichen zhehuaichen commented Aug 3, 2023

test_init_and_train
TODO: add audio related part

What does this PR do ?

Add a one line overview of what this PR aims to accomplish.

Collection: [Note which collection this PR will affect]

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

test_init_and_train
TODO: add audio related part
@zhehuaichen zhehuaichen merged commit 0c626ce into speechllm Aug 3, 2023
@zhehuaichen zhehuaichen deleted the speechllm_zhc branch August 3, 2023 19:57
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
…#2)

approve as no need to review

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
…#2)

approve as no need to review

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
zhehuaichen added a commit that referenced this pull request Oct 4, 2023
zhehuaichen added a commit that referenced this pull request Oct 26, 2023
…IDIA-NeMo#7634)

* add initial impl of ModularizedSpeechGPTModel and integration test

* fix typo in the test name (#1)

approve the nit change

* clean a initial version of example config; make sure it works by test (#2)

approve as no need to review

* add the test for training_step and fix the code correspondingly (test passed now) (#3)

* add test for validation_step (#4)

* mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input

* Merge heh and zhehuai's initial version of frozen am+llm (#5)

* Merge heh and zhehuai's initial version of frozen am+llm

The previous differences are summarized here:
https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit

This PR includes
1. Finish merging the model, dataset, and config code
2. Previous tests are still enabled and passed (prepare_llm_input, training_step,
    validation_step)
3. the example training script with LS960 has been run to make sure the training
pipeline works

The major remaining works are listed here
https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw

---------

Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* fix a nit init bug broke test (#6)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18)

* wip

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix consumed_samples

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the training restart problem by storing adapter+perception model and
init them from the ckpt

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refix state dict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support wer and inf

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nan guard

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* reimpl inf and bug fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* multi loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* unfreeze lm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* flag for load am

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* tokenizer

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite vocab size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support bpe dropout

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add tarred datasets

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix sample_alpha

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bpe dropout bugs in the mismatched context in tokenization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add bleu metric

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update metrics

Signed-off-by: stevehuang52 <heh@nvidia.com>

* support inference and fix a bug in wer calculation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix bucketing dataset

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support question set file per dataset/data loader in preparation for
multitask understanding; also fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support simple random context for word boosting

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* use sacrebleu.corpus_bleu to be consistent with the rest

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make audio_file optional in the data loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add a tool to materialize mt and text data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* compatible with tar dataset

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for metric and speed up materialization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make num of context configurable

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* val_check_interval fix; make manifest dumping consistent with speech models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* random_context_positive_ratio configurable to control precision

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* bug fix: freeze_llm flag is not passed to the model cfg

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite tensor_model_parallel_size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support both stt and ssl models for loading audio encoder

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the inference config so as to use sampling; allow inference config update in training

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper.
also make sure test passed

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update default inference config and test golden value accordingly

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* integration test and minor fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit bug fix on manifest_filepath introduced by code cleanup

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update workspace/ files; consider moving to examples later

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* further remove unnecessary stuff in the inference implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert the update in default end_string to be compatible with legacy models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: Zhehuai Chen <chenzhehuai.sjtu@aispeech.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
zhehuaichen pushed a commit that referenced this pull request Feb 21, 2025
* upcycle dense to moe

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix(?) path when saving

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add unwrap method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* move file

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
zhehuaichen pushed a commit that referenced this pull request Feb 21, 2025
…#11500)

* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>
zhehuaichen added a commit that referenced this pull request Jun 16, 2025
* Add fsdp2 strategy

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add imports

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Add init import

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: BoxiangW <BoxiangW@users.noreply.github.com>

* Fix mixtral export for NeMo 2.0 (#11532)

* Initial commit

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

---------

Signed-off-by: Piotr Kaminski <pikaminski@nvidia.com>
Signed-off-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>
Co-authored-by: Piotr Kaminski <pikaminski@nvidia.com>
Co-authored-by: Laplasjan107 <Laplasjan107@users.noreply.github.com>

* Make HFDatasetDataModule a datasets.load_dataset wrapper (#11500)

* Make HfDatasetDataModule a datasets.load_dataset wrapper

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add logging

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update HFDatasetDataModule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor fixup #2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* do not expand

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* doc

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add synonym

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Add train/val/test attributes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add test for hf-datamodule

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Import lazily to avoid breaking with older megatron versions

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot happy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* bot happy2

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add doc-strings and collate-fn arg

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11544)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Use SHA for cut-off (#11545)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* link to mcore documentation (#11538)

Signed-off-by: ashors1 <ashors@nvidia.com>

* ci: Adjust inputs for code-freeze workflow (#11550)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump release freeze (#11551)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Ko3n1g/ci/commit sha for cutoff (#11553)

* ci: Remove token from checkout

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* bump version

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

---------

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code-freeze workflow (#11554)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* ci: Bump code freeze workflow (#11557)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Fix deploy conflicts in llm.api (#11367)

* Fix llm.deploy api

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>

* PR feedback

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: hemildesai <hemildesai@users.noreply.github.com>
Co-authored-by: hemildesai <hemildesai@users.noreply.github.com>

* perf summary docs link (#11262)

Signed-off-by: Malay Nagda <malayn@nvidia.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Add vlm nemo run scripts (#11394)

* update recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix mllama mock ds

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use attention bias

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* remove example

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mock.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/base.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring mllama/language.py

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* bump mcore

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add scripts for mllama

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* update script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix pylint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* revert Dockerfile.ci

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* add scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add vlm training test in ci

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix docstring issues

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update script match recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update recipes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Update mllama_train.py

Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>

* update mllama 90b recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update to use tmp in ci tests

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update default llava config

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* add nemo run scripts

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix vpp issue

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix cicd

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* remove duplicated script

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* ci: Add HF cache

Signed-off-by: oliver könig <okoenig@nvidia.com>

* update to use SP in recipe

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* upgrade

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Revert "upgrade"

This reverts commit f6ad2cd76abcdd9258cb53a25c788fd658189150.

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* update neva api

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix neva processing

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix lint

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix data fields

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* few fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Signed-off-by: Oliver Koenig <okoenig@nvidia.com>
Signed-off-by: Yu Yao <54727607+yaoyu-33@users.noreply.github.com>
Signed-off-by: oliver könig <okoenig@nvidia.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: Oliver Koenig <okoenig@nvidia.com>

* Add from_dict to HFDatasetDataModule (#11559)

* Add from_dict method

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test_load_from_dict

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Prevent llama3.1 from using Linear interpolation (#11548)

* prevent llama3.1 from using linear interpolation

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>

* [TTS] Add audio and mel codec HF models to docs (#11526)

Signed-off-by: Ryan <rlangman@nvidia.com>

* Update for NEST release (#11537)

* update for nest release

Signed-off-by: stevehuang52 <heh@nvidia.com>

* make pylint happier

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix for lhotse dataloader

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update yaml

Signed-off-by: stevehuang52 <heh@nvidia.com>

* minor refactor

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

* clean up

Signed-off-by: stevehuang52 <heh@nvidia.com>

---------

Signed-off-by: stevehuang52 <heh@nvidia.com>

* Merging SpeechLLM development branch (#11462)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying) (#10062)

Note: as of now, this is still not fully working on the cluster. See above doc for details.
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix unit tests after rebasing on recent main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support megatron_amp_O2 and tp (#10599)

* Port changes related to SFT text+speech dataloading

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert changes from Canary(nonLLM) code

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add joint text/audio dataloading capability to speechllm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* include text-only into fprop of training and eval; TODO: text-only
predict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Actually working forward step

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Support for source-target text file pair training for MT+speech

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Include supervision text tokens in audio example's num tokens

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Disable conformer seq len NCCL sync

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Preliminary sampler fusion stragies support: mux/zip/round_robin/randomized_round_robin

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Working V2 version of multimodal dataloading. Each modality gets its own batch settings that can be merged with zip sampler to enjoy max batch sizes for both modalities in a single training step. Each modality runs fwd+bwd in turn to save GPU memory (instead of running fwd separately and bwd together).

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add missing config

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Revert multimodal grad accum and fix mask padding issue

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add modality weights support via cfg.model.modality_weights

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix for V2 dataloader shuffling CRITICAL

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Restore multimodal grad accum

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Fix unit tests for multi-sampler configurations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* nemo gemma to hf  conversion (#9629)

* adding script for gemma nemo to hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* adding verification for convert_gemma_nemo_to_hf

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

---------

Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* support FSDP (thank Yifan for early trying)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* debug TP deadlock

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* some fixes for fsdp and tp

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs2048_mbs16_ep200/error-1417621-0.out

/lustre/fsw/portfolios/llmservice/users/zhehuaic/results/canary-v0_speechllm/prompt_lhmerge5_p2b_tp_oci_FC-GPT_llama_canaryset_b6s4kf-sunolong_noCC_langtemp0.5_dsettemp0.5_lr1e-4wd1e-3_CosineAnnealing_warmup2500_minlr1e-6_gbs128_mbs16_ep200/error-1421103-3.out

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit fix
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* for llama3.1
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix for inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix grad accu
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix inference
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* initial impl to support megatron_amp_O2 in salm, bestow, salm-t5

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>

* minor change in dataloader (#10601)

* Speechllm dataset basic unit test (#10631)

* Basic unit test for speechllm lhotse dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* cleanup

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Unit test for existing speechllm dataset with llama2 prompt format (#10634)

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* [speechllm] Replace TextProcessing with PromptFormatter (#10639)

* [speechllm] Replace TextProcessing with PromptFormatter

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Test for tokens_to_generate

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Padding optimization for speechlm dataset

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Multimodal conversation format dataloading (#10683)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* a few fixes for the new prompt template based dataloader and lora+distributed fused adam (#10701)

* Draft implementation of NeMo Multimodal Conversation format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working data parsing and iteration

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fully working dataloading with tokenization + prompting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Collapse consecutive user turns into single turn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* compatible with previous expts

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support gemma

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* handle the case max_seq_length is smaller than input_id length

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix max seq case

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix lora ckpt storing and loading

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for distributed fused adam

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert changes in nemo_adapters.py
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Fix tokenize_with_prompt

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Co-authored-by: Piotr Żelasko <petezor@gmail.com>

* Mechanism to insert BOS/EOS at the beginning/end of dialog (#10923)

* Mechanism to insert BOS/EOS at the beginning/end of dialog

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix Gemma prompt formatter test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add a test specifically for multiturn insertion of bos/eos

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add options to override default map/iterable dataset style selection in lhotse dataloader

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Feature/conversations tarred (#11086)

* Multimodal conversation tarring script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix sharding logic

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix dir creation

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* EMMeTT support in SpeechLLM + tutorial for Lhotse Multimodal Dataloading (#10927)

* Preliminary support for oomptimizer

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* OOMptimizer for SpeechLLM

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial version of estimate token bins script

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Initial support for multimodal 2d bucketing

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Extend to text-to-text oomptimizer

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Preliminary support for Llama2 prompt format in ast+mt

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support for 1D estimate token bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add min/max tokens filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Change to bisect_left for bucket idx selection

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add reconfigure_num_microbatches_calculator at the start of train epoch for modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update lhotse multi-sampler config and make validation datasets finite

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial implementation of text+audio training for T5 modular models

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* megatron t5 nmt prompt formatter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for MT+AST T5 oomptimizer and training

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* configs, fixes, token-per-token filtering

* Support text modality in predict_step

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Support text data in val/test dl

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix infinite

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* prompt format fixes

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes in audio supervision

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* remove superficial padding

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* test config and prompt context fetching fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* support text-only decoding for salm/bestow

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Add unit tests for EMMETT / refactor prompt_format_fn

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* make t5nmt prompt formatter auto discoverable

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* include token count / tpt filtering in estimate_token_bins

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix max token filter

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* some fixes

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* custom mixin for text adapters

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Warmup in oomptimizer-speechlm

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Move oomptimizer-speechllm to separate directory

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Initial cleanup

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactoring of prompt format fn and length measurement and filtering for data types; improved unit test coverage

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Refactor sampler constraints / filters into sampling.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Tests and support for sampler length measurement of multimodal conversations

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update estimate_token_bins.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Move estimate_token_bins.py to speech_llm scripts

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Minor tweaks

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for SpeechLLM dataset

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* Add missing emmett tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Add tutorial about multimodal lhotse dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Updated documentation for multimodal dataloading

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Prompt Formatter tutorial

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Review comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fixes for sampling filters None values

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Changes requested by Steve: moving some args to main config namespace in multi config sampler

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Update default configs to the modified config schema

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix omegaconf use issue

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Update the docs to the modified multi config format

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>

* Remove old TODO comments

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Remove prompts/fn.py

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Copyright notices

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Make linter happy

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix megatron test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Disable plugin for high entropy strings in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix CodeQL errors

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix unit tests

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix another unit test

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix multimodal tests

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>

* fixes after merging canary2 pr to main

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix headers

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* fix canary integration test + formatting

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address reviews - add sync_max_audio_length flag for conformer encoder

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Revert change in secrets detector

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address code review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Address Steve's review

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <pzelasko@nvidia.com>
Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: Piotr Żelasko <petezor@gmail.com>
Signed-off-by: pzelasko <pzelasko@users.noreply.github.com>
Signed-off-by: Krishna Puvvada <kpuvvada@nvidia.com>
Signed-off-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Co-authored-by: pzelasko <pzelasko@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <93558329+krishnacpuvvada@users.noreply.github.com>
Co-authored-by: Krishna Puvvada <kpuvvada@nvidia.com>
Co-authored-by: krishnacpuvvada <krishnacpuvvada@users.noreply.github.com>
Co-authored-by: zhehuaichen <139396994+zhehuaichen@users.noreply.github.com>

* Sync validation metrics for ASRModel (#11533)

* Sync validation metrics for ASRModel

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* support sync for single-dataloader case

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* NeMo 2.0 In-framework deployment support (#11523)

* nemo 2 support

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove unwanted params in DDP init in Megatron Parallel

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* nemo2 working with query

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* multigpu deployment with nemo2 works

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* add max output lenght

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Remove prints

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* Fix merge conflicts

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

* readded this file

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>

---------

Signed-off-by: Onur Yilmaz <oyilmaz@nvidia.com>
Signed-off-by: Hemil Desai <hemild@nvidia.com>
Signed-off-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>
Co-authored-by: Hemil Desai <hemild@nvidia.com>
Co-authored-by: oyilmaz-nvidia <oyilmaz-nvidia@users.noreply.github.com>

* Add SFT/PEFT HF tests (#11519)

* Add SFT/PEFT HF tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move hf examples to examples dir

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* bot

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use mini_squad

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add 2gpu DDP

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use labels as passed by the user

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update samples/ tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm unused imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add tests with subset split names, e.g. train[:100]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* add --disable-ckpt

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use self-hosted-azure-gpus-1 for single-gpu test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add TRANSFORMERS_OFFLINE=1 to hf tests

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* Fix typo: LocalNonpersitentObject -> LocalNonpersistentObject (#11546)

Signed-off-by: Ananth Subramaniam <ansubramania@nvidia.com>

* Adding documentation for packed dataset preparation with context para… (#11564)

* adding documentation for packed dataset preparation with context parallel

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* addressing Anna Shor's comment

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

---------

Signed-off-by: Lifu Zhang <tomzhanglf@gmail.com>

* have micro_batch_size and global_batch_size as class attributes in mock datamodule (#11563)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping" (#11560)

* Revert "Fix the names of two sets of weight and bias in mcore_to_nemo_mapping (#9628)"

This reverts commit 6784db56a03f19f37bc4f37bdf87dabb3fc1acee.

* keep underscores

Signed-off-by: ashors1 <ashors@nvidia.com>

---------

Signed-off-by: ashors1 <ashors@nvidia.com>

* add huggingface-based tokenizer support for mixtral HF -> .nemo (#11572)

* add huggingface-based tokenizer support

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Apply isort and black reformatting

Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>
Signed-off-by: dimapihtar <dimapihtar@users.noreply.github.com>
Co-authored-by: dimapihtar <dimapihtar@users.noreply.github.com>

* Github Actions tests for Llava Next and modify pretrain recipe to have language model path (#11424)

* modified pretrain recipe to have language_model_from_pretrained

* ci test for llava next

* fixed indent/lint issue in cicd yml file

* fix lint issues

* Apply isort and black reformatting

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

* Update .github/workflows/cicd-main.yml

Co-authored-by: oliver könig <okoenig@nvidia.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>

---------

Signed-off-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Signed-off-by: Yashaswi Karnati <144376261+yashaswikarnati@users.noreply.github.com>
Co-authored-by: yashaswikarnati <yashaswikarnati@users.noreply.github.com>
Co-authored-by: oliver könig <okoenig@nvidia.com>

* Fix SingleDeviceStrategy support in Nsys callback (#11574)

* fix for SingleDeviceStrategy

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* remove dialogue scripts and docs (#11577)

* remove deprecated scripts

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* add JitTransform (#11131)

* add JitTransform

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fixes

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add JiT CB test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove stale imports

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* cleanup

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add jit callback test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix param passing

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use sgd in test_nemo_jit_cb

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add thunder call

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use .compile method to avoid changing module structure

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* Use JitConfig

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* thunder setting

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* avoid reentry

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove optional

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rewrite

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* refactor & module_selector

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo 2.0 documentation upgrade (#11235)

* update attention

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update docs to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update usage

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update parallelism docs

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix style

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* NeMo 2.0 update

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated file

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update in respect to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix hyperlinks

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* remove deprecated

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* update documentation to NeMo 2.0

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix typo

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* fix punctuation

Signed-off-by: dimapihtar <dpihtar@gmail.com>

---------

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Remove auto-import of lhotse when importing nemo.collections.common.data (#11578)

* Remove auto-import of lhotse when importing nemo.collections.common.data

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix test import

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

---------

Signed-off-by: Piotr Żelasko <petezor@gmail.com>

* Fix example configs (#11571)

* Fix example configs

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* Fix line length

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

---------

Signed-off-by: Boxiang Wang <boxiangw@nvidia.com>

* fix (#11575)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* NIM supporting changes for nemo.export for NeMo 2.0 (#11488)

* Move torch_dtype_from_precision for independent export module

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Remove unused imports

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Apply isort and black reformatting

Signed-off-by: janekl <janekl@users.noreply.github.com>
Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix signature and default for megatron_amp_O2

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>
Signed-off-by: janekl <janekl@users.noreply.github.com>
Co-authored-by: Bobby Chen <bobchen@nvidia.com>
Co-authored-by: janekl <janekl@users.noreply.github.com>

* AED greedy confidence estimation (#11573)

* upload

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: GNroy <GNroy@users.noreply.github.com>

* set prompt confidence dtype at initialization

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>

---------

Signed-off-by: Aleksandr Laptev <alaptev@nvidia.com>
Signed-off-by: GNroy <GNroy@users.noreply.github.com>
Co-authored-by: GNroy <GNroy@users.noreply.github.com>

* gemma fix (#11587)

* Update T5 DataModule regarding Pretrain/Finetune validate (#11584)

* update datamodule to have mbs/gbs

* update datamodule to have mbs/gbs

* Apply isort and black reformatting

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>

---------

Signed-off-by: huvunvidia <huvunvidia@users.noreply.github.com>
Co-authored-by: Huy Vu2 <huvu@login-eos02.eos.clusters.nvidia.com>
Co-authored-by: huvunvidia <huvunvidia@users.noreply.github.com>

* fix llama3 (#11580)

* Add Hf nemorun tests (#11566)

* minor fixes for recipe

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add peft nemorun script

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add sft script and data module

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* clean up

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* add disable ckpt and data config for tests

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* Apply isort and black reformatting

Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* add tests to cicd yaml

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

* cleanup

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>

---------

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Signed-off-by: HuiyingLi <HuiyingLi@users.noreply.github.com>
Co-authored-by: HuiyingLi <HuiyingLi@users.noreply.github.com>

* [🤖]: Howdy folks, let's bump NeMo-Toolkit to `2.2.0rc0` ! (#11555)

Signed-off-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Pass the number of experts to modelopt layer spec (#11607)

* Pass number of experts to modelopt layer spec

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Fix too long lines

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

---------

Signed-off-by: Jan Lasek <janek.lasek@gmail.com>

* Adding changes to asr documentation (#11397)

Signed-off-by: Ssofja <sofiakostandian@gmail.com>

* Support Cosmos tokenizer TensorRT inference (#11472)

* Add cosmos TRT

* Add trt run script

* Apply isort and black reformatting

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>

* Clean code

* Fix CodeQL

---------

Signed-off-by: meatybobby <meatybobby@users.noreply.github.com>
Co-authored-by: meatybobby <meatybobby@users.noreply.github.com>

* Neva updates to latest mcore and some fixes (#11565)

* api updates and fixes

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* Apply isort and black reformatting

Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* fix

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

* fix arg

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

---------

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>
Co-authored-by: yaoyu-33 <yaoyu-33@users.noreply.github.com>

* add nemo2-sft-peft to readme (#11613)

Signed-off-by: Huiying Li <willwin.lee@gmail.com>

* Set Minitron width pruning batch size 1 (#11603)

Signed-off-by: Keval Morabia <28916987+kevalmorabia97@users.noreply.github.com>

* Disable CP for running Inference using megatron_gpt_eval (#11547)

* Disable CP for megatron_gpt_eval

* Apply isort and black reformatting

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>

* Update examples/nlp/language_modeling/megatron_gpt_eval.py

Co-authored-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>

---------

Signed-off-by: suiyoubi <suiyoubi@users.noreply.github.com>
Signed-off-by: Ao Tang <mike.tang96@gmail.com>
Co-authored-by: suiyoubi <suiyoubi@users.noreply.github.com>
Co-authored-by: Chen Cui <chcui@nvidia.com>

* ci: Add `no-fail-fast` mode (#11608)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Chat dataset support (#11423)

* chat dataset support

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* add ci test

Signed-off-by: Chen Cui <chcui@nvidia.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>

* address comment

Signed-off-by: Chen Cui <chcui@nvidia.com>

---------

Signed-off-by: Chen Cui <chcui@nvidia.com>
Signed-off-by: cuichenx <cuichenx@users.noreply.github.com>
Co-authored-by: cuichenx <cuichenx@users.noreply.github.com>

* Sortformer Diarizer 4spk v1 model PR Part 2: Unit-tests for Sortformer Diarizer. (#11336)

* Adding the first pr files models and dataset

Signed-off-by: taejinp <tango4j@gmail.com>

* Tested all unit-test files

Signed-off-by: taejinp <tango4j@gmail.com>

* Name changes on yaml files and train example

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments and removing unnecessary parts for this PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding docstrings to reflect the PR comments

Signed-off-by: taejinp <tango4j@gmail.com>

* removed the unused find_first_nonzero

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed all pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removing unused varialbe in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed docstrings in training script

Signed-off-by: taejinp <tango4j@gmail.com>

* Line-too-long issue from Pylint fixed

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding get_subsegments_scriptable to prevent jit.script error

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Addressed Code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved conflicts on bce_loss.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding all the diarization reltated unit-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Moving speaker task related unit test files to speaker_tasks folder

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed uninit variable issue in bce_loss.py spotted by codeQL

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixing code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting PR comments from weiqingw

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Line too long pylint issue resolved in e2e_diarize_speech.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resovled unused variable issue in model test

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting the comment on Nov 21st  2024.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused variable import time

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding docstrings to score_labels() function in der.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments on YAML files and model file variable changes.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added get_subsegments_scriptable for legacy get_subsegment functions

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved line too long pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added training and inference CI-tests

Signed-off-by: taejinp <tango4j@gmail.com>

* Added the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding the missing parse_func in preprocessing/collections.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed an indentation error

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved multi_bin_acc and bce_loss issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved line-too-long for msdd_models.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Code QL issues and fixed test errors

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* line too long in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* resolving CICD test issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixing codeQL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed pin memory False for inference

Signed-off-by: taejinp <tango4j@gmail.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>

* 2x more memory efficient Graph-based RNN-T (#11169)

* Optimized Graph-Transducer implementation

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>

---------

Signed-off-by: Vladimir Bataev <vbataev@nvidia.com>
Signed-off-by: artbataev <artbataev@users.noreply.github.com>
Co-authored-by: artbataev <artbataev@users.noreply.github.com>

* Use explicit subpaths in io for exporting a checkpoint (#11352)

* Fix llm.export_ckpt

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* fix

Signed-off-by: Hemil Desai <hemild@nvidia.com>

---------

Signed-off-by: Hemil Desai <hemild@nvidia.com>

* Remove triton requirement (#11627)

* Specify pytorch-triton instead of triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Remove triton

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>

---------

Signed-off-by: Dong Hyuk Chang <donghyukc@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

* ci: Remove comment if no changes required anymore (#11624)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Jit with peft (#11586)

* move jitransform at the end

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstring & post-init

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add remove_extra_batch_keys and remove align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Run JitTransform on_train_epoch_start

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add --use-torch-jit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add docstrings

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pep8

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* NeMo-UX: add Hf's AutoModelForImageTextToText (#11321)

* init commit

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* wip

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* peft examp;le

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* move peft example to multimodal_llm

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* surface HFAutoModelForImageTextToText

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add hf vlm dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* train_log -> train_loss

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* vlm.HFDatasetDataModule pass collate_fn as argument

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update peft example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* typo

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove unused var

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Move example

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* remove unused

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Small change

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Fix loss calculation

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Use vlm.HFAutoModelForImageTextToText.extract_skipped_token_ids

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Update logits/labels handling

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add trust_remote_code to configure_processor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

* mini refactor

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add LLAMA_TOKENS

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* update hf_dataset

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add lora_dtype for models with non-FP weights

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Add load_in_4bit option

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add default_dtype

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* add load_in_4bit to llm collection

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* rm import

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix asset path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move vlm test

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* move data offline

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use signel gpu

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint fix

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* pylint

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* drop align_labels

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* remove align_labels from llm too

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* use loss * mask instead of loss[mask == 1]

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* fix path

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>

* Apply isort and black reformatting

Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>

---------

Signed-off-by: Alexandros Koumparoulis <akoumparouli@nvidia.com>
Signed-off-by: akoumpa <akoumpa@users.noreply.github.com>
Signed-off-by: Alexandros Koumparoulis <153118171+akoumpa@users.noreply.github.com>
Co-authored-by: akoumpa <akoumpa@users.noreply.github.com>

* ci: Bump release workflow (#11635)

Signed-off-by: Oliver Koenig <okoenig@nvidia.com>

* Add fix docstring for speech commands (#11638)

Signed-off-by: smajumdar <titu1994@gmail.com>

* Fixing Multi_Task_Adapters.ipynb by replacing canary2 with canary_custom (#11641)

Signed-off-by: Weiqing Wang <weiqingw@nvidia.com>

* fixed config name in online augmentation tutorial (#11628)

Signed-off-by: Rauf <rnasretdinov@nvidia.com>

* fix default nodes (#11632)

* add renormalize_blend_weights param (#11647)

Signed-off-by: dimapihtar <dpihtar@gmail.com>

* Sortformer Diarizer 4spk v1 model PR Part 3: Speaker Diarization Mixin (#11511)

* Adding diarization mixin for one click inference

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tan…
zhehuaichen pushed a commit that referenced this pull request Jun 16, 2025
…-NeMo#11766)

* First draft of distill script port to 2.0

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Pipeline-parallel changes

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Basic distillation running

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Add CLI args

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Most fixes

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Fix callbacks in PP loop

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* More fixes

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Rework checkpoint loading

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Resolve seemingly remaining bugs

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Refactor into multiple files

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Integration test

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Clean up strings

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Appease linter

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Remediate failing tests

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Update CICD model definition

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Divert TB logger to same log_dir

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Load CICD model specially

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Fix SP flag

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Move test into own script

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Update cicd dependency

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Update cicd thing #2

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

* Fix new linting errors

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>

---------

Signed-off-by: Asha Anoosheh <aanoosheh@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments