Skip to content

Beta candidate#3

Merged
okuchaiev merged 6 commits intomasterfrom
beta-candidate
Sep 2, 2019
Merged

Beta candidate#3
okuchaiev merged 6 commits intomasterfrom
beta-candidate

Conversation

@okuchaiev
Copy link
Collaborator

No description provided.

okuchaiev and others added 6 commits September 2, 2019 13:48
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
Signed-off-by: Oleksii Kuchaiev <okuchaiev@nvidia.com>
@okuchaiev okuchaiev merged commit 447de7d into master Sep 2, 2019
tkornuta-nvidia pushed a commit that referenced this pull request Nov 9, 2019
Signed-off-by: Hoo Chang Shin <hshin@nvidia.com>
Signed-off-by: Tomasz Kornuta <tkornuta@nvidia.com>
@okuchaiev okuchaiev deleted the beta-candidate branch July 27, 2020 20:22
redoctopus pushed a commit that referenced this pull request Oct 13, 2021
* hifigan finetuning setup

* added hifigan training filelists

* default config to 44100 hifigan

* added duration loss parameter fastpitch

* make hifigan scheduler optional

* remove filelists

* some cleanup
borisfom added a commit to borisfom/NeMo that referenced this pull request Mar 6, 2023
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
VahidooX added a commit that referenced this pull request Mar 14, 2023
* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixed trace warnings

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Rearranging tests

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixing non-caching case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* testing

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed channel cache length issue

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* stash

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverting non-essential changes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Offset=None case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Remove test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Clean up speech_to_text_cache_aware_streaming_infer

Signed-off-by: Greg Clark <grclark@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert pad -> constant_pad_nd

Signed-off-by: Greg Clark <grclark@nvidia.com>

* conformer-encoder set window_size from streaming_cfg

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixes for working export(), using more constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Optional rand init for cahce

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Folding update_cache with constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* More folding

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #1

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed unit tests, more reverts

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Export fixes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverted slice changes that ruined ONNX perf

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Adding back keep_all_outputs and drop_extra_preencoded

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fix export

Signed-off-by: Greg Clark <grclark@nvidia.com>

---------

Signed-off-by: Greg Clark <grclark@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
titu1994 pushed a commit to titu1994/NeMo that referenced this pull request Mar 24, 2023
* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixed trace warnings

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Rearranging tests

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixing non-caching case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* testing

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed channel cache length issue

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* stash

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverting non-essential changes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Offset=None case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Remove test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Clean up speech_to_text_cache_aware_streaming_infer

Signed-off-by: Greg Clark <grclark@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert pad -> constant_pad_nd

Signed-off-by: Greg Clark <grclark@nvidia.com>

* conformer-encoder set window_size from streaming_cfg

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixes for working export(), using more constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Optional rand init for cahce

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Folding update_cache with constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* More folding

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #1

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff NVIDIA-NeMo#2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff NVIDIA-NeMo#3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed unit tests, more reverts

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Export fixes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverted slice changes that ruined ONNX perf

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Adding back keep_all_outputs and drop_extra_preencoded

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fix export

Signed-off-by: Greg Clark <grclark@nvidia.com>

---------

Signed-off-by: Greg Clark <grclark@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
hsiehjackson pushed a commit that referenced this pull request Jun 2, 2023
* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixed trace warnings

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Rearranging tests

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixing non-caching case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* testing

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed channel cache length issue

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* cache-aware streaming export

Test onnx streaming conformer ctc WER

Constant att cache width with len param

Remove some extra functions in cache_aware runner

transpose cache so that batch is first for trt

Signed-off-by: Greg Clark <grclark@nvidia.com>

* fix export for full-context conformer

* WIP trying to improve onnx perf

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Adding test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* More perf testing script

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Updates for jit torch_tensorrt tracing

Signed-off-by: Greg Clark <grclark@nvidia.com>

* stash

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverting non-essential changes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Offset=None case

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Remove test scripts

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Clean up speech_to_text_cache_aware_streaming_infer

Signed-off-by: Greg Clark <grclark@nvidia.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Revert pad -> constant_pad_nd

Signed-off-by: Greg Clark <grclark@nvidia.com>

* conformer-encoder set window_size from streaming_cfg

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fixes for working export(), using more constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Optional rand init for cahce

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Folding update_cache with constants

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* More folding

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #1

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #2

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reducing diff #3

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Fixed unit tests, more reverts

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Export fixes

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Reverted slice changes that ruined ONNX perf

Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>

* Adding back keep_all_outputs and drop_extra_preencoded

Signed-off-by: Greg Clark <grclark@nvidia.com>

* Fix export

Signed-off-by: Greg Clark <grclark@nvidia.com>

---------

Signed-off-by: Greg Clark <grclark@nvidia.com>
Signed-off-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: Boris Fomitchev <bfomitchev@nvidia.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Vahid Noroozi <VahidooX@users.noreply.github.com>
Signed-off-by: hsiehjackson <c2hsieh@ucsd.edu>
zhehuaichen added a commit that referenced this pull request Oct 9, 2023
)

* add initial impl of ModularizedSpeechGPTModel and integration test

* fix typo in the test name (#1)

approve the nit change

* clean a initial version of example config; make sure it works by test (#2)

approve as no need to review

* add the test for training_step and fix the code correspondingly (test passed now) (#3)

* add test for validation_step (#4)

* mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input

* Merge heh and zhehuai's initial version of frozen am+llm (#5)

* Merge heh and zhehuai's initial version of frozen am+llm

The previous differences are summarized here:
https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit

This PR includes
1. Finish merging the model, dataset, and config code
2. Previous tests are still enabled and passed (prepare_llm_input, training_step,
    validation_step)
3. the example training script with LS960 has been run to make sure the training
pipeline works

The major remaining works are listed here
https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw

---------

Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* fix a nit init bug broke test (#6)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18)

* wip

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix consumed_samples

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the training restart problem by storing adapter+perception model and
init them from the ckpt

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refix state dict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support wer and inf

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nan guard

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* reimpl inf and bug fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* multi loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* unfreeze lm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* flag for load am

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* tokenizer

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite vocab size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support bpe dropout

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add tarred datasets

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix sample_alpha

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bpe dropout bugs in the mismatched context in tokenization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add bleu metric

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update metrics

Signed-off-by: stevehuang52 <heh@nvidia.com>

* support inference and fix a bug in wer calculation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix bucketing dataset

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support question set file per dataset/data loader in preparation for
multitask understanding; also fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support simple random context for word boosting

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* use sacrebleu.corpus_bleu to be consistent with the rest

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make audio_file optional in the data loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add a tool to materialize mt and text data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* compatible with tar dataset

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for metric and speed up materialization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make num of context configurable

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* val_check_interval fix; make manifest dumping consistent with speech models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* random_context_positive_ratio configurable to control precision

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* bug fix: freeze_llm flag is not passed to the model cfg

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite tensor_model_parallel_size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support both stt and ssl models for loading audio encoder

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the inference config so as to use sampling; allow inference config update in training

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper.
also make sure test passed

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update default inference config and test golden value accordingly

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* integration test and minor fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit bug fix on manifest_filepath introduced by code cleanup

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update workspace/ files; consider moving to examples later

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* further remove unnecessary stuff in the inference implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert the update in default end_string to be compatible with legacy models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: Zhehuai Chen <chenzhehuai.sjtu@aispeech.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
zhehuaichen added a commit that referenced this pull request Oct 13, 2023
)

* add initial impl of ModularizedSpeechGPTModel and integration test

* fix typo in the test name (#1)

approve the nit change

* clean a initial version of example config; make sure it works by test (#2)

approve as no need to review

* add the test for training_step and fix the code correspondingly (test passed now) (#3)

* add test for validation_step (#4)

* mv audio and text emb concat to prepare_llm_input so as to write test to guard the llm input

* Merge heh and zhehuai's initial version of frozen am+llm (#5)

* Merge heh and zhehuai's initial version of frozen am+llm

The previous differences are summarized here:
https://docs.google.com/document/d/1zNI4hC6vJtUfcHbrUSPaMuYWRBQdN_36H0P2NiBiuPY/edit

This PR includes
1. Finish merging the model, dataset, and config code
2. Previous tests are still enabled and passed (prepare_llm_input, training_step,
    validation_step)
3. the example training script with LS960 has been run to make sure the training
pipeline works

The major remaining works are listed here
https://docs.google.com/document/d/1o0AM7v4gcTQkPZjE0Vl9TTX4vYnGTrbXEFGWh0UhGlk/edit#bookmark=id.pzvdadt5oxyw

---------

Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* fix a nit init bug broke test (#6)

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Clean up implementation for SALM paper and sync to NEMO v1.20.0 (#18)

* wip

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix consumed_samples

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the training restart problem by storing adapter+perception model and
init them from the ckpt

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refix state dict

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support wer and inf

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nan guard

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* reimpl inf and bug fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* multi loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* unfreeze lm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* flag for load am

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* tokenizer

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite vocab size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support bpe dropout

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add tarred datasets

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix sample_alpha

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bpe dropout bugs in the mismatched context in tokenization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add bleu metric

Signed-off-by: stevehuang52 <heh@nvidia.com>

* update metrics

Signed-off-by: stevehuang52 <heh@nvidia.com>

* support inference and fix a bug in wer calculation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix bucketing dataset

Signed-off-by: stevehuang52 <heh@nvidia.com>

* fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support question set file per dataset/data loader in preparation for
multitask understanding; also fix bleu implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support simple random context for word boosting

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* use sacrebleu.corpus_bleu to be consistent with the rest

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make audio_file optional in the data loader

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* add a tool to materialize mt and text data

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* compatible with tar dataset

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* temp fix for metric and speed up materialization

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* make num of context configurable

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* val_check_interval fix; make manifest dumping consistent with speech models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* random_context_positive_ratio configurable to control precision

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* bug fix: freeze_llm flag is not passed to the model cfg

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* overwrite tensor_model_parallel_size

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* support both stt and ssl models for loading audio encoder

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* fix the inference config so as to use sampling; allow inference config update in training

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* refactorize and clean up code for preprocessing collections, dataset interface, model inference and rename some classes to be consistent with salm paper.
also make sure test passed

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* Undo changes in megatron_gpt_peft_models.py and move them to speechllm_models.py; make sure the correctness by test_speechllm_models.py::TestModularizedAudioGPTModel::test_predict_step

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update default inference config and test golden value accordingly

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* integration test and minor fix

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* nit bug fix on manifest_filepath introduced by code cleanup

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update workspace/ files; consider moving to examples later

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* further remove unnecessary stuff in the inference implementation

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* revert the update in default end_string to be compatible with legacy models

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>

* rename 'ModularizedAudioGPTModel' to 'ModularAudioGPTLoRAModel'; move speechllm stuff under nemo/collections/multimodal/speechllm

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

* update copyright; remove workspace/scripts and workspace/tools folders since the main branch has LLaMA support

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>

---------

Signed-off-by: zhehuaichen <dian.chenzhehuai@gmail.com>
Signed-off-by: stevehuang52 <heh@nvidia.com>
Co-authored-by: Zhehuai Chen <chenzhehuai.sjtu@aispeech.com>
Co-authored-by: He Huang (Steve) <105218074+stevehuang52@users.noreply.github.com>
Co-authored-by: stevehuang52 <heh@nvidia.com>
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request Nov 29, 2023
tango4j referenced this pull request in chimechallenge/C8DASR-Baseline-NeMo Feb 9, 2024
Major fixes for C8DASR recipe: GSS and Channel clustering
pzelasko pushed a commit to pzelasko/NeMo that referenced this pull request May 8, 2024
monica-sekoyan pushed a commit to monica-sekoyan/NeMo that referenced this pull request Aug 7, 2024
dcurran90 pushed a commit to dcurran90/NeMo that referenced this pull request Oct 15, 2024
Some help to get the root cli to run
ankitapasad pushed a commit to ankitapasad/NeMo that referenced this pull request Nov 7, 2025
…-sysprompt

support training and inference for data with system prompt
ankitapasad pushed a commit to ankitapasad/NeMo that referenced this pull request Feb 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments