Standalone diarization+ASR evaluation script#5439
Conversation
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
for more information, see https://pre-commit.ci
|
This pull request introduces 1 alert when merging aea2e6c into 2ecfb7a - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. Please enable GitHub code scanning, which uses the same CodeQL engine ⚙️ that powers LGTM.com. For more information, please check out our post on the GitHub blog. |
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <tango4j@gmail.com>
… into mulspk_asr_eval_script
Signed-off-by: Taejin Park <tango4j@gmail.com>
… into mulspk_asr_eval_script
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
for more information, see https://pre-commit.ci
* clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
* change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
* Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com>
* Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
…5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com>
…n MegatronGPT (#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca>
* Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com>
Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com>
* Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru>
Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com>
…rred dataset (#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com>
adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com>
Signed-off-by: Taejin Park <tango4j@gmail.com>
| filter_speech_first: True | ||
|
|
||
| speaker_embeddings: | ||
| model_path: titanet_large # .nemo local model path or pretrained model name (titanet_large, ecapa_tdnn or speakerverification_speakernet) |
There was a problem hiding this comment.
Is it titanet_large or titanet_small ?
There was a problem hiding this comment.
If titanet_small is published, we should change this right away
| return total_result_jsons | ||
|
|
||
|
|
||
| def get_audacity_label(word: str, stt_sec: float, end_sec: float, speaker: str) -> str: |
There was a problem hiding this comment.
Is this moved from speaker_utils? Why?
There was a problem hiding this comment.
synced offline (because of @staticmethod functions we cannot put it in the class)
There was a problem hiding this comment.
shall we keep this part of speaker_utils?
Signed-off-by: Taejin Park <tango4j@gmail.com>
… into mulspk_asr_eval_script
for more information, see https://pre-commit.ci
Signed-off-by: Taejin Park <tango4j@gmail.com>
… into mulspk_asr_eval_script
| collar: 0.25 # Collar value for scoring | ||
| ignore_overlap: True # Consider or ignore overlap segments while scoring | ||
|
|
||
| vad: |
There was a problem hiding this comment.
This part looks good to me.
There was a problem hiding this comment.
thanks for checking
|
This pull request introduces 1 alert when merging e2d519d into bccf6d5 - view on LGTM.com new alerts:
Heads-up: LGTM.com's PR analysis will be disabled on the 5th of December, and LGTM.com will be shut down ⏻ completely on the 16th of December 2022. It looks like GitHub code scanning with CodeQL is already set up for this repo, so no further action is needed 🚀. For more information, please check out our post on the GitHub blog. |
Signed-off-by: Taejin Park <tango4j@gmail.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA-NeMo#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA-NeMo#5410) (NVIDIA-NeMo#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA-NeMo#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421) (NVIDIA-NeMo#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA-NeMo#5413) (NVIDIA-NeMo#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA-NeMo#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421)" (NVIDIA-NeMo#5431) (NVIDIA-NeMo#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA-NeMo#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA-NeMo#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA-NeMo#5420) (NVIDIA-NeMo#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA-NeMo#5382) (NVIDIA-NeMo#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA-NeMo#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA-NeMo#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA-NeMo#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA-NeMo#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA-NeMo#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA-NeMo#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA-NeMo#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA-NeMo#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: shane carroll <shane.carroll@utsa.edu>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA-NeMo#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA-NeMo#5410) (NVIDIA-NeMo#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA-NeMo#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421) (NVIDIA-NeMo#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA-NeMo#5413) (NVIDIA-NeMo#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA-NeMo#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421)" (NVIDIA-NeMo#5431) (NVIDIA-NeMo#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA-NeMo#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA-NeMo#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA-NeMo#5420) (NVIDIA-NeMo#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA-NeMo#5382) (NVIDIA-NeMo#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA-NeMo#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA-NeMo#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA-NeMo#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA-NeMo#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA-NeMo#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA-NeMo#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA-NeMo#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA-NeMo#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA-NeMo#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA-NeMo#5410) (NVIDIA-NeMo#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA-NeMo#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421) (NVIDIA-NeMo#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA-NeMo#5413) (NVIDIA-NeMo#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA-NeMo#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421)" (NVIDIA-NeMo#5431) (NVIDIA-NeMo#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA-NeMo#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA-NeMo#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA-NeMo#5420) (NVIDIA-NeMo#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA-NeMo#5382) (NVIDIA-NeMo#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA-NeMo#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA-NeMo#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA-NeMo#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA-NeMo#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA-NeMo#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA-NeMo#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA-NeMo#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA-NeMo#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: Hainan Xu <hainanx@nvidia.com>
* first commit on eval_diar_with_asr.py Signed-off-by: Taejin Park <tango4j@gmail.com> * Add a standalone diarization-ASR evaluation transcript Signed-off-by: Taejin Park <tango4j@gmail.com> * Fixed examples in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixed staticmethod error Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> * fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * combine into 1 commit Signed-off-by: Taejin Park <tango4j@gmail.com> * Added description on eval modes Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add MoE support for T5 model (w/o expert parallel) (NVIDIA-NeMo#5409) * clean Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * kwarg ref Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * extra args Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * test Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * rm prints Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * style Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * review comments Signed-off-by: Abhinav Khattar <aklife97@gmail.com> * fix Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Fix args (NVIDIA-NeMo#5410) (NVIDIA-NeMo#5416) Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Fix for concat map dataset (NVIDIA-NeMo#5133) * change for concat map dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Exhaust longest dataset * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: 1-800-BAD-CODE <> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421) (NVIDIA-NeMo#5422) Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: Yu Yao <yuya@nvidia.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * Fix GPT generation when using sentencepiece tokenizer (NVIDIA-NeMo#5413) (NVIDIA-NeMo#5428) * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Support for finetuning and finetuning inference with .ckpt files & batch size refactoring (NVIDIA-NeMo#5339) * Initial refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Resolve config before passing to load_from_checkpoint Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for model parallel and nemo restore Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fixes for eval Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Revert config changes Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Refactor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix typo Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove comments Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Minor Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix validation reconfiguration Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Remove old comment Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fixes for test_ds Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Revert "Add temporary fix for CUDA issue in Dockerfile (NVIDIA-NeMo#5421)" (NVIDIA-NeMo#5431) (NVIDIA-NeMo#5432) This reverts commit 0718b17. Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> * [ITN] fix year date graph, cardinals extension for hundreds (NVIDIA-NeMo#5435) * wip Signed-off-by: ekmb <ebakhturina@nvidia.com> * add lociko's hundreds extension for cardinals Signed-off-by: ekmb <ebakhturina@nvidia.com> * add optional end Signed-off-by: ekmb <ebakhturina@nvidia.com> * restart ci Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> * update doc in terms of get_label for lang id model (NVIDIA-NeMo#5366) * reflect PR 5278 ion doc Signed-off-by: fayejf <fayejf07@gmail.com> * reflect comment Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: fayejf <fayejf07@gmail.com> * Revert workaround for T5 that sets number of workers to 0 & sync_batch_comm=False (NVIDIA-NeMo#5420) (NVIDIA-NeMo#5433) * Revert workers workaround Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix in config Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> * Fix Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> * Fixed bug in notebook (NVIDIA-NeMo#5382) (NVIDIA-NeMo#5394) Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> * Fixing bug in Megatron BERT when loss mask is all zeros (NVIDIA-NeMo#5424) * Fixing bug when loss mask is fully zero Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update megatron_bert_model.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> * Update dataset_utils.py Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * Use updated API for overlapping grad sync with pipeline parallelism (NVIDIA-NeMo#5236) Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * support to disable sequence length + 1 input tokens for each sample in MegatronGPT (NVIDIA-NeMo#5363) * support to disable sequence length + 1 input tokens for MegatronGPT * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> * [TTS] Create script for processing TTS training audio (NVIDIA-NeMo#5262) * Create script for processing TTS training audio * Update VAD trimming logic * Remove unused import Signed-off-by: Ryan <rlangman@nvidia.com> * [TTS] remove useless logic for set_tokenizer. (NVIDIA-NeMo#5430) Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> * Fix setting up of `ReduceLROnPlateau` learning rate scheduler (NVIDIA-NeMo#5444) * Fix tests Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Add accidentally lost changes Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: PeganovAnton <peganoff2@mail.ru> * Create codeql.yml (NVIDIA-NeMo#5445) Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> * Fix for getting tokenizer in character-based ASR models when using tarred dataset (NVIDIA-NeMo#5442) Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> * Combine 5 commits adding diar_infer_general.yaml Signed-off-by: Taejin Park <tango4j@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Update codeql.yml Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> fix msdd_model in general yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> fixed errors in yaml file Signed-off-by: Taejin Park <tango4j@gmail.com> * moved eval_der function and fixed tqdm options Signed-off-by: Taejin Park <tango4j@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Changed minor error in docstrings Signed-off-by: Taejin Park <tango4j@gmail.com> * removed score_labels and changed leave=True Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Taejin Park <tango4j@gmail.com> Signed-off-by: Abhinav Khattar <aklife97@gmail.com> Signed-off-by: MaximumEntropy <sandeep.subramanian.1@umontreal.ca> Signed-off-by: Yu Yao <yuya@nvidia.com> Signed-off-by: ekmb <ebakhturina@nvidia.com> Signed-off-by: fayejf <fayejf07@gmail.com> Signed-off-by: Virginia Adams <vadams@nvidia.com> Signed-off-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Ryan <rlangman@nvidia.com> Signed-off-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Signed-off-by: PeganovAnton <peganoff2@mail.ru> Signed-off-by: Somshubra Majumdar <titu1994@gmail.com> Signed-off-by: Jonghwan Hyeon <hyeon0145@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Abhinav Khattar <aklife97@gmail.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Sandeep Subramanian <sandeep.subramanian.1@umontreal.ca> Co-authored-by: Shane Carroll <50530592+1-800-BAD-CODE@users.noreply.github.com> Co-authored-by: Oleksii Kuchaiev <okuchaiev@users.noreply.github.com> Co-authored-by: yaoyu-33 <54727607+yaoyu-33@users.noreply.github.com> Co-authored-by: Yi Dong <yidong@nvidia.com> Co-authored-by: Evelina <10428420+ekmb@users.noreply.github.com> Co-authored-by: fayejf <36722593+fayejf@users.noreply.github.com> Co-authored-by: Virginia Adams <78445382+vadam5@users.noreply.github.com> Co-authored-by: Shanmugam Ramasamy <111910568+shanmugamr1992@users.noreply.github.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: anmolgupt <14880251+anmolgupt@users.noreply.github.com> Co-authored-by: Anmol Gupta <anmolg@nvidia.com> Co-authored-by: Ryan Langman <rlangman@nvidia.com> Co-authored-by: Xuesong Yang <1646669+XuesongYang@users.noreply.github.com> Co-authored-by: PeganovAnton <peganoff2@mail.ru> Co-authored-by: Somshubra Majumdar <titu1994@gmail.com> Co-authored-by: Jonghwan Hyeon <jonghwanhyeon93@gmail.com> Signed-off-by: andrusenkoau <andrusenkoau@gmail.com>
What does this PR do ?
Add a standalone diarization-ASR evaluation transcript.
This script was build at the request for testing ASR+diarization with already extracted files.
Adding diar_infer_general.yaml (VAD optimized for multilingual ASR dataset, diarization optimized on DIHARD dev set)
Collection: [Note which collection this PR will affect]
ASR
Changelog
Usage
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone belongs to ASR
Additional Information