Skip to content

Comments

Sortformer Diarizer 4spk v1 model PR Part 1: models, modules and dataloaders#11282

Merged
tango4j merged 87 commits intoNVIDIA-NeMo:mainfrom
tango4j:sortformer/pr_01
Nov 27, 2024
Merged

Sortformer Diarizer 4spk v1 model PR Part 1: models, modules and dataloaders#11282
tango4j merged 87 commits intoNVIDIA-NeMo:mainfrom
tango4j:sortformer/pr_01

Conversation

@tango4j
Copy link
Collaborator

@tango4j tango4j commented Nov 14, 2024

What does this PR do ?

Sortformer Diarizer Model, 4 speaker limit, v1

Sortformer Paper Link

In this PR, we are adding: model files, module files and corresponding dataloader and evalutations.

Collection: ASR/speaker_tasks

Changelog

  • model files
    nemo/collections/asr/models/sortformer_diar_models.py

  • module files
    nemo/collections/asr/modules/sortformer_modules.py

  • evaluation files
    nemo/collections/asr/metrics/der.py
    nemo/collections/asr/metrics/multi_binary_acc.py

  • dataloader files
    NeMo/nemo/collections/asr/data/audio_to_diar_label.py
    NeMo/nemo/collections/asr/data/audio_to_diar_label_lhotse.py

  • training yaml
    examples/speaker_tasks/diarization/conf/neural_diarizer/sortformer_diarizer_hybrid_loss_4spk-v1.yaml

  • post-processing yaml files
    NeMo/examples/speaker_tasks/diarization/conf/post_processing/sortformer_diar_4spk-v1_callhome-part1.yaml
    NeMo/examples/speaker_tasks/diarization/conf/post_processing/sortformer_diar_4spk-v1_dihard-dev.yaml
    NeMo/nemo/collections/asr/data/audio_to_diar_label.py
    NeMo/nemo/collections/asr/data/audio_to_diar_label_lhotse.py

  • util files
    NeMo/nemo/collections/asr/parts/utils/speaker_utils.py
    NeMo/nemo/collections/asr/parts/utils/vad_utils.py

  • Changed the file names of these yaml files
    examples/speaker_tasks/diarization/neural_diarizer/sortformer_diar_train.py
    nemo/collections/asr/data/audio_to_diar_label.py
    nemo/collections/asr/models/init.py
    nemo/collections/asr/modules/sortformer_modules.py
    nemo/collections/asr/parts/utils/asr_multispeaker_utils.py
    nemo/collections/asr/parts/utils/speaker_utils.py
    nemo/collections/asr/parts/utils/vad_utils.py
    nemo/collections/common/parts/preprocessing/collections.py

Usage

  • You can potentially add a usage example below
python ${NEMO_ROOT}/examples/speaker_tasks/diarization/neural_diarizer/e2e_diarize_speech.py \
     model_path=/path/to/diar_sortformer_4spk-v1.nemo \
     dataset_manifest=/path/to/eval_dataset.json

GitHub Actions CI

CI tests will be added in the second PR.
Third PR will include documents and tutorials.

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
  • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the ASR and speaker_tasks

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Copy link
Contributor

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@tango4j tango4j marked this pull request as ready for review November 14, 2024 09:07
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
tango4j and others added 3 commits November 14, 2024 16:56
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
tango4j and others added 7 commits November 14, 2024 17:53
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: taejinp <tango4j@gmail.com>
@tango4j tango4j added Run CICD and removed Run CICD labels Nov 26, 2024
@tango4j tango4j added Run CICD and removed Run CICD labels Nov 26, 2024
@tango4j tango4j added Run CICD and removed Run CICD labels Nov 27, 2024
@tango4j tango4j removed the Run CICD label Nov 27, 2024
tango4j and others added 4 commits November 26, 2024 17:33
@github-actions
Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

1 similar comment
@github-actions
Copy link
Contributor

beep boop 🤖: 🙏 The following files have warnings. In case you are familiar with these, please try helping us to improve the code base.


Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

@github-actions
Copy link
Contributor

beep boop 🤖: 🚨 The following files must be fixed before merge!


Your code was analyzed with PyLint. The following annotations have been identified:


------------------------------------
Your code has been rated at 10.00/10

Thank you for improving NeMo's documentation!

Copy link
Member

@nithinraok nithinraok left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merge it, once it passes CI

@tango4j tango4j merged commit 505acac into NVIDIA-NeMo:main Nov 27, 2024
XuesongYang pushed a commit to paarthneekhara/NeMo that referenced this pull request Jan 18, 2025
…loaders (NVIDIA-NeMo#11282)

* Adding the first pr files models and dataset

Signed-off-by: taejinp <tango4j@gmail.com>

* Tested all unit-test files

Signed-off-by: taejinp <tango4j@gmail.com>

* Name changes on yaml files and train example

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments and removing unnecessary parts for this PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding docstrings to reflect the PR comments

Signed-off-by: taejinp <tango4j@gmail.com>

* removed the unused find_first_nonzero

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed all pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removing unused varialbe in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed docstrings in training script

Signed-off-by: taejinp <tango4j@gmail.com>

* Line-too-long issue from Pylint fixed

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding get_subsegments_scriptable to prevent jit.script error

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Addressed Code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved conflicts on bce_loss.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed uninit variable issue in bce_loss.py spotted by codeQL

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting PR comments from weiqingw

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Line too long pylint issue resolved in e2e_diarize_speech.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting the comment on Nov 21st  2024.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused variable import time

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding docstrings to score_labels() function in der.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments on YAML files and model file variable changes.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added get_subsegments_scriptable for legacy get_subsegment functions

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved line too long pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
youngeunkwon0405 pushed a commit to youngeunkwon0405/NeMo that referenced this pull request Feb 10, 2025
…loaders (NVIDIA-NeMo#11282)

* Adding the first pr files models and dataset

Signed-off-by: taejinp <tango4j@gmail.com>

* Tested all unit-test files

Signed-off-by: taejinp <tango4j@gmail.com>

* Name changes on yaml files and train example

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments and removing unnecessary parts for this PR

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Adding docstrings to reflect the PR comments

Signed-off-by: taejinp <tango4j@gmail.com>

* removed the unused find_first_nonzero

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed all pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolving pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Removing unused varialbe in audio_to_diar_label.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Fixed docstrings in training script

Signed-off-by: taejinp <tango4j@gmail.com>

* Line-too-long issue from Pylint fixed

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding get_subsegments_scriptable to prevent jit.script error

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Addressed Code-QL issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Resolved conflicts on bce_loss.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Fixed uninit variable issue in bce_loss.py spotted by codeQL

Signed-off-by: taejinp <tango4j@gmail.com>

* Reflecting PR comments from weiqingw

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Line too long pylint issue resolved in e2e_diarize_speech.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting the comment on Nov 21st  2024.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Unused variable import time

Signed-off-by: taejinp <tango4j@gmail.com>

* Adding docstrings to score_labels() function in der.py

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Reflecting comments on YAML files and model file variable changes.

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Added get_subsegments_scriptable for legacy get_subsegment functions

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

* Resolved line too long pylint issues

Signed-off-by: taejinp <tango4j@gmail.com>

* Apply isort and black reformatting

Signed-off-by: tango4j <tango4j@users.noreply.github.com>

---------

Signed-off-by: taejinp <tango4j@gmail.com>
Signed-off-by: tango4j <tango4j@users.noreply.github.com>
Co-authored-by: tango4j <tango4j@users.noreply.github.com>
Signed-off-by: Youngeun Kwon <youngeunk@nvidia.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants