Skip to content

[benchmarking] Adds audio curation benchmark to nightly#1360

Merged
praateekmahajan merged 45 commits intoNVIDIA-NeMo:mainfrom
rlratzel:26.02-add_audio_bench
Jan 14, 2026
Merged

[benchmarking] Adds audio curation benchmark to nightly#1360
praateekmahajan merged 45 commits intoNVIDIA-NeMo:mainfrom
rlratzel:26.02-add_audio_bench

Conversation

@rlratzel
Copy link
Copy Markdown
Contributor

Adds an audio benchmark to the nightly benchmark suite.

This benchmark is based on the current audio example and adds code to save metadata and results used by the benchmarking framework.

Note to reviewers: this PR depends on features in #1341, so it has been merged with this branch. When #1341 is merged, the diff should only include changes needed for adding the audio benchmark.

rlratzel and others added 26 commits December 12, 2025 21:46
…images with :latest by default, adds session name to slack report.

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
…a_updates

Signed-off-by: rlratzel <rratzel@nvidia.com>
…atzel/curator into 2602_benchmark_infra_updates

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
…g script to allow for more flexibility.

Signed-off-by: rlratzel <rratzel@nvidia.com>
…n-readable output is needed, updates paths to benchmark output dir.

Signed-off-by: rlratzel <rratzel@nvidia.com>
…sults

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
…laceholders were silently ignored, comment cleanup.

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
…k to nightly YAML

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Jan 10, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rlratzel rlratzel marked this pull request as ready for review January 10, 2026 02:12
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jan 10, 2026

Greptile Summary

This PR adds an audio curation benchmark based on the FLEURS dataset to the nightly benchmark suite. The implementation follows established patterns from existing benchmarks and adds a reusable write_benchmark_results utility function.

Key Changes:

  • Added audio_fleurs_benchmark.py script that runs ASR inference on FLEURS dataset and filters by WER threshold
  • Added write_benchmark_results utility function in runner/utils.py to standardize result file writing across benchmarks
  • Added audio_fleurs configuration entry to nightly-benchmark.yaml with appropriate parameters

Issues Found:

  • Copyright year in new script is 2026 instead of 2025 (already noted in previous reviews)

Confidence Score: 4/5

  • This PR is safe to merge after fixing the copyright year
  • The implementation follows established benchmark patterns and the core logic is sound. The only blocking issue is the incorrect copyright year (2026 vs 2025), which is a trivial fix. The code reuses well-tested components from the tutorial and adds appropriate error handling and result tracking.
  • Pay attention to benchmarking/scripts/audio_fleurs_benchmark.py to fix the copyright year

Important Files Changed

Filename Overview
benchmarking/scripts/audio_fleurs_benchmark.py New audio benchmark script with copyright year issue (2026 vs 2025) and minor inconsistencies with tutorial
benchmarking/runner/utils.py Adds write_benchmark_results utility function, clean implementation
benchmarking/nightly-benchmark.yaml Configuration entry for audio benchmark with appropriate parameters

Sequence Diagram

sequenceDiagram
    participant Main as main()
    participant Run as run_audio_fleurs_benchmark()
    participant Pipeline as Audio Pipeline
    participant Executor as XennaExecutor
    participant Utils as write_benchmark_results()
    participant FS as File System

    Main->>Main: Parse arguments
    Main->>Main: Initialize result_dict
    Main->>Main: Convert paths to Path objects
    Main->>Run: run_audio_fleurs_benchmark(args)
    Run->>FS: Check results_dir exists
    Run->>Executor: Create XennaExecutor
    Run->>Pipeline: Create Pipeline
    Run->>Pipeline: Add CreateInitialManifestFleursStage
    Run->>Pipeline: Add InferenceAsrNemoStage
    Run->>Pipeline: Add GetPairwiseWerStage
    Run->>Pipeline: Add GetAudioDurationStage
    Run->>Pipeline: Add PreserveByValueStage
    Run->>Pipeline: Add AudioToDocumentStage
    Run->>Pipeline: Add JsonlWriter
    Pipeline->>Executor: pipeline.run(executor)
    Executor-->>Pipeline: results (tasks)
    Pipeline-->>Run: tasks
    Run-->>Main: result dict with metrics and tasks
    Main->>Main: Update result_dict
    Main->>Main: Set success_code based on is_success
    Main->>Utils: write_benchmark_results(result_dict, path)
    Utils->>FS: Create output directory
    Utils->>FS: Write params.json
    Utils->>FS: Write metrics.json
    Utils->>FS: Write tasks.pkl
    Utils-->>Main: Done
    Main-->>Main: Return success_code
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, 4 comments

Edit Code Review Agent Settings | Greptile

Comment thread benchmarking/runner/utils.py Outdated
@@ -0,0 +1,190 @@
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The copyright year is set to 2026, which is in the future. This should be 2025 to match the other files in this PR (entry.py, session.py, utils.py all use 2025).

Suggested change
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

parser.add_argument("--lang", default="hy_am", help="Language code")
parser.add_argument("--split", default="dev", help="Dataset split to use")
parser.add_argument("--wer-threshold", type=float, default=5.5, help="WER threshold for filtering")
parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --gpus argument is defined as type=int, but the original tutorial code in tutorials/audio/fleurs/pipeline.py line 110 uses type=float. This creates an inconsistency.

The Resources class used in line 104 likely expects a float (GPUs can be fractional for resource allocation). Consider using type=float to match the original tutorial:

Suggested change
parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")
parser.add_argument("--gpus", type=float, default=1, help="Number of GPUs to use")

sys.path.insert(0, str(_repo_dir))
from benchmarking.runner.utils import write_benchmark_results # noqa: E402

_expected_num_results = 50
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded _expected_num_results = 50 makes the benchmark fragile. If dataset parameters change (different split, language, or dataset size), the assertions on lines 137 and 139 will fail.

Consider making this configurable or removing the strict assertions in favor of logging the actual number of results, or at least document why exactly 50 results are expected for this specific configuration (dev split, hy_am language).

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No files reviewed, no comments

Edit Code Review Agent Settings | Greptile

Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated
Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated
Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

6 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@@ -0,0 +1,141 @@
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Copyright year should be 2025, not 2026

Suggested change
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Signed-off-by: rlratzel <rratzel@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

parser.add_argument("--lang", default="hy_am", help="Language code")
parser.add_argument("--split", default="dev", help="Dataset split to use")
parser.add_argument("--wer-threshold", type=float, default=5.5, help="WER threshold for filtering")
parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: type=int differs from the tutorial's type=float (tutorials/audio/fleurs/pipeline.py:110), though both work with Resources.gpus

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Contributor

@praateekmahajan praateekmahajan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@praateekmahajan praateekmahajan enabled auto-merge (squash) January 14, 2026 00:25
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@@ -0,0 +1,141 @@
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: Copyright year should be 2025 to match other files in this PR (entry.py, utils.py, session.py all use 2025)

Suggested change
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

…urator into 26.02-add_audio_bench

Signed-off-by: rlratzel <rratzel@nvidia.com>
auto-merge was automatically disabled January 14, 2026 00:38

Head branch was pushed to by a user without write access

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment thread benchmarking/runner/utils.py Outdated
Signed-off-by: rlratzel <rratzel@nvidia.com>
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

@@ -0,0 +1,141 @@
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

syntax: copyright year should be 2025 to match other files in the repository

Suggested change
# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

@praateekmahajan praateekmahajan merged commit fb88643 into NVIDIA-NeMo:main Jan 14, 2026
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants