[benchmarking] Adds audio curation benchmark to nightly by rlratzel · Pull Request #1360 · NVIDIA-NeMo/Curator

rlratzel · 2026-01-10T02:04:10Z

Adds an audio benchmark to the nightly benchmark suite.

This benchmark is based on the current audio example and adds code to save metadata and results used by the benchmarking framework.

Note to reviewers: this PR depends on features in #1341, so it has been merged with this branch. When #1341 is merged, the diff should only include changes needed for adding the audio benchmark.

…images with :latest by default, adds session name to slack report. Signed-off-by: rlratzel <rratzel@nvidia.com>

Signed-off-by: rlratzel <rratzel@nvidia.com>

…a_updates Signed-off-by: rlratzel <rratzel@nvidia.com>

…atzel/curator into 2602_benchmark_infra_updates Signed-off-by: rlratzel <rratzel@nvidia.com>

Signed-off-by: rlratzel <rratzel@nvidia.com>

…g script to allow for more flexibility. Signed-off-by: rlratzel <rratzel@nvidia.com>

…n-readable output is needed, updates paths to benchmark output dir. Signed-off-by: rlratzel <rratzel@nvidia.com>

…sults Signed-off-by: rlratzel <rratzel@nvidia.com>

Signed-off-by: rlratzel <rratzel@nvidia.com>

…laceholders were silently ignored, comment cleanup. Signed-off-by: rlratzel <rratzel@nvidia.com>

Signed-off-by: rlratzel <rratzel@nvidia.com>

…k to nightly YAML Signed-off-by: rlratzel <rratzel@nvidia.com>

Signed-off-by: rlratzel <rratzel@nvidia.com>

copy-pr-bot · 2026-01-10T02:04:14Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

greptile-apps · 2026-01-10T02:16:47Z

Greptile Summary

This PR adds an audio curation benchmark based on the FLEURS dataset to the nightly benchmark suite. The implementation follows established patterns from existing benchmarks and adds a reusable write_benchmark_results utility function.

Key Changes:

Added audio_fleurs_benchmark.py script that runs ASR inference on FLEURS dataset and filters by WER threshold
Added write_benchmark_results utility function in runner/utils.py to standardize result file writing across benchmarks
Added audio_fleurs configuration entry to nightly-benchmark.yaml with appropriate parameters

Issues Found:

Copyright year in new script is 2026 instead of 2025 (already noted in previous reviews)

Confidence Score: 4/5

This PR is safe to merge after fixing the copyright year
The implementation follows established benchmark patterns and the core logic is sound. The only blocking issue is the incorrect copyright year (2026 vs 2025), which is a trivial fix. The code reuses well-tested components from the tutorial and adds appropriate error handling and result tracking.
Pay attention to benchmarking/scripts/audio_fleurs_benchmark.py to fix the copyright year

Important Files Changed

Filename	Overview
benchmarking/scripts/audio_fleurs_benchmark.py	New audio benchmark script with copyright year issue (2026 vs 2025) and minor inconsistencies with tutorial
benchmarking/runner/utils.py	Adds `write_benchmark_results` utility function, clean implementation
benchmarking/nightly-benchmark.yaml	Configuration entry for audio benchmark with appropriate parameters

Sequence Diagram

sequenceDiagram
    participant Main as main()
    participant Run as run_audio_fleurs_benchmark()
    participant Pipeline as Audio Pipeline
    participant Executor as XennaExecutor
    participant Utils as write_benchmark_results()
    participant FS as File System

    Main->>Main: Parse arguments
    Main->>Main: Initialize result_dict
    Main->>Main: Convert paths to Path objects
    Main->>Run: run_audio_fleurs_benchmark(args)
    Run->>FS: Check results_dir exists
    Run->>Executor: Create XennaExecutor
    Run->>Pipeline: Create Pipeline
    Run->>Pipeline: Add CreateInitialManifestFleursStage
    Run->>Pipeline: Add InferenceAsrNemoStage
    Run->>Pipeline: Add GetPairwiseWerStage
    Run->>Pipeline: Add GetAudioDurationStage
    Run->>Pipeline: Add PreserveByValueStage
    Run->>Pipeline: Add AudioToDocumentStage
    Run->>Pipeline: Add JsonlWriter
    Pipeline->>Executor: pipeline.run(executor)
    Executor-->>Pipeline: results (tasks)
    Pipeline-->>Run: tasks
    Run-->>Main: result dict with metrics and tasks
    Main->>Main: Update result_dict
    Main->>Main: Set success_code based on is_success
    Main->>Utils: write_benchmark_results(result_dict, path)
    Utils->>FS: Create output directory
    Utils->>FS: Write params.json
    Utils->>FS: Write metrics.json
    Utils->>FS: Write tasks.pkl
    Utils-->>Main: Done
    Main-->>Main: Return success_code

greptile-apps

_{4 files reviewed, 4 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-10T02:16:52Z

@@ -0,0 +1,190 @@
+# Copyright (c) 2026, NVIDIA CORPORATION.  All rights reserved.


The copyright year is set to 2026, which is in the future. This should be 2025 to match the other files in this PR (entry.py, session.py, utils.py all use 2025).

Suggested change

# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

greptile-apps · 2026-01-10T02:16:53Z

+    parser.add_argument("--lang", default="hy_am", help="Language code")
+    parser.add_argument("--split", default="dev", help="Dataset split to use")
+    parser.add_argument("--wer-threshold", type=float, default=5.5, help="WER threshold for filtering")
+    parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")


The --gpus argument is defined as type=int, but the original tutorial code in tutorials/audio/fleurs/pipeline.py line 110 uses type=float. This creates an inconsistency.

The Resources class used in line 104 likely expects a float (GPUs can be fractional for resource allocation). Consider using type=float to match the original tutorial:

Suggested change

parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")

parser.add_argument("--gpus", type=float, default=1, help="Number of GPUs to use")

greptile-apps · 2026-01-10T02:16:54Z

+sys.path.insert(0, str(_repo_dir))
+from benchmarking.runner.utils import write_benchmark_results  # noqa: E402
+
+_expected_num_results = 50


The hardcoded _expected_num_results = 50 makes the benchmark fragile. If dataset parameters change (different split, language, or dataset size), the assertions on lines 137 and 139 will fail.

Consider making this configurable or removing the strict assertions in favor of logging the actual number of results, or at least document why exactly 50 results are expected for this specific configuration (dev split, hy_am language).

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps

_{No files reviewed, no comments}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps

_{6 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-13T20:02:26Z

@@ -0,0 +1,141 @@
+# Copyright (c) 2026, NVIDIA CORPORATION.  All rights reserved.


syntax: Copyright year should be 2025, not 2026

Suggested change

# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps

_{5 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-13T20:34:57Z

+    parser.add_argument("--lang", default="hy_am", help="Language code")
+    parser.add_argument("--split", default="dev", help="Dataset split to use")
+    parser.add_argument("--wer-threshold", type=float, default=5.5, help="WER threshold for filtering")
+    parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")


style: type=int differs from the tutorial's type=float (tutorials/audio/fleurs/pipeline.py:110), though both work with Resources.gpus

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

praateekmahajan

LGTM!

greptile-apps

_{5 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-14T00:27:45Z

@@ -0,0 +1,141 @@
+# Copyright (c) 2026, NVIDIA CORPORATION.  All rights reserved.


syntax: Copyright year should be 2025 to match other files in this PR (entry.py, utils.py, session.py all use 2025)

Suggested change

# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

Signed-off-by: rlratzel <rratzel@nvidia.com>

…urator into 26.02-add_audio_bench Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps

_{3 files reviewed, 1 comment}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-01-14T02:43:20Z

@@ -0,0 +1,141 @@
+# Copyright (c) 2026, NVIDIA CORPORATION.  All rights reserved.


syntax: copyright year should be 2025 to match other files in the repository

Suggested change

# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

rlratzel and others added 26 commits December 12, 2025 21:46

Updates env var names to match other top-level scripts, does not tag …

89bc074

…images with :latest by default, adds session name to slack report. Signed-off-by: rlratzel <rratzel@nvidia.com>

Updates env var for consistency

9b64594

Signed-off-by: rlratzel <rratzel@nvidia.com>

Fixes formatting of help message.

34564d4

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch 'main' into 2602_benchmark_infra_updates

7ef2bf9

Merge remote-tracking branch 'upstream/main' into 2602_benchmark_infr…

0692fbb

…a_updates Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '2602_benchmark_infra_updates' of https://github.com/rlr…

82383a0

…atzel/curator into 2602_benchmark_infra_updates Signed-off-by: rlratzel <rratzel@nvidia.com>

Removes unused support for an artifacts dir.

3ec5b30

Signed-off-by: rlratzel <rratzel@nvidia.com>

Removes unconditional use of --benchmarks-results-dir arg when runnin…

ed1b95e

…g script to allow for more flexibility. Signed-off-by: rlratzel <rratzel@nvidia.com>

Fixes warning condition about not converting to number when only huma…

59de0d9

…n-readable output is needed, updates paths to benchmark output dir. Signed-off-by: rlratzel <rratzel@nvidia.com>

Updates results path to be session_entry_dir so framework can find re…

53ef91d

…sults Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '2602_benchmark_infra_updates' into 26.02-add_image_bench

af646d8

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

eba1928

Signed-off-by: rlratzel <rratzel@nvidia.com>

Adds initial entry for image curation benchmark

cc2bf93

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

026d79c

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

fc61726

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

8a7f0b2

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream' into 26.02-add_image_bench

6e90da1

Signed-off-by: rlratzel <rratzel@nvidia.com>

Adds curator_repo_dir reserved placeholder, fixes bug where invalid p…

69b1d06

…laceholders were silently ignored, comment cleanup. Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream' into 26.02-add_image_bench

58741c4

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream' into 26.02-add_image_bench

323c346

Signed-off-by: rlratzel <rratzel@nvidia.com>

Initial audio benchmark script.

58eaf11

Signed-off-by: rlratzel <rratzel@nvidia.com>

Minor updates for linter.

3324e2f

Signed-off-by: rlratzel <rratzel@nvidia.com>

Adds updates for writing out out json/pickle files, adds new benchmar…

6b461e8

…k to nightly YAML Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream' into 26.02-add_image_bench

a991a07

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream' into 26.02-add_audio_bench

43f1bfc

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_image_bench' into 26.02-add_audio_bench

bc50782

Signed-off-by: rlratzel <rratzel@nvidia.com>

rlratzel marked this pull request as ready for review January 10, 2026 02:12

greptile-apps Bot reviewed Jan 10, 2026

View reviewed changes

rlratzel added 3 commits January 12, 2026 18:03

Merge remote-tracking branch 'upstream/main' into 26.02-add_audio_bench

33ebc5a

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

a1c89be

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_image_bench' into 26.02-add_audio_bench

c6d6589

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps Bot reviewed Jan 13, 2026

View reviewed changes

praateekmahajan reviewed Jan 13, 2026

View reviewed changes

Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated

praateekmahajan reviewed Jan 13, 2026

View reviewed changes

Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated

praateekmahajan reviewed Jan 13, 2026

View reviewed changes

Comment thread benchmarking/scripts/audio_fleurs_benchmark.py Outdated

rlratzel added 4 commits January 13, 2026 12:23

Adds JSON util back

02ba779

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_image_bench' into 26.02-add_audio_bench

6a03d08

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge remote-tracking branch 'upstream/main' into 26.02-add_audio_bench

98e3a56

Signed-off-by: rlratzel <rratzel@nvidia.com>

Removes unneeded assertion code

6e5cd40

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps Bot reviewed Jan 13, 2026

View reviewed changes

rlratzel added 4 commits January 13, 2026 14:03

Merge remote-tracking branch 'upstream/main' into 26.02-add_image_bench

3e556f7

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_image_bench' into 26.02-add_audio_bench

dfb5834

Signed-off-by: rlratzel <rratzel@nvidia.com>

Adds JSON util back

fa311ed

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_image_bench' into 26.02-add_audio_bench

cda714d

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps Bot reviewed Jan 13, 2026

View reviewed changes

praateekmahajan approved these changes Jan 14, 2026

View reviewed changes

Merge branch 'main' into 26.02-add_audio_bench

3f3d5ac

praateekmahajan enabled auto-merge (squash) January 14, 2026 00:25

greptile-apps Bot reviewed Jan 14, 2026

View reviewed changes

rlratzel added 2 commits January 13, 2026 18:35

Merge remote-tracking branch 'upstream/main' into 26.02-add_audio_bench

8f84e03

Signed-off-by: rlratzel <rratzel@nvidia.com>

Merge branch '26.02-add_audio_bench' of https://github.com/rlratzel/c…

957f2e6

…urator into 26.02-add_audio_bench Signed-off-by: rlratzel <rratzel@nvidia.com>

auto-merge was automatically disabled January 14, 2026 00:38
Head branch was pushed to by a user without write access

greptile-apps Bot reviewed Jan 14, 2026

View reviewed changes

Comment thread benchmarking/runner/utils.py Outdated

Removes redundant line from merge

3ed1e14

Signed-off-by: rlratzel <rratzel@nvidia.com>

greptile-apps Bot reviewed Jan 14, 2026

View reviewed changes

praateekmahajan merged commit fb88643 into NVIDIA-NeMo:main Jan 14, 2026
18 checks passed

copy-pr-bot Bot pushed a commit that referenced this pull request Feb 19, 2026

[benchmarking] Adds audio curation benchmark to nightly (#1360)

c0df452

		@@ -0,0 +1,190 @@
		# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

	# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.
	# Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.

	parser.add_argument("--gpus", type=int, default=1, help="Number of GPUs to use")
	parser.add_argument("--gpus", type=float, default=1, help="Number of GPUs to use")

		@@ -0,0 +1,141 @@
		# Copyright (c) 2026, NVIDIA CORPORATION. All rights reserved.

Conversation

rlratzel commented Jan 10, 2026

Uh oh!

copy-pr-bot Bot commented Jan 10, 2026

Uh oh!

greptile-apps Bot commented Jan 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 13, 2026

Choose a reason for hiding this comment

Uh oh!

praateekmahajan left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jan 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

greptile-apps Bot commented Jan 10, 2026 •

edited

Loading