Skip to content

test: Update on-policy distillation release tests#1363

Merged
terrykong merged 9 commits intoNVIDIA-NeMo:mainfrom
zpqiu:fix-distillation-release-tests
Oct 16, 2025
Merged

test: Update on-policy distillation release tests#1363
terrykong merged 9 commits intoNVIDIA-NeMo:mainfrom
zpqiu:fix-distillation-release-tests

Conversation

@zpqiu
Copy link
Copy Markdown
Contributor

@zpqiu zpqiu commented Oct 15, 2025

What does this PR do ?

This pull request updates and simplifies the distillation configuration and test suites. The main focus is on improving validation accuracy requirements, reducing runtime, cleaning up configuration files, and removing scripts and configs for 8B model distillation.

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • New Features

    • Added a new distillation configuration with sequence packing for a 2-node, 8-GPU setup.
  • Chores

    • Increased validation batch sizes and reduced max sequence lengths for student/teacher models.
    • Introduced reverse KL loss; removed optimizer/scheduler blocks and several batching fields.
    • Reduced checkpoint save interval; added a tensor-parallel setting for generation.
    • Removed multiple legacy distillation configurations.
  • Tests

    • Tightened loss thresholds, added validation accuracy checks, and reduced time budgets.
    • Shortened long-run tests; removed select 8B/long convergence suites.
    • Updated release test list to reflect new/retired scenarios.

zpqiu added 9 commits October 14, 2025 02:56
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
@zpqiu zpqiu marked this pull request as ready for review October 15, 2025 12:24
@zpqiu zpqiu requested review from a team as code owners October 15, 2025 12:24
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 15, 2025

📝 Walkthrough

Walkthrough

Updates multiple Qwen distillation YAML configs (parameter adjustments, removals of batch/scheduler blocks), adds a new seqpack config, deletes three configs. Corresponding test scripts tighten time/metric thresholds; two test scripts are removed. Release test list is updated to reflect additions/removals and renames.

Changes

Cohort / File(s) Summary of changes
Configs: parameter simplifications (batch/schedulers)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
Increase val_batch_size (32→256). Remove policy/teacher train_global_batch_size, generation_batch_size, and all scheduler blocks.
Configs: long run tuning and structure updates
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
val_batch_size 32→512; remove max_val_samples; add loss_fn.kl_type: reverse; checkpointing.save_period 50→10; policy: remove train/generation batch sizes, dynamic_batching, optimizer/scheduler/milestones; reduce max_total_sequence_length 32768→20480; add generation.vllm_cfg.tensor_parallel_size: 2; teacher max_total_sequence_length 32768→20480.
Configs: non-colocated simplifications
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
val_batch_size 32→256; remove policy/teacher train/generation batch sizes, dynamic_batching, optimizer/schedulers/milestones; retain dtensor_cfg, make_sequence_length_divisible_by, and generation.colocated: false.
Config: new seqpack variant (added)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
Add new YAML: distillation params, loss_fn.kl_type: reverse, checkpointing, policy/teacher with dtensor settings, sequence_packing enabled, dynamic_batching disabled, and cluster/logger blocks.
Configs: removed
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml, examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml, examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml
Delete entire YAML configurations.
Tests: tightened budgets and metrics
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh, .../distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh, .../distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh, .../distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
Reduce NUM_MINUTES; reduce steps (long variant); tighten train/loss thresholds; add validation/accuracy checks at step; remove GPU memory checks where present.
Tests: removed
tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh, tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh
Delete entire test scripts.
Release test manifest
tests/test_suites/release.txt
Remove 100-step 4b test and 8b long test; update headings; replace 4b-instruct-seqpack with 4b-base-2n8g-seqpack; adjust references accordingly.

Sequence Diagram(s)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • RL#1024 — Similar YAML config simplifications and normalization introduced by a tooling refactor.
  • RL#1006 — Touches the same distillation YAMLs and test scripts adjusted here.

Suggested labels

CI:L1, r0.4.0

Suggested reviewers

  • parthchadha
  • terrykong

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning The PR introduces significant updates to distillation configurations and test suites that could impact convergence behavior and runtime performance, yet the description does not include any concrete test outcomes, regression analysis, or before-and-after performance metrics to demonstrate there is no degradation. Please add documented test results or performance data showing that convergence metrics and runtime targets remain acceptable, including any before-and-after comparisons or numerical evidence to confirm there are no regressions.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title “test: Update on-policy distillation release tests” accurately reflects that the pull request focuses on updating and tightening the distillation test suites, which are a significant portion of the changes; although the PR also includes configuration cleanups and script removals, the title sufficiently conveys the primary testing work.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@zpqiu zpqiu requested a review from terrykong October 15, 2025 12:24
@zpqiu zpqiu added the CI:L1 Run doctests, unit tests, and functional tests label Oct 15, 2025
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml (1)

11-12: save_period exceeds total steps; likely no checkpoints saved.

max_num_steps is 20 but save_period is 50, so nothing gets saved during the run. If you expect an artifact, reduce save_period.

Apply:

 checkpointing:
   checkpoint_dir: checkpoints/distillation-qwen3-32b-to-4b-base-noncolocated
-  save_period: 50
+  save_period: 20
🧹 Nitpick comments (5)
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (1)

10-10: Time budget reduction looks good; consider silencing SC2034 locally.

NUM_MINUTES is harness-consumed; add a shellcheck directive/comment to avoid false positives.

- NUM_MINUTES=120
+# shellcheck disable=SC2034  # Consumed by harness via common.env
+NUM_MINUTES=120

Based on learnings

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml (2)

13-20: Consider aligning generation TP with the long recipe for parity.

Add policy.generation.vllm_cfg.tensor_parallel_size: 2 to mirror the long config behavior.

 policy:
   model_name: Qwen/Qwen3-4B-Base
+  generation:
+    vllm_cfg:
+      tensor_parallel_size: 2
   dtensor_cfg:
     context_parallel_size: 1

Also applies to: 31-37


12-20: Add explicit tensor_parallel_size to policy.dtensor_cfg
The default (distillation_math.yaml) sets tensor_parallel_size: 2, but policy.dtensor_cfg only overrides context_parallel_size. For clarity, include:

 policy:
   dtensor_cfg:
+    tensor_parallel_size: 2
     context_parallel_size: 1
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (1)

10-10: Time budget reduction looks good; silence SC2034 locally if needed.

NUM_MINUTES is harness-consumed; add a directive/comment to avoid ShellCheck noise.

- NUM_MINUTES=120
+# shellcheck disable=SC2034  # Consumed by harness via common.env
+NUM_MINUTES=120

Based on learnings

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (1)

10-10: Time budget reduction acknowledged; consider SC2034 suppression.

NUM_MINUTES is used by the harness; add a directive/comment to appease ShellCheck.

- NUM_MINUTES=120
+# shellcheck disable=SC2034  # Consumed by harness via common.env
+NUM_MINUTES=120

Based on learnings

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c67023 and 03e430d.

📒 Files selected for processing (14)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml (1 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml (0 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml (0 hunks)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml (0 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (2 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh (0 hunks)
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh (0 hunks)
  • tests/test_suites/release.txt (1 hunks)
💤 Files with no reviewable changes (5)
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.yaml
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-2n8g-fsdp2tp2.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.sh
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-instruct-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-8b-base-4n8g-fsdp2tp8-long.v1.yaml
🧰 Additional context used
📓 Path-based instructions (7)
**/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

**/*.sh: Follow the Google Shell Style Guide for all shell scripts
Use uv run to execute Python scripts in shell/driver scripts instead of activating virtualenvs and calling python directly
Add the NVIDIA copyright header (with current year) at the top of all shell scripts, excluding tests/ and test-only scripts

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
tests/test_suites/llm/*.sh

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM driver script filenames must mirror the YAML base name and follow the same pattern with .sh extension

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
tests/test_suites/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place driver shell scripts and common.env under tests/test_suites// and list nightly tests in tests/test_suites/nightly.txt

Files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
  • tests/test_suites/release.txt
  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh
examples/configs/recipes/**/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

examples/configs/recipes/**/*.yaml: Recipe YAMLs under examples/configs/recipes/** are runnable snapshots and may omit documentation
When adding support for a new model, add a recipe YAML under examples/configs/recipes/ in the appropriate domain (llm/ or vlm/) with the correct name

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
examples/configs/recipes/llm/*.yaml

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

LLM recipe YAML filenames must follow: --ng-[-modifiers][-long][.vN].yaml

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
examples/configs/recipes/**/*.{yaml,sh}

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Known exception: Deepscaler recipes may encode context length in place of the cluster tuple (e.g., grpo-deepscaler-1.5b-8K.*); allowed but document intended hardware in the script

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
examples/configs/recipes/**

📄 CodeRabbit inference engine (CODING_GUIDELINES.md)

Place recipe YAMLs under examples/configs/recipes//

Files:

  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml
  • examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml
🧠 Learnings (1)
📚 Learning: 2025-10-12T14:46:57.171Z
Learnt from: zpqiu
PR: NVIDIA-NeMo/RL#1324
File: tests/test_suites/llm/distillation-qwen3-32b-to-1.7b-base-1n8g-megatron-tp2pp2cp2-pack.sh:6-11
Timestamp: 2025-10-12T14:46:57.171Z
Learning: Test scripts in tests/test_suites/llm/ follow a standard configuration pattern that includes NUM_NODES, STEPS_PER_RUN, MAX_STEPS, NUM_RUNS (calculated as `$(( (MAX_STEPS + STEPS_PER_RUN - 1) / STEPS_PER_RUN ))`), and NUM_MINUTES. These variables are part of the test infrastructure's standard interface and should not be flagged as unused even if not directly referenced within the individual script, as they are consumed by external launch tooling or common.env.

Applied to files:

  • tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh
🪛 LanguageTool
tests/test_suites/release.txt

[grammar] ~43-~43: There might be a mistake here.
Context: ... ################ # Long 4b convergence tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~46-~46: Use a hyphen to join words.
Context: ...-4b-base-2n8g-fsdp2tp2-long.v1.sh # 20 step functional tests on dynamic batchin...

(QB_NEW_EN_HYPHEN)


[grammar] ~46-~46: Ensure spelling is correct
Context: ... on dynamic batching, non-colocated and seqence packing features tests/test_suites/llm/...

(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)


[grammar] ~47-~47: There might be a mistake here.
Context: ...4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~48-~48: There might be a mistake here.
Context: ...4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh tests/test_suites/llm/distillation-qwen3...

(QB_NEW_EN)


[grammar] ~49-~49: There might be a mistake here.
Context: ...b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh

(QB_NEW_EN)

🪛 Shellcheck (0.11.0)
tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh

[warning] 9-9: NUM_RUNS appears unused. Verify use (or export if used externally).

(SC2034)


[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh

[warning] 10-10: NUM_MINUTES appears unused. Verify use (or export if used externally).

(SC2034)

🔇 Additional comments (12)
examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.yaml (3)

5-5: Validate memory headroom for val_batch_size=256.

256 may OOM with a 32B teacher depending on seq length/precision. If you see OOMs in release runs, drop to 128.


20-25: Topology alignment LGTM.

policy and teacher each reserve 1×8 GPUs with colocated: false; cluster.num_nodes=2. This matches the non-colocated layout and TP8.

Also applies to: 32-36, 43-43


1-1: Inherited optimizer/scheduler confirmed

distillation_math.yaml defines optimizer (AdamW) and scheduler (LinearLR) under policy; teacher merges policy, so no missing blocks remain.

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.yaml (1)

5-5: Check val_batch_size=256 fits 1×8 GB
Defaults use bfloat16 precision and max_total_sequence_length=8192 for eval with max_val_samples=256. If GPU OOM risk is high, lower val_batch_size or cap sequence length/precision.

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp8-noncolocated.v1.sh (1)

38-39: Tighter metrics LGTM.

Adding validation/accuracy and tightening loss at step 20 aligns with updated configs.

tests/test_suites/release.txt (2)

43-43: Section rename is consistent with dropped 8b runs.

"Long 4b convergence" matches the current suite scope.


49-49: Verified seqpack script and recipe exist with matching names per guidelines.

examples/configs/recipes/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.yaml (1)

5-5: Config tightening LGTM.

  • val_batch_size 512, save_period 10, reverse KL, reduced max seq length, and vLLM TP=2 look consistent with the updated tests.

Also applies to: 7-8, 11-11, 14-17, 20-20

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-1n8g-fsdp2tp2-dynamicbatch.v1.sh (1)

38-39: Metric threshold updates LGTM.

Loss < 0.3 and validation/accuracy > 0.1 at step 20 match the suite changes.

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-seqpack.v1.sh (1)

38-39: Seqpack thresholds LGTM.

Tighter loss and added validation accuracy check are consistent with config.

tests/test_suites/llm/distillation-qwen3-32b-to-4b-base-2n8g-fsdp2tp2-long.v1.sh (2)

7-8: Runtime contraction is reasonable for release tests.

STEPS_PER_RUN=50, MAX_STEPS=100, NUM_MINUTES=240 look balanced for CI.

Also applies to: 10-10


38-39: Metric tightening LGTM.

Loss < 0.25 and validation/accuracy > 0.2 at step 100 improve signal quality.

@terrykong terrykong added r0.4.0 CI:L0 Run doctests and unit tests and removed CI:L1 Run doctests, unit tests, and functional tests labels Oct 16, 2025
@terrykong terrykong enabled auto-merge (squash) October 16, 2025 07:11
@terrykong terrykong merged commit 638bc52 into NVIDIA-NeMo:main Oct 16, 2025
142 of 154 checks passed
chtruong814 pushed a commit that referenced this pull request Oct 16, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
lbliii pushed a commit that referenced this pull request Nov 3, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>
PrinsYin pushed a commit to PrinsYin/RL that referenced this pull request Nov 30, 2025
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
yuanhangsu1986 pushed a commit to yuanhangsu1986/RL-Nemontron-Edge-Omni that referenced this pull request Feb 21, 2026
Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
@zpqiu zpqiu deleted the fix-distillation-release-tests branch April 8, 2026 05:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests r0.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants