Conversation
695fe82 to
dedd010
Compare
Contributor
📝 WalkthroughWalkthroughNightly test suite updated to reference a new VLM test script version. The new script reduces step counts, updates metric checkpoints to step 130, and changes the reward validation from a single-point threshold to a trailing-window mean check. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Nightly as Nightly Suite
participant Script as VLM v2 Test Script
participant Trainer as Training Job
participant Metrics as Metrics Reader
Nightly->>Script: Invoke test script
Script->>Trainer: Launch training (MAX_STEPS=130)
Trainer-->>Script: Emit logs/metrics
Script->>Metrics: Parse metrics at step 130
rect rgba(200,230,255,0.3)
note right of Metrics: Changed check
Metrics-->>Script: Compute mean(train/reward[130][-6:-1])
Script-->>Nightly: Assert mean > 0.6 and other step-130 checks
end
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested labels
Suggested reviewers
Pre-merge checks and finishing touches✅ Passed checks (4 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
0111fbe to
2a16654
Compare
fix Signed-off-by: Terry Kong <terryk@nvidia.com> rename Signed-off-by: Terry Kong <terryk@nvidia.com> fix Signed-off-by: Terry Kong <terryk@nvidia.com> fix Signed-off-by: Terry Kong <terryk@nvidia.com>
2a16654 to
d7d41a1
Compare
yfw
approved these changes
Oct 1, 2025
This was referenced Oct 3, 2025
PrinsYin
pushed a commit
to PrinsYin/RL
that referenced
this pull request
Nov 30, 2025
Signed-off-by: Terry Kong <terryk@nvidia.com>
yuanhangsu1986
pushed a commit
to yuanhangsu1986/RL-Nemontron-Edge-Omni
that referenced
this pull request
Feb 21, 2026
Signed-off-by: Terry Kong <terryk@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The smolvlm test was never able to get 200 steps in 3 hours. This lowers the step count and makes the final accuracy check an average
https://wandb.ai/nvidia/nemo-rl?nw=x2dvezg4z1l
After #1115 merged the smolvlm test was disabled, I ran the config to check if the metrics would have caught that issue, and it does:
Summary by CodeRabbit