Skip to content

docs: V0.5 perf results#1771

Closed
guyueh1 wants to merge 4 commits intoNVIDIA-NeMo:mainfrom
guyueh1:v0.5_perf_results
Closed

docs: V0.5 perf results#1771
guyueh1 wants to merge 4 commits intoNVIDIA-NeMo:mainfrom
guyueh1:v0.5_perf_results

Conversation

@guyueh1
Copy link
Copy Markdown
Contributor

@guyueh1 guyueh1 commented Jan 13, 2026

What does this PR do ?

Update performance results for v0.5

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • Documentation

    • Expanded performance summary with new benchmark sections covering H100 BF16, H100 FP8, and GB200 BF16 configurations with detailed results tables.
  • New Features

    • Added GRPO performance testing configuration for Qwen3-30B with 128K context length support.
    • Added automated performance test suite for GRPO experiments including metrics validation.
  • Chores

    • Updated submodule dependency references.

✏️ Tip: You can customize this high-level summary in your review settings.

Sherif Hosam Fouad Fawzy and others added 4 commits December 10, 2025 16:01
Signed-off-by: Sherif Hosam Fouad Fawzy <sfawzy@login-eos01.eos.clusters.nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
@guyueh1 guyueh1 requested review from a team as code owners January 13, 2026 23:53
@github-actions github-actions Bot added the Documentation Improvements or additions to documentation label Jan 13, 2026
@guyueh1 guyueh1 closed this Jan 13, 2026
@github-actions
Copy link
Copy Markdown

✅ Submodule Fast-Forward Check Results

Check based on commit: dd8189a (PR #1771 from v0.5_perf_results)

✅ Submodules that are properly updated:

Megatron-Bridge: ✅ PR branch is ahead of main branch (fast-forward)

All submodule changes look good! ✨

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Jan 13, 2026

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

This PR updates the Megatron-Bridge submodule, expands the performance benchmark documentation with H100/GB200 results, introduces a new GRPO configuration for Qwen3-30B with 128K sequence length, adds rope-scaling (yarn) embedding support in model imports, and registers a new performance test.

Changes

Cohort / File(s) Summary
Submodule Update
3rdparty/Megatron-Bridge-workspace/Megatron-Bridge
Pointer advanced from 1e9a459 to 812dcf75; no functional changes.
Documentation
docs/about/performance-summary.md
Restructured and expanded with three new benchmark sections (H100 BF16, H100 FP8, GB200 BF16), introducing explicit Algorithm and Model columns, updated metrics layout, and dataset references.
GRPO Qwen3-30B 128K Support
examples/configs/recipes/llm/performance/grpo-qwen3-30ba3b-4n8g-128K.yaml
New configuration enabling 128K sequence length with tensor/pipeline/expert/context parallelism, yarn rope scaling, and logging setup.
Model Import Enhancement
nemo_rl/models/megatron/community_import.py
Added conditional logic to apply yarn-based rope scaling embeddings when NRL_MCORE_OVERRIDE_EMBEDDING_TYPE=1 and rope_type="yarn" in config overrides.
Performance Test Suite
tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n8g-128K.sh
New shell script executing GRPO performance experiment with metric validation for training loss and token probability error thresholds.
Test Registry
tests/test_suites/performance_h100.txt
Added new test script path to SYNC suite.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly Related PRs

Suggested Labels

documentation, CI:docs, r0.5.0

Suggested Reviewers

  • terrykong
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e95efb9 and dd8189a.

📒 Files selected for processing (6)
  • 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge
  • docs/about/performance-summary.md
  • examples/configs/recipes/llm/performance/grpo-qwen3-30ba3b-4n8g-128K.yaml
  • nemo_rl/models/megatron/community_import.py
  • tests/test_suites/llm/performance/grpo-qwen3-30ba3b-4n8g-128K.sh
  • tests/test_suites/performance_h100.txt

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant