Skip to content

docs(optimizer): Add Muon post-training support#1848

Merged
terrykong merged 12 commits intomainfrom
ashors/muon-main
Mar 17, 2026
Merged

docs(optimizer): Add Muon post-training support#1848
terrykong merged 12 commits intomainfrom
ashors/muon-main

Conversation

@ashors1
Copy link
Copy Markdown
Contributor

@ashors1 ashors1 commented Jan 29, 2026

What does this PR do ?

NOTE: blocked by #1787

W&B report with latest experiments (all using adam-pretrained base models): https://wandb.ai/nvidia/ashors-muon/reports/Muon-for-Post-Training--VmlldzoxNTAzMzcwMA

Issues

List issues that this PR closes (syntax):

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
  • Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

  • ...

Summary by CodeRabbit

  • Documentation

    • Added comprehensive guide for Muon optimizer usage with NeMo RL, including prerequisites, configuration examples, and experimental results.
    • Updated documentation navigation to include the new Muon optimizer guide.
  • Chores

    • Added emerging-optimizers dependency.

Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: ashors1 <ashors@nvidia.com>
@github-actions github-actions Bot added the Documentation Improvements or additions to documentation label Jan 29, 2026
Signed-off-by: Anna Shors <ashors@nvidia.com>
@adityavavreNVDA adityavavreNVDA changed the title Add Muon post-training support feat(optimizer): Add Muon post-training support Feb 27, 2026
@adityavavreNVDA adityavavreNVDA changed the title feat(optimizer): Add Muon post-training support docs(optimizer): Add Muon post-training support Feb 27, 2026
@adityavavreNVDA adityavavreNVDA marked this pull request as ready for review March 3, 2026 23:20
@adityavavreNVDA adityavavreNVDA requested review from a team as code owners March 3, 2026 23:20
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 3, 2026

📝 Walkthrough

Walkthrough

This pull request introduces documentation for the Muon optimizer integration with NeMo RL, including a comprehensive guide with configuration examples, and adds the emerging-optimizers dependency to support the feature. No code changes or public API modifications are included.

Changes

Cohort / File(s) Summary
Documentation
docs/guides/muon-optimizer.md, docs/index.md
New guide documenting Muon optimizer usage, configuration options, and experimental results. Index updated with guide reference.
Dependencies
pyproject.toml
Added emerging-optimizers==0.1.0 dependency to mcore requirements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Results For Major Changes ⚠️ Warning PR adds major change (new dependency emerging-optimizers) but lacks test results documentation in PR description; W&B link provided externally but no numerical metrics included; critical blocker: dependency version doesn't exist on PyPI. Add explicit test results and metrics to PR description; replace invalid dependency version emerging-optimizers==0.1.0 with valid PyPI version; address other documentation issues in review comments.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding documentation for Muon optimizer post-training support. It is concise, specific, and clearly identifies the primary contribution of the changeset.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch ashors/muon-main

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guides/muon-optimizer.md`:
- Line 91: Update the two fenced command blocks that currently lack language
identifiers so they pass MD040 linting: add "bash" after the opening backticks
for the block containing the command "uv run examples/run_sft.py" and likewise
add "bash" for the block containing "uv run examples/run_grpo_math.py" (these
are the two examples referenced around the comment). Ensure both opening fences
read ```bash so the markdown linter recognizes them as shell/command examples.
- Around line 22-23: The example enables both Megatron and DTensor, which
contradicts the "Megatron-only" statement; update the example so only Megatron
is enabled by either removing the policy.dtensor_cfg.enabled line or explicitly
setting policy.dtensor_cfg.enabled=false, and ensure
policy.megatron_cfg.enabled=true remains; reference the config keys
policy.megatron_cfg.enabled and policy.dtensor_cfg.enabled and adjust the
surrounding text to reflect that DTensor is disabled in the Megatron-only
example.
- Line 92: The SFT example command block is not copy/paste runnable: add a
trailing backslash to the end of the "uv run examples/run_sft.py" line so the
shell sees the next lines as continuations, and split the merged config
arguments so "policy.tokenizer.name=Qwen/Qwen3-235B-A22B" and
"checkpointing.enabled=True" are on separate lines (they currently appear merged
on the same line), ensuring each config flag is its own continued line in the
examples/run_sft.py invocation.

In `@pyproject.toml`:
- Line 119: Remove or replace the invalid dependency pin
"emerging-optimizers==0.1.0" in pyproject.toml: either remove the line entirely
or change it to a valid installable spec (e.g., a released PyPI version if
available or a VCS URL like a git+https://...@<commit_or_tag> for the
Emerging-Optimizers repo). Update the dependency entry that currently reads
emerging-optimizers==0.1.0 so the package installer can resolve it during the uv
sync --extra mcore step.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a426896 and 39e1439.

⛔ Files ignored due to path filters (4)
  • docs/assets/muon-dapo-reward.png is excluded by !**/*.png
  • docs/assets/muon-dapo-val-acc.png is excluded by !**/*.png
  • docs/assets/muon-sft-comparison.png is excluded by !**/*.png
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (3)
  • docs/guides/muon-optimizer.md
  • docs/index.md
  • pyproject.toml

Comment thread docs/guides/muon-optimizer.md Outdated
Comment thread docs/guides/muon-optimizer.md Outdated
Comment thread docs/guides/muon-optimizer.md Outdated
Comment thread pyproject.toml
Signed-off-by: Aditya Vavre <avavre@nvidia.com>
@terrykong terrykong added the CI:L0 Run doctests and unit tests label Mar 16, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Mar 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@terrykong
Copy link
Copy Markdown
Collaborator

/ok to test c41ca0a

@terrykong terrykong enabled auto-merge (squash) March 16, 2026 17:30
@terrykong terrykong merged commit 49d0d6d into main Mar 17, 2026
47 of 49 checks passed
@terrykong terrykong deleted the ashors/muon-main branch March 17, 2026 02:51
nbasyl pushed a commit that referenced this pull request Mar 17, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Anna Shors <ashors@nvidia.com>
Signed-off-by: Aditya Vavre <avavre@nvidia.com>
Co-authored-by: adityavavreNVDA <avavre@nvidia.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Signed-off-by: Shih-Yang Liu <shihyangl@nvidia.com>
nbasyl pushed a commit that referenced this pull request Mar 18, 2026
Signed-off-by: ashors1 <ashors@nvidia.com>
Signed-off-by: Anna Shors <ashors@nvidia.com>
Signed-off-by: Aditya Vavre <avavre@nvidia.com>
Co-authored-by: adityavavreNVDA <avavre@nvidia.com>
Co-authored-by: Terry Kong <terrycurtiskong@gmail.com>
Signed-off-by: Shih-Yang Liu <shihyangl@nvidia.com>
@anwithk anwithk added this to the v0.6 Release milestone Mar 20, 2026
@terrykong
Copy link
Copy Markdown
Collaborator

@qiaochuz-nv to QA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI:L0 Run doctests and unit tests Documentation Improvements or additions to documentation QA:Verified

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants