docs(optimizer): Add Muon post-training support by ashors1 · Pull Request #1848 · NVIDIA-NeMo/RL

ashors1 · 2026-01-29T20:23:16Z

What does this PR do ?

NOTE: blocked by #1787

W&B report with latest experiments (all using adam-pretrained base models): https://wandb.ai/nvidia/ashors-muon/reports/Muon-for-Post-Training--VmlldzoxNTAzMzcwMA

Issues

List issues that this PR closes (syntax):

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you run the unit tests and functional tests locally? Visit our Testing Guide for how to run tests
Did you add or update any necessary documentation? Visit our Document Development Guide for how to write, build and test the docs.

Additional Information

...

Summary by CodeRabbit

Documentation
- Added comprehensive guide for Muon optimizer usage with NeMo RL, including prerequisites, configuration examples, and experimental results.
- Updated documentation navigation to include the new Muon optimizer guide.
Chores
- Added emerging-optimizers dependency.

Signed-off-by: ashors1 <ashors@nvidia.com>

Signed-off-by: Anna Shors <ashors@nvidia.com>

coderabbitai · 2026-03-03T23:25:28Z

📝 Walkthrough

Walkthrough

This pull request introduces documentation for the Muon optimizer integration with NeMo RL, including a comprehensive guide with configuration examples, and adds the emerging-optimizers dependency to support the feature. No code changes or public API modifications are included.

Changes

Cohort / File(s)	Summary
Documentation `docs/guides/muon-optimizer.md`, `docs/index.md`	New guide documenting Muon optimizer usage, configuration options, and experimental results. Index updated with guide reference.
Dependencies `pyproject.toml`	Added `emerging-optimizers==0.1.0` dependency to mcore requirements.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Results For Major Changes	⚠️ Warning	PR adds major change (new dependency emerging-optimizers) but lacks test results documentation in PR description; W&B link provided externally but no numerical metrics included; critical blocker: dependency version doesn't exist on PyPI.	Add explicit test results and metrics to PR description; replace invalid dependency version emerging-optimizers==0.1.0 with valid PyPI version; address other documentation issues in review comments.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding documentation for Muon optimizer post-training support. It is concise, specific, and clearly identifies the primary contribution of the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch ashors/muon-main

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/guides/muon-optimizer.md`:
- Line 91: Update the two fenced command blocks that currently lack language
identifiers so they pass MD040 linting: add "bash" after the opening backticks
for the block containing the command "uv run examples/run_sft.py" and likewise
add "bash" for the block containing "uv run examples/run_grpo_math.py" (these
are the two examples referenced around the comment). Ensure both opening fences
read ```bash so the markdown linter recognizes them as shell/command examples.
- Around line 22-23: The example enables both Megatron and DTensor, which
contradicts the "Megatron-only" statement; update the example so only Megatron
is enabled by either removing the policy.dtensor_cfg.enabled line or explicitly
setting policy.dtensor_cfg.enabled=false, and ensure
policy.megatron_cfg.enabled=true remains; reference the config keys
policy.megatron_cfg.enabled and policy.dtensor_cfg.enabled and adjust the
surrounding text to reflect that DTensor is disabled in the Megatron-only
example.
- Line 92: The SFT example command block is not copy/paste runnable: add a
trailing backslash to the end of the "uv run examples/run_sft.py" line so the
shell sees the next lines as continuations, and split the merged config
arguments so "policy.tokenizer.name=Qwen/Qwen3-235B-A22B" and
"checkpointing.enabled=True" are on separate lines (they currently appear merged
on the same line), ensuring each config flag is its own continued line in the
examples/run_sft.py invocation.

In `@pyproject.toml`:
- Line 119: Remove or replace the invalid dependency pin
"emerging-optimizers==0.1.0" in pyproject.toml: either remove the line entirely
or change it to a valid installable spec (e.g., a released PyPI version if
available or a VCS URL like a git+https://...@<commit_or_tag> for the
Emerging-Optimizers repo). Update the dependency entry that currently reads
emerging-optimizers==0.1.0 so the package installer can resolve it during the uv
sync --extra mcore step.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a426896 and 39e1439.

⛔ Files ignored due to path filters (4)

docs/assets/muon-dapo-reward.png is excluded by !**/*.png
docs/assets/muon-dapo-val-acc.png is excluded by !**/*.png
docs/assets/muon-sft-comparison.png is excluded by !**/*.png
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (3)

docs/guides/muon-optimizer.md
docs/index.md
pyproject.toml

Signed-off-by: Aditya Vavre <avavre@nvidia.com>

copy-pr-bot · 2026-03-16T17:29:45Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

terrykong · 2026-03-16T17:29:59Z

/ok to test c41ca0a

Signed-off-by: ashors1 <ashors@nvidia.com> Signed-off-by: Anna Shors <ashors@nvidia.com> Signed-off-by: Aditya Vavre <avavre@nvidia.com> Co-authored-by: adityavavreNVDA <avavre@nvidia.com> Co-authored-by: Terry Kong <terrycurtiskong@gmail.com> Signed-off-by: Shih-Yang Liu <shihyangl@nvidia.com>

terrykong · 2026-04-14T01:20:44Z

@qiaochuz-nv to QA

ashors1 added 6 commits December 18, 2025 12:24

add muon documentation

80bdf80

Signed-off-by: ashors1 <ashors@nvidia.com>

minor doc fix

13ddf2e

Signed-off-by: ashors1 <ashors@nvidia.com>

updtae wording

1cd61e4

Signed-off-by: ashors1 <ashors@nvidia.com>

Merge branch 'main' of github.com:NVIDIA-NeMo/RL into ashors/muon-main

564b64b

update muon doc with experimental results

8d15884

Signed-off-by: ashors1 <ashors@nvidia.com>

minor wording changes

4881552

Signed-off-by: ashors1 <ashors@nvidia.com>

github-actions Bot added the Documentation Improvements or additions to documentation label Jan 29, 2026

update emerging-optimizers, fix documentation

6198fa6

Signed-off-by: Anna Shors <ashors@nvidia.com>

terrykong assigned adityavavreNVDA Feb 12, 2026

Merging main

7091c77

adityavavreNVDA changed the title ~~Add Muon post-training support~~ feat(optimizer): Add Muon post-training support Feb 27, 2026

adityavavreNVDA changed the title ~~feat(optimizer): Add Muon post-training support~~ docs(optimizer): Add Muon post-training support Feb 27, 2026

Merge branch 'main' of github.com:NVIDIA-NeMo/RL into ashors/muon-main

39e1439

adityavavreNVDA marked this pull request as ready for review March 3, 2026 23:20

adityavavreNVDA requested review from a team as code owners March 3, 2026 23:20

coderabbitai Bot reviewed Mar 3, 2026

View reviewed changes

Comment thread docs/guides/muon-optimizer.md Outdated

Comment thread docs/guides/muon-optimizer.md Outdated

Comment thread docs/guides/muon-optimizer.md Outdated

Comment thread pyproject.toml

adityavavreNVDA added 2 commits March 3, 2026 15:52

Fixing muon docs

d6c5e0f

Signed-off-by: Aditya Vavre <avavre@nvidia.com>

Merge branch 'main' into ashors/muon-main

fd87cea

terrykong added the CI:L0 Run doctests and unit tests label Mar 16, 2026

Merge branch 'main' into ashors/muon-main

c41ca0a

terrykong approved these changes Mar 16, 2026

View reviewed changes

copy-pr-bot Bot temporarily deployed to nemo-ci March 16, 2026 17:30 Inactive

terrykong enabled auto-merge (squash) March 16, 2026 17:30

copy-pr-bot Bot temporarily deployed to nemo-ci March 16, 2026 18:43 Inactive

terrykong merged commit 49d0d6d into main Mar 17, 2026
47 of 49 checks passed

terrykong deleted the ashors/muon-main branch March 17, 2026 02:51

anwithk added this to the v0.6 Release milestone Mar 20, 2026

terrykong added the QA:In Progress label Apr 14, 2026

qiaochuz-nv added QA:Verified and removed QA:In Progress labels Apr 15, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(optimizer): Add Muon post-training support#1848

docs(optimizer): Add Muon post-training support#1848
terrykong merged 12 commits intomainfrom
ashors/muon-main

ashors1 commented Jan 29, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Mar 3, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot Bot commented Mar 16, 2026

Uh oh!

terrykong commented Mar 16, 2026

Uh oh!

Uh oh!

terrykong commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

ashors1 commented Jan 29, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do ?

Issues

Usage

Before your PR is "Ready for review"

Additional Information

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Mar 3, 2026

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

copy-pr-bot Bot commented Mar 16, 2026

Uh oh!

terrykong commented Mar 16, 2026

Uh oh!

Uh oh!

terrykong commented Apr 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ashors1 commented Jan 29, 2026 •

edited by coderabbitai Bot

Loading