Skip to content

Docs: NeMo RL Documentation Review#1282

Closed
jgerh wants to merge 53 commits intomainfrom
jgerhold/docs-refactor-staging
Closed

Docs: NeMo RL Documentation Review#1282
jgerh wants to merge 53 commits intomainfrom
jgerhold/docs-refactor-staging

Conversation

@jgerh
Copy link
Copy Markdown
Contributor

@jgerh jgerh commented Oct 6, 2025

What does this PR do ?

This is a comprehensive end-to-end review of the NeMo RL documentation refactor project. We’re sharing the full branch so you can view the final state of the documentation, with all links intact, and preview the content as needed.

Important: These doc drafts are AI-assisted and use the source code and existing documentation to curate the new content. If you see something odd, chances are it is incorrect. Be sure to flag any issues you find in your review comments.

Review Content

Please assess the documentation’s overall structure and organization for clarity and usability, and review each file thoroughly to ensure accuracy and completeness.

  • about
  • api-docs
  • core-design
  • development
  • get-started
  • guides
  • learning-resources

Preview Docs

  1. CD into the repo.
  2. Run make docs-env.
  3. Source the env (varies depending on OS).
  4. Run make docs-live.
  5. Open docs at http://127.0.0.1:8000/ once built.

Summary by CodeRabbit

  • New Features

    • Enhanced docs search with rich UI, filtering, and optional AI-assisted answers.
    • JSON search index generation for faster, smarter queries.
    • Content gating to build GA/EA/internal doc variants.
    • Makefile targets to build, live-serve, and publish docs.
    • New Docker pipelines, including an NVIDIA PyTorch-based image.
  • Documentation

    • Major docs overhaul: getting started, architecture, design principles, development (testing, debugging, profiling, FP8), and backend guidance.
    • README and Docker docs updated with clearer workflows and examples.
  • Chores

    • Updated pre-commit hooks and type checks.
    • CI/coverage policy tweaked (patch coverage thresholds).
    • Contributor workflow clarified (fork-based).
    • Dependency/submodule and ignore list updates.
    • Added code review configuration and coding guidelines.

jgerh added 30 commits July 1, 2025 12:29
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
Signed-off-by: jgerh <jgerhold@nvidia.com>
@jgerh jgerh requested review from a team as code owners October 6, 2025 17:59
@jgerh jgerh removed request for a team October 6, 2025 18:02
@jgerh jgerh closed this Oct 6, 2025
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Oct 6, 2025

Caution

Review failed

The pull request is closed.

📝 Walkthrough

Walkthrough

Adds extensive docs infrastructure (Sphinx config, Makefile, custom extensions for AI Assistant, enhanced search, JSON output, content gating), major documentation reorganization/additions/removals, Dockerfiles (new multi-stage builds, vLLM build, hermetic/release flows), repo/config updates (.gitignore, .dockerignore, codecov, pre-commit, CodeRabbit), and submodule/workspace adjustments (Megatron Bridge added; NeMo packaging removed).

Changes

Cohort / File(s) Summary
Repo config and automation
/.coderabbit.yaml, /.pre-commit-config.yaml, /codecov.yml
Adds CodeRabbit config; updates pre-commit (type-based filters, new hooks for typecheck and config checks); enables patch coverage policy.
Ignore and modules
/.dockerignore, /.gitignore, /.gitmodules
Stops ignoring .git in Docker context; expands ignores; removes NeMo submodule, updates Megatron-LM branch, adds Megatron-Bridge and Automodel submodules.
Docs build system
/Makefile, /docs/conf.py, /docs/BUILD_INSTRUCTIONS.md, /docs/README.md
Introduces Makefile targets for docs; significantly expands Sphinx config (themes, transforms, tags); adds build instructions and docs template README.
Docker build pipeline
/docker/Dockerfile, /docker/Dockerfile.ngc_pytorch, /docker/README.md
Adds multi-stage builds sourcing repo by ref; configures UV-based installs, CUDA/flash-attn, vLLM build, hermetic/release stages; updates image type descriptions.
Megatron Bridge workspace
/3rdparty/Megatron-Bridge-workspace/*
Adds standalone packaging (pyproject, setup), installation probe script (imports megatron.bridge.AutoBridge).
Megatron/NeMo workspaces
/3rdparty/Megatron-LM-workspace/*, /3rdparty/NeMo-workspace/*
Sets setuptools backend for Megatron-LM; removes NeMo workspace packaging (pyproject, setup); updates subproject commit markers.
Top-level docs content reorg
/README.md, /CONTRIBUTING.md, /CODING_GUIDELINES.md
Expands README; revises contributor workflows; adds comprehensive coding guidelines.
Docs: About & Core Design
/docs/about/*, /docs/core-design/**
Adds architecture, key features, purpose; core design indices and deep-dive docs (design principles, generation, loss functions, data mgmt, computational systems, checkpointing, env vars).
Docs: Development guides
/docs/development/*, /docs/get-started/*
Adds development index, debugging, FP8, testing, nsys profiling updates; new Docker and cluster guides; getting-started index.
Docs removals/moves
docs/adding-new-models.md, docs/debugging.md, docs/docker.md, docs/documentation.md, docs/design-docs/*
Removes or relocates legacy docs (adding-new-models, debugging, Docker, documentation, older design-docs).
Sphinx extension: AI Assistant
/docs/_extensions/ai_assistant/**, /docs/_extensions/ai_assistant/assets/styles/ai-assistant.css
Adds AI client, response processing/rendering, markdown processor, main orchestrator, and search integration JS; includes CSS assets and extension setup.
Sphinx extension: Enhanced Search
/docs/_extensions/search_assets/**
Adds enhanced client-side search (loader, engine with Lunr, UI, page manager), bundling, assets, template, and CSS.
Sphinx extension: JSON Output
/docs/_extensions/json_output/**
Adds JSON generation pipeline (config, content extraction, formatter, writer, hierarchy builder, caching, parallel processing, utils, setup).
Sphinx extension: Content Gating
/docs/_extensions/content_gating/**
Adds tag-based gating for documents, toctrees, and directives; condition evaluation and build-time exclusion.
Sphinx extension: MyST codeblock substitutions
/docs/_extensions/myst_codeblock_substitutions.py
Adds substitution processing for MyST inside code blocks with template-aware safeguards.
Docs static assets/templates
/docs/_static/octicons.*, /docs/_templates/autodoc2_index.rst
Adds Octicons CSS/JS and an API index template.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant U as User (Docs site)
  participant ES as EnhancedSearch (JS)
  participant SE as SearchEngine (Lunr)
  participant DL as DocumentLoader
  participant AI as AI Assistant

  Note over ES: On DOMContentLoaded
  ES->>DL: loadDocuments()
  DL-->>ES: documents[]
  ES->>SE: initialize(documents)
  U->>ES: type query
  ES->>SE: search(query, filters)
  SE-->>ES: results
  ES->>AI: analyzeQuery(query, results) [optional, threshold-based]
  AI-->>ES: aiResponse or error
  ES-->>U: render results (+ AI panel if available)
Loading
sequenceDiagram
  autonumber
  participant S as Sphinx
  participant JO as JSONOutputBuilder
  participant DD as DocumentDiscovery
  participant JF as JSONFormatter
  participant JW as JSONWriter
  participant HC as JSONOutputCache

  S->>JO: on_build_finished()
  JO->>DD: get_all_documents_recursive()
  DD-->>JO: docnames[]
  loop per document
    JO->>HC: needs_update(doc)
    alt needs update
      JO->>JF: build_json_data(doc)
      JF-->>JO: json data
      JO->>JW: write_json_file(doc, data)
      JW-->>JO: OK
      JO->>HC: mark_updated(doc)
    else
      JO-->>S: skip doc
    end
  end
  JO-->>S: done
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120+ minutes

Possibly related PRs

Suggested reviewers

  • terrykong
  • parthchadha
  • snowmanwwg
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch jgerhold/docs-refactor-staging

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7536be3 and 8393fc4.

⛔ Files ignored due to path filters (5)
  • docs/assets/fp8_curves.png is excluded by !**/*.png
  • docs/assets/fp8_e2e_curve.png is excluded by !**/*.png
  • docs/assets/nsys-multi-report-view.png is excluded by !**/*.png
  • docs/assets/train-reward-sliding-puzzle.png is excluded by !**/*.png
  • docs/assets/valid_acc-sliding-puzzle.png is excluded by !**/*.png
📒 Files selected for processing (107)
  • .coderabbit.yaml (1 hunks)
  • .dockerignore (1 hunks)
  • .gitignore (2 hunks)
  • .gitmodules (1 hunks)
  • .pre-commit-config.yaml (2 hunks)
  • 3rdparty/Megatron-Bridge-workspace/is_megatron_bridge_installed.py (1 hunks)
  • 3rdparty/Megatron-Bridge-workspace/pyproject.toml (1 hunks)
  • 3rdparty/Megatron-Bridge-workspace/setup.py (1 hunks)
  • 3rdparty/Megatron-LM-workspace/Megatron-LM (0 hunks)
  • 3rdparty/Megatron-LM-workspace/pyproject.toml (1 hunks)
  • 3rdparty/NeMo-workspace/NeMo (0 hunks)
  • 3rdparty/NeMo-workspace/pyproject.toml (0 hunks)
  • 3rdparty/NeMo-workspace/setup.py (0 hunks)
  • CODING_GUIDELINES.md (1 hunks)
  • CONTRIBUTING.md (1 hunks)
  • Makefile (1 hunks)
  • README.md (10 hunks)
  • codecov.yml (1 hunks)
  • docker/Dockerfile (4 hunks)
  • docker/Dockerfile.ngc_pytorch (1 hunks)
  • docker/README.md (1 hunks)
  • docs/BUILD_INSTRUCTIONS.md (1 hunks)
  • docs/README.md (1 hunks)
  • docs/_extensions/__init__.py (1 hunks)
  • docs/_extensions/ai_assistant/README.md (1 hunks)
  • docs/_extensions/ai_assistant/__init__.py (1 hunks)
  • docs/_extensions/ai_assistant/assets/styles/ai-assistant.css (1 hunks)
  • docs/_extensions/ai_assistant/core/AIClient.js (1 hunks)
  • docs/_extensions/ai_assistant/core/ResponseProcessor.js (1 hunks)
  • docs/_extensions/ai_assistant/core/main.js (1 hunks)
  • docs/_extensions/ai_assistant/integrations/search-integration.js (1 hunks)
  • docs/_extensions/ai_assistant/ui/MarkdownProcessor.js (1 hunks)
  • docs/_extensions/ai_assistant/ui/ResponseRenderer.js (1 hunks)
  • docs/_extensions/content_gating/README.md (1 hunks)
  • docs/_extensions/content_gating/__init__.py (1 hunks)
  • docs/_extensions/content_gating/condition_evaluator.py (1 hunks)
  • docs/_extensions/content_gating/conditional_directives.py (1 hunks)
  • docs/_extensions/content_gating/document_filter.py (1 hunks)
  • docs/_extensions/json_output/README.md (1 hunks)
  • docs/_extensions/json_output/__init__.py (1 hunks)
  • docs/_extensions/json_output/config.py (1 hunks)
  • docs/_extensions/json_output/content/__init__.py (1 hunks)
  • docs/_extensions/json_output/content/extractor.py (1 hunks)
  • docs/_extensions/json_output/content/metadata.py (1 hunks)
  • docs/_extensions/json_output/content/structured.py (1 hunks)
  • docs/_extensions/json_output/content/text.py (1 hunks)
  • docs/_extensions/json_output/core/__init__.py (1 hunks)
  • docs/_extensions/json_output/core/builder.py (1 hunks)
  • docs/_extensions/json_output/core/document_discovery.py (1 hunks)
  • docs/_extensions/json_output/core/hierarchy_builder.py (1 hunks)
  • docs/_extensions/json_output/core/json_formatter.py (1 hunks)
  • docs/_extensions/json_output/core/json_writer.py (1 hunks)
  • docs/_extensions/json_output/processing/__init__.py (1 hunks)
  • docs/_extensions/json_output/processing/cache.py (1 hunks)
  • docs/_extensions/json_output/processing/processor.py (1 hunks)
  • docs/_extensions/json_output/utils.py (1 hunks)
  • docs/_extensions/myst_codeblock_substitutions.py (1 hunks)
  • docs/_extensions/search_assets/__init__.py (1 hunks)
  • docs/_extensions/search_assets/enhanced-search.css (1 hunks)
  • docs/_extensions/search_assets/main.js (1 hunks)
  • docs/_extensions/search_assets/modules/DocumentLoader.js (1 hunks)
  • docs/_extensions/search_assets/modules/EventHandler.js (1 hunks)
  • docs/_extensions/search_assets/modules/ResultRenderer.js (1 hunks)
  • docs/_extensions/search_assets/modules/SearchEngine.js (1 hunks)
  • docs/_extensions/search_assets/modules/SearchInterface.js (1 hunks)
  • docs/_extensions/search_assets/modules/SearchPageManager.js (1 hunks)
  • docs/_extensions/search_assets/modules/Utils.js (1 hunks)
  • docs/_extensions/search_assets/templates/search.html (1 hunks)
  • docs/_static/octicons.css (1 hunks)
  • docs/_static/octicons.js (1 hunks)
  • docs/_templates/autodoc2_index.rst (1 hunks)
  • docs/about/architecture-overview.md (1 hunks)
  • docs/about/index.md (1 hunks)
  • docs/about/key-features.md (1 hunks)
  • docs/about/purpose.md (1 hunks)
  • docs/adding-new-models.md (0 hunks)
  • docs/conf.py (5 hunks)
  • docs/core-design/computational-systems/index.md (1 hunks)
  • docs/core-design/computational-systems/logger.md (5 hunks)
  • docs/core-design/computational-systems/training-backends.md (1 hunks)
  • docs/core-design/data-management/index.md (1 hunks)
  • docs/core-design/data-management/sequence-packing-and-dynamic-batching.md (1 hunks)
  • docs/core-design/design-principles/design-and-philosophy.md (3 hunks)
  • docs/core-design/design-principles/fsdp2-parallel-plan.md (1 hunks)
  • docs/core-design/design-principles/generation.md (1 hunks)
  • docs/core-design/design-principles/index.md (1 hunks)
  • docs/core-design/design-principles/loss-functions.md (1 hunks)
  • docs/core-design/development-infrastructure/checkpointing.md (1 hunks)
  • docs/core-design/development-infrastructure/env-vars.md (1 hunks)
  • docs/core-design/development-infrastructure/index.md (1 hunks)
  • docs/core-design/development-infrastructure/uv.md (2 hunks)
  • docs/core-design/index.md (1 hunks)
  • docs/debugging.md (0 hunks)
  • docs/design-docs/generation.md (0 hunks)
  • docs/design-docs/loss-functions.md (0 hunks)
  • docs/design-docs/training-backends.md (0 hunks)
  • docs/development/debugging.md (1 hunks)
  • docs/development/fp8.md (1 hunks)
  • docs/development/index.md (1 hunks)
  • docs/development/nsys-profiling.md (1 hunks)
  • docs/development/testing.md (5 hunks)
  • docs/development/use-custom-vllm.md (1 hunks)
  • docs/docker.md (0 hunks)
  • docs/documentation.md (0 hunks)
  • docs/get-started/cluster.md (3 hunks)
  • docs/get-started/docker.md (1 hunks)
  • docs/get-started/index.md (1 hunks)
⛔ Files not processed due to max files limit (16)
  • docs/get-started/installation.md
  • docs/get-started/local-workstation.md
  • docs/get-started/model-selection.md
  • docs/get-started/quickstart.md
  • docs/guides/environment-data/environment-development.md
  • docs/guides/environment-data/environments.md
  • docs/guides/environment-data/index.md
  • docs/guides/index.md
  • docs/guides/model-development/add-new-models.md
  • docs/guides/model-development/deepseek.md
  • docs/guides/model-development/index.md
  • docs/guides/model-development/model-quirks.md
  • docs/guides/training-algorithms/async-grpo.md
  • docs/guides/training-algorithms/distillation.md
  • docs/guides/training-algorithms/dpo.md
  • docs/guides/training-algorithms/eval.md

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jgerh jgerh reopened this Oct 6, 2025
@jgerh jgerh closed this Oct 6, 2025
@jgerh
Copy link
Copy Markdown
Contributor Author

jgerh commented Oct 6, 2025

Closing for re-evaluation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant