fix: Fix process_weights_after_loading for fp8 dense#1432
fix: Fix process_weights_after_loading for fp8 dense#1432terrykong merged 10 commits intoNVIDIA-NeMo:mainfrom
Conversation
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
terrykong
left a comment
There was a problem hiding this comment.
small comment. could you also confirm the fp8 rollout test runs after this fix?
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
📝 WalkthroughWalkthroughTwo changes added: logging for the Changes
Sequence Diagram(s)sequenceDiagram
participant Code as Model Code
participant Load as load_weights()
participant Extract as _create_param_from_subclass_attributes()
participant Strategy as process_fp8_weight_block_strategy()
participant PostProc as maybe_post_process_fp8_weight_block()
participant Layer as Layer Attributes
Code->>Load: load weights with "_scale" key
Load->>Extract: extract layer attributes
Extract->>Layer: retrieve weight_scale_inv or weight_scale
Extract->>Strategy: process (layer.weight, weight_scale)
Strategy-->>Extract: return (updated_weight, updated_scale)
Extract->>Extract: create ModelWeightParameter with updated_weight.data
Extract->>PostProc: post-process layer
PostProc->>Layer: update layer state
Extract-->>Code: return processed parameter
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
nemo_rl/algorithms/grpo.py(1 hunks)nemo_rl/models/generation/fp8.py(3 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts
Files:
nemo_rl/algorithms/grpo.pynemo_rl/models/generation/fp8.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)
Files:
nemo_rl/algorithms/grpo.pynemo_rl/models/generation/fp8.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Lint check
- GitHub Check: Post automodel integration comment / Comment on PR
- GitHub Check: Post submodule check comment / Comment on PR
🔇 Additional comments (3)
nemo_rl/algorithms/grpo.py (1)
1279-1279: Clarify relevance to PR objective.This logging addition appears unrelated to the PR's stated purpose of fixing FP8 weight loading after the vllm 0.11.0 upgrade. Additionally, there's a past review comment suggesting
gen_kl_errorshould be logged instead.Consider either:
- Moving this change to a separate PR focused on GRPO logging improvements
- Updating the PR description to explain why this logging change is included
- Addressing the past review comment about using
gen_kl_errornemo_rl/models/generation/fp8.py (2)
304-304: LGTM: Key naming updated for vllm 0.11.0 compatibility.The change from
_scale_invto_scalesuffix correctly aligns with vllm 0.11.0's FP8 parameter naming convention.
394-397: LGTM: FP8 utility imports added.The imported functions (
maybe_post_process_fp8_weight_block,process_fp8_weight_block_strategy) are vllm 0.11.0 utilities that enable proper post-processing of FP8 weight blocks.
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
…-RL into fix_fp8_rollout_dense
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com> Signed-off-by: Zhaopeng Qiu <alexq@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com>
Signed-off-by: Guyue Huang <guyueh@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
What does this PR do ?
Fix process_weights_after_loading for fp8 dense after bumping vllm to 0.11.0
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit
Refactor
Chores