perf: Add a field in megatron_cfg to enable bias_activation_fusion#1241
perf: Add a field in megatron_cfg to enable bias_activation_fusion#1241terrykong merged 5 commits intoNVIDIA-NeMo:mainfrom
Conversation
📝 WalkthroughWalkthroughAdded a new bias_activation_fusion option to the SFT config and propagated it into the Megatron model configuration during policy worker initialization, alongside the existing apply_rope_fusion setting. No control-flow changes. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
examples/configs/sft.yaml (1)
93-94: Improve the comment to document the performance benefit and dependencies.The comment should be more descriptive to match the detail provided for
apply_rope_fusionabove. Per the PR description, this flag provides ~25% speedup when used together withapply_rope_fusion=Trueandsequence_parallel=True.Consider updating the comment:
- # gives training perf speedup - bias_activation_fusion: True + # Gives ~25% training perf speedup when combined with apply_rope_fusion=True and sequence_parallel=True + bias_activation_fusion: TrueAs per coding guidelines.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
examples/configs/sft.yaml(1 hunks)nemo_rl/models/policy/megatron_policy_worker.py(1 hunks)
🧰 Additional context used
📓 Path-based instructions (3)
examples/configs/*.yaml
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
examples/configs/*.yaml: Exemplar configs under examples/configs/.yaml must include documented defaults
When adding a new config key, reflect its recommended default in exemplar YAMLs under examples/configs/.yaml
Files:
examples/configs/sft.yaml
**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
**/*.py: Follow the Google Python Style Guide for all Python code
Target Python 3.12+ for all Python code in NeMo-RL
Indent Python code with 4 spaces; do not use tabs
Python filenames should be snake_case (e.g., some_file.py)
Class names should be PascalCase
Function and method names should be snake_case
Local variable names should be snake_case; if starting with a number, prefix with k (e.g., k_99th_percentile)
Global variables should be UPPER_SNAKE_CASE and prefixed with G_ (e.g., G_MY_GLOBAL)
Constants should be UPPER_SNAKE_CASE
Avoid shadowing variables declared in an outer scope
Initialize all externally visible members of a class in the constructor
For public interfaces used outside a file, prefer docstrings over comments
Use comments mainly for code within a function or interfaces local to a file
Commented-out code must include a nearby comment explaining usage and why it is commented out; otherwise remove before merging
Use Google-style docstrings for classes and functions (Sphinx-parseable)
Avoid using reflection when functionality can be easily achieved without it
Limit except clauses to the smallest specific set of exceptions possible
For duck-typing via try/except, keep the try body minimal and use else for main logic
Add the NVIDIA copyright header (with current year) at the top of all Python files, excluding tests/ and test-only scripts
Files:
nemo_rl/models/policy/megatron_policy_worker.py
nemo_rl/**/*.py
📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
nemo_rl/**/*.py: Do not set non-None configuration defaults in code; YAML is the single source of truth for defaults
Access required config attributes directly (e.g., policy_cfg["precision"]) and assume presence; do not introduce hidden defaults
Express configuration optionality via TypedDict using typing.NotRequired
When adding a new config key to a TypedDict subclass, document the key’s purpose, valid values/types, and recommended default in code
For any class or function decorated with @ray.remote, add '# pragma: no cover' on the class/def line (and on remote functions)
Files:
nemo_rl/models/policy/megatron_policy_worker.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Lint check
- GitHub Check: Post submodule check comment / Comment on PR
- GitHub Check: Post automodel integration comment / Comment on PR
|
@katec846 please also add this field to |
9817e88 to
55954aa
Compare
55954aa to
f8eb594
Compare
|
@terrykong @guyueh1 can you help me review this PR? @terrykong do you know how can I fix this pipeline error? I didn't see anything failing. The results are either success or skipped. |
|
@katec846 it just means the tests weren't run ( i just triggered them for you by adding the label ). @chtruong814 maybe we need to give a better error message on the status check job name in this case |
|
set to automerge pending @guyueh1 's review |
joyang-nv
left a comment
There was a problem hiding this comment.
Good stuff. I am involved due to automodel reveiwers.
Support for this change but I don't think I am the right person to approve.
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
a8a2a65 to
f547c0d
Compare
…1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
…1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
…1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
…1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
…VIDIA-NeMo#1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com>
…VIDIA-NeMo#1241) Signed-off-by: Kate Cheng <yunhsuanc@nvidia.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
What does this PR do ?
Add a field in megatron_cfg to enable bias_activation_fusion.
Issues
This PR is related to (#917): This will reduce the train step time
This flag along with
policy.megatron_cfg.apply_rope_fusion=Trueandpolicy.megatron_cfg.sequence_parallel=Truecan improve ~25% speedupUsage
or
overide the params with
Before your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit