fix: Fix Fp8 sequence padding for PP>1 case#1579
Conversation
Signed-off-by: root <root@pool0-00514.cm.cluster>
📝 WalkthroughWalkthroughThe changes enforce stricter padding constraints for packed sequences in Megatron models by asserting divisibility requirements, normalizing Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🧰 Additional context used📓 Path-based instructions (4)**/*.py📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
nemo_rl/**/*.py📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
!(**/tests/**|**/test_*.py|**/test_*.sh)📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
**/*.{py,sh}📄 CodeRabbit inference engine (CODING_GUIDELINES.md)
Files:
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
🔇 Additional comments (3)
Tip 📝 Customizable high-level summaries are now available in beta!You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.
Example instruction:
Note: This feature is currently in beta for Pro-tier users, and pricing will be announced later. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@terrykong this PR is passing L2 but I think the fp8 code pass is not tested due to CI being on A100 |
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster> Signed-off-by: NeMo Bot <nemo-bot@nvidia.com>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster> Signed-off-by: Parth Mannan <pmannan@nvidia.com>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster>
Signed-off-by: root <root@pool0-00514.cm.cluster> Co-authored-by: root <root@pool0-00514.cm.cluster>
What does this PR do ?
This is a follow-up after #1569 , to fix the sequence length for PP>1 case.
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.