feat: Update mbridge with cache support#1187
Conversation
|
@terrykong @guyueh1 could you take a review? |
📝 WalkthroughWalkthroughUpdated the Git submodule reference for 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge to a new commit SHA. No code or API changes in this repository. Changes
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~2 minutes Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
3rdparty/Megatron-Bridge-workspace/Megatron-Bridge (1)
1-1: Document/cache toggle and rollback plan.Cache support in the submodule can change behavior/perf. Please add a brief CHANGELOG/release note and mention any flags/env vars to disable it for rollback if needed; link the W&B runs for reproducibility.
I can draft the release-note snippet if helpful.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
3rdparty/Megatron-Bridge-workspace/Megatron-Bridge(1 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Lint check
- GitHub Check: Lint check
- GitHub Check: Post automodel integration comment / Comment on PR
🔇 Additional comments (2)
3rdparty/Megatron-Bridge-workspace/Megatron-Bridge (2)
1-1: Submodule pin verified — no action required. 3rdparty/Megatron-Bridge-workspace/Megatron-Bridge is pinned to 85a37ffdf02edc07c0a7ac97cb9abcafcd0ac0ed (merge commit of PR #621) and that commit is the head of branch yifu/nemo-rl-use-chunkpatch-ds; .gitmodules has no branch configured, so the SHA pin is explicit and reproducible.
1-1: Ensure CI fetches submodules recursively (actions/checkout: submodules: true/recursive, fetch-depth: 0)
No .github/workflows/* files were found during verification — confirm your CI (GitHub Actions or other) is configured to fetch submodules (actions/checkout with submodules: true/recursive and fetch-depth: 0) or update CI accordingly.
0f35de4 to
583a283
Compare
|
@yaoyu-33 could you take a review as well? |
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
583a283 to
a822dcd
Compare
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com>
Signed-off-by: Zhiyu Li <zhiyul@NVIDIA.com> Signed-off-by: yuanhangs <yuanhangs@nvidia.com>
What does this PR do ?
This is to add cache support in mbridge submodule, which mitigates the dsv3 refitting regression:
No need to change the
.gitmodulessince we updated to the same branchyifu/nemo-rl-use-chunkpatch-dsAdd cache to optimize refit performance Megatron-Bridge#621
Issues
List issues that this PR closes (syntax):
Usage
# Add a code snippet demonstrating how to use thisBefore your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit