Skip to content

fix(tests): keep shadows cold before failover quiesce#8325

Merged
pvijayakrish merged 1 commit into
release/1.1.0from
schwinns/cherrypick-gms-shadow-failover-1.1.0
Apr 19, 2026
Merged

fix(tests): keep shadows cold before failover quiesce#8325
pvijayakrish merged 1 commit into
release/1.1.0from
schwinns/cherrypick-gms-shadow-failover-1.1.0

Conversation

@galletas1712
Copy link
Copy Markdown
Contributor

@galletas1712 galletas1712 commented Apr 17, 2026

Overview:

Cherry-pick of main PR #8258 into release/1.1.0 to stabilize the GMS fault-tolerance tests.

Details:

  • Clean cherry-pick of main commit c69e19e8dd029d39d33a24dd42ffc3a111b124ae
  • No release-only code changes
  • No known cherry-pick dependencies
  • Carries the same test changes as #8258:
    • keep shadow-a and shadow-b cold before their first quiesce in tests/gpu_memory_service/test_shadow_failover.py
    • remove repo-level NVML memory-accounting assertions from the GMS test flows
    • lower the vLLM / SGLang GMS harness memory fractions from 0.9 to 0.8

Where should the reviewer start?

  • tests/gpu_memory_service/test_shadow_failover.py
  • tests/gpu_memory_service/flow_assertions.py
  • tests/gpu_memory_service/common/runtime.py

Validation:

  • git cherry-pick --signoff c69e19e8dd029d39d33a24dd42ffc3a111b124ae
  • Cherry-pick applied cleanly on top of origin/release/1.1.0
  • Main PR #8258 passed isolated L4 pre-merge GMS validation before merge
  • Release-branch CI pending on this PR

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)


Open with Devin

Signed-off-by: Schwinn Saereesitthipitak <schwinns@nvidia.com>
@galletas1712 galletas1712 requested review from a team as code owners April 17, 2026 20:52
@github-actions github-actions Bot added the fix label Apr 17, 2026
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 4 additional findings.

Open in Devin Review

@pvijayakrish pvijayakrish merged commit f028dd3 into release/1.1.0 Apr 19, 2026
83 of 84 checks passed
@pvijayakrish pvijayakrish deleted the schwinns/cherrypick-gms-shadow-failover-1.1.0 branch April 19, 2026 01:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants