Restore calc-success-rate job to full-sweep schedulers#180
Conversation
dca970d to
089358f
Compare
📊 Line Count ReportFile: Total Lines: 956 Base Lines: 956 Change: No change ➡️ |
There was a problem hiding this comment.
Pull Request Overview
This PR restores the calc-success-rate job to three scheduler workflow files that were accidentally deleted in PR #145. The job is necessary for computing GPU success rates and providing data to the frontend reliability graph.
Key changes:
- Added
calc-success-ratejob to three full-sweep scheduler workflows (1k1k, 8k1k, 1k8k) - Each job downloads benchmark results, calculates success rates, and uploads stats for frontend consumption
- Jobs are configured to run regardless of benchmark outcomes using
if: ${{ always() }}
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
.github/workflows/full-sweep-8k1k-scheduler.yml |
Restored calc-success-rate job with dependencies on dsr1, gptoss, and gb200 benchmarks |
.github/workflows/full-sweep-1k8k-scheduler.yml |
Restored calc-success-rate job with dependencies on dsr1 and gptoss benchmarks (no gb200) |
.github/workflows/full-sweep-1k1k-scheduler.yml |
Restored calc-success-rate job with dependencies on dsr1, gptoss, and gb200 benchmarks |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
089358f to
861eedd
Compare
📊 Line Count ReportFile: Total Lines: 956 Base Lines: 956 Change: No change ➡️ |
kimbochen
left a comment
There was a problem hiding this comment.
lgtm thanks for catching that
cquil11
left a comment
There was a problem hiding this comment.
cant believe Imissed this
PR #145 accidentally deleted the
calc-success-ratejob from three scheduler workflows, breaking the frontend reliability graph.Changes
Restored
calc-success-ratejob to:full-sweep-1k1k-scheduler.yml- depends onbenchmark-dsr1,benchmark-gptoss,benchmark-gb200full-sweep-8k1k-scheduler.yml- depends onbenchmark-dsr1,benchmark-gptoss,benchmark-gb200full-sweep-1k8k-scheduler.yml- depends onbenchmark-dsr1,benchmark-gptoss(no gb200 benchmark exists)Each job:
if: ${{ always() }}to collect stats regardless of benchmark outcomesresults_*artifactscalc_success_rate.pyto compute GPU success ratesrun-statsartifact for frontend consumptionMatches existing pattern in
full-sweep-test.ymlande2e-tests.yml.View original Slack conversation
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.