Skip to content

Restore calc-success-rate job to full-sweep schedulers#180

Merged
cquil11 merged 1 commit intomainfrom
copilot/fix-frontend-reliability-graph
Nov 6, 2025
Merged

Restore calc-success-rate job to full-sweep schedulers#180
cquil11 merged 1 commit intomainfrom
copilot/fix-frontend-reliability-graph

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Nov 6, 2025

PR #145 accidentally deleted the calc-success-rate job from three scheduler workflows, breaking the frontend reliability graph.

Changes

Restored calc-success-rate job to:

  • full-sweep-1k1k-scheduler.yml - depends on benchmark-dsr1, benchmark-gptoss, benchmark-gb200
  • full-sweep-8k1k-scheduler.yml - depends on benchmark-dsr1, benchmark-gptoss, benchmark-gb200
  • full-sweep-1k8k-scheduler.yml - depends on benchmark-dsr1, benchmark-gptoss (no gb200 benchmark exists)

Each job:

  • Runs with if: ${{ always() }} to collect stats regardless of benchmark outcomes
  • Downloads results_* artifacts
  • Executes calc_success_rate.py to compute GPU success rates
  • Uploads run-stats artifact for frontend consumption

Matches existing pattern in full-sweep-test.yml and e2e-tests.yml.

View original Slack conversation


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI changed the title [WIP] Fix frontend reliability graph after refactor Restore calc-success-rate job to full-sweep schedulers Nov 6, 2025
Copilot AI requested a review from functionstackx November 6, 2025 04:41
@functionstackx functionstackx force-pushed the copilot/fix-frontend-reliability-graph branch from dca970d to 089358f Compare November 6, 2025 04:52
@functionstackx functionstackx marked this pull request as ready for review November 6, 2025 04:52
@functionstackx functionstackx requested a review from a team as a code owner November 6, 2025 04:52
Copilot AI review requested due to automatic review settings November 6, 2025 04:52
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Nov 6, 2025

📊 Line Count Report

File: utils/matrix-logic/generate_sweep_configs.py

Total Lines: 956

Base Lines: 956

Change: No change ➡️

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR restores the calc-success-rate job to three scheduler workflow files that were accidentally deleted in PR #145. The job is necessary for computing GPU success rates and providing data to the frontend reliability graph.

Key changes:

  • Added calc-success-rate job to three full-sweep scheduler workflows (1k1k, 8k1k, 1k8k)
  • Each job downloads benchmark results, calculates success rates, and uploads stats for frontend consumption
  • Jobs are configured to run regardless of benchmark outcomes using if: ${{ always() }}

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
.github/workflows/full-sweep-8k1k-scheduler.yml Restored calc-success-rate job with dependencies on dsr1, gptoss, and gb200 benchmarks
.github/workflows/full-sweep-1k8k-scheduler.yml Restored calc-success-rate job with dependencies on dsr1 and gptoss benchmarks (no gb200)
.github/workflows/full-sweep-1k1k-scheduler.yml Restored calc-success-rate job with dependencies on dsr1, gptoss, and gb200 benchmarks

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Co-authored-by: functionstackx <47992694+functionstackx@users.noreply.github.com>
@functionstackx functionstackx force-pushed the copilot/fix-frontend-reliability-graph branch from 089358f to 861eedd Compare November 6, 2025 05:02
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Nov 6, 2025

📊 Line Count Report

File: utils/matrix-logic/generate_sweep_configs.py

Total Lines: 956

Base Lines: 956

Change: No change ➡️

Copy link
Copy Markdown
Collaborator

@kimbochen kimbochen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm thanks for catching that

Copy link
Copy Markdown
Collaborator

@cquil11 cquil11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cant believe Imissed this

@cquil11 cquil11 merged commit 9e97f65 into main Nov 6, 2025
18 of 19 checks passed
@cquil11 cquil11 deleted the copilot/fix-frontend-reliability-graph branch November 6, 2025 14:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants