Skip to content

[workflow] adjust the GPU memory threshold for scheduled unit test#2558

Merged
FrankLeeeee merged 2 commits intohpcaitech:mainfrom
FrankLeeeee:hotfix/build-workflow
Feb 6, 2023
Merged

[workflow] adjust the GPU memory threshold for scheduled unit test#2558
FrankLeeeee merged 2 commits intohpcaitech:mainfrom
FrankLeeeee:hotfix/build-workflow

Conversation

@FrankLeeeee
Copy link
Copy Markdown
Contributor

@FrankLeeeee FrankLeeeee commented Feb 3, 2023

📌 Checklist before creating the PR

  • I have created an issue for this PR for traceability
  • The title follows the standard format: [doc/gemini/tensor/...]: A concise description
  • I have added relevant tags if possible for us to better distinguish different PRs

🚨 Issue number

Link this PR to your issue with words like fixed to automatically close the linked issue upon merge

e.g. fixed #1234, closed #1234, resolved #1234

Fixed #2557

📝 What does this PR do?

Summarize your work here.
if you have any plots/diagrams/screenshots/tables, please attach them here.

This PR did two things:

  1. rename the workflow files for consistency with [workflow] fixed example check workflow #2554
  2. adjusted the used GPU memory from 100MB to 10GB such that the workflow can run on our machine as long as no one is using more than 10 GB memory out of 80 GB. We need to set this threshold because sometimes the machine will have various DL jobs running so as not to interrupt those jobs.

💥 Checklist before requesting a review

  • I have linked my PR to an issue (instruction)
  • My issue clearly describes the problem/feature/proposal, with diagrams/charts/table/code if possible
  • I have performed a self-review of my code
  • I have added thorough tests.
  • I have added docstrings for all the functions/methods I implemented

⭐️ Do you enjoy contributing to Colossal-AI?

  • 🌝 Yes, I do.
  • 🌚 No, I don't.

Tell us more if you don't enjoy contributing to Colossal-AI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[devops] build workflow is stopped due to low GPU memory threshold

1 participant