-
Notifications
You must be signed in to change notification settings - Fork 35
Only run itergration tests on AWS GPU and CPU runners #1538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
… + intergration tests
| - ".github/workflows/cpu-long-tests.yaml" | ||
| - ".github/workflows/gpu-integration-tests.yaml" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding skips here so that when we make some small change to our action that runs on AWS we don't fire off the whole testing matrix.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1538 +/- ##
==========================================
- Coverage 95.33% 92.88% -2.45%
==========================================
Files 183 183
Lines 15763 15763
==========================================
- Hits 15028 14642 -386
- Misses 735 1121 +386
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
GPU test here: https://github.com/OpenFreeEnergy/openfe/actions/runs/17991765097/job/51183190807 Looks like one failed -- should we add some retry logic to the test? |
|
Took 40 minutes: |
tagging @IAlibay to ask how expected this is. |
Hmm, I think the CPU runner can just be slow tests - I know @IAlibay specifically wanted this for when he was doing development away from his workstation. I think unit tests locally and then slow tests on the runner should still meet his need? |
|
Okay CPU runner just doing the slow tests now, testing that here: |
Shouldn't be happening very often, we can throw a retry in there though, wouldn't hurt. |
Here's what my view is on what we want: AWS CPU runnersNormal tests: yes AWS GPU runnersNormal tests: if we have to What's wrong with our integration tests?The most important thing is that we don't use GPU runners in places we don't need them, i.e. we don't want to use GPU dollars running a bunch of CPU-only tests. What we would want, is an additional flag that ONLY runs GPU tests, and then we can run any integration tests that need CPU on a specific CPU integration runner. |
IAlibay
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main issue I have with this is that we're going to spend lots of expensive GPU time number crunching on the CPU. It may be order a few hundred dollars, but it's going to add up real quick as we add more slow tests.
Ideally what we need to do it decouple integration from slow, or add a special "integration only" flag here:
openfe/openfe/tests/conftest.py
Lines 73 to 82 in 6581c95
| def pytest_collection_modifyitems(self, items, config): | |
| if (config.getoption('--integration') or | |
| os.getenv("OFE_INTEGRATION_TESTS", default="false").lower() == 'true'): | |
| return | |
| elif (config.getoption('--runslow') or | |
| os.getenv("OFE_SLOW_TESTS", default="false").lower() == 'true'): | |
| self._modify_integration(items, config) | |
| else: | |
| self._modify_integration(items, config) | |
| self._modify_slow(items, config) |
| OFE_INTEGRATION_TESTS: FALSE | ||
| run: | | ||
| pytest -n logical -vv --durations=10 openfecli/tests/ openfe/tests/ | ||
| pytest -n logical -vv --durations=10 -m slow openfecli/tests/ openfe/tests/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need -m slow? I would have though the env variable would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had thought we only wanted to run the slow tests (and not the normal ones) on the CPU runner, will fix this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That could be ok too.
|
Okay to expand on what I wrote here: #1527 (comment) The
Which means This will satisfy what you want for the GPU runners. For the CPU runners, since we want slow + normal, that is easy to do with flags/args.
The |
|
Ah ok thanks for the explanation @mikemhenry , I hadn't caught that. |
|
No worries! I could have sworn I explained the |
|
I guess it depends if all integration tests are GPU tests, or if all GPU tests will be integration tests. If we don't think there is anything interesting there, then we can rename Thoughts @IAlibay |
All integration tests are GPU, but not all integration tests are ONLY GPU. Very roughly, if we can control the OpenMM platform with pytest marks, then that's the answer. We stick on a gpu and cpu mark and then use the name of the active mark to pick the platform (or skip the test). If you gimme some pseudo code I'm happy to show a demo of what I mean. |
|
This! https://stackoverflow.com/a/74804492 We just need to create a gpu and cpu mark and then check for the mark's presence and pick from a dictionary, with a prerference for GPU over CPU if both are available. |
|
I think the three of us are either kinda talking past each other or I don't fully understand the scope of what we are trying to do. As this PR currently stands:
If this isn't quite what we are trying to do, let me know! It sounds like there are some |
I agree, we're going out of scope, the untangling of CPU & GPU tests for the integration tests can be done in a separate PR. @atravitz are you happy with that? |
|
I think @atravitz main point was to re-name the |
I agree! |
|
No API break detected ✅ |
Checklist
newsentry-m MARKEXPR Only run tests matching given mark expression. For example: -m 'mark1 and not mark2'.
Developers certificate of origin
Fixes #1527