Reuse the cache created for latest main on PRs/branches if setup.py is not modified#25445
Reuse the cache created for latest main on PRs/branches if setup.py is not modified#25445
main on PRs/branches if setup.py is not modified#25445Conversation
|
The documentation is not available anymore as the PR was closed or merged. |
631f5b2 to
98772c4
Compare
There was a problem hiding this comment.
I will remove this before merge
main on PRs/branches if setup.py not modified
main on PRs/branches if setup.py not modifiedmain on PRs/branches if setup.py not modified
main on PRs/branches if setup.py not modifiedmain on PRs/branches if setup.py is not modified
sgugger
left a comment
There was a problem hiding this comment.
Thanks for working on this! Let's try it for a bit and see if it impacts any workflow negatively.
47334f8 to
c56762e
Compare
| - run: python utils/tests_fetcher.py | tee test_preparation/tests_fetched_summary.txt | ||
| - store_artifacts: | ||
| path: ~/transformers/tests_fetched_summary.txt | ||
| path: test_preparation/tests_fetched_summary.txt |
There was a problem hiding this comment.
need this in create_circleci_config.py to check if tests is in the list
| summary_file = os.path.join(folder, "tests_fetched_summary.txt") | ||
| if os.path.exists(summary_file): | ||
| with open(summary_file) as f: | ||
| tests_fetched_summary = f.read() | ||
| setup_file_modifiled = "### TEST TO RUN ###\n- tests\n" in tests_fetched_summary |
There was a problem hiding this comment.
We need to check the summary file instead of filtered_test_list as it is already changed to a list in filter_tests method
if test_files == ["tests"]:
test_files = [os.path.join("tests", f) for f in os.listdir("tests") if f not in ["__init__.py"] + filters]|
@sgugger FYI: I need to change 2 more places to make it work correctly for all cases. See my last 2 review comments. |
|
Hi @sgugger I am afraid I have to revert this PR until we do something to enabling sharing cache (see below at the end). From this section
The cache is never shared between PRs/branches from different forks. This might explains the question I posted about why the same cache key could be found sometimes but not other times. The PR description of #24886 is partially valid as lhoestq created a branch in The way to share cache is
But we need to be careful if we have sensitive env. variables or not. |
What does this PR do?
For a PR or a branch, if
setup.pyis not modified (compare to the common ancestor with themain), let's use cache that is created for thesetup.pyfrom thelatestcommit on themainbranch.latest means the latest one on
mainat the moment where a run is triggered.Motivation:
mainandpull(for most cases), which is introduced in Separate CircleCI cache betweenmainandpullor other branches #24886 to avoid unexpected/undesired edge cases.setup.pyas they don't rebase on more recentmainWith this PR, we expect the storage usage for
cachecould be reduced dramatically 🤞 .The artifact in this job run shows
generated_config.txthas explicit checksum usedv0.7-pipelines_torch-main-pip-9RXs1YQ8L2beP4cdAfRDkWX0VRTtWaQodDVKzvyJwPI=rather than{{ checksum "setup.py" }}