refactoring: speedup static checks with disk cache#44992
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
as discussed on slack, let's separate the PR into multiple ones :-) For def test_ignores_other_decorators(self):
""" what this really want to check (and maybe also why it should be the expected output you specified) """
path = self._write_temp("@dataclass\nclass Baz:\n pass\n")
names = _get_auto_docstring_names(path)
self.assertEqual(names, set())(I don't mind AI generate docstring, as long as you have an 👀 reading it first 🙏 ) For some tests, we could guess what they are from the test body, but it's not always clear. (I know, we rarely have docstring in test methods ... in general 😢 ) |
|
splited out the doctsring refactoring in a different PR #45009 |
fee57c5 to
6ae3f7c
Compare
|
splitted out changes to check_repo in #45012 |
ydshieh
left a comment
There was a problem hiding this comment.
Very nice and thank you for the iteration! Going to 🚀
103fbe0 to
0af5214
Compare
|
Thanks for the work, this is a nice speedup! One concern: It's also not clear from the code whether these globs are a faithful 1:1 replication of each checker's internal file discovery, or just a best-effort approximation. If approximations, this fragility should at least be prominently documented. |
Thanks, that’s a very good point, and I agree the failure mode here is particularly risky since it can silently skip checks. I like the idea of reducing the chance of drift by moving the glob definition next to each checker, so the file discovery logic and the cache inputs live in the same place. That should make changes much harder to miss. I’ll also double-check whether these globs are currently a strict reflection of each checker’s behavior or just approximations. If there’s any approximation involved, I’ll make that explicit in the code and add documentation calling out the risk. Longer term, it might be worth exploring whether we can derive the inputs directly from the checker logic instead of duplicating it, to avoid this class of issue entirely. |
6c2aacc to
ce2de86
Compare
|
I should also add that there are no cache on CI side so we will not miss a check failure prior to merging |
ydshieh
left a comment
There was a problem hiding this comment.
Just a few nit, then we can go!
Super nice 🔥
03e8290 to
b88ec56
Compare
* added a cache * use a shared cache * added cache in repo checker * more caching * more caching * display elapsed times * added test coverage * Remove check_repo cache changes from speedup branch * fix bad merge * per-checker glob * fixed a couple of tests * tweaks
* added a cache * use a shared cache * added cache in repo checker * more caching * more caching * display elapsed times * added test coverage * Remove check_repo cache changes from speedup branch * fix bad merge * per-checker glob * fixed a couple of tests * tweaks
What does this PR do?
make check-repocan be quite slow, this patch adds file-level cache to speed up checks.We get up to a 27x speedup