-
-
Notifications
You must be signed in to change notification settings - Fork 3.1k
Don't modify the on disk cache in fine-grained mode #4664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
They'll still merge conflict, but the resolution will be trivial now (I'm trying avoid making one depend on the other)
They'll still merge conflict, but the resolution will be trivial now (I'm trying avoid making one depend on the other)
JukkaL
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just a few notes below. Please test this change manually with and without a remote cache before merging.
It would be useful to have the motivation for this change spelled out in the commit message. Is it more about performance of incremental updates or about correctness?
Longer-term, we may want to write cache files in fine-grained incremental mode, at least when not using a remote cache. This could improve performance if users frequently restart the daemon and don't use a remote cache (or if we decide to shut down the daemon after some time of inactivity). Can you create an issue about this?
| if not self.options.use_fine_grained_cache: | ||
| # Stores the initial state of sources as a side effect. | ||
| self.fswatcher.find_changed() | ||
| # Stores the initial state of sources as a side effect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this changed to be executed unconditionally?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that we have accurate fswatcher cache information for files that we didn't read from the on-disk cache, now that we don't generate CacheMetas
mypy/dmypy_server.py
Outdated
| for state in self.fine_grained_manager.graph.values(): | ||
| meta = state.meta | ||
| # If there isn't a meta, that means the current | ||
| # version got checked in the initial build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Relying on this logic here seems a bit fragile. Maybe move the meta is None check to a helper method in State and use the helper method here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing it to use is_cache_skeleton instead.
|
The main rationale is closer to correctness than performance. Currently we write and delete cache files during the initial load of fine-grained mode, but never during fine-grained updates. This means that fine-grained can mess the cache up but won't fix it without a restart, which seems not great. |
This is a little subtle, because interface_hash still needs to be computed, as it is a major driver of the coarse-grained build process. Since metas are no longer computed for files that get rechecked during build, to avoid spuriously reprocessing them we need to find initial file state in cache mdoe as well.
f9076ed to
8b3702d
Compare
|
Withdrawing this. Some changes I am making as part of my deletion optimization path is going to make this fall out much more simply. |
This is a little subtle, because interface_hash still needs to be
computed, as it is a major driver of the coarse-grained build process.
Since metas are no longer computed for files that get rechecked during
build, to avoid spuriously reprocessing them we need to find initial
file state in cache mode as well.