ci: only *write ccache in "push to master" jobs#11661
ci: only *write ccache in "push to master" jobs#11661ochafik wants to merge 2 commits intoggml-org:masterfrom
ci: only *write ccache in "push to master" jobs#11661Conversation
ci: only write ccache in release jobs (but keep reading from them)ci: only *write ccache when pushing to master
ci: only *write ccache when pushing to masterci: only *write ccache in "push to master" jobs
|
I don't think exceeding the cache size is necessarily a problem, that's expected, since caches are immutable and every commit adds a new set of caches. As long as the size of all the caches created in a single commit is a few times lower than the max total cache size, so that the cache for the latest master commit and the caches of open PRs are kept, it should be fine. Creating caches for PRs is desirable since it improves the build times of subsequent commits to the PR. |
I'm weary about the following problems:
If we only cached the main branch, we could cache said sdk downloads (reducing long tail), and PRs would get a % of cache hit proportional with the amount of files they modified, with a predictable pattern. PRs with lots of header changes would pay a higher compilation price but would reap benefits from long tail SDK-installing jobs being much faster, and a possible majority of PRs (TBC) would still have a high, predictible cache hit rate. (I'm wondering how to interpret the https://github.com/ggerganov/llama.cpp/actions/metrics/performance metrics, but job queue time is on the rise, and avg time hasn't budged) |
|
@slaren Anyway, if you're willing to experiment, we could push something like this (+ maybe cache some sdk downloads) and see in which direction performance metrics budge after a week / revert if it's worse. |
|
On a related note, |
|
This is tempting to resurrect, we get absolutely swamped with caches on busy days, esp. the windows CUDA caches are problematic as they've now grown to 0.5G each. |
|
I think this was superseded by #18207 |
According to https://github.com/ggerganov/llama.cpp/actions/caches, we're
Approaching total cache storage limit (88.08 GB of 10 GB Used)With this PR, instead of letting each and every branch write their branch-specific outputs to ccache (and probably overwrite each other w/ weird race conditions), we restrict it to pushes to master (hopefully less concurrency). Also proposing to expire cache after 12h but not sure that's needed (risk is if there's no push for over 12h, then nobody will get any ccache to read from).
cc/ @slaren (follow up to #11516)