common : add standard Hugging Face cache support#20775
Conversation
angt
commented
Mar 19, 2026
- Use HF API to find all files
- Migrate all manifests to hugging face cache at startup
|
WARNING: Do not test without taking care of your cache , or you'll regret it. There is no come-back.😬 |
3638b70 to
77ff285
Compare
|
Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct? |
I need to test this. I'm afraid that without the token there is no way to migrate correctly.. |
So is the current logic that the migration will only happen if an HF token is provided? I think that makes sense. |
- Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co>
77ff285 to
6fd16ba
Compare
|
tried it locally, worked well! |
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
ggerganov
left a comment
There was a problem hiding this comment.
One fail case that I can think of is if the user has the current llama.cpp cache on a larger, separate disk from the one where the HF cache is. This would cause to move files from the larger to the smaller and it might get full in the process. But I don't think we have a way to prevent that, if we want the migration to be automatic.
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
62bcccb to
5d0c722
Compare
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
|
Here my tests: |
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
|
Let's go @ngxson @ggerganov ? |
|
I think we should add some prominent notification on the top of the README.md about this change - I expect there to be some level of confusion from the migration. |
|
Do you want a bigger WARNING with an explanation when the files are migrated the first time ? |
|
I guess a warning in the logs might be useful too. |
|
see #20935 |
This was not tested in the PR ? |
Sanitizer jobs (and server-metal, which is also generally failing now) are manual outside of master since #20546 |
|
Thanks, i missed that. |
|
see #20946 |
|
Yikes! Anyway to disable this or hide models from HF_HOME? I'm seeing EVERYTHING in my cache, even stuff llama.cpp can't run (Safetensors, Gitattributes, etc...). I purposely have my gguf files in another location so I don't mix HF stuff with llama.cpp. Thoughts on how to handle? |
|
Regresses: #21280 |
* common : add standard Hugging Face cache support - Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check with the quant tag Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Cleanup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Improve error handling and report API errors Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Restore common_cached_model_info and align mmproj filtering Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Prefer main when getting cached ref Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use cached files when HF API fails Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use final_path.. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check all inputs Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>
* common : add standard Hugging Face cache support - Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check with the quant tag Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Cleanup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Improve error handling and report API errors Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Restore common_cached_model_info and align mmproj filtering Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Prefer main when getting cached ref Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use cached files when HF API fails Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use final_path.. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check all inputs Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>
…maBarn (#2453) Move llama.cpp and HuggingFaceModelDownloader under a new Applications table, add LlamaBarn, and replace the "Work in progress" note for llama.cpp with a link to ggml-org/llama.cpp#20775 which added standard Hugging Face cache support. Co-authored-by: julien-agent <Agents+cyolo@huggingface.co>