Skip to content

common : add standard Hugging Face cache support#20775

Merged
angt merged 9 commits intoggml-org:masterfrom
angt:common-add-standard-hugging-face-cache-support
Mar 24, 2026
Merged

common : add standard Hugging Face cache support#20775
angt merged 9 commits intoggml-org:masterfrom
angt:common-add-standard-hugging-face-cache-support

Conversation

@angt
Copy link
Copy Markdown
Member

@angt angt commented Mar 19, 2026

  • Use HF API to find all files
  • Migrate all manifests to hugging face cache at startup

@angt angt requested a review from a team as a code owner March 19, 2026 21:10
@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 19, 2026

WARNING: Do not test without taking care of your cache , or you'll regret it. There is no come-back.😬

@angt angt force-pushed the common-add-standard-hugging-face-cache-support branch 7 times, most recently from 3638b70 to 77ff285 Compare March 20, 2026 07:36
@ggerganov
Copy link
Copy Markdown
Member

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 20, 2026

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

I need to test this. I'm afraid that without the token there is no way to migrate correctly..

@ggerganov
Copy link
Copy Markdown
Member

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

I need to test this. I'm afraid that without the token there is no way to migrate correctly..

So is the current logic that the migration will only happen if an HF token is provided? I think that makes sense.

- Use HF API to find all files
- Migrate all manifests to hugging face cache at startup

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt angt force-pushed the common-add-standard-hugging-face-cache-support branch from 77ff285 to 6fd16ba Compare March 20, 2026 14:11
@julien-c
Copy link
Copy Markdown
Contributor

tried it locally, worked well!

number of models in cache: 9
   1. bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
   2. ggml-org/Nemotron-Nano-3-30B-A3B-GGUF:Q4_K_M
   3. ggml-org/gemma-3-1b-it-GGUF:Q4_K_M
   4. ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
   5. ggml-org/gpt-oss-20b-GGUF:MXFP4
   6. lmstudio-ai/gemma-2b-it-GGUF:Q4_K_M
   7. lmstudio-ai/gemma-2b-it-GGUF:Q8_0
   8. unsloth/Qwen3.5-4B-GGUF:Q4_K_XL
   9. unsloth/Qwen3.5-9B-GGUF:Q4_K_XL

Comment thread common/hf-cache.h Outdated
Comment thread common/download.h Outdated
Comment thread common/download.cpp Outdated
angt added 2 commits March 20, 2026 17:25
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt angt requested a review from a team as a code owner March 20, 2026 18:06
@github-actions github-actions Bot added python python script changes server labels Mar 20, 2026
Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One fail case that I can think of is if the user has the current llama.cpp cache on a larger, separate disk from the one where the HF cache is. This would cause to move files from the larger to the smaller and it might get full in the process. But I don't think we have a way to prevent that, if we want the migration to be automatic.

@ggerganov ggerganov requested a review from ngxson March 20, 2026 18:45
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Comment thread common/download.h Outdated
Comment thread common/download.cpp
@angt angt force-pushed the common-add-standard-hugging-face-cache-support branch 2 times, most recently from 62bcccb to 5d0c722 Compare March 22, 2026 09:02
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
angt added 3 commits March 22, 2026 09:18
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 22, 2026

Here my tests:


$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp


# Download model with llama.cpp

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517.downloadInProgress (etag:"de9bcb3f1b16e6e33ab42b3851e80945d4153631418c8494a77bfa46b3ac70ec")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf

$ cat /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

# Load already downloaded

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Load already downloaded in offline

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF --offline
migrate_old_cache_to_hf_cache: skipping migration in offline mode (will run when online)
common_download_file_single: required file is not available in cache (offline mode): /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
no remote preset found, skipping
common_download_file_single: using cached file (offline mode): /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single: using cached file (offline mode): /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Show from hf

$ hf cache list
ID                              SIZE     LAST_ACCESSED     LAST_MODIFIED     REFS
------------------------------- -------- ----------------- ----------------- ----
model/unsloth/Qwen3.5-0.8B-GGUF   739.9M a few seconds ago a few seconds ago main
�[1m
Found 1 repo(s) for a total of 1 revision(s) and 739.9M on disk.�[0m


# Load already downloaded by hf

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ hf download unsloth/Qwen3.5-0.8B-GGUF --include *Q4_K_M*
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
find: ‘/home/angt/.cache/llama.cpp’: No such file or directory
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/.check_for_update_done
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/huggingface/hub/.locks
/home/angt/.cache/huggingface/hub/.locks/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/xet
/home/angt/.cache/huggingface/xet/logs
/home/angt/.cache/huggingface/xet/logs/xet_20260322T104858688+0000_16986.log
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h/staging

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...


# Load with old cache

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ cp -rf /home/angt/.cache/llama.cpp.old.2 /home/angt/.cache/llama.cpp

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf.etag
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf.etag
/home/angt/.cache/llama.cpp/manifest=unsloth=Qwen3.5-0.8B-GGUF=latest.json
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf
find: ‘/home/angt/.cache/huggingface’: No such file or directory

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
migrate_single_file: migrated unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
migrate_single_file: migrated unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-F16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...


# Check caches after migration

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/56e4c6cfe73b0c82e3e82bc518d7591997e61d81f723fc41a586f4fa69ea2453
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-F16.gguf

$ cat /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. unsloth/Qwen3.5-0.8B-GGUF:Q4_K_M


# Load Q4_0

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF:Q4_0
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_0.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c.downloadInProgress (etag:"ae418b071a9fb1f47f0ca8a8e73a09ef3c24db13e8cd8a1573da8946d0d6b0ae")...
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_0.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. unsloth/Qwen3.5-0.8B-GGUF:Q4_0


# Check splitted gguf

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v -hf angt/test-split-model-stories260K
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694.downloadInProgress (etag:"a62dae7ae4cb9c28eb85f5b544d5848c664c592ccde91676d65c18ad23c231db")...
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c.downloadInProgress (etag:"63c5624a34c15c7d80ac1e9793dd041b1b372c432adb989f66a81b99c1bbe22f")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. angt/test-split-model-stories260K:F32


# Check with hf download

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ hf download angt/test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
find: ‘/home/angt/.cache/llama.cpp’: No such file or directory
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/.check_for_update_done
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/a6c45e2117353b8b9df7b5f0638a3ef76d35a57b
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/.gitattributes
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
/home/angt/.cache/huggingface/hub/.locks
/home/angt/.cache/huggingface/hub/.locks/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/xet
/home/angt/.cache/huggingface/xet/logs
/home/angt/.cache/huggingface/xet/logs/xet_20260322T104904162+0000_17074.log
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h/staging


# Load with old cache, with a tag

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ cp -rf /home/angt/.cache/llama.cpp.old.3 /home/angt/.cache/llama.cpp

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf.etag
/home/angt/.cache/llama.cpp/manifest=unsloth=Qwen3.5-2B-GGUF=Q3_K_S.json
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf.etag
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf
find: ‘/home/angt/.cache/huggingface’: No such file or directory

$ build/bin/llama-server -v -hf angt/test-split-model-stories260K
migrate_single_file: migrated unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/Qwen3.5-2B-Q3_K_S.gguf
migrate_single_file: migrated unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/mmproj-F16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c.downloadInProgress (etag:"63c5624a34c15c7d80ac1e9793dd041b1b372c432adb989f66a81b99c1bbe22f")...
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694.downloadInProgress (etag:"a62dae7ae4cb9c28eb85f5b544d5848c664c592ccde91676d65c18ad23c231db")...


# Check caches after migration

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs/21026bce70a757887bce861047c26966109206ebe2adeb7b662de9a179952d28
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs/7035e9cb8d7c6a9681d07eef9a364783e86ea4cd73faab2eabb4f43a101830c7
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/Qwen3.5-2B-Q3_K_S.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/mmproj-F16.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 2
   1. unsloth/Qwen3.5-2B-GGUF:Q3_K_S
   2. angt/test-split-model-stories260K:F32


# Check --hf-file

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
common_download_file_single_online: downloading from https://huggingface.co/ggml-org/models/resolve/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf to /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999.downloadInProgress (etag:"6d8e67bea957b6d92bf0339c873a4abe061e888bcda60e3b71a48212f840e91c")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. ggml-org/models:F16


# Check degraded mode (no redownload)

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
common_download_file_single_online: downloading from https://huggingface.co/ggml-org/models/resolve/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf to /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999.downloadInProgress (etag:"6d8e67bea957b6d92bf0339c873a4abe061e888bcda60e3b71a48212f840e91c")...

$ rm -- /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ mv /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999 /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ find /home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ find /home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

Comment thread common/hf-cache.cpp
Comment thread common/download.h
Comment thread common/hf-cache.cpp
Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 23, 2026

Let's go @ngxson @ggerganov ?

@angt angt merged commit 8c7957c into ggml-org:master Mar 24, 2026
51 checks passed
@ggerganov
Copy link
Copy Markdown
Member

I think we should add some prominent notification on the top of the README.md about this change - I expect there to be some level of confusion from the migration.

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 24, 2026

Do you want a bigger WARNING with an explanation when the files are migrated the first time ?

@ggerganov
Copy link
Copy Markdown
Member

I guess a warning in the logs might be useful too.

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 24, 2026

see #20935

@CISC
Copy link
Copy Markdown
Member

CISC commented Mar 24, 2026

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 24, 2026

@CISC
Copy link
Copy Markdown
Member

CISC commented Mar 24, 2026

https://github.com/ggml-org/llama.cpp/actions/runs/23476321133/job/68309759940

This was not tested in the PR ?

Sanitizer jobs (and server-metal, which is also generally failing now) are manual outside of master since #20546

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 24, 2026

Thanks, i missed that.

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 24, 2026

see #20946

@wbste
Copy link
Copy Markdown

wbste commented Mar 25, 2026

Yikes! Anyway to disable this or hide models from HF_HOME? I'm seeing EVERYTHING in my cache, even stuff llama.cpp can't run (Safetensors, Gitattributes, etc...). I purposely have my gguf files in another location so I don't mix HF stuff with llama.cpp. Thoughts on how to handle?

@WhyNotHugo
Copy link
Copy Markdown
Contributor

Regresses: #21280

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* common : add standard Hugging Face cache support

- Use HF API to find all files
- Migrate all manifests to hugging face cache at startup

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Check with the quant tag

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Cleanup

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Improve error handling and report API errors

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Restore common_cached_model_info and align mmproj filtering

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Prefer main when getting cached ref

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Use cached files when HF API fails

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Use final_path..

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Check all inputs

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* common : add standard Hugging Face cache support

- Use HF API to find all files
- Migrate all manifests to hugging face cache at startup

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Check with the quant tag

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Cleanup

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Improve error handling and report API errors

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Restore common_cached_model_info and align mmproj filtering

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Prefer main when getting cached ref

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Use cached files when HF API fails

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Use final_path..

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

* Check all inputs

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

---------

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
julien-c added a commit to huggingface/hub-docs that referenced this pull request May 4, 2026
…maBarn (#2453)

Move llama.cpp and HuggingFaceModelDownloader under a new Applications
table, add LlamaBarn, and replace the "Work in progress" note for
llama.cpp with a link to ggml-org/llama.cpp#20775 which added standard
Hugging Face cache support.

Co-authored-by: julien-agent <Agents+cyolo@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

examples python python script changes server

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants