common : add standard Hugging Face cache support by angt · Pull Request #20775 · ggml-org/llama.cpp

angt · 2026-03-19T21:10:57Z

Use HF API to find all files
Migrate all manifests to hugging face cache at startup

angt · 2026-03-19T21:13:32Z

WARNING: Do not test without taking care of your cache , or you'll regret it. There is no come-back.😬

ggerganov · 2026-03-20T08:42:07Z

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

angt · 2026-03-20T08:53:23Z

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

I need to test this. I'm afraid that without the token there is no way to migrate correctly..

ggerganov · 2026-03-20T08:56:49Z

Is it going to handle correctly repos that require HF token (e.g. gated, private)? I think it will back out from the migration of that specific manifest, correct?

I need to test this. I'm afraid that without the token there is no way to migrate correctly..

So is the current logic that the migration will only happen if an HF token is provided? I think that makes sense.

- Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co>

julien-c · 2026-03-20T15:26:35Z

tried it locally, worked well!

number of models in cache: 9
   1. bartowski/Qwen2.5-7B-Instruct-GGUF:Q4_K_M
   2. ggml-org/Nemotron-Nano-3-30B-A3B-GGUF:Q4_K_M
   3. ggml-org/gemma-3-1b-it-GGUF:Q4_K_M
   4. ggml-org/gemma-3-4b-it-GGUF:Q4_K_M
   5. ggml-org/gpt-oss-20b-GGUF:MXFP4
   6. lmstudio-ai/gemma-2b-it-GGUF:Q4_K_M
   7. lmstudio-ai/gemma-2b-it-GGUF:Q8_0
   8. unsloth/Qwen3.5-4B-GGUF:Q4_K_XL
   9. unsloth/Qwen3.5-9B-GGUF:Q4_K_XL

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

ggerganov

One fail case that I can think of is if the user has the current llama.cpp cache on a larger, separate disk from the one where the HF cache is. This would cause to move files from the larger to the smaller and it might get full in the process. But I don't think we have a way to prevent that, if we want the migration to be automatic.

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

angt · 2026-03-22T10:51:24Z

Here my tests:


$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp


# Download model with llama.cpp

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517.downloadInProgress (etag:"de9bcb3f1b16e6e33ab42b3851e80945d4153631418c8494a77bfa46b3ac70ec")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf

$ cat /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

# Load already downloaded

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Load already downloaded in offline

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF --offline
migrate_old_cache_to_hf_cache: skipping migration in offline mode (will run when online)
common_download_file_single: required file is not available in cache (offline mode): /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
no remote preset found, skipping
common_download_file_single: using cached file (offline mode): /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single: using cached file (offline mode): /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Show from hf

$ hf cache list
ID                              SIZE     LAST_ACCESSED     LAST_MODIFIED     REFS
------------------------------- -------- ----------------- ----------------- ----
model/unsloth/Qwen3.5-0.8B-GGUF   739.9M a few seconds ago a few seconds ago main
�[1m
Found 1 repo(s) for a total of 1 revision(s) and 739.9M on disk.�[0m


# Load already downloaded by hf

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ hf download unsloth/Qwen3.5-0.8B-GGUF --include *Q4_K_M*
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
find: ‘/home/angt/.cache/llama.cpp’: No such file or directory
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/.check_for_update_done
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/huggingface/hub/.locks
/home/angt/.cache/huggingface/hub/.locks/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/xet
/home/angt/.cache/huggingface/xet/logs
/home/angt/.cache/huggingface/xet/logs/xet_20260322T104858688+0000_16986.log
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h/staging

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...


# Load with old cache

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ cp -rf /home/angt/.cache/llama.cpp.old.2 /home/angt/.cache/llama.cpp

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf.etag
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf.etag
/home/angt/.cache/llama.cpp/manifest=unsloth=Qwen3.5-0.8B-GGUF=latest.json
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf
find: ‘/home/angt/.cache/huggingface’: No such file or directory

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF
migrate_single_file: migrated unsloth_Qwen3.5-0.8B-GGUF_Qwen3.5-0.8B-Q4_K_M.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
migrate_single_file: migrated unsloth_Qwen3.5-0.8B-GGUF_mmproj-F16.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-F16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...


# Check caches after migration

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/bd258782e35f7f458f8aced1adc053e6e92e89bc735ba3be89d38a06121dc517
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/56e4c6cfe73b0c82e3e82bc518d7591997e61d81f723fc41a586f4fa69ea2453
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_K_M.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-F16.gguf

$ cat /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
6ab461498e2023f6e3c1baea90a8f0fe38ab64d0

# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. unsloth/Qwen3.5-0.8B-GGUF:Q4_K_M


# Load Q4_0

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v -hf unsloth/Qwen3.5-0.8B-GGUF:Q4_0
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/unsloth_Qwen3.5-0.8B-GGUF_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_0.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c.downloadInProgress (etag:"ae418b071a9fb1f47f0ca8a8e73a09ef3c24db13e8cd8a1573da8946d0d6b0ae")...
common_download_file_single_online: downloading from https://huggingface.co/unsloth/Qwen3.5-0.8B-GGUF/resolve/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf to /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6.downloadInProgress (etag:"3454e627e6acd0d0970c2e8844f617fabda1f3e03f02d3221911f8728b1e01ca")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/444406ddd926550c724ec18d5120a9d40ded44908a063b0e66e9a7e5464c652c
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/blobs/d312c4d02fd46eea7a16e4f3bbb58840e6222209322ca1e33ca03247ad8935d6
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/Qwen3.5-0.8B-Q4_0.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-0.8B-GGUF/snapshots/6ab461498e2023f6e3c1baea90a8f0fe38ab64d0/mmproj-BF16.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. unsloth/Qwen3.5-0.8B-GGUF:Q4_0


# Check splitted gguf

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v -hf angt/test-split-model-stories260K
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694.downloadInProgress (etag:"a62dae7ae4cb9c28eb85f5b544d5848c664c592ccde91676d65c18ad23c231db")...
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c.downloadInProgress (etag:"63c5624a34c15c7d80ac1e9793dd041b1b372c432adb989f66a81b99c1bbe22f")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. angt/test-split-model-stories260K:F32


# Check with hf download

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ hf download angt/test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
find: ‘/home/angt/.cache/llama.cpp’: No such file or directory
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/.check_for_update_done
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/a6c45e2117353b8b9df7b5f0638a3ef76d35a57b
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/.gitattributes
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
/home/angt/.cache/huggingface/hub/.locks
/home/angt/.cache/huggingface/hub/.locks/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/xet
/home/angt/.cache/huggingface/xet/logs
/home/angt/.cache/huggingface/xet/logs/xet_20260322T104904162+0000_17074.log
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h
/home/angt/.cache/huggingface/xet/https___cas_serv-tGqkUaZf_CBPHQ6h/staging


# Load with old cache, with a tag

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ cp -rf /home/angt/.cache/llama.cpp.old.3 /home/angt/.cache/llama.cpp

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf.etag
/home/angt/.cache/llama.cpp/manifest=unsloth=Qwen3.5-2B-GGUF=Q3_K_S.json
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf.etag
/home/angt/.cache/llama.cpp/unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf
find: ‘/home/angt/.cache/huggingface’: No such file or directory

$ build/bin/llama-server -v -hf angt/test-split-model-stories260K
migrate_single_file: migrated unsloth_Qwen3.5-2B-GGUF_Qwen3.5-2B-Q3_K_S.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/Qwen3.5-2B-Q3_K_S.gguf
migrate_single_file: migrated unsloth_Qwen3.5-2B-GGUF_mmproj-F16.gguf -> /home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/mmproj-F16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c.downloadInProgress (etag:"63c5624a34c15c7d80ac1e9793dd041b1b372c432adb989f66a81b99c1bbe22f")...
common_download_file_single_online: downloading from https://huggingface.co/angt/test-split-model-stories260K/resolve/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf to /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694.downloadInProgress (etag:"a62dae7ae4cb9c28eb85f5b544d5848c664c592ccde91676d65c18ad23c231db")...


# Check caches after migration

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs/21026bce70a757887bce861047c26966109206ebe2adeb7b662de9a179952d28
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/blobs/7035e9cb8d7c6a9681d07eef9a364783e86ea4cd73faab2eabb4f43a101830c7
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/refs
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/refs/main
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/Qwen3.5-2B-Q3_K_S.gguf
/home/angt/.cache/huggingface/hub/models--unsloth--Qwen3.5-2B-GGUF/snapshots/f6d5376be1edb4d416d56da11e5397a961aca8ae/mmproj-F16.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
/home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 2
   1. unsloth/Qwen3.5-2B-GGUF:Q3_K_S
   2. angt/test-split-model-stories260K:F32


# Check --hf-file

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
common_download_file_single_online: downloading from https://huggingface.co/ggml-org/models/resolve/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf to /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999.downloadInProgress (etag:"6d8e67bea957b6d92bf0339c873a4abe061e888bcda60e3b71a48212f840e91c")...

$ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
/home/angt/.cache/llama.cpp
/home/angt/.cache/huggingface
/home/angt/.cache/huggingface/hub
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf


# Check --cache-list

$ build/bin/llama-server --cache-list
number of models in cache: 1
   1. ggml-org/models:F16


# Check degraded mode (no redownload)

$ rm -rf /home/angt/.cache/huggingface /home/angt/.cache/llama.cpp

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: no previous model file found /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999
common_download_file_single_online: downloading from https://huggingface.co/ggml-org/models/resolve/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf to /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999.downloadInProgress (etag:"6d8e67bea957b6d92bf0339c873a4abe061e888bcda60e3b71a48212f840e91c")...

$ rm -- /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ mv /home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs/f0b2fef971e8366438bfd2d9aefea1b0115919389448806d290237f638bae999 /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ find /home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ build/bin/llama-server -v --hf-repo ggml-org/models --hf-file bert-bge-small/ggml-model-f16.gguf
common_download_file_single_online: no previous model file found /home/angt/.cache/llama.cpp/ggml-org_models_preset.ini
common_download_file_single_online: HEAD failed, status: 404
no remote preset found, skipping
common_download_file_single_online: using cached file: /home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

$ find /home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models
/home/angt/.cache/huggingface/hub/models--ggml-org--models/blobs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs
/home/angt/.cache/huggingface/hub/models--ggml-org--models/refs/main
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small
/home/angt/.cache/huggingface/hub/models--ggml-org--models/snapshots/499bc8821c6b12b4e53c5bffcb21ec206f212d81/bert-bge-small/ggml-model-f16.gguf

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

angt · 2026-03-23T21:29:36Z

Let's go @ngxson @ggerganov ?

ggerganov · 2026-03-24T06:37:53Z

I think we should add some prominent notification on the top of the README.md about this change - I expect there to be some level of confusion from the migration.

angt · 2026-03-24T06:56:07Z

Do you want a bigger WARNING with an explanation when the files are migrated the first time ?

ggerganov · 2026-03-24T07:09:32Z

I guess a warning in the logs might be useful too.

angt · 2026-03-24T08:09:23Z

see #20935

CISC · 2026-03-24T08:44:52Z

https://github.com/ggml-org/llama.cpp/actions/runs/23476321133/job/68309759940

angt · 2026-03-24T09:16:43Z

https://github.com/ggml-org/llama.cpp/actions/runs/23476321133/job/68309759940

This was not tested in the PR ?

CISC · 2026-03-24T09:23:16Z

https://github.com/ggml-org/llama.cpp/actions/runs/23476321133/job/68309759940

This was not tested in the PR ?

Sanitizer jobs (and server-metal, which is also generally failing now) are manual outside of master since #20546

angt · 2026-03-24T09:26:14Z

Thanks, i missed that.

angt · 2026-03-24T09:59:19Z

see #20946

wbste · 2026-03-25T14:40:01Z

Yikes! Anyway to disable this or hide models from HF_HOME? I'm seeing EVERYTHING in my cache, even stuff llama.cpp can't run (Safetensors, Gitattributes, etc...). I purposely have my gguf files in another location so I don't mix HF stuff with llama.cpp. Thoughts on how to handle?

WhyNotHugo · 2026-04-01T21:52:47Z

Regresses: #21280

* common : add standard Hugging Face cache support - Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check with the quant tag Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Cleanup Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Improve error handling and report API errors Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Restore common_cached_model_info and align mmproj filtering Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Prefer main when getting cached ref Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use cached files when HF API fails Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Use final_path.. Signed-off-by: Adrien Gallouët <angt@huggingface.co> * Check all inputs Signed-off-by: Adrien Gallouët <angt@huggingface.co> --------- Signed-off-by: Adrien Gallouët <angt@huggingface.co>

…maBarn (#2453) Move llama.cpp and HuggingFaceModelDownloader under a new Applications table, add LlamaBarn, and replace the "Work in progress" note for llama.cpp with a link to ggml-org/llama.cpp#20775 which added standard Hugging Face cache support. Co-authored-by: julien-agent <Agents+cyolo@huggingface.co>

angt requested a review from a team as a code owner March 19, 2026 21:10

github-actions Bot added the examples label Mar 19, 2026

angt force-pushed the common-add-standard-hugging-face-cache-support branch 7 times, most recently from 3638b70 to 77ff285 Compare March 20, 2026 07:36

common : add standard Hugging Face cache support

6fd16ba

- Use HF API to find all files - Migrate all manifests to hugging face cache at startup Signed-off-by: Adrien Gallouët <angt@huggingface.co>

angt force-pushed the common-add-standard-hugging-face-cache-support branch from 77ff285 to 6fd16ba Compare March 20, 2026 14:11

ggerganov reviewed Mar 20, 2026

View reviewed changes

Comment thread common/hf-cache.h Outdated

Comment thread common/download.h Outdated

Comment thread common/download.cpp Outdated

angt added 2 commits March 20, 2026 17:25

Check with the quant tag

e915644

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

Cleanup

b6c7bcf

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

angt requested a review from a team as a code owner March 20, 2026 18:06

github-actions Bot added python python script changes server labels Mar 20, 2026

ggerganov approved these changes Mar 20, 2026

View reviewed changes

ggerganov requested a review from ngxson March 20, 2026 18:45

Improve error handling and report API errors

e404f6a

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

ngxson reviewed Mar 20, 2026

View reviewed changes

Comment thread common/download.h Outdated

Comment thread common/download.cpp

loci-dev mentioned this pull request Mar 21, 2026

UPSTREAM PR #20775: common : add standard Hugging Face cache support auroralabs-loci/llama.cpp#1278

Open

angt force-pushed the common-add-standard-hugging-face-cache-support branch 2 times, most recently from 62bcccb to 5d0c722 Compare March 22, 2026 09:02

Restore common_cached_model_info and align mmproj filtering

77fa9a9

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

angt added 3 commits March 22, 2026 09:18

Prefer main when getting cached ref

5572986

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

Use cached files when HF API fails

74c1874

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

Use final_path..

6ab630f

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

ngxson reviewed Mar 22, 2026

View reviewed changes

Comment thread common/hf-cache.cpp

Comment thread common/download.h

Comment thread common/hf-cache.cpp

Check all inputs

3645fee

Signed-off-by: Adrien Gallouët <angt@huggingface.co>

ngxson approved these changes Mar 23, 2026

View reviewed changes

angt merged commit 8c7957c into ggml-org:master Mar 24, 2026
51 checks passed

wbste mentioned this pull request Mar 25, 2026

Misc. bug: HF_HOME Huggingface cache lists all content and LLAMA_CACHE non-functional #20994

Closed

bachp mentioned this pull request Mar 25, 2026

Misc. bug: Unable to start llama-server as systemd service without specifying huggingface cache options #20952

Closed

Beinsezii mentioned this pull request Mar 26, 2026

Misc. bug: New HF Cache Picks imatrix GGUFs #21014

Closed

hmblair mentioned this pull request Mar 26, 2026

Misc. bug: llama-server fails to load multi-shard GGUF from HF cache (selects wrong shard) #21016

Closed

fanshi1028 mentioned this pull request Apr 5, 2026

Misc. bug: Or feature? HF_HUB_CACHE doesn't work. #21456

Closed

danchev mentioned this pull request Apr 7, 2026

Feature Request: tool to list and delete cached models #16393

Open

4 tasks

julien-c mentioned this pull request May 4, 2026

docs(local-cache): split table into Libraries / Applications, add LlamaBarn huggingface/hub-docs#2453

Merged

1 task

Conversation

angt commented Mar 19, 2026

Uh oh!

angt commented Mar 19, 2026

Uh oh!

ggerganov commented Mar 20, 2026

Uh oh!

angt commented Mar 20, 2026

Uh oh!

ggerganov commented Mar 20, 2026

Uh oh!

julien-c commented Mar 20, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ggerganov left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

angt commented Mar 22, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

angt commented Mar 23, 2026

Uh oh!

Uh oh!

ggerganov commented Mar 24, 2026

Uh oh!

angt commented Mar 24, 2026

Uh oh!

ggerganov commented Mar 24, 2026

Uh oh!

angt commented Mar 24, 2026

Uh oh!

CISC commented Mar 24, 2026

Uh oh!

angt commented Mar 24, 2026

Uh oh!

CISC commented Mar 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

angt commented Mar 24, 2026

Uh oh!

angt commented Mar 24, 2026

Uh oh!

wbste commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

WhyNotHugo commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

CISC commented Mar 24, 2026 •

edited

Loading

wbste commented Mar 25, 2026 •

edited

Loading