Skip to content

common : fix split model migration#21019

Merged
angt merged 1 commit intoggml-org:masterfrom
angt:common-fix-split-model-migration
Mar 26, 2026
Merged

common : fix split model migration#21019
angt merged 1 commit intoggml-org:masterfrom
angt:common-fix-split-model-migration

Conversation

@angt
Copy link
Copy Markdown
Member

@angt angt commented Mar 26, 2026

Overview

Sadly the manifest does not list all required files, i honestly thought it was the case

Without the files listed we don't have the sha256, so if the first file is valid, and all others have the correct size, then we can assume we are good and do the migration...

Additional information

Here my test:

    $ find /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ build/bin/llama-server
    ================================================================================
    WARNING: Migrating cache to HuggingFace cache directory
      Old cache: /home/angt/.cache/llama.cpp/
      New cache: /home/angt/.cache/huggingface/hub
    This one-time migration moves models previously downloaded with -hf
    from the legacy llama.cpp cache to the standard HuggingFace cache.
    Models downloaded with --model-url are not affected.
    ================================================================================
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    migrate_old_cache_to_hf_cache: migration complete, deleting manifest: /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/huggingface
    /home/angt/.cache/huggingface/hub
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf

Requirements

Sadly the manifest does not list all required files, i honestly thought
it was the case

Without the files listed we don't have the sha256, so if the first file
is valid, and all others have the correct size, then we can assume we
are good and do the migration...

Here my test:

    $ find /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ build/bin/llama-server
    ================================================================================
    WARNING: Migrating cache to HuggingFace cache directory
      Old cache: /home/angt/.cache/llama.cpp/
      New cache: /home/angt/.cache/huggingface/hub
    This one-time migration moves models previously downloaded with -hf
    from the legacy llama.cpp cache to the standard HuggingFace cache.
    Models downloaded with --model-url are not affected.
    ================================================================================
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    migrate_old_cache_to_hf_cache: migration complete, deleting manifest: /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/huggingface
    /home/angt/.cache/huggingface/hub
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
@angt angt requested a review from a team as a code owner March 26, 2026 09:20
@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 26, 2026

should fix #21015 and #21016

@angt
Copy link
Copy Markdown
Member Author

angt commented Mar 26, 2026

Like before, this is not perfect but that's better than nothing (I guess)

@ngxson ngxson requested a review from ggerganov March 26, 2026 10:02
@angt angt merged commit 3cba8bb into ggml-org:master Mar 26, 2026
44 of 45 checks passed
brettp added a commit to brettp/llama.cpp that referenced this pull request Apr 6, 2026
File order is not guaranteed when listing dirs. In offline mode, files are
not sorted when read from cache, which can result in the wrong part loading
first if the files have changed metadata (e.g., by moving, symlinking, etc).

This is a minimal approach to ensure model files are correctly sorted
when downloaded or loaded from cache, in online or offline mode. Tests are
included.

Disclaimer: An AI agent was used to refine the approach and write the test.

refs: ggml-org#21019 ggml-org#21016
brettp added a commit to brettp/llama.cpp that referenced this pull request Apr 10, 2026
File order is not guaranteed when listing dirs. In offline mode, files are
not sorted when read from cache, which can result in the wrong part loading
first if the files have changed metadata (e.g., by moving, symlinking, etc).

This is a minimal approach to ensure model files are correctly sorted
when downloaded or loaded from cache, in online or offline mode. Tests are
included.

Disclaimer: An AI agent was used to refine the approach and write the test.

refs: ggml-org#21019 ggml-org#21016
slartibardfast pushed a commit to slartibardfast/llama.cpp that referenced this pull request Apr 12, 2026
Sadly the manifest does not list all required files, i honestly thought
it was the case

Without the files listed we don't have the sha256, so if the first file
is valid, and all others have the correct size, then we can assume we
are good and do the migration...

Here my test:

    $ find /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ build/bin/llama-server
    ================================================================================
    WARNING: Migrating cache to HuggingFace cache directory
      Old cache: /home/angt/.cache/llama.cpp/
      New cache: /home/angt/.cache/huggingface/hub
    This one-time migration moves models previously downloaded with -hf
    from the legacy llama.cpp cache to the standard HuggingFace cache.
    Models downloaded with --model-url are not affected.
    ================================================================================
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    migrate_old_cache_to_hf_cache: migration complete, deleting manifest: /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/huggingface
    /home/angt/.cache/huggingface/hub
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
Sadly the manifest does not list all required files, i honestly thought
it was the case

Without the files listed we don't have the sha256, so if the first file
is valid, and all others have the correct size, then we can assume we
are good and do the migration...

Here my test:

    $ find /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf.etag
    /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ build/bin/llama-server
    ================================================================================
    WARNING: Migrating cache to HuggingFace cache directory
      Old cache: /home/angt/.cache/llama.cpp/
      New cache: /home/angt/.cache/huggingface/hub
    This one-time migration moves models previously downloaded with -hf
    from the legacy llama.cpp cache to the standard HuggingFace cache.
    Models downloaded with --model-url are not affected.
    ================================================================================
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00001-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf
    migrate_file: migrated angt_test-split-model-stories260K_stories260K-f32-00002-of-00002.gguf -> /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    migrate_old_cache_to_hf_cache: migration complete, deleting manifest: /home/angt/.cache/llama.cpp/manifest=angt=test-split-model-stories260K=latest.json

    $ find /home/angt/.cache/llama.cpp /home/angt/.cache/huggingface
    /home/angt/.cache/llama.cpp
    /home/angt/.cache/huggingface
    /home/angt/.cache/huggingface/hub
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/50d019817c2626eb9e8a41f361ff5bfa538757e6f708a3076cd3356354a75694
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/blobs/7b273e1dbfab11dc67dce479deb5923fef27c39cbf56a20b3a928a47b77dab3c
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/refs/main
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00002-of-00002.gguf
    /home/angt/.cache/huggingface/hub/models--angt--test-split-model-stories260K/snapshots/68c3ea2061e8c7688455fab07597dde0f4d7f0db/stories260K-f32-00001-of-00002.gguf

Signed-off-by: Adrien Gallouët <angt@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: llama-server fails to load multi-shard GGUF from HF cache (selects wrong shard) Misc. bug: HF Cache Migration Only Moves Part 00001

3 participants