convert-hf : match model part name prefix and suffix by compilade · Pull Request #7687 · ggml-org/llama.cpp

compilade · 2024-06-02T00:57:51Z

TL;DR This should fix conversion of https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

(thanks to @teleprint-me for noticing this problem in #7379 (comment))

Before #7075, convert-hf-to-gguf.py checked for specific model part names according to the number of *.bin files or *.safetensors files.

https://github.com/ggerganov/llama.cpp/blob/bc4bba364fb96d908f2698e908648df5e6f55e02/convert-hf-to-gguf.py#L224-L232

But then, #7075, to fix the conversion of (some) models using model-00001-of-00001.safetensors instead of model.safetensors for a single model part, simply used the same logic as the part count to get the part names. That is, using only the suffix of the files.

https://github.com/ggerganov/llama.cpp/blob/f98eb31c517c95960df1d0abc48002787f145f3b/convert-hf-to-gguf.py#L285-L293

But this doesn't always work correctly, like when unusual additional model files like consolidated.safetensors in https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present.

Note that the previous way it was done would still have failed, since the model count still only relied on the suffixes.

I think matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models.

teleprint-me · 2024-06-02T01:05:44Z

It's an improvement, but the problem continues to persist. This is a good patch for the interim.

teleprint-me · 2024-06-09T06:24:41Z

I think this might be okay for the most part. I've been looking at a bunch of model repos and they all seem to be converging towards a consistent format. I only thought this might be an issue because a model creator could name the file anything they wanted. The only outlier I've been able to find is consolidated which is (kind of) a new one. My rationale for this potentially being an issue is that the raw models distributed by facebook are also named consolidated or prefixed with consolidated.

02:21:19 | ~
  λ tree -I papers /mnt/scsm/models/facebook                 
/mnt/scsm/models/facebook
├── llama-1
│   ├── 13B
│   │   ├── checklist.chk
│   │   ├── consolidated.00.pth
│   │   ├── consolidated.01.pth
│   │   ├── params.json
│   │   ├── tokenizer_checklist.chk
│   │   └── tokenizer.model
│   ├── 30B
│   │   ├── checklist.chk
│   │   ├── consolidated.00.pth
│   │   ├── consolidated.01.pth
│   │   ├── consolidated.02.pth
│   │   ├── consolidated.03.pth
│   │   ├── params.json
│   │   ├── tokenizer_checklist.chk
│   │   └── tokenizer.model
│   ├── 7B
│   │   ├── checklist.chk
│   │   ├── consolidated.00.pth
│   │   ├── params.json
│   │   ├── tokenizer_checklist.chk
│   │   └── tokenizer.model
│   └── llama.sh

That's all I have to say about this for now. I'm still researching and experimenting.

mofosyne · 2024-06-09T06:34:16Z

Do we have any contact with the safetensor devs etc... to encourage formalization or at least encourage best practices?

teleprint-me · 2024-06-09T07:04:37Z

The issue is that PyTorch users can do whatever they want. Safetensors is managed by HuggingFace. So, if I make some toy models, then I can technically name them whatever. I think @compilade is right about the naming not being an issue. The conversion script is solely geared toward huggingface, so it isolates our focus, which isn't necessarily a bad thing considering how much is going on in the repo. You can find the Safetensors specs in their repo which is actually nicely outlined.

https://github.com/huggingface/safetensors/blob/main/docs/source/metadata_parsing.mdx#usage

@teleprint-me

In ggml-org#7075, to fix the conversion of (some) models using model-00001-of-00001.safetensors instead of model.safetensors for a single model part we simply used the same logic as the part count to get the part names. But this doesn't always work correctly, like when unusual additional model files like consolidated.safetensors in https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present. This commit matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models. But according to report by @teleprint-me there is still some persistent problem, but shall do in the meantime.

@teleprint-me

In ggml-org#7075, to fix the conversion of (some) models using model-00001-of-00001.safetensors instead of model.safetensors for a single model part we simply used the same logic as the part count to get the part names. But this doesn't always work correctly, like when unusual additional model files like consolidated.safetensors in https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present. This commit matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models. But according to report by @teleprint-me there is still some persistent problem, but shall do in the meantime.

convert-hf : match model part name prefix and suffix

3af9371

compilade added bugfix fixes an issue or bug Review Complexity : Low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix python python script changes labels Jun 2, 2024

mofosyne approved these changes Jun 2, 2024

View reviewed changes

mofosyne added the merge ready A maintainer can use this label to indicate that they consider the changes final and ready to merge. label Jun 2, 2024

mofosyne merged commit 5795b94 into master Jun 9, 2024

mofosyne deleted the compilade/convert-hf-model-part-prefix branch June 9, 2024 07:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

convert-hf : match model part name prefix and suffix#7687

convert-hf : match model part name prefix and suffix#7687
mofosyne merged 1 commit intomasterfrom
compilade/convert-hf-model-part-prefix

compilade commented Jun 2, 2024 •

edited

Loading

Uh oh!

teleprint-me commented Jun 2, 2024

Uh oh!

teleprint-me commented Jun 9, 2024

Uh oh!

mofosyne commented Jun 9, 2024 •

edited

Loading

Uh oh!

teleprint-me commented Jun 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

compilade commented Jun 2, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

teleprint-me commented Jun 2, 2024

Uh oh!

teleprint-me commented Jun 9, 2024

Uh oh!

mofosyne commented Jun 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

teleprint-me commented Jun 9, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

compilade commented Jun 2, 2024 •

edited

Loading

mofosyne commented Jun 9, 2024 •

edited

Loading