convert-hf : match model part name prefix and suffix#7687
Conversation
|
It's an improvement, but the problem continues to persist. This is a good patch for the interim. |
|
I think this might be okay for the most part. I've been looking at a bunch of model repos and they all seem to be converging towards a consistent format. I only thought this might be an issue because a model creator could name the file anything they wanted. The only outlier I've been able to find is consolidated which is (kind of) a new one. My rationale for this potentially being an issue is that the raw models distributed by facebook are also named consolidated or prefixed with consolidated. 02:21:19 | ~
λ tree -I papers /mnt/scsm/models/facebook
/mnt/scsm/models/facebook
├── llama-1
│ ├── 13B
│ │ ├── checklist.chk
│ │ ├── consolidated.00.pth
│ │ ├── consolidated.01.pth
│ │ ├── params.json
│ │ ├── tokenizer_checklist.chk
│ │ └── tokenizer.model
│ ├── 30B
│ │ ├── checklist.chk
│ │ ├── consolidated.00.pth
│ │ ├── consolidated.01.pth
│ │ ├── consolidated.02.pth
│ │ ├── consolidated.03.pth
│ │ ├── params.json
│ │ ├── tokenizer_checklist.chk
│ │ └── tokenizer.model
│ ├── 7B
│ │ ├── checklist.chk
│ │ ├── consolidated.00.pth
│ │ ├── params.json
│ │ ├── tokenizer_checklist.chk
│ │ └── tokenizer.model
│ └── llama.shThat's all I have to say about this for now. I'm still researching and experimenting. |
|
Do we have any contact with the safetensor devs etc... to encourage formalization or at least encourage best practices? |
|
The issue is that PyTorch users can do whatever they want. Safetensors is managed by HuggingFace. So, if I make some toy models, then I can technically name them whatever. I think @compilade is right about the naming not being an issue. The conversion script is solely geared toward huggingface, so it isolates our focus, which isn't necessarily a bad thing considering how much is going on in the repo. You can find the Safetensors specs in their repo which is actually nicely outlined. https://github.com/huggingface/safetensors/blob/main/docs/source/metadata_parsing.mdx#usage |
In ggml-org#7075, to fix the conversion of (some) models using model-00001-of-00001.safetensors instead of model.safetensors for a single model part we simply used the same logic as the part count to get the part names. But this doesn't always work correctly, like when unusual additional model files like consolidated.safetensors in https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present. This commit matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models. But according to report by @teleprint-me there is still some persistent problem, but shall do in the meantime.
In ggml-org#7075, to fix the conversion of (some) models using model-00001-of-00001.safetensors instead of model.safetensors for a single model part we simply used the same logic as the part count to get the part names. But this doesn't always work correctly, like when unusual additional model files like consolidated.safetensors in https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present. This commit matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models. But according to report by @teleprint-me there is still some persistent problem, but shall do in the meantime.
TL;DR This should fix conversion of https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
(thanks to @teleprint-me for noticing this problem in #7379 (comment))
Before #7075,
convert-hf-to-gguf.pychecked for specific model part names according to the number of*.binfiles or*.safetensorsfiles.https://github.com/ggerganov/llama.cpp/blob/bc4bba364fb96d908f2698e908648df5e6f55e02/convert-hf-to-gguf.py#L224-L232
But then, #7075, to fix the conversion of (some) models using
model-00001-of-00001.safetensorsinstead ofmodel.safetensorsfor a single model part, simply used the same logic as the part count to get the part names. That is, using only the suffix of the files.https://github.com/ggerganov/llama.cpp/blob/f98eb31c517c95960df1d0abc48002787f145f3b/convert-hf-to-gguf.py#L285-L293
But this doesn't always work correctly, like when unusual additional model files like
consolidated.safetensorsin https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3 are present.Note that the previous way it was done would still have failed, since the model count still only relied on the suffixes.
I think matching both the prefix and the suffix of the model part names should fix this problem without breaking any previously-supported upstream models.