Allow conversion of Llama / Mistral HF models#6144
Allow conversion of Llama / Mistral HF models#6144ggerganov merged 8 commits intoggml-org:masterfrom
Conversation
There was a problem hiding this comment.
Why not just use:
@Model.register("LlamaForCausalLM", "MistralForCausalLM", "MixtralForCausalLM")on the existing MixtralModel (call it LlamaModel)? I don't see a point in supporting Mistral in this script without supporting Llama, and these classes are identical so they can be merged.
There was a problem hiding this comment.
Sure, we can do that. I tried to mimic what I saw in the existing entries.
In my opinion, it could be interesting to use different architectures for Mistral (and Mixtral) for informative purposes. If you click on the GGUF information for any Mistral file on the Hugging Face Hub, there's nothing that refers to mistral except the filename. But that would also require additional changes, so happy to use the same entry for the three of them!
|
I tried to convert this model: https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca I got this exception: This model converts fine with convert.py. |
|
|
||
| def set_vocab(self): | ||
| self._set_vocab_sentencepiece() | ||
| self._set_vocab_hf() |
There was a problem hiding this comment.
Testing conversion with fine-tuned models I experienced this problem: #6320. If that PR is merged, then we can also use _set_vocab_sentencepiece() here.
|
Sorry for the delay, @cebtenzzre, I could only return to this today. Testing the model you mentioned (https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca) I experienced an inconsistency in the sentencepiece vocab method as reported in #6320. In addition, I had not taken care of tensor permutation in my initial PR. I tested conversion of https://huggingface.co/Open-Orca/Mistral-7B-OpenOrca and verified that generation matches conversion with |
|
It looks like the CI timed out in 6 hours. |
|
This PR conflicts with #6355 which renames HfVocab to LlamaHfVocab and makes it specific to models with tokenizer.json - Mistral-7B-OpenOrca only has a tokenizer.model. To be consistent with convert.py with the default --vocab-type, after that PR is merged you would want to do something like: try:
self. _set_vocab_sentencepiece()
except FileNotFoundError:
self._set_vocab_llama_hf()This benefits from a conditional dependency on transformers (sentencepiece is a required dependency atm) and accurate token scores when tokenizer.model is available. Does that seem reasonable? |
|
@cebtenzzre yes, makes total sense! I merged and applied those changes, then tested with Mistral-7B-OpenOrca and Mistral-7B-v0.1. |
|
With the changes I added, the set of metadata keys is exactly the same as those written by convert.py (checked with gguf-dump), with the only difference that |
|
Thank you! Much appreciated! 🙌 |
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai>
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai>
* Allow conversion of Mistral HF models * Homogenize Llama, Mistral, Mixtral under the same entry. * Fix tokenizer, permute tensors * Use sentencepiece tokenizer, or fall back to hfft. * convert-hf : small fix for mypy * convert-hf : fix duplicated block_count * convert-hf : add vocab size to metadata --------- Co-authored-by: Jared Van Bortel <jared@nomic.ai>
This allows to convert fine-tuned models with
convert-hf-to-gguf.py. The base architecture is set tollama, as in the models converted by @TheBloke. If necessary, we can add a new entry toconstants.py.