add phi3 support#6852
Conversation
|
Might have to add diff --git a/llama.cpp b/llama.cpp
index 63483b9a..698ad236 100644
--- a/llama.cpp
+++ b/llama.cpp
@@ -4381,6 +4381,7 @@ static void llm_load_vocab(
//vocab.id_to_token[t.second].type == LLAMA_TOKEN_TYPE_CONTROL &&
(t.first == "<|eot_id|>" ||
t.first == "<|im_end|>" ||
+ t.first == "<|end|>" ||
t.first == "<end_of_turn>"
)
) {This seems to be producing better results than #6851 |
|
So the difference was in the tokenization - in the other PR the I wonder if it affects the conversion of some other models too? Anyway, now the results match except for the other PR having a BOS token added at the start, while this PR does not: Just double-checking if this is the intent? There is also a minor issue because of this - the python3 gguf-py/scripts/gguf-dump.py models/phi-3-4k-instruct/ggml-model-f16.gguf
* Loading: models/phi-3-4k-instruct/ggml-model-f16-new.gguf
Traceback (most recent call last):
KeyError: 'Duplicate tokenizer.ggml.add_bos_token already in list at offset 725511'This is because we already write this field automatically here: |
|
So is the implementation in #6851 preferred or are both needed for official support? |
ggerganov
left a comment
There was a problem hiding this comment.
Thank you for the nice implementation.
I decided to set the "add BOS" KV to True based on this configuration:
|
hi, using server "<|eot_id|>" still printed at the end of conversation, and I can't find stop token now in /examples/server/utils.hpp, how to avoid this "<|eot_id|>" in server ? thanks |
|
Most likely you are using base model instead of instruct model. See #6916 for clear explanation and way to add stop tokens from client-side |
|
@ggerganov Hi, no i was using Phi-3-mini-128k-instruct.Q4_K_M.gguf, forget it, I think this was for server, for non server it already works fine |
* add explicit phi3 support * add explicit phi3 support * remove unused code * convert : add BOS token * llama : match EOT token <|end|> * llama : minor / style * llama : tabs -> spaces * convert : fix lint checks --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
* add explicit phi3 support * add explicit phi3 support * remove unused code * convert : add BOS token * llama : match EOT token <|end|> * llama : minor / style * llama : tabs -> spaces * convert : fix lint checks --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
Make phi3 as an explicit model to support in llama.