How to disable this automatic behavior? And if it's not possible yet, can we get a --flag for it?
llama_tokenize_internal: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token.
Running into this with Llama-3-8B models.
Related PR:
ggml-org#7332
How to disable this automatic behavior? And if it's not possible yet, can we get a --flag for it?
llama_tokenize_internal: Added a BOS token to the prompt as specified by the model but the prompt also starts with a BOS token.Running into this with Llama-3-8B models.
Related PR:
ggml-org#7332