### Description Loading a [Microsoft Phi-4-mini-instruct](https://huggingface.co/microsoft/Phi-4-mini-instruct) ([4bit quantization](https://huggingface.co/unsloth/Phi-4-mini-instruct)) model fails with: `unknown pre-tokenizer type: 'gpt-4o'` This issue was already addressed in [llama.cpp b4792](https://github.com/ggml-org/llama.cpp/releases/tag/b4792). ### Reproduction Steps ```lang=csharp LLamaWeights.LoadFromFile(new ModelParams("Phi-4-mini-instruct-Q4_K_M.gguf")); ``` ### Environment & Configuration - Operating system: Windows 11 - .NET runtime version: 9 - LLamaSharp version: 0.21.0 - CUDA version (if you are using cuda backend): 12 ### Known Workarounds * Download newer [llama.cpp release b4792](https://github.com/ggml-org/llama.cpp/releases/tag/b4792) * Use `NativeLibraryConfig.LLama.WithLibrary` to use the downloaded `llama.dll`
Description
Loading a Microsoft Phi-4-mini-instruct (4bit quantization) model fails with:
unknown pre-tokenizer type: 'gpt-4o'This issue was already addressed in llama.cpp b4792.
Reproduction Steps
Environment & Configuration
Known Workarounds
NativeLibraryConfig.LLama.WithLibraryto use the downloadedllama.dll