Describe the Issue
Upstream we have the new feature of ARM optimized models (Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8). I tried to run every one of them at my Snapdragon 8G1, but I was unable to run it with koboldcpp.
Additional Information:
Checking upstream I saw the new documentation (ggml-org#9321), that shows that some flags must be set at compilation. Can you please inform how to compile koboldcpp with those flags so I can try again?
To support `Q4_0_4_4`, you must build with `GGML_NO_LLAMAFILE=1` (`make`) or `-DGGML_LLAMAFILE=OFF` (`cmake`).
Describe the Issue
Upstream we have the new feature of ARM optimized models (Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8). I tried to run every one of them at my Snapdragon 8G1, but I was unable to run it with koboldcpp.
Additional Information:
Checking upstream I saw the new documentation (ggml-org#9321), that shows that some flags must be set at compilation. Can you please inform how to compile koboldcpp with those flags so I can try again?
To support `Q4_0_4_4`, you must build with `GGML_NO_LLAMAFILE=1` (`make`) or `-DGGML_LLAMAFILE=OFF` (`cmake`).