Unable to run models with the Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 formats at ARM device.

**Describe the Issue**
Upstream we have the new feature of ARM optimized models (Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8). I tried to run every one of them at my Snapdragon 8G1, but I was unable to run it with koboldcpp.

**Additional Information:**
Checking upstream I saw the new documentation (https://github.com/ggerganov/llama.cpp/pull/9321), that shows that some flags must be set at compilation. Can you please inform how to compile koboldcpp with those flags so I can try again? 

```To support `Q4_0_4_4`, you must build with `GGML_NO_LLAMAFILE=1` (`make`) or `-DGGML_LLAMAFILE=OFF` (`cmake`).```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run models with the Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 formats at ARM device. #1117

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Unable to run models with the Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 formats at ARM device. #1117

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions