Skip to content

Unable to run models with the Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8 formats at ARM device. #1117

@gustrd

Description

@gustrd

Describe the Issue
Upstream we have the new feature of ARM optimized models (Q4_0_4_4, Q4_0_4_8 and Q4_0_8_8). I tried to run every one of them at my Snapdragon 8G1, but I was unable to run it with koboldcpp.

Additional Information:
Checking upstream I saw the new documentation (ggml-org#9321), that shows that some flags must be set at compilation. Can you please inform how to compile koboldcpp with those flags so I can try again?

To support `Q4_0_4_4`, you must build with `GGML_NO_LLAMAFILE=1` (`make`) or `-DGGML_LLAMAFILE=OFF` (`cmake`).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions