fix crash on non-AVX systems dynamically loading GGML CPU backends by jmorganca · Pull Request #11780 · ggml-org/llama.cpp

jmorganca · 2025-02-10T02:16:51Z

Thanks for the awesome work by @slaren in #10469 (and a few follow up PRs) to enable dynamic GGML backend loading. This made supporting different CPU instructions in GGML much, much easier.

I noticed a small hitch with with the llamafile code where a machine with a non-AVX CPU would crash when trying to dlopen CPU libraries built with GGML_LLAMAFILE=ON. This moves the AVX-dependent code to do a member variable, fixing the crash on dlopen. I'm not sure how sgemm.cpp is vendored, and so let me know the best way/place to suggest a change.

slaren

Thanks, I missed this global. The fix looks ok, but if the code is not inlined it may add some overhead to the other types. I will leave this open for a while in case someone knowledgeable about llamafile/tinyblas wants to propose a better solution.

jmorganca · 2025-02-14T18:16:46Z

Thanks for merging @slaren. I'm running some performance tests after noticing ollama/ollama#9087. I'm not sure if this PR is the root cause, but I haven't ruled it out yet. In any case will keep you posted and wanted to give you a heads up in case

slaren · 2025-02-14T18:32:57Z

Llamafile tinyblas should only be used for prompt processing, so if you are also observing a decrease of performance during generation, it is not very likely that it was caused by this change.

…rg#11780)

llamafile: use member variable instead of constant for iq4nlt

f3ee51e

github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Feb 10, 2025

slaren approved these changes Feb 10, 2025

View reviewed changes

slaren merged commit 8a8c4ce into ggml-org:master Feb 13, 2025

orca-zhang pushed a commit to orca-zhang/llama.cpp that referenced this pull request Feb 26, 2025

llamafile: use member variable instead of constant for iq4nlt (ggml-o…

ef0dbde

…rg#11780)

arthw pushed a commit to arthw/llama.cpp that referenced this pull request Feb 26, 2025

llamafile: use member variable instead of constant for iq4nlt (ggml-o…

a32b415

…rg#11780)

V6ser pushed a commit to V6ser/llama.cpp that referenced this pull request Mar 15, 2026

llamafile: use member variable instead of constant for iq4nlt (ggml-o…

c7ee707

…rg#11780)

Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026

llamafile: use member variable instead of constant for iq4nlt (ggml-o…

9f0e444

…rg#11780)

ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026

llamafile: use member variable instead of constant for iq4nlt (ggml-o…

828ea4c

…rg#11780)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix crash on non-AVX systems dynamically loading GGML CPU backends#11780

fix crash on non-AVX systems dynamically loading GGML CPU backends#11780
slaren merged 1 commit intoggml-org:masterfrom
jmorganca:jmorganca/sgemm-initialization

jmorganca commented Feb 10, 2025

Uh oh!

slaren left a comment

Uh oh!

jmorganca commented Feb 14, 2025

Uh oh!

slaren commented Feb 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jmorganca commented Feb 10, 2025

Uh oh!

slaren left a comment

Choose a reason for hiding this comment

Uh oh!

jmorganca commented Feb 14, 2025

Uh oh!

slaren commented Feb 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants