Skip to content

ggml: ggml-cpu: force-no-lto-for-cpu-feats#19609

Merged
ggerganov merged 1 commit intoggml-org:masterfrom
talhaHavadar:master
Feb 17, 2026
Merged

ggml: ggml-cpu: force-no-lto-for-cpu-feats#19609
ggerganov merged 1 commit intoggml-org:masterfrom
talhaHavadar:master

Conversation

@talhaHavadar
Copy link
Copy Markdown
Contributor

@talhaHavadar talhaHavadar commented Feb 13, 2026

When LTO enabled in build environments it forces all builds to have LTO in place. But feature detection logic is fragile, and causing Illegal instruction errors with lto. This disables LTO for the feature detection code to prevent cross-module optimization from inlining architecture-specific instructions into the score function. Without this, LTO can cause SIGILL when loading backends on older CPUs (e.g., loading power10 backend on power9 crashes before feature check runs).

Please also see https://salsa.debian.org/deeplearning-team/ggml/-/merge_requests/6 for more information about the issue we saw on ppc64el builds with LTO enabled in ubuntu.

Make sure to read the contributing guidelines before submitting a PR

When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Feb 13, 2026
@taronaeo
Copy link
Copy Markdown
Member

@shalinib-ibm Maybe you could take a look at this? They have a discussion going on at https://salsa.debian.org/deeplearning-team/ggml/-/merge_requests/6 already.

@ckastner
Copy link
Copy Markdown
Collaborator

ckastner commented Feb 14, 2026

While I'm not entirely sure what the root cause for the SIGILL is, the fix at least seems consistent with the strategy for the CPU feature detection stuff: build that part with as few features as possible, so that it can run everywhere.

Copy link
Copy Markdown
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI before merge.

@shalinib-ibm
Copy link
Copy Markdown
Contributor

@shalinib-ibm Maybe you could take a look at this? They have a discussion going on at https://salsa.debian.org/deeplearning-team/ggml/-/merge_requests/6 already.

Hi all, thanks for flagging this. Since LTO is disabled only for the CPU feature-detection code, this change has no impact on performance-critical paths. We verified this by running llama-bench with and without the change and observed no performance regressions.

@ggerganov ggerganov merged commit ae2d3f2 into ggml-org:master Feb 17, 2026
76 of 78 checks passed
liparetejas pushed a commit to liparetejas/llama.cpp that referenced this pull request Feb 23, 2026
When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
bartowski1182 pushed a commit to bartowski1182/llama.cpp that referenced this pull request Mar 2, 2026
When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
ArberSephirotheca pushed a commit to ArberSephirotheca/llama.cpp that referenced this pull request Mar 3, 2026
When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
When LTO enabled in build environments it forces all builds to have LTO
in place. But feature detection logic is fragile, and causing Illegal
instruction errors with lto. This disables LTO for the feature
detection code to prevent cross-module optimization from inlining
architecture-specific instructions into the score function. Without this,
LTO can cause SIGILL when loading backends on older CPUs (e.g., loading
power10 backend on power9 crashes before feature check runs).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants