Skip to content

ggml-cpu : disable tiled matmul on AIX to fix page boundary segfault#22293

Merged
ggerganov merged 4 commits intoggml-org:masterfrom
shalinib-ibm:aix_matmul_fix
Apr 29, 2026
Merged

ggml-cpu : disable tiled matmul on AIX to fix page boundary segfault#22293
ggerganov merged 4 commits intoggml-org:masterfrom
shalinib-ibm:aix_matmul_fix

Conversation

@shalinib-ibm
Copy link
Copy Markdown
Contributor

@shalinib-ibm shalinib-ibm commented Apr 23, 2026

vec_xst operations in the tiled path crash on AIX when writing near 4KB page boundaries due to strict memory protection. Fall back to mnpack implementation on AIX for stable execution.

Overview

This patch fixes segmentation faults in q4_0 model inference on AIX PowerPC systems by disabling the tiled matrix multiplication path in llamafile's sgemm implementation.
vec_xst operations crash on AIX when writing near 4KB page boundaries due to strict memory protection. The vec_xst instruction cannot write across page boundaries on AIX, and when the buffer offset lands at addresses like 0x1100ed000 (exactly at a page boundary), the write operation attempts to access unmapped memory, triggering a segfault.

This issue does not occur on Linux because Linux has more lenient memory management and may have adjacent pages already mapped. On AIX, page boundaries are strictly enforced.

Additional information

Requirements

vec_xst operations in the tiled path crash on AIX when writing
near 4KB page boundaries due to strict memory protection. Fall
back to mnpack implementation on AIX for stable execution.

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>
@shalinib-ibm shalinib-ibm requested a review from ggerganov as a code owner April 23, 2026 13:49
@shalinib-ibm
Copy link
Copy Markdown
Contributor Author

@taronaeo Request to review this patch . Thanks in advance.

@shalinib-ibm
Copy link
Copy Markdown
Contributor Author

@ggerganov Request to review this patch. Thanks in advance

Copy link
Copy Markdown
Member

@taronaeo taronaeo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor change. Triggered CI.

Comment thread ggml/src/ggml-cpu/llamafile/sgemm.cpp Outdated
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Apr 23, 2026
@shalinib-ibm
Copy link
Copy Markdown
Contributor Author

@taronaeo
ubuntu-cpu-riscv64-native : failed because if no root access.
ggml-ci-win-intel-vulkan: failed to load backed ( -m model path not found)
oher two test failures do not seem to be related to this PR. Can you please take a look and re-trigger these tests ?

@taronaeo
Copy link
Copy Markdown
Member

Noted on the CI failures. All seem unrelated to this PR.

@taronaeo taronaeo requested a review from a team April 24, 2026 14:14
Comment thread ggml/src/ggml-cpu/llamafile/sgemm.cpp Outdated
@shalinib-ibm
Copy link
Copy Markdown
Contributor Author

Hi @taronaeo @CISC If there any no more review comments, can you please help merge this PR ?

@shalinib-ibm
Copy link
Copy Markdown
Contributor Author

@ggerganov can you please help review this patch ?

@ggerganov ggerganov merged commit 1cbc846 into ggml-org:master Apr 29, 2026
1 check passed
tekintian added a commit to tekintian/llama.cpp that referenced this pull request May 1, 2026
* 'master' of github.com:tekintian/llama.cpp: (659 commits)
  ggml-webgpu: Improve performance of mat-vec and mat-mat for MUL_MAT_ID (ggml-org#22464)
  Update llama-mmap to use ftello/fseeko (ggml-org#22497)
  common : check for null getpwuid in hf-cache (ggml-org#22550)
  vulkan: add get/set tensor 2d functions (ggml-org#22514)
  spec: fix argument typo (ggml-org#22552)
  ci : bump ty to 0.0.33 (ggml-org#22535)
  vendor : update cpp-httplib to 0.43.2 (ggml-org#22548)
  CUDA: fix tile FA kernel on Pascal (ggml-org#22541)
  scripts : add wc2wt.sh - create worktree from current HEAD (ggml-org#22513)
  add fast matmul iquants (ggml-org#22504)
  spec : fix draft model checkpoints (ggml-org#22521)
  spec : fix vocab compat checks in spec example (ggml-org#22426)
  common : do not pass prompt tokens to reasoning budget sampler (ggml-org#22488)
  hexagon: make vmem and buffer-size configurable (ggml-org#22487)
  CUDA: fuse SSM_CONV + ADD(bias) + SILU (ggml-org#22478)
  spec : disacard last drafted token with low prob (ggml-org#22506)
  sync : ggml
  ggml : bump version to 0.10.1 (ggml/1469)
  webui: fix slow mic stop and WAV encode (ggml-org#22480)
  ggml-cpu : disable tiled matmul on AIX to fix page boundary segfault (ggml-org#22293)
  ...

# Conflicts:
#	.gitignore
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
…gml-org#22293)

* ggml-cpu : disable tiled matmul on AIX to fix page boundary segfault

vec_xst operations in the tiled path crash on AIX when writing
near 4KB page boundaries due to strict memory protection. Fall
back to mnpack implementation on AIX for stable execution.

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>

* Update ggml/src/ggml-cpu/llamafile/sgemm.cpp

Co-authored-by: Aaron Teo <taronaeo@gmail.com>

* Update sgemm.cpp

* Update sgemm.cpp

---------

Signed-off-by: Shalini Salomi Bodapati <Shalini.Salomi.Bodapati@ibm.com>
Co-authored-by: Aaron Teo <taronaeo@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants