Skip to content

[LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector#18586

Merged
mshr-h merged 1 commit intoapache:mainfrom
mshr-h:fix-segfault-aarch64-codegen
Dec 16, 2025
Merged

[LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector#18586
mshr-h merged 1 commit intoapache:mainfrom
mshr-h:fix-segfault-aarch64-codegen

Conversation

@mshr-h
Copy link
Contributor

@mshr-h mshr-h commented Dec 14, 2025

As per title.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @mshr-h, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a crucial fix to prevent a segmentation fault within the AArch64 LLVM code generation process. The issue arose when the arith::GetVScaleValues function could potentially return an empty vector, leading to an attempt to dereference an invalid iterator during the calculation of the maximum vscale value. By adding a simple check for an empty vector, the change ensures the system's stability and robustness, avoiding crashes in specific scenarios.

Highlights

  • Bug Fix: Addresses a potential segmentation fault in the AArch64 LLVM code generation when arith::GetVScaleValues returns an empty vector.
  • Defensive Programming: Introduces a check to ensure the vector returned by arith::GetVScaleValues is not empty before attempting to find its maximum element, thus preventing a runtime crash.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@mshr-h mshr-h changed the title [FIX][LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector [LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector Dec 14, 2025
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a segmentation fault in the AArch64 LLVM code generator that occurs when arith::GetVScaleValues returns an empty vector. The original code would crash when calling std::max_element on an empty vector. The fix adds a check to ensure the vector is not empty before processing it. This change is correct and effectively prevents the crash, improving the robustness of the code generator.

@mshr-h mshr-h marked this pull request as ready for review December 14, 2025 09:03
@mshr-h
Copy link
Contributor Author

mshr-h commented Dec 14, 2025

cc @cbalint13

unsigned int max_val = *std::max_element(kVScaleValues.begin(), kVScaleValues.end());
func->addFnAttr(
llvm::Attribute::getWithVScaleRangeArgs(*llvm_target_->GetContext(), 1, max_val));
if (!kVScaleValues.empty()) {
Copy link
Contributor

@cbalint13 cbalint13 Dec 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be fine to protect this way, but we will miss the origin of issue.

My concern is that if we have SVE/SME isa presence (deduced from valid target) why the list is empty !?
For a proper fix, I would be curious what is the string value of target and what value have vector_width .

Here is the generator part responsible with population, missing the vector_width (zero) will not populate:

unsigned int vector_width = 0;
std::vector<unsigned int> kVScaleValues;
if (!target.defined()) {
target = Target::Current();
}
if (target.defined()) {
static auto llvm_get_vector_width_fn =
tvm::ffi::Function::GetGlobalRequired("target.llvm_get_vector_width");
vector_width = llvm_get_vector_width_fn(target).cast<int>();
}
// scale list with powers of two
for (unsigned int i = 0;; ++i) {
auto power = static_cast<unsigned int>(std::pow(2, i));
if (power > (vector_width / 8)) break;
kVScaleValues.push_back(power);
}
return kVScaleValues;


How vector_width is populated:

  • I am afraid that TVM fails to pickup proper vector_width during initial parsing of arch properties based on target provided, the initial default value population is happening here:

    const int LLVMTargetInfo::GetVectorWidth() {

  • VecWidth default may be later overriden based on vector ISA presence and properties, the riscv case is more complex:

    if (vector_width_ < zvlbits) vector_width_ = zvlbits;


If could help me with these two info I can come up with a proper fix.
I could also go the long way to test it on arm machine but then I need a clear reproducer case.

Thanks @mshr-h !

LATER EDIT:
I updated (edit) my comments here in order to get a clean view.

Copy link
Contributor Author

@mshr-h mshr-h Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment @cbalint13 !
I'm targetting Apple M4 Pro. Seems likevector_width is 128.
Here's repro.

Code:

import torch
import torchvision
from torch.export import export
import tvm
from tvm.relax.frontend.torch import from_exported_program

tvm.support.describe()

torch_model = torchvision.models.resnet18(weights=None).eval()
example_args = (torch.randn(1, 3, 224, 224),)
exported_program = export(
    torch_model,
    args=example_args,
)

# Relax
target = tvm.target.Target(
    "llvm -keys=arm_cpu,cpu -mcpu=apple-m4 -mtriple=arm64-apple-darwin25.2.0"
)  # Apple M4 Pro target
print(f"vector_width: {tvm.get_global_func('target.llvm_get_vector_width')(target)}")

mod = from_exported_program(exported_program)
exe = tvm.compile(mod, target=target)

Output:

% uv run relax/repro_segfault.py
Python Environment
  TVM version    = 0.23.dev0
  Python version = 3.13.11 (main, Dec  9 2025, 20:26:22) [Clang 21.1.4 ] (64 bit)
  os.uname()     = Darwin 25.2.0 Darwin Kernel Version 25.2.0: Tue Nov 18 21:09:56 PST 2025; root:xnu-12377.61.12~1/RELEASE_ARM64_T6041 arm64
CMake Options:
  {
    "BACKTRACE_ON_SEGFAULT": "OFF",
    "BUILD_DUMMY_LIBTVM": "OFF",
    "BUILD_STATIC_RUNTIME": "OFF",
    "COMPILER_RT_PATH": "3rdparty/compiler-rt",
    "CUDA_VERSION": "NOT-FOUND",
    "DMLC_PATH": "3rdparty/dmlc-core/include",
    "GIT_COMMIT_HASH": "6248b5db43505fbcfb13cc289d11877d5d2649e8",
    "GIT_COMMIT_TIME": "2025-12-13 02:29:23 -0500",
    "HIDE_PRIVATE_SYMBOLS": "OFF",
    "INDEX_DEFAULT_I64": "ON",
    "INSTALL_DEV": "OFF",
    "LLVM_VERSION": "21.1.7",
    "MLIR_VERSION": "NOT-FOUND",
    "PICOJSON_PATH": "3rdparty/picojson",
    "RANG_PATH": "3rdparty/rang/include",
    "ROCM_PATH": "/opt/rocm",
    "SUMMARIZE": "OFF",
    "TVM_BUILD_PYTHON_MODULE": "OFF",
    "TVM_CLML_VERSION": "",
    "TVM_CXX_COMPILER_PATH": "/usr/bin/c++",
    "TVM_DEBUG_WITH_ABI_CHANGE": "OFF",
    "TVM_LOG_BEFORE_THROW": "OFF",
    "USE_ALTERNATIVE_LINKER": "AUTO",
    "USE_AMX": "OFF",
    "USE_ARM_COMPUTE_LIB": "OFF",
    "USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR": "OFF",
    "USE_BLAS": "none",
    "USE_BNNS": "OFF",
    "USE_BYODT_POSIT": "OFF",
    "USE_CCACHE": "AUTO",
    "USE_CLML": "OFF",
    "USE_CLML_GRAPH_EXECUTOR": "OFF",
    "USE_COREML": "OFF",
    "USE_CPP_RPC": "ON",
    "USE_CPP_RTVM": "",
    "USE_CUBLAS": "OFF",
    "USE_CUDA": "OFF",
    "USE_CUDNN": "OFF",
    "USE_CURAND": "OFF",
    "USE_CUSTOM_LOGGING": "OFF",
    "USE_CUTLASS": "OFF",
    "USE_DNNL": "OFF",
    "USE_FALLBACK_STL_MAP": "OFF",
    "USE_GTEST": "AUTO",
    "USE_HEXAGON": "OFF",
    "USE_HEXAGON_EXTERNAL_LIBS": "OFF",
    "USE_HEXAGON_GTEST": "/path/to/hexagon/gtest",
    "USE_HEXAGON_RPC": "OFF",
    "USE_HEXAGON_SDK": "/path/to/sdk",
    "USE_HIPBLAS": "OFF",
    "USE_IOS_RPC": "OFF",
    "USE_KHRONOS_SPIRV": "OFF",
    "USE_LIBBACKTRACE": "AUTO",
    "USE_LIBTORCH": "OFF",
    "USE_LLVM": "/opt/homebrew/opt/llvm/bin/llvm-config",
    "USE_METAL": "OFF",
    "USE_MIOPEN": "OFF",
    "USE_MKL": "OFF",
    "USE_MLIR": "ON",
    "USE_MRVL": "OFF",
    "USE_MSC": "OFF",
    "USE_MSCCL": "OFF",
    "USE_MSVC_MT": "OFF",
    "USE_NCCL": "OFF",
    "USE_NNAPI_CODEGEN": "OFF",
    "USE_NNAPI_RUNTIME": "OFF",
    "USE_NNPACK": "OFF",
    "USE_NVSHMEM": "OFF",
    "USE_NVTX": "OFF",
    "USE_OPENCL": "OFF",
    "USE_OPENCL_ENABLE_HOST_PTR": "OFF",
    "USE_OPENCL_EXTN_QCOM": "NOT-FOUND",
    "USE_OPENCL_GTEST": "/path/to/opencl/gtest",
    "USE_OPENMP": "OFF",
    "USE_PAPI": "OFF",
    "USE_RANDOM": "ON",
    "USE_RCCL": "OFF",
    "USE_ROCBLAS": "OFF",
    "USE_ROCM": "OFF",
    "USE_RPC": "ON",
    "USE_RTTI": "ON",
    "USE_RUST_EXT": "OFF",
    "USE_SORT": "ON",
    "USE_SPIRV_KHR_INTEGER_DOT_PRODUCT": "OFF",
    "USE_TENSORFLOW_PATH": "none",
    "USE_TENSORRT_CODEGEN": "OFF",
    "USE_TENSORRT_RUNTIME": "OFF",
    "USE_TFLITE": "OFF",
    "USE_THREADS": "ON",
    "USE_THRUST": "OFF",
    "USE_UMA": "OFF",
    "USE_VULKAN": "OFF"
  }
vector_width: 128
!!!!!!! Segfault encountered !!!!!!!
  File "build/src/ffi/backtrace.cc", line 154, in TVMFFISegFaultHandler
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_aarch64.cc", line 61, in tvm::codegen::CodeGenAArch64::SetTargetAttributes(llvm::Function*)
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.cc", line 287, in tvm::codegen::CodeGenLLVM::DeclareFunctionInternal(tvm::GlobalVar const&, tvm::tir::PrimFunc const&)
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.h", line 656, in void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)::'lambda'(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)::'lambda'(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator))
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.h", line 181, in void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/llvm_module.cc", line 356, in tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, tvm::Target const&)
  File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/llvm_module.cc", line 664, in tvm::codegen::LLVMReflectionRegister()::$_0::operator()(tvm::IRModule, tvm::Target) const
  File "<unknown>", line 0, in _PyEval_EvalFrameDefault
  File "<unknown>", line 0, in PyEval_EvalCode
  File "<unknown>", line 0, in run_eval_code_obj
  File "<unknown>", line 0, in run_mod.llvm.17421610541250727766
  File "<unknown>", line 0, in pyrun_file
  File "<unknown>", line 0, in _PyRun_SimpleFileObject
  File "<unknown>", line 0, in _PyRun_AnyFileObject
  File "<unknown>", line 0, in pymain_run_file_obj
  File "<unknown>", line 0, in pymain_run_file
  File "<unknown>", line 0, in Py_RunMain
  File "<unknown>", line 0, in pymain_main
  File "<unknown>", line 0, in Py_BytesMain

/Users/mshr/.local/share/uv/python/cpython-3.13.11-macos-aarch64-none/lib/python3.13/multiprocessing/resource_tracker.py:400: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown: {'/mp-dtancvc1'}
  warnings.warn(

Copy link
Contributor

@cbalint13 cbalint13 Dec 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your comment @cbalint13 !
I'm targetting Apple M4 Pro. Seems likevector_width is 128.
Here's repro.

@mshr-h

Thanks a lot for the help !

  • I look at this, definitely would like to make it work properly !
  • I have a OPI6 (SVE+SME support) , would like to see the tensorized kernels (expecting TFLOPS level).
  • And it is time to fix tensorization for the transposed flavours of GEMM too.

I can do only in weekend, this week I am very busy.


In meanwhile, if you wish, can merge this as-is, I can fix in subsecvent PR (will let you know Cc ).
Once again thanks for looking into this !

@cbalint13 cbalint13 self-assigned this Dec 15, 2025
@mshr-h mshr-h merged commit f2930d5 into apache:main Dec 16, 2025
19 of 20 checks passed
@mshr-h mshr-h deleted the fix-segfault-aarch64-codegen branch December 16, 2025 10:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants