[LLVM][Codegen] Avoid segfault when `arith::GetVScaleValues` returns empty vector by mshr-h · Pull Request #18586 · apache/tvm

mshr-h · 2025-12-14T07:27:23Z

As per title.

gemini-code-assist · 2025-12-14T07:27:33Z

Summary of Changes

Hello @mshr-h, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a crucial fix to prevent a segmentation fault within the AArch64 LLVM code generation process. The issue arose when the arith::GetVScaleValues function could potentially return an empty vector, leading to an attempt to dereference an invalid iterator during the calculation of the maximum vscale value. By adding a simple check for an empty vector, the change ensures the system's stability and robustness, avoiding crashes in specific scenarios.

Highlights

Bug Fix: Addresses a potential segmentation fault in the AArch64 LLVM code generation when arith::GetVScaleValues returns an empty vector.
Defensive Programming: Introduces a check to ensure the vector returned by arith::GetVScaleValues is not empty before attempting to find its maximum element, thus preventing a runtime crash.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request fixes a segmentation fault in the AArch64 LLVM code generator that occurs when arith::GetVScaleValues returns an empty vector. The original code would crash when calling std::max_element on an empty vector. The fix adds a check to ensure the vector is not empty before processing it. This change is correct and effectively prevents the crash, improving the robustness of the code generator.

mshr-h · 2025-12-14T09:03:14Z

cc @cbalint13

cbalint13 · 2025-12-14T23:59:49Z

src/target/llvm/codegen_aarch64.cc

-    unsigned int max_val = *std::max_element(kVScaleValues.begin(), kVScaleValues.end());
-    func->addFnAttr(
-        llvm::Attribute::getWithVScaleRangeArgs(*llvm_target_->GetContext(), 1, max_val));
+    if (!kVScaleValues.empty()) {


It would be fine to protect this way, but we will miss the origin of issue.

My concern is that if we have SVE/SME isa presence (deduced from valid target) why the list is empty !?
For a proper fix, I would be curious what is the string value of target and what value have vector_width .

Here is the generator part responsible with population, missing the vector_width (zero) will not populate:

tvm/src/arith/scalable_expression.cc

Lines 106 to 123 in 6248b5d

unsigned int vector_width = 0;

std::vector<unsigned int> kVScaleValues;

if (!target.defined()) {

target = Target::Current();

}

if (target.defined()) {

static auto llvm_get_vector_width_fn =

tvm::ffi::Function::GetGlobalRequired("target.llvm_get_vector_width");

vector_width = llvm_get_vector_width_fn(target).cast<int>();

}

// scale list with powers of two

for (unsigned int i = 0;; ++i) {

auto power = static_cast<unsigned int>(std::pow(2, i));

if (power > (vector_width / 8)) break;

kVScaleValues.push_back(power);

}

return kVScaleValues;

How vector_width is populated:

I am afraid that TVM fails to pickup proper vector_width during initial parsing of arch properties based on target provided, the initial default value population is happening here:

tvm/src/target/llvm/llvm_instance.cc

Line 913 in 6248b5d

const int LLVMTargetInfo::GetVectorWidth() {

VecWidth default may be later overriden based on vector ISA presence and properties, the riscv case is more complex:

tvm/src/target/llvm/llvm_instance.cc

Line 305 in 6248b5d

if (vector_width_ < zvlbits) vector_width_ = zvlbits;

If could help me with these two info I can come up with a proper fix.
I could also go the long way to test it on arm machine but then I need a clear reproducer case.

Thanks @mshr-h !

LATER EDIT:
I updated (edit) my comments here in order to get a clean view.

Thank you for your comment @cbalint13 !
I'm targetting Apple M4 Pro. Seems likevector_width is 128.
Here's repro.

Code:

import torch import torchvision from torch.export import export import tvm from tvm.relax.frontend.torch import from_exported_program tvm.support.describe() torch_model = torchvision.models.resnet18(weights=None).eval() example_args = (torch.randn(1, 3, 224, 224),) exported_program = export( torch_model, args=example_args, ) # Relax target = tvm.target.Target( "llvm -keys=arm_cpu,cpu -mcpu=apple-m4 -mtriple=arm64-apple-darwin25.2.0" ) # Apple M4 Pro target print(f"vector_width: {tvm.get_global_func('target.llvm_get_vector_width')(target)}") mod = from_exported_program(exported_program) exe = tvm.compile(mod, target=target)

Output:

% uv run relax/repro_segfault.py Python Environment TVM version = 0.23.dev0 Python version = 3.13.11 (main, Dec 9 2025, 20:26:22) [Clang 21.1.4 ] (64 bit) os.uname() = Darwin 25.2.0 Darwin Kernel Version 25.2.0: Tue Nov 18 21:09:56 PST 2025; root:xnu-12377.61.12~1/RELEASE_ARM64_T6041 arm64 CMake Options: { "BACKTRACE_ON_SEGFAULT": "OFF", "BUILD_DUMMY_LIBTVM": "OFF", "BUILD_STATIC_RUNTIME": "OFF", "COMPILER_RT_PATH": "3rdparty/compiler-rt", "CUDA_VERSION": "NOT-FOUND", "DMLC_PATH": "3rdparty/dmlc-core/include", "GIT_COMMIT_HASH": "6248b5db43505fbcfb13cc289d11877d5d2649e8", "GIT_COMMIT_TIME": "2025-12-13 02:29:23 -0500", "HIDE_PRIVATE_SYMBOLS": "OFF", "INDEX_DEFAULT_I64": "ON", "INSTALL_DEV": "OFF", "LLVM_VERSION": "21.1.7", "MLIR_VERSION": "NOT-FOUND", "PICOJSON_PATH": "3rdparty/picojson", "RANG_PATH": "3rdparty/rang/include", "ROCM_PATH": "/opt/rocm", "SUMMARIZE": "OFF", "TVM_BUILD_PYTHON_MODULE": "OFF", "TVM_CLML_VERSION": "", "TVM_CXX_COMPILER_PATH": "/usr/bin/c++", "TVM_DEBUG_WITH_ABI_CHANGE": "OFF", "TVM_LOG_BEFORE_THROW": "OFF", "USE_ALTERNATIVE_LINKER": "AUTO", "USE_AMX": "OFF", "USE_ARM_COMPUTE_LIB": "OFF", "USE_ARM_COMPUTE_LIB_GRAPH_EXECUTOR": "OFF", "USE_BLAS": "none", "USE_BNNS": "OFF", "USE_BYODT_POSIT": "OFF", "USE_CCACHE": "AUTO", "USE_CLML": "OFF", "USE_CLML_GRAPH_EXECUTOR": "OFF", "USE_COREML": "OFF", "USE_CPP_RPC": "ON", "USE_CPP_RTVM": "", "USE_CUBLAS": "OFF", "USE_CUDA": "OFF", "USE_CUDNN": "OFF", "USE_CURAND": "OFF", "USE_CUSTOM_LOGGING": "OFF", "USE_CUTLASS": "OFF", "USE_DNNL": "OFF", "USE_FALLBACK_STL_MAP": "OFF", "USE_GTEST": "AUTO", "USE_HEXAGON": "OFF", "USE_HEXAGON_EXTERNAL_LIBS": "OFF", "USE_HEXAGON_GTEST": "/path/to/hexagon/gtest", "USE_HEXAGON_RPC": "OFF", "USE_HEXAGON_SDK": "/path/to/sdk", "USE_HIPBLAS": "OFF", "USE_IOS_RPC": "OFF", "USE_KHRONOS_SPIRV": "OFF", "USE_LIBBACKTRACE": "AUTO", "USE_LIBTORCH": "OFF", "USE_LLVM": "/opt/homebrew/opt/llvm/bin/llvm-config", "USE_METAL": "OFF", "USE_MIOPEN": "OFF", "USE_MKL": "OFF", "USE_MLIR": "ON", "USE_MRVL": "OFF", "USE_MSC": "OFF", "USE_MSCCL": "OFF", "USE_MSVC_MT": "OFF", "USE_NCCL": "OFF", "USE_NNAPI_CODEGEN": "OFF", "USE_NNAPI_RUNTIME": "OFF", "USE_NNPACK": "OFF", "USE_NVSHMEM": "OFF", "USE_NVTX": "OFF", "USE_OPENCL": "OFF", "USE_OPENCL_ENABLE_HOST_PTR": "OFF", "USE_OPENCL_EXTN_QCOM": "NOT-FOUND", "USE_OPENCL_GTEST": "/path/to/opencl/gtest", "USE_OPENMP": "OFF", "USE_PAPI": "OFF", "USE_RANDOM": "ON", "USE_RCCL": "OFF", "USE_ROCBLAS": "OFF", "USE_ROCM": "OFF", "USE_RPC": "ON", "USE_RTTI": "ON", "USE_RUST_EXT": "OFF", "USE_SORT": "ON", "USE_SPIRV_KHR_INTEGER_DOT_PRODUCT": "OFF", "USE_TENSORFLOW_PATH": "none", "USE_TENSORRT_CODEGEN": "OFF", "USE_TENSORRT_RUNTIME": "OFF", "USE_TFLITE": "OFF", "USE_THREADS": "ON", "USE_THRUST": "OFF", "USE_UMA": "OFF", "USE_VULKAN": "OFF" } vector_width: 128 !!!!!!! Segfault encountered !!!!!!! File "build/src/ffi/backtrace.cc", line 154, in TVMFFISegFaultHandler File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_aarch64.cc", line 61, in tvm::codegen::CodeGenAArch64::SetTargetAttributes(llvm::Function*) File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.cc", line 287, in tvm::codegen::CodeGenLLVM::DeclareFunctionInternal(tvm::GlobalVar const&, tvm::tir::PrimFunc const&) File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.h", line 656, in void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)::'lambda'(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)::'lambda'(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator)) File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/codegen_llvm.h", line 181, in void tvm::codegen::CodeGenLLVM::AddFunctionsOrdered<tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator>(tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator, tvm::ffi::Map<tvm::GlobalVar, tvm::BaseFunc, void>::iterator) File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/llvm_module.cc", line 356, in tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, tvm::Target const&) File "/Users/mshr/data/project/tvm-example/tvm/src/target/llvm/llvm_module.cc", line 664, in tvm::codegen::LLVMReflectionRegister()::$_0::operator()(tvm::IRModule, tvm::Target) const File "<unknown>", line 0, in _PyEval_EvalFrameDefault File "<unknown>", line 0, in PyEval_EvalCode File "<unknown>", line 0, in run_eval_code_obj File "<unknown>", line 0, in run_mod.llvm.17421610541250727766 File "<unknown>", line 0, in pyrun_file File "<unknown>", line 0, in _PyRun_SimpleFileObject File "<unknown>", line 0, in _PyRun_AnyFileObject File "<unknown>", line 0, in pymain_run_file_obj File "<unknown>", line 0, in pymain_run_file File "<unknown>", line 0, in Py_RunMain File "<unknown>", line 0, in pymain_main File "<unknown>", line 0, in Py_BytesMain /Users/mshr/.local/share/uv/python/cpython-3.13.11-macos-aarch64-none/lib/python3.13/multiprocessing/resource_tracker.py:400: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown: {'/mp-dtancvc1'} warnings.warn(

Thank you for your comment @cbalint13 !
I'm targetting Apple M4 Pro. Seems likevector_width is 128.
Here's repro.

@mshr-h

Thanks a lot for the help !

I look at this, definitely would like to make it work properly !

I have a OPI6 (SVE+SME support) , would like to see the tensorized kernels (expecting TFLOPS level).

And it is time to fix tensorization for the transposed flavours of GEMM too.

I can do only in weekend, this week I am very busy.

In meanwhile, if you wish, can merge this as-is, I can fix in subsecvent PR (will let you know Cc ).
Once again thanks for looking into this !

[LLVM] Avoid segfault when arith::GetVScaleValues returns empty vector

9380f98

mshr-h changed the title ~~[FIX][LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector~~ [LLVM][Codegen] Avoid segfault when arith::GetVScaleValues returns empty vector Dec 14, 2025

gemini-code-assist bot reviewed Dec 14, 2025

View reviewed changes

mshr-h marked this pull request as ready for review December 14, 2025 09:03

cbalint13 reviewed Dec 15, 2025

View reviewed changes

cbalint13 self-assigned this Dec 15, 2025

cbalint13 approved these changes Dec 15, 2025

View reviewed changes

mshr-h merged commit f2930d5 into apache:main Dec 16, 2025
19 of 20 checks passed

mshr-h deleted the fix-segfault-aarch64-codegen branch December 16, 2025 10:52

ysh329 mentioned this pull request Feb 1, 2026

[Release] v0.23.0 Release Candidate Notes #18701

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLVM][Codegen] Avoid segfault when `arith::GetVScaleValues` returns empty vector#18586

[LLVM][Codegen] Avoid segfault when `arith::GetVScaleValues` returns empty vector#18586
mshr-h merged 1 commit intoapache:mainfrom
mshr-h:fix-segfault-aarch64-codegen

mshr-h commented Dec 14, 2025

Uh oh!

gemini-code-assist bot commented Dec 14, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

mshr-h commented Dec 14, 2025

Uh oh!

cbalint13 Dec 14, 2025 •

edited

Loading

Uh oh!

mshr-h Dec 15, 2025 •

edited

Loading

Uh oh!

cbalint13 Dec 15, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	unsigned int vector_width = 0;
	std::vector<unsigned int> kVScaleValues;
	if (!target.defined()) {
	target = Target::Current();
	}
	if (target.defined()) {
	static auto llvm_get_vector_width_fn =
	tvm::ffi::Function::GetGlobalRequired("target.llvm_get_vector_width");
	vector_width = llvm_get_vector_width_fn(target).cast<int>();
	}
	// scale list with powers of two
	for (unsigned int i = 0;; ++i) {
	auto power = static_cast<unsigned int>(std::pow(2, i));
	if (power > (vector_width / 8)) break;
	kVScaleValues.push_back(power);
	}

	return kVScaleValues;

Conversation

mshr-h commented Dec 14, 2025

Uh oh!

gemini-code-assist bot commented Dec 14, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

mshr-h commented Dec 14, 2025

Uh oh!

cbalint13 Dec 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mshr-h Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cbalint13 Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

cbalint13 Dec 14, 2025 •

edited

Loading

mshr-h Dec 15, 2025 •

edited

Loading

cbalint13 Dec 15, 2025 •

edited

Loading