Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
d274861
Fix Vulkan interleave SPIRV codegen. Fix a bug in Simplify_Shuffle. F…
mcourteaux May 24, 2025
4fde938
Vector Legalization Pass. Useful for vectorizing to GPU backends with…
mcourteaux May 27, 2025
e88e66c
Fix Makefile.
mcourteaux May 27, 2025
2182bd1
Cleanup.
mcourteaux May 27, 2025
b345929
Cleanup vector legalization.
mcourteaux May 27, 2025
488426c
Try to fix the compiler complaint around visibility.
mcourteaux May 28, 2025
c44a130
GCC-9 does not understand a complete switch?
mcourteaux May 28, 2025
17a8c0a
Do not lift Let out to LetStmt if we are not in a loop with lane limi…
mcourteaux Jun 5, 2025
306b616
Improve error message for reinterpret.
mcourteaux Jun 5, 2025
2a50d11
Only run vector legalization mutators on device loops that require it.
mcourteaux Jun 5, 2025
963f510
Move required simplifier logic for the vector legalization to the act…
mcourteaux Jun 14, 2025
f381af0
Remove special handling of strict_float, as those got overhauled.
mcourteaux Jun 14, 2025
9e1329a
Hexagon codegen for vdelta fix regarding dont-care values in shuffle …
mcourteaux Aug 29, 2025
43ed906
Clang-format
mcourteaux Aug 29, 2025
4cb5c2c
Satisfy clang-tidy
mcourteaux Oct 12, 2025
3034e92
Revive.
mcourteaux Dec 13, 2025
70debb1
Restore case-insensitive sorting order.
mcourteaux Jan 28, 2026
f29344d
Feedback from Andrew.
mcourteaux Feb 21, 2026
ef2274a
Unify my own ExtractLanes and the existing Deinterleaver.
mcourteaux Mar 1, 2026
d9184c8
Don't use designated initializers. We're not on C++20 yet... :(
mcourteaux Mar 1, 2026
9601159
clang-format
mcourteaux Mar 1, 2026
cbc0031
unrelated clang-format???
mcourteaux Mar 1, 2026
d747743
Slightly better early-outing of the ExtractLanes mutator.
mcourteaux Mar 3, 2026
1937797
Forgot brackets.
mcourteaux Mar 3, 2026
4c5fc2f
Merge branch 'main' into fix-vulkan-interleave
mcourteaux Mar 3, 2026
a2f084b
Clang-tidy.
mcourteaux Mar 3, 2026
202d5c0
Two bugs identified by Gemini in CodeGen_Hexagon
mcourteaux Mar 3, 2026
0f22d87
Fix the shuffle bug that's causing everything to fail.
mcourteaux Mar 6, 2026
6f71253
Two bugs found by Gemini Pro.
mcourteaux Mar 6, 2026
7d9370d
Another bug found by Gemini Pro.
mcourteaux Mar 6, 2026
3c56378
Fix infinite recursion on shuffles of vectors with exclusively don't-…
mcourteaux Mar 6, 2026
98a49dd
Merge branch 'hvx-bugs' into fix-vulkan-interleave
mcourteaux Mar 6, 2026
4779737
I somehow f*cked up the git merge yesterday.
mcourteaux Mar 7, 2026
5fdf126
Merge branch 'main' into fix-vulkan-interleave
mcourteaux Mar 10, 2026
6565880
fix clang-tidy.
mcourteaux Mar 10, 2026
61d5c55
Use CSE across stores during legalization.
mcourteaux Mar 10, 2026
5f8e226
Address review comments.
mcourteaux Mar 10, 2026
459eed2
Add a fuzzer for extract_lanes and fix issues found
abadams Mar 10, 2026
a71db49
Merge remote-tracking branch 'origin/abadams/vector_legalization' int…
mcourteaux Mar 11, 2026
0c3b824
Clang format.
mcourteaux Mar 11, 2026
a386c58
Resolve ambiguous C++ call to Buffer constructor.
mcourteaux Mar 11, 2026
7343c37
Merge branch 'main' into fix-vulkan-interleave
mcourteaux Mar 11, 2026
06dcb86
Fix lossless casts of vector reduces down to bools
abadams Mar 12, 2026
3a30306
Merge remote-tracking branch 'origin/abadams/fix_9011' into fix-vulka…
mcourteaux Mar 13, 2026
358257b
Merge branch 'main' into fix-vulkan-interleave
mcourteaux Mar 14, 2026
4dabe54
Fix an ARM codegen issue.
mcourteaux Mar 14, 2026
c83dc51
Simplify result of extract_lanes
mcourteaux Mar 14, 2026
a3077a2
Add skip for ARM in the extract_lanes fuzz tester.
mcourteaux Mar 14, 2026
ba579b9
Merge branch 'main' into fix-vulkan-interleave
mcourteaux Mar 15, 2026
2c09330
Fixes #9030.
mcourteaux Mar 15, 2026
33fa9b6
Merge remote-tracking branch 'origin/mcourteaux/fix-simplifier-vector…
mcourteaux Mar 15, 2026
af3bc76
Move simplification calls down to where they are needed.
mcourteaux Mar 15, 2026
a165925
Fix int type warning.
mcourteaux Mar 15, 2026
01b6b1b
Extra work on fuzz test.
mcourteaux Mar 15, 2026
cd60a77
Disable fuzz tester on non-x86_64 for now.
mcourteaux Mar 15, 2026
4e3750b
Apply pre-commit auto-fixes
halide-ci[bot] Mar 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -240,6 +240,9 @@ xcuserdata
# NeoVim + clangd
.cache

# CCLS
.ccls-cache

# Emacs
tags
TAGS
Expand Down
2 changes: 2 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -535,6 +535,7 @@ SOURCE_FILES = \
IRVisitor.cpp \
JITModule.cpp \
Lambda.cpp \
LegalizeVectors.cpp \
Lerp.cpp \
LICM.cpp \
LLVM_Output.cpp \
Expand Down Expand Up @@ -737,6 +738,7 @@ HEADER_FILES = \
IRVisitor.h \
JITModule.h \
Lambda.h \
LegalizeVectors.h \
Lerp.h \
LICM.h \
LLVM_Output.h \
Expand Down
5 changes: 4 additions & 1 deletion src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,8 @@ endif ()
set_target_properties(Halide PROPERTIES POSITION_INDEPENDENT_CODE ON)

##
# Lists of source files. Keep ALL lists sorted in alphabetical order.
# Lists of source files. Keep ALL lists sorted in case-insensitive alphabetical order.
# (neo)vim users can use ":sort i" in visual line mode.
Comment on lines +40 to +41
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider using https://github.com/google/keep-sorted for this.

##

# The externally-visible header files that go into making Halide.h.
Expand Down Expand Up @@ -145,6 +146,7 @@ target_sources(
IRVisitor.h
JITModule.h
Lambda.h
LegalizeVectors.h
Lerp.h
LICM.h
LLVM_Output.h
Expand Down Expand Up @@ -323,6 +325,7 @@ target_sources(
IRVisitor.cpp
JITModule.cpp
Lambda.cpp
LegalizeVectors.cpp
Lerp.cpp
LICM.cpp
LLVM_Output.cpp
Expand Down
5 changes: 5 additions & 0 deletions src/CSE.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,11 @@ bool should_extract(const Expr &e, bool lift_all) {
return false;
}

if (const Call *c = e.as<Call>()) {
// Calls with side effects should not be moved.
return c->is_pure() || c->call_type == Call::Halide;
}

if (lift_all) {
return true;
}
Expand Down
2 changes: 1 addition & 1 deletion src/CodeGen_ARM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1524,7 +1524,7 @@ void CodeGen_ARM::visit(const Store *op) {
// Declare the function
std::ostringstream instr;
vector<llvm::Type *> arg_types;
llvm::Type *intrin_llvm_type = llvm_type_with_constraint(intrin_type, false, is_sve ? VectorTypeConstraint::VScale : VectorTypeConstraint::Fixed);
llvm::Type *intrin_llvm_type = llvm_type_with_constraint(intrin_type, true, is_sve ? VectorTypeConstraint::VScale : VectorTypeConstraint::Fixed);
if (target.bits == 32) {
instr << "llvm.arm.neon.vst"
<< num_vecs
Expand Down
5 changes: 4 additions & 1 deletion src/CodeGen_Hexagon.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -1157,7 +1157,7 @@ Value *CodeGen_Hexagon::shuffle_vectors(Value *a, Value *b,
internal_assert(result_elements > 0);
llvm::Type *result_ty = get_vector_type(element_ty, result_elements);

// Try to rewrite shuffles that only access the elements of b.
// Find the range of non-dont-care indices.
int min = INT_MAX;
int max = -1;
for (int idx : indices) {
Expand All @@ -1169,6 +1169,8 @@ Value *CodeGen_Hexagon::shuffle_vectors(Value *a, Value *b,
if (min == INT_MAX) {
return llvm::PoisonValue::get(result_ty);
}

// Try to rewrite shuffles that only access the elements of b.
if (min >= a_elements) {
vector<int> shifted_indices(indices);
for (int &i : shifted_indices) {
Expand Down Expand Up @@ -1565,6 +1567,7 @@ Value *CodeGen_Hexagon::vdelta(Value *lut, const vector<int> &indices) {
Value *ret = nullptr;
for (int i = 0; i < lut_elements; i += native_elements) {
Value *lut_i = slice_vector(lut, i, native_elements);
internal_assert(get_vector_num_elements(lut_i->getType()) == native_elements);
vector<int> indices_i(native_elements);
vector<Constant *> mask(native_elements);
bool all_used = true;
Expand Down
3 changes: 2 additions & 1 deletion src/CodeGen_LLVM.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5093,10 +5093,11 @@ Value *CodeGen_LLVM::shuffle_vectors(Value *a, Value *b,
}
// Check for type identity *after* normalizing to fixed vectors
internal_assert(a->getType() == b->getType());
int elements_a = get_vector_num_elements(a->getType());
vector<Constant *> llvm_indices(indices.size());
for (size_t i = 0; i < llvm_indices.size(); i++) {
if (indices[i] >= 0) {
internal_assert(indices[i] < get_vector_num_elements(a->getType()) * 2);
internal_assert(indices[i] < elements_a * 2) << indices[i] << " " << elements_a * 2;
llvm_indices[i] = ConstantInt::get(i32_t, indices[i]);
} else {
// Only let -1 be undef.
Expand Down
24 changes: 7 additions & 17 deletions src/CodeGen_Vulkan_Dev.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2086,31 +2086,21 @@ void CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(const Shuffle *op) {
debug(3) << "\n";

if (arg_ids.size() == 1) {

// 1 argument, just do a simple assignment via a cast
SpvId result_id = cast_type(op->type, op->vectors[0].type(), arg_ids[0]);
builder.update_id(result_id);

} else if (arg_ids.size() == 2) {

// 2 arguments, use a composite insert to update even and odd indices
uint32_t even_idx = 0;
uint32_t odd_idx = 1;
SpvFactory::Indices even_indices;
SpvFactory::Indices odd_indices;
for (int i = 0; i < op_lanes; ++i) {
even_indices.push_back(even_idx);
odd_indices.push_back(odd_idx);
even_idx += 2;
odd_idx += 2;
// 2 arguments, use vector-shuffle with logical indices indexing into (vec1[0], vec1[1], ..., vec2[0], vec2[1], ...)
SpvFactory::Indices logical_indices;
for (int i = 0; i < arg_lanes; ++i) {
logical_indices.push_back(uint32_t(i));
logical_indices.push_back(uint32_t(i + arg_lanes));
}

SpvId type_id = builder.declare_type(op->type);
SpvId value_id = builder.declare_null_constant(op->type);
SpvId partial_id = builder.reserve_id(SpvResultId);
SpvId result_id = builder.reserve_id(SpvResultId);
builder.append(SpvFactory::composite_insert(type_id, partial_id, arg_ids[0], value_id, even_indices));
builder.append(SpvFactory::composite_insert(type_id, result_id, arg_ids[1], partial_id, odd_indices));
builder.append(SpvFactory::vector_shuffle(type_id, result_id, arg_ids[0], arg_ids[1], logical_indices));
builder.update_id(result_id);

} else {
Expand Down Expand Up @@ -2140,7 +2130,7 @@ void CodeGen_Vulkan_Dev::SPIRV_Emitter::visit(const Shuffle *op) {
} else if (op->is_extract_element()) {
int idx = op->indices[0];
internal_assert(idx >= 0);
internal_assert(idx <= op->vectors[0].type().lanes());
internal_assert(idx < op->vectors[0].type().lanes());
if (op->vectors[0].type().is_vector()) {
SpvFactory::Indices indices = {(uint32_t)idx};
SpvId type_id = builder.declare_type(op->type);
Expand Down
Loading
Loading