Skip to content

Fix vector_reduce simplification bug.#9031

Merged
alexreinking merged 1 commit intomainfrom
mcourteaux/fix-simplifier-vector-reduce
Mar 15, 2026
Merged

Fix vector_reduce simplification bug.#9031
alexreinking merged 1 commit intomainfrom
mcourteaux/fix-simplifier-vector-reduce

Conversation

@mcourteaux
Copy link
Contributor

@mcourteaux mcourteaux commented Mar 15, 2026

Gemini said:

An analysis of the debug prints and Halide's source code reveals that vector_reduce_add(x9((uint8)3)) incorrectly constant-folds into x3((uint8)3) instead of x3((uint8)9). This evaluation happens because of two closely related bugs in the simplifier for VectorReduce:

  1. Incorrect Integer Check for Bounds Tracking:
    In Simplify::visit(const VectorReduce *op, ExprInfo *info), bounds tracking attempts to properly scale bounds of an addition by multiplying the current bounds by the reduction factor. However, the condition checking if bounds should be tracked checks if op->type.is_int(). In Halide, .is_int() checks if the type is a signed integer; it evaluates to false for unsigned integer types like uint8. As a result, the reduction scaling is completely skipped, leaving the initial bounds unchanged ([3, 3]). The mutator spots these single-point bounds and prematurely folds the expression down to exactly 3. We need to use .is_int_or_uint().

  2. Missing IRMatcher Structural Rewrite Rule:
    Currently, the IRMatcher structurally matches vector_reduce_add of an explicit multiplication with a broadcast (h_add(x * broadcast(y, ...), ...)). However, unlike VectorReduce::Min and VectorReduce::Max which properly flatten a bare broadcast, VectorReduce::Add lacks the rule to reduce h_add(broadcast(x, arg_lanes), lanes). Adding broadcast(x * factor, lanes) structurally captures reductions of broadcasts into equivalent scaled broadcasts even when dealing with variable expressions.

Here are the fixes for both issues:

--- Simplify_Exprs.cpp
+++ Simplify_Exprs.cpp
@@ -73,7 +73,7 @@
         return value;
     }
 
-    if (info && op->type.is_int()) {
+    if (info && op->type.is_int_or_uint()) {
         switch (op->op) {
         case VectorReduce::Add:
             // Alignment of result is the alignment of the arg. Bounds
@@ -124,7 +124,8 @@
     case VectorReduce::Add: {
         auto rewrite = IRMatcher::rewriter(IRMatcher::h_add(value, lanes), op->type);
         if (rewrite(h_add(x * broadcast(y, arg_lanes), lanes), h_add(x, lanes) * broadcast(y, lanes)) ||
-            rewrite(h_add(broadcast(x, arg_lanes) * y, lanes), h_add(y, lanes) * broadcast(x, lanes))) {
+            rewrite(h_add(broadcast(x, arg_lanes) * y, lanes), h_add(y, lanes) * broadcast(x, lanes)) ||
+            rewrite(h_add(broadcast(x, arg_lanes), lanes), broadcast(x * factor, lanes))) {
             return mutate(rewrite.result, info);
         }
         break;

Fixes #9030.

Co-authored-by: Gemini 3.1 Pro <gemini@aistudio.google.com>
@mcourteaux mcourteaux requested a review from abadams March 15, 2026 12:12
@mcourteaux mcourteaux changed the title Fixes #9030. Fix vector_reduce simplification bug. Mar 15, 2026
Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The rewrite rules appear sound and the tests look correct.

@alexreinking
Copy link
Member

#8928 strikes again!

@alexreinking alexreinking merged commit d9c29a5 into main Mar 15, 2026
22 of 29 checks passed
}

if (info && op->type.is_int()) {
if (info && op->type.is_int_or_uint()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't this also be broken for float? Why is there a type check at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hah. Potentially yes. Good point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Vector reduce simplification bug.

3 participants