Skip to content

Asyncing a middle stage in a sliding window chain produces incorrect output #7867

@abadams

Description

@abadams

This test fails:

#include "Halide.h"

using namespace Halide;

int main(int argc, char **argv) {

    Func f{"f"}, g{"g"}, h{"h"};
    Var x;

    f(x) = cast<uint8_t>(x + 7);
    g(x) = f(x);
    h(x) = g(x);

    f.store_root().compute_at(h, x);
    g.store_root().compute_at(h, x).async();

    Buffer<uint8_t> buf = h.realize({32});
    for (int i = 0; i < buf.dim(0).extent(); i++) {
        uint8_t correct = i + 7;
        if (buf(i) != correct) {
            printf("buf(%d) = %d instead of %d\n", i, buf(i), correct);
            return 1;
        }
    }

    return 0;
}

It produces IR of the form:

 fork {
  for (h.s0.v0, 0, h.extent.0) {
   acquire (g.folding_semaphore._0, 1) {
    produce g {
     consume f {
      g[0] = f[0]
     }
     halide_semaphore_release(g.semaphore_0, 1)
    }
   }
  }
 } {
  produce h {
   for (h.s0.v0.rebased, 0, h.extent.0) {
    produce f {
     f[0] = uint8(((h.min.0 + h.s0.v0.rebased) + 7))
    }
    acquire (g.semaphore_0, 1) {
     consume g {
      h[h.s0.v0.rebased] = g[0]
     }
    }
    halide_semaphore_release(g.folding_semaphore._0, 1)
   }
  }
 } 
 free f
 free g
}

This is incorrect, because f is potentially read before it is written to. I think the production of f needs to be outside the fork node instead of in the second clause of the fork node. You can't start running g until f is done, and f is not async.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions