Refer this PR #2268 .
I was working on this PR, and after passing unit test I did a sanity check, dumping IR.
// attr [kernel_a.v1] storage_scope = "global"
allocate kernel_a.v1[int32 * 4 * 4]
// attr [kernel_b] storage_scope = "global"
allocate kernel_b[int32 * 4 * 4]
produce kernel_a {
// attr [0] extern_scope = 0
for (i, 0, 16) {
kernel_a.v0[i] = (placeholder[i] + 2)
kernel_a.v1[i] = (placeholder[i] + 1)
}
}
produce kernel_b {
// attr [0] extern_scope = 0
for (i, 0, 4) {
for (j, 0, 4) {
kernel_b[((i*4) + j)] = (kernel_a.v0[((i*4) + j)]*kernel_a.v1[((i*4) + j)])
}
}
}
I find no allocation of kernel_a.v0 buffer, but this piece of code works fine?