Skip to content

[Bug] BuiltinLower has hard-coded heuristic for alloca which is not appropriate for all kDLCPU target devices #9022

@mbs-octoml

Description

@mbs-octoml

if (device_type_.defined()) {

This particular code has a complex history but it's purpose is to avoid pool allocations for small tensors on CPU targets (a big perf win). However, not all devices of type 'CPU' support alloca since the stack may not be hardware addressable. In particular, the EthosU target requires all mem ops to be rewritten to the pool allocation primitives for later handling.

So the logic which fires this early exit heuristic (so that a later pass will rewrite allocs to alloca) is too hard coded.

Any easy fix would be to make the kMaxStackAlloca threshold zero on target which do not support alloca. That can be done by moving the value to a TargetKind attribute.

Metadata

Metadata

Assignees

No one assigned

    Labels

    tir:transformTIR transforms: src/tir/transforms and python/tvm/tir/transformstype: bug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions