Skip to content

[Bug] [BYOC] AoT Codegen produces invalid packed function call for relay models using multi-output subgraphs #9036

@PhilippvK

Description

@PhilippvK

I recently ran into segmentation faults working with rather large BYOC subgraphs and the AoT executor for MicroTVM. It seems to be related to a BYOC subgraph with more than one output, as I was able to create a relatively simple test case to reproduce the issue just using additions and the ccompiler codegen.

Expected behavior

Simplified Test Case: (see below for full example)

    ...
    # Inputs and Weights
    x = relay.var("x", shape=(10, 10))
    w0 = relay.var("w0", shape=(10, 10))
    w1 = relay.var("w1", shape=(10, 10))
    w2 = relay.var("w2", shape=(10, 10))

    # C compiler

    # z0 = x + w0
    x_ = compiler_begin(x, "ccompiler")
    w0_ = compiler_begin(w0, "ccompiler")
    z0_ = relay.add(x_, w0_)
    z0 = compiler_end(z0_, "ccompiler")

    # z1 = z0 + w1
    z0__ = compiler_begin(z0, "ccompiler")
    w1_ = compiler_begin(w1, "ccompiler")
    z1_ = relay.add(z0__, w1_)
    z1 = compiler_end(z1_, "ccompiler")

    # TVM Compiler

    # z2 = z0 + z1
    z2 = relay.add(z0, z1)

    f = relay.Function([x, w0, w1], z2)
    mod = tvm.IRModule()
    mod["main"] = f

    if merge_compiler_regions:
        mod = transform.MergeCompilerRegions()(mod)

    mod = transform.PartitionGraph("mod_name")(mod)
    mod = transform.InferType()(mod)
    ...

Running the test should not result in any failures.

Actual behavior

The test for merge_compiler_regions=True failed while one with only one relay.add per subgraph finished successful.

Investigation

The problem seems to be that the the tvmgen_my_mod_run_model generated by the AoT codegen the TVM function tvmgen_my_mod_fused_add is assumed to have 2 arguments while it actually has 3 (e.g. 2 inputs and 1 output). Therefore the last argument is not properly packed and will still point to the 3rd argument of the previous packed function call instead of the model output.

Using the default aot_test_utils.py, it will "just" fail because of an output value mismatch because all model inputs are declared as non-constant. In PhilippvK@2bb77f8 I modified the create_header_file function to store model inputs as constants which leads to the mentioned segmentation fails caused by writing to a const variable.

As the error is only present for merge_compiler_regions=True, it should not be directly related to Tuple inputs.

Relay model after partitioning:

def @main(%x: Tensor[(10, 10), float32], %w0: Tensor[(10, 10), float32], %w1: Tensor[(10, 10), float32]) -> Tensor[(10, 10), float32] {
  %0 = @tvmgen_mod_name_ccompiler_main_0(%x, %w0, %w1) /* ty=(Tensor[(10, 10), float32], Tensor[(10, 10), float32]) */;
  %1 = %0.0;
  %2 = %0.1;
  add(%1, %2) /* ty=Tensor[(10, 10), float32] */
}

def @tvmgen_mod_name_ccompiler_main_0(%ccompiler_0_i0: Tensor[(10, 10), float32], %ccompiler_0_i1: Tensor[(10, 10), float32], %ccompiler_0_i2: Tensor[(10, 10), float32], Inline=1, Compiler="ccompiler", global_symbol="tvmgen_mod_name_ccompiler_main_0", Primitive=1) -> (Tensor[(10, 10), float32], Tensor[(10, 10), float32]) {
  %3 = add(%ccompiler_0_i0, %ccompiler_0_i1) /* ty=Tensor[(10, 10), float32] */;
  %4 = add(%3, %ccompiler_0_i2) /* ty=Tensor[(10, 10), float32] */;
  (%3, %4)
}

This is the incorrectly generated code snippet by the AoT:

  TVMValue stack4[6];
  void* tvm_value_2 = stack4;
  (((DLTensor*)tvm_value_2)[0].data) = sid_3;
  (((TVMValue*)stack_value)[0].v_handle) = tvm_value_2;
  ((int32_t*)stack_tcode)[(0)] = 3;
  (((TVMValue*)stack_value)[1].v_handle) = output;
  ((int32_t*)stack_tcode)[(1)] = 3;
  TVMValue ret_val1;
  int ret_type_code1;
  if (tvmgen_my_mod_fused_add( (TVMValue*) stack_value , (int*) stack_tcode, 2, &ret_val1, &ret_type_code1, NULL) != 0){
    return -1;
  }
  return 0;

Environment

Operating System: Ubuntu18.04 & 20.04

TVM Version: 1fd8f61 (latest)

Python Version: Python v3.6

Steps to reproduce

The full pytest script for reproducing the issue can be found here alongside with the mentioned modifications to the AOT test helper script.

The tests can be run using the following command:

export PYTHONPATH=$(pwd)/python
python3 -m pytest tests/python/relay/aot/test_crt_aot_bug.py -s

(The -s is useful to inspect the relay model before and after partitioning which is printed during the test.)

Make sure to compile TVM using MicroTVM and LLVM support!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions