Skip to content

task_rust.sh and task_cpp_unittest.sh fail with updated Docker images when USE_VITIS_AI ON #10696

@leandron

Description

@leandron

It looks like there is an error happening in ./tests/scripts/task_rust.sh when we have updated images including Python 3.7/TensorFlow 2.6/h5py 3.1.0. The error only happens when USE_VITIS_AI ON in ./tests/scripts/task_config_build_cpu.sh.

This is causing me issues when testing TensorFlow 2.6 images. I tried downgrading h5py and reproducing the steps without success. Despite that TensorFlow 2.6 requires h5py>3, so it is not a viable solution anyway.

Steps to reproduce:

cd <your tvm repo>
rm -rf build
docker pull tlcpackstaging/ci_cpu:20220321-061034-bd684deb5
./docker/bash.sh tlcpackstaging/ci_cpu:20220321-061034-bd684deb5

# this happens inside the container
./tests/scripts/task_config_build_cpu.sh build
./tests/scripts/task_build.py
./tests/scripts/task_ci_setup.sh
./tests/scripts/task_rust.sh         # this is the one that crashes

Then you should see an error similar to this:

test device::tests::device ... ok

test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

free(): invalid pointer
error: test failed, to rerun pass '--lib'

Caused by:
  process didn't exit successfully: `/workspace/rust/target/debug/deps/tvm_sys-0b86e7b6a5ad689c` (signal: 6, SIGABRT: process abort signal)
script returned exit code 101

By setting USE_VITIS_AI OFF in ./tests/scripts/task_config_build_cpu.sh and repeating the steps, the issue doesn't reproduce.

Full output in Jenkins: https://ci.tlcpack.ai/blue/organizations/jenkins/docker-images-ci%2Fdocker-image-run-tests/detail/docker-image-run-tests/69/pipeline/

cc @jtuyls @anilmartha can you help me troubleshooting this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions