Skip to content

Remote Cross Compilation CUDA Runtime Independent Target #5420

@JonathanMace

Description

@JonathanMace

I'm training a resnet18_v2 model for CUDA using the RPC runner. I run an RPC server on a remote machine that has a GPU, and I run the training program locally on a machine that has no GPUs.

extract_from_program fails because internally a GPU is required. Specifically, in /python/tvm/relay/op/strategy/cuda.py, the code checks for local GPU properties in three places:

if nvcc.have_tensorcore(tvm.gpu(0).compute_version):

This check fails when there is no local GPU. It's also logically inconsistent for remote GPUs even if the local GPU exists.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions