I'm training a resnet18_v2 model for CUDA using the RPC runner. I run an RPC server on a remote machine that has a GPU, and I run the training program locally on a machine that has no GPUs.
extract_from_program fails because internally a GPU is required. Specifically, in /python/tvm/relay/op/strategy/cuda.py, the code checks for local GPU properties in three places:
if nvcc.have_tensorcore(tvm.gpu(0).compute_version):
This check fails when there is no local GPU. It's also logically inconsistent for remote GPUs even if the local GPU exists.