The bug introduced in #1336 slipped by testing because the tests only try to build host-side (CPU) code while the error shows up when building device-side code (GPU). We should try building device-side code as well to avoid this happening in the future (so far, I have found that the PR broke PyTorch, Blender, FFMPEG, OpenPose and the Intel Compiler)