-
Notifications
You must be signed in to change notification settings - Fork 761
Emit diagnostic when attempting to use extended device lambda in contexts that require querying the return type #1688
Description
It is a very common pitfall of Thrust users to attempt to use a __device__ lambda with Thrust algorithms or iterators that fails in silent or obscure ways.
This is frequently due to the limitation that you cannot reliably query the return type of an extended lambda in host code. Specifically:
- As described above, the CUDA compiler replaces a device extended lambda defined in a host function with a placeholder type defined in namespace scope. This placeholder type does not define a operator() function equivalent to the original lambda declaration. An attempt to determine the return type or parameter types of the operator() function may therefore work incorrectly in host code, as the code processed by the host compiler will be semantically different than the input code processed by the CUDA compiler. However, it is OK to introspect the return type or parameter types of the operator() function within device code. Note that this restriction does not apply to host device extended lambdas.
This means that any code in Thrust that relies on std::invoke_result_t or std::result_of or similar in host code will fail with a __device__ lambda, e.g.,
| using result_type = std::invoke_result_t<TransformOp, input_type>; |
This will frequently fail silently or fail to compile with incredibly obscure compile errors. This is such a nuisance that I avoid __device__ lambdas all together with Thrust.
However, I recently learned that nvcc provides intrinsic type traits to query if a type is an extended lambda.
The compiler provides type traits to detect closure types for extended lambdas at compile time:
__nv_is_extended_device_lambda_closure_type(type): If 'type' is the closure class created for an extended device lambda, then the trait is true, otherwise it is false.
__nv_is_extended_host_device_lambda_closure_type(type): If 'type' is the closure class created for an extended host device lambda, then the trait is true, otherwise it is false.
This means Thrust can emit a useful diagnostic when a user attempts to use a __device__ lambda in a situation where that would be problematic (e.g., std::invoke_result_t).
One easy way to do this in thrust would be to introduce a consistent wrapper for invoke_result_t/result_of that simply static_asserts that the callable is not an extended lambda.
namespace thrust::detail{
template <typename F, typename... Args>
struct invoke_result_t : std::invoke_result_t<F, Args...>{
static_assert(!__nv_is_extended_device_lambda_closure_type(F), "Attempt to use an extended __device__ lambda in a context that requires querying its return type. Use a named function object or a __host__ __device__ lambda instead.");
}
There are likely many other useful ways Thrust could use these traits. Deducing (or at least verifying) execution space from the callable comes to mind.