Is this a duplicate?
Type of Bug
Compile-time Error
Component
CUB
Describe the bug
PR #2773 change the definitions of constexpr values RegBoundScaling::BLOCK_THREADS and MemBoundScaling::BLOCK_THREADS in cub/util_arch.cuh to use the constexpr function ::cuda::ceil_div. This breaks NVC++ stdpar, because ::cuda::ceil_div is not constexpr in NVC++. ceil_div contains two _CCCL_ASSERT statements. With NVC++ in stdpar mode, _CCCL_ASSERT is defined as
# define _CCCL_ASSERT(expression, message) \
NV_IF_ELSE_TARGET( \
NV_IS_DEVICE, (_CCCL_ASSERT_DEVICE(expression, message);), (_CCCL_ASSERT_HOST(expression, message);))
That works fine at runtime. But NV_IF_ELSE_TARGET is not constexpr. It can't be evaluated by the front end during constexpr evaluation because the front end can't know what target the code will eventually run on. This prevents ceil_dev from being a constexpr function and leads to compilation errors such as:
"/home/dolsen/work/pgi/dev/nv/Linux_x86_64/mine/compilers/include-stdpar/cub/util_arch.cuh", line 136: error: expression must have a constant value
::cuda::std::min(Nominal4ByteBlockThreads,
^
"/home/dolsen/work/pgi/dev/nv/Linux_x86_64/mine/compilers/include-stdpar/cuda/__cmath/ceil_div.h", line 63: note: cannot call non-constexpr function "__builtin_i
s_device_code" (declared implicitly)
_CCCL_ASSERT(__a >= _Tp(0), "cuda::ceil_div: a must be non negative");
^
How to Reproduce
The NVHPC stdpar automated tests have many hundreds of failures due to this. Any test that uses a parallel algorithm that uses CUB will run into this.
Expected behavior
nvc++ -stdpar should work.
Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response
Is this a duplicate?
Type of Bug
Compile-time Error
Component
CUB
Describe the bug
PR #2773 change the definitions of constexpr values
RegBoundScaling::BLOCK_THREADSandMemBoundScaling::BLOCK_THREADSincub/util_arch.cuhto use the constexpr function::cuda::ceil_div. This breaks NVC++ stdpar, because::cuda::ceil_divis not constexpr in NVC++.ceil_divcontains two_CCCL_ASSERTstatements. With NVC++ in stdpar mode,_CCCL_ASSERTis defined asThat works fine at runtime. But
NV_IF_ELSE_TARGETis not constexpr. It can't be evaluated by the front end during constexpr evaluation because the front end can't know what target the code will eventually run on. This preventsceil_devfrom being a constexpr function and leads to compilation errors such as:How to Reproduce
The NVHPC stdpar automated tests have many hundreds of failures due to this. Any test that uses a parallel algorithm that uses CUB will run into this.
Expected behavior
nvc++ -stdparshould work.Reproduction link
No response
Operating System
No response
nvidia-smi output
No response
NVCC version
No response