Skip to content

Consider extending Clang-CUDA support/testing #2430

@StephanTLavavej

Description

@StephanTLavavej

Related to #2075. Possibly related to #1598, if Clang-CUDA requires a recent NVIDIA CUDA Toolkit to be installed.

Test coverage

Currently, we have no test coverage for Clang-CUDA. If this scenario is important to some users (as it appears to be) and is likely to be damaged as we modify preprocessor logic for Clang and CUDA separately, we should have test coverage to prevent major/obvious regressions.

__CUDACC__ preprocessor logic

Also, if this is important, we should audit the codebase for places where we're testing __CUDACC__ but Clang-CUDA could handle the normal codepath instead of needing the workaround codepath. (I suspect that Clang-CUDA can handle the normal codepath when "front-end stuff" is involved, but that we need the workaround codepath when "codegen intrinsic stuff" is involved.)

Current examples:

Already patched

  • STL/stl/inc/yvals_core.h

    Lines 428 to 432 in 303df3d

    #if defined(__CUDACC__) && !defined(__clang__) // TRANSITION, VSO-568006
    #define _NODISCARD_FRIEND friend
    #else // ^^^ workaround ^^^ / vvv no workaround vvv
    #define _NODISCARD_FRIEND _NODISCARD friend
    #endif // TRANSITION, VSO-568006

    Already patched by @CaseyCarter in <random>: Implement LWG-3519 #2208! 🎉 (This is a good example of "front-end stuff".)
  • STL/stl/inc/yvals_core.h

    Lines 586 to 592 in 303df3d

    #ifndef _ALLOW_COMPILER_AND_STL_VERSION_MISMATCH
    #ifdef __CUDACC__
    #if __CUDACC_VER_MAJOR__ < 10 \
    || (__CUDACC_VER_MAJOR__ == 10 \
    && (__CUDACC_VER_MINOR__ < 1 || (__CUDACC_VER_MINOR__ == 1 && __CUDACC_VER_BUILD__ < 243)))
    #error STL1002: Unexpected compiler version, expected CUDA 10.1 Update 2 or newer.
    #endif // ^^^ old CUDA ^^^

    This is what handle Clang-CUDA #2075 is patching.

No action necessary

  • STL/stl/inc/cmath

    Lines 15 to 16 in 303df3d

    #if !defined(_M_CEE) && !defined(__clang__) && !defined(__CUDACC__) && !defined(__INTEL_COMPILER)
    #define _HAS_CMATH_INTRINSICS 1

    Codegen intrinsics, excludes both Clang and CUDA, no action necessary.
  • STL/stl/inc/limits

    Lines 19 to 21 in 303df3d

    #if (defined(_M_ARM64) || defined(_M_ARM64EC)) && !defined(_M_CEE_PURE) && !defined(__CUDACC__) \
    && !defined(__INTEL_COMPILER) && !defined(__clang__) // TRANSITION, LLVM-51488
    #define _HAS_NEON_INTRINSICS 1

    STL/stl/inc/limits

    Lines 1054 to 1056 in 303df3d

    #if (defined(_M_IX86) || (defined(_M_X64) && !defined(_M_ARM64EC))) && !defined(_M_CEE_PURE) && !defined(__CUDACC__) \
    && !defined(__INTEL_COMPILER)
    #define _HAS_TZCNT_BSF_INTRINSICS 1

    STL/stl/inc/limits

    Lines 1157 to 1159 in 303df3d

    #if (defined(_M_IX86) || (defined(_M_X64) && !defined(_M_ARM64EC))) && !defined(_M_CEE_PURE) && !defined(__CUDACC__) \
    && !defined(__INTEL_COMPILER)
    #define _HAS_POPCNT_INTRINSICS 1

    These are all codegen intrinsics.
  • STL/stl/inc/type_traits

    Lines 636 to 639 in 303df3d

    #if defined(_IS_ASSIGNABLE_NOCHECK_SUPPORTED) && !defined(__CUDACC__)
    template <class _Ty>
    struct _Is_copy_assignable_no_precondition_check
    : bool_constant<__is_assignable_no_precondition_check(

    STL/stl/inc/type_traits

    Lines 661 to 664 in 303df3d

    #if defined(_IS_ASSIGNABLE_NOCHECK_SUPPORTED) && !defined(__CUDACC__)
    template <class _Ty>
    struct _Is_move_assignable_no_precondition_check
    : bool_constant<__is_assignable_no_precondition_check(add_lvalue_reference_t<_Ty>, _Ty)> {};

    This front-end __is_assignable_no_precondition_check makes MSVC behave like Clang, thus I believe there's no need to investigate making Clang-CUDA take this path.
  • STL/stl/inc/yvals_core.h

    Lines 438 to 441 in 303df3d

    #elif defined(__CUDACC__) // TRANSITION, CUDA - warning: attribute namespace "msvc" is unrecognized
    #define _MSVC_KNOWN_SEMANTICS
    #elif __has_cpp_attribute(msvc::known_semantics)
    #define _MSVC_KNOWN_SEMANTICS [[msvc::known_semantics]]

    This is for MSVC-specific type trait optimizations. No reason to make Clang-CUDA use this.
  • STL/stl/inc/yvals_core.h

    Lines 556 to 563 in 303df3d

    #ifdef __clang__
    #define _STL_DISABLE_DEPRECATED_WARNING \
    _Pragma("clang diagnostic push") \
    _Pragma("clang diagnostic ignored \"-Wdeprecated-declarations\"")
    #elif defined(__CUDACC__) || defined(__INTEL_COMPILER)
    #define _STL_DISABLE_DEPRECATED_WARNING \
    __pragma(warning(push)) \
    __pragma(warning(disable : 4996)) // was declared deprecated

    We already test for Clang before CUDA here, no action necessary. (Ditto for the restore macro below.)

Possible enhancements

  • STL/stl/inc/functional

    Lines 858 to 859 in 303df3d

    #ifdef __CUDACC__ // TRANSITION, CUDA
    #define _USE_FUNCTION_INT_0_SFINAE 0

    Front-end SFINAE, I suspect that Clang-CUDA doesn't need this workaround. (Also applies to Use int = 0 SFINAE in <memory> to improve compiler throughput #2124.)
  • STL/stl/inc/xutility

    Lines 36 to 40 in 303df3d

    #ifdef __CUDACC__
    #define _CONSTEXPR_BIT_CAST inline
    #else // ^^^ workaround ^^^ / vvv no workaround vvv
    #define _CONSTEXPR_BIT_CAST constexpr
    #endif // ^^^ no workaround ^^^

    STL/stl/inc/xutility

    Lines 66 to 73 in 303df3d

    _NODISCARD _CONSTEXPR_BIT_CAST _To _Bit_cast(const _From& _Val) noexcept {
    #ifdef __CUDACC__
    _To _To_obj; // assumes default-init
    _CSTD memcpy(_STD addressof(_To_obj), _STD addressof(_Val), sizeof(_To));
    return _To_obj;
    #else // ^^^ workaround ^^^ / vvv no workaround vvv
    return __builtin_bit_cast(_To, _Val);
    #endif // ^^^ no workaround ^^^

    Clang-CUDA might be capable of using __builtin_bit_cast.
  • STL/stl/inc/yvals_core.h

    Lines 450 to 451 in 303df3d

    #elif defined(__CUDACC__) || defined(__INTEL_COMPILER)
    #define _HAS_CONDITIONAL_EXPLICIT 0 // TRANSITION, CUDA/ICC

    Front-end stuff: Clang-CUDA likely supports "conditional explicit" in all Standard modes, so making it use the modern path would be good (as we already do for vanilla Clang).

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementSomething can be improvedtestRelated to test code

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions