Add `cuda::ptx::cp_reduce_async_bulk` by ahendriksen · Pull Request #1445 · NVIDIA/cccl

ahendriksen · 2024-02-27T17:14:40Z

Description

Checklist

New or existing tests cover these changes.
The documentation is up to date with these changes.

1. Add the ifdef 2. Add min, max support for f16 and bf16 (I overlooked this initially)

miscco · 2024-02-28T16:21:05Z

+}
+#endif // __cccl_ptx_isa >= 800
+
+#ifdef _LIBCUDACXX_HAS_NVF16


Note, the PR that brings this in has not been merged, so that will currently always be off until we merge #1140

Okay.. This issue probably caused the tests to fail, so I have guarded the tests on this macro as well.

I am okay with the __half and bfloat16 variants not being available immediately. I have tested the generated PTX offline, so I know it works.

ahendriksen requested review from a team as code owners February 27, 2024 17:14

ahendriksen requested review from griwes and miscco February 27, 2024 17:14

miscco approved these changes Feb 28, 2024

View reviewed changes

miscco reviewed Feb 28, 2024

View reviewed changes

Comment thread libcudacxx/include/cuda/std/detail/libcxx/include/__cuda/ptx.h

ahendriksen added 2 commits February 28, 2024 17:01

Add cuda::ptx::cp_reduce_async_bulk

34b639c

Fix f16 and bf16 support

312c5b6

1. Add the ifdef 2. Add min, max support for f16 and bf16 (I overlooked this initially)

ahendriksen force-pushed the add-ptx-cp-reduce-async-bulk branch from 44aa1af to 312c5b6 Compare February 28, 2024 16:05

miscco reviewed Feb 28, 2024

View reviewed changes

cp.reduce.async.bulk: guard {b}f16 tests

d17a6c8

miscco enabled auto-merge (squash) February 29, 2024 07:15

auto-merge was automatically disabled February 29, 2024 07:25
Pull Request is not mergeable

miscco merged commit 4495154 into NVIDIA:main Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `cuda::ptx::cp_reduce_async_bulk`#1445

Add `cuda::ptx::cp_reduce_async_bulk`#1445
miscco merged 3 commits intoNVIDIA:mainfrom
ahendriksen:add-ptx-cp-reduce-async-bulk

ahendriksen commented Feb 27, 2024

Uh oh!

Uh oh!

miscco Feb 28, 2024

Uh oh!

ahendriksen Feb 28, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ahendriksen commented Feb 27, 2024

Description

Checklist

Uh oh!

Uh oh!

miscco Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

ahendriksen Feb 28, 2024

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants