-
Notifications
You must be signed in to change notification settings - Fork 95
Description
Problem Description
In oneMKL Interfaces library, a test using the rocFFT backend fails. This test is a forward transposing DFT. I believe this a bug in rocFFT, introduced between the releases for ROCm 5.4.3 and ROCm 5.7.1. The issue is described in a comment at uxlfoundation/oneMath#559.
The DFT that fails can be described as:
lengths = {4, 4, 4}
inStrides = {16 1 4}
outStrides = {1 16 4}
batches = 2 // may not be important
This can be reproduced by building oneMKL interfaces library with the rocFFT backend for ROCm 5.7.1 or later, and running the the test. Intel's oneAPI base toolkit 2024.2 and the Codeplay's plugin for AMD GPUs are required (see the oneMKL interfaces documentation).
cmake $ONEMKL_SOURCE_DIR -DCMAKE_CXX_COMPILER=icpx -DCMAKE_C_COMPILER=icx -DENABLE_MKLGPU_BACKEND=OFF -DENABLE_MKLCPU_BACKEND=OFF -DENABLE_ROCFFT_BACKEND=ON -DTARGET_DOMAINS=dft -DBUILD_FUNCTIONAL_TESTS=ON -DHIP_CXX_COMPILER=/path/to/rocm/version/llvm/bin/clang++
make
./bin/test_main_dft_ct --gtest_filter=*omputeTests_out_of_place_COMPLEX.COMPLEX_SINGLE_out_of_place_USM/sizes_4x4x4_fwd_strides_2_4_1_16_bwd_strides_1_4_16_1_*
I've identified a bug in the rocFFT implementation that fixes the above issue in my tests: a lambda captures a variable by value where by reference is intended. I don't entirely understand why this works if reordering the DFT dimensions is otherwise valid, so it might be dumb luck.
diff --git a/library/src/plan.cpp b/library/src/plan.cpp
index 969bdd04..1ef07ded 100644
--- a/library/src/plan.cpp
+++ b/library/src/plan.cpp
@@ -237,7 +237,7 @@ void rocfft_plan_t::sort()
return;
bool sort_on_istride = true;
- auto sorter = [sort_on_istride](const rocfft_iodim& a, const rocfft_iodim& b) {
+ auto sorter = [&sort_on_istride](const rocfft_iodim& a, const rocfft_iodim& b) {
// move any lengths of 1 to the end
if(a.length == 1 && b.length != 1)
return false;
This fix was tested on the rocFFT version corresponding to ROCm 5.7.1 only.
Operating System
Ubuntu 22.04
CPU
AMD EPYC 7402 w/MI210, i9-12900K w/w6800
GPU
AMD Instinct MI210, AMD Radeon Pro W6800
ROCm Version
ROCm 6.1.0, ROCm 6.0.0, ROCm 5.7.1
ROCm Component
rocFFT
Steps to Reproduce
No response
(Optional for Linux users) Output of /opt/rocm/bin/rocminfo --support
No response
Additional Information
No response