Describe the bug
I'm debugging the SYCL backend of llama.cpp.
I found some kernel output -nan when built with Debug. The root cause is that
sycl::select_from_group(g, x,
target_offset < logical_sub_group_size
? start_index + target_offset
: id);
this returns nan when local_id is larger than 16.
I can also reproduce this by using sg.shuffle:
tmp += sg.shuffle(x, i); when i > 16, the result of shuffle is nan. But they both work fine with Release build.
To reproduce
- checkout llama.cpp: https://github.com/ggerganov/llama.cpp
- built with SYCL debug config
- run inference.
Environment
Windows 11
Intel Arc770
Intel(R) oneAPI DPC++/C++ Compiler 2024.1.0 (2024.1.0.20240308)
Target: x86_64-pc-windows-msvc
Thread model: posix
Additional context
No response
Describe the bug
I'm debugging the SYCL backend of llama.cpp.
I found some kernel output
-nanwhen built with Debug. The root cause is thatsycl::select_from_group(g, x, target_offset < logical_sub_group_size ? start_index + target_offset : id);this returns nan when local_id is larger than 16.
I can also reproduce this by using sg.shuffle:
tmp += sg.shuffle(x, i);when i > 16, the result of shuffle isnan. But they both work fine with Release build.To reproduce
Environment
Windows 11
Intel Arc770
Intel(R) oneAPI DPC++/C++ Compiler 2024.1.0 (2024.1.0.20240308)
Target: x86_64-pc-windows-msvc
Thread model: posix
Additional context
No response