[SYCL] Update SYCL-Rope op and Refactor#8157
Conversation
|
This is my first PR to SYCL :). @airMeng, @luoyu-intel, can you please take a look? Do I need other tests to verify it? |
|
With this PR, the DeepSeek-Coder-V2-Lite-Instruct model is working perfectly. |
3284c9c to
0ea9ccb
Compare
I apologize for the false feedback. I just discovered that I was building the branch with the wrong flag, and as a result, the build didn't use GPU offload. I didn't notice this because the DeepSeek v2 Lite model runs very quickly. With the correct flag and GPU acceleration enabled, llama-server crashes with 'GGML_ASSERT: S:/LLM/SYCL/llama.cpp/ggml/src/ggml-sycl.cpp:3226: dim == 2,' just like the main branch. |
|
@characharm do you mean https://github.com/zhentaoyu/llama.cpp/blob/0ea9ccbdda9ce342ef7e800cce3606fca1ff1225/ggml/src/ggml-sycl.cpp#L3014? |
Signed-off-by: Yu Zhentao <zhentao.yu@intel.com>
Signed-off-by: Yu Zhentao <zhentao.yu@intel.com>
Signed-off-by: Yu Zhentao <zhentao.yu@intel.com>
0ea9ccb to
43aa0d3
Compare
* align with rope.cu and move sycl-op to a single file
* align with rope.cu and move sycl-op to a single file
modifications:
rope.cuggml-syclfolder.UT:
NEAPI_DEVICE_SELECTOR=level_zero:gpu7 ./build/bin/test-backend-ops -b SYCL7 -o ROPEbefore:
after:
all contiguous src0 UT cases pass.