Update ggml_sycl_op_mul_mat_vec_q by AidanBeltonS · Pull Request #5502 · ggml-org/llama.cpp

AidanBeltonS · 2024-02-15T10:50:52Z

This PR updates the unsupported quantized data types and refactors the code for ggml_sycl_op_mul_mat_vec_q.
SYCL does not currently have the intrinsics to support some quantized data types, this adds one missing quantized data type to the unsupported check, so tests won't be run.
This also refactors the code so there is a single templated mul_mat_vec_q_sycl_submitter rather than multiple duplicate functions which submit a different instantiated kernel. This makes the code less verbose and much smaller.

AidanBeltonS · 2024-02-15T10:52:10Z

@NeoZhangJianyu, @abhilash1910, @Alcpz, feedback would be appreciated

Alcpz

Minor comment on the refactor. Looks great.

abhilash1910 · 2024-02-15T11:31:43Z

Thanks @AidanBeltonS , could you please rebase to latest master for CI build?
Also tagging @ggerganov & @airMeng for a look when available.

abhilash1910 · 2024-02-16T06:29:14Z

@ggerganov @0cc4m I think the vulkan build CI is exiting abruptly - maybe issue is common for other requests. Could you help take a look ? Thanks

ggerganov · 2024-02-16T09:15:39Z

It's because we enabled the gguf example which links only the ggml library and does not link ggml-vulkan or ggml-rocm:

#5216 (comment)

We can easily disable the gguf example from the build, but I'm wondering if we need separate libs in the first place. Can we not link everything in ggml, similar to what CUDA, Metal, SYCL, etc. do

airMeng

LGTM

abhilash1910 · 2024-02-19T02:32:30Z

@AidanBeltonS could you please rebase to latest master - should solve some build issues with vulkan ci.

NeoZhangJianyu · 2024-02-19T09:24:06Z

"unsupported quantized data types"
What's the data types to be supported by this PR?
How to test with the supported data type?

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

AidanBeltonS · 2024-02-19T10:53:31Z

"unsupported quantized data types"
What's the data types to be supported by this PR?

How to test with the supported data type?

This PR does not change the supported and unsupported data types. If you look at the switch case, all the types that are there are supported. It simply explicitly sets an assert for the quantization types we do not support, and sets tests which use that data type to unsupported.
All the regular tests for this functionality are run. So any existing testing for the other quantization types will still be run as normal to check functionality.

* Update ggml_sycl_op_mul_mat_vec_q * Apply suggestions from code review Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com> * revert suggestion on macro * fix bug * Add quant type GGML_TYPE_IQ1_S to unsupported * fix format --------- Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

Alcpz reviewed Feb 15, 2024

View reviewed changes