Skip to content

Update ggml_sycl_op_mul_mat_vec_q#5502

Merged
abhilash1910 merged 6 commits intoggml-org:masterfrom
AidanBeltonS:Update_mul_mat_vec_q
Feb 20, 2024
Merged

Update ggml_sycl_op_mul_mat_vec_q#5502
abhilash1910 merged 6 commits intoggml-org:masterfrom
AidanBeltonS:Update_mul_mat_vec_q

Conversation

@AidanBeltonS
Copy link
Copy Markdown
Contributor

This PR updates the unsupported quantized data types and refactors the code for ggml_sycl_op_mul_mat_vec_q.
SYCL does not currently have the intrinsics to support some quantized data types, this adds one missing quantized data type to the unsupported check, so tests won't be run.
This also refactors the code so there is a single templated mul_mat_vec_q_sycl_submitter rather than multiple duplicate functions which submit a different instantiated kernel. This makes the code less verbose and much smaller.

@AidanBeltonS
Copy link
Copy Markdown
Contributor Author

@NeoZhangJianyu, @abhilash1910, @Alcpz, feedback would be appreciated

Copy link
Copy Markdown
Collaborator

@Alcpz Alcpz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment on the refactor. Looks great.

Comment thread ggml-sycl.cpp Outdated
Comment thread ggml-sycl.cpp Outdated
Comment thread ggml-sycl.cpp Outdated
Comment thread ggml-sycl.cpp Outdated
Comment thread ggml-sycl.cpp Outdated
Comment thread ggml-sycl.cpp Outdated
@abhilash1910
Copy link
Copy Markdown
Contributor

Thanks @AidanBeltonS , could you please rebase to latest master for CI build?
Also tagging @ggerganov & @airMeng for a look when available.

@abhilash1910
Copy link
Copy Markdown
Contributor

@ggerganov @0cc4m I think the vulkan build CI is exiting abruptly - maybe issue is common for other requests. Could you help take a look ? Thanks

@ggerganov
Copy link
Copy Markdown
Member

ggerganov commented Feb 16, 2024

It's because we enabled the gguf example which links only the ggml library and does not link ggml-vulkan or ggml-rocm:

#5216 (comment)

We can easily disable the gguf example from the build, but I'm wondering if we need separate libs in the first place. Can we not link everything in ggml, similar to what CUDA, Metal, SYCL, etc. do

Copy link
Copy Markdown
Contributor

@airMeng airMeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@abhilash1910
Copy link
Copy Markdown
Contributor

@AidanBeltonS could you please rebase to latest master - should solve some build issues with vulkan ci.

@NeoZhangJianyu
Copy link
Copy Markdown
Contributor

  1. "unsupported quantized data types"
    What's the data types to be supported by this PR?

  2. How to test with the supported data type?

Aidan and others added 4 commits February 19, 2024 10:47
Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
@AidanBeltonS
Copy link
Copy Markdown
Contributor Author

AidanBeltonS commented Feb 19, 2024

  1. "unsupported quantized data types"
    What's the data types to be supported by this PR?

    1. How to test with the supported data type?
  1. This PR does not change the supported and unsupported data types. If you look at the switch case, all the types that are there are supported. It simply explicitly sets an assert for the quantization types we do not support, and sets tests which use that data type to unsupported.

  2. All the regular tests for this functionality are run. So any existing testing for the other quantization types will still be run as normal to check functionality.

@abhilash1910 abhilash1910 merged commit b9111bd into ggml-org:master Feb 20, 2024
jordankanter pushed a commit to jordankanter/llama.cpp that referenced this pull request Mar 13, 2024
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
phuongncn pushed a commit to phuongncn/llama.cpp-gx10-dgx-sparks-deepseekv4 that referenced this pull request Apr 28, 2026
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
ljubomirj pushed a commit to ljubomirj/llama.cpp that referenced this pull request May 6, 2026
* Update ggml_sycl_op_mul_mat_vec_q

* Apply suggestions from code review

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>

* revert suggestion on macro

* fix bug

* Add quant type GGML_TYPE_IQ1_S to unsupported

* fix format

---------

Co-authored-by: Abhilash Majumder <30946547+abhilash1910@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants