Skip to content

added note for old Intel hardware pre sycl#18017

Merged
NeoZhangJianyu merged 3 commits intoggml-org:masterfrom
alosslessdev:patch-6
Dec 16, 2025
Merged

added note for old Intel hardware pre sycl#18017
NeoZhangJianyu merged 3 commits intoggml-org:masterfrom
alosslessdev:patch-6

Conversation

@alosslessdev
Copy link
Copy Markdown
Contributor

Older hardware used opencl

@github-actions github-actions Bot added documentation Improvements or additions to documentation SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language labels Dec 14, 2025
Copy link
Copy Markdown
Contributor

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intel GPU can be supported by several backends, including OpenCL.
But I don't like this sentence in SYCL backend doc:

l don't know why, current OpenCL backend narrow down it's scope from support multiple GPUs to only Qualcomm Adreno GPU. It's bad news.
As OpenCL.md, only list 3 Qualcomm Adreno GPUs are verified.
If you promise to support Intel GPU, please list them in your supported/verified hardware list.
But in the latest OpenCL.md, no any promise to support Intel GPU.
Based on above info, I don't think ask user to try OpenCL backend is good suggestion to Intel GPU users.

Every backend can promote itself in it's guide/doc.
Do not cross the line with your advertising! :)

Thank you!

@savvadesogle
Copy link
Copy Markdown

@NeoZhangJianyu
Hi, i tested some models with OpenCL backend:

C:\dev\llama_opencl\llama.cpp\build\bin\Release>llama-bench -m T:\models\TheBloke\Llama-2-7B-GGUF\llama-2-7b.Q4_0.gguf -ngl 100 -fa 0,1
ggml_opencl: selected platform: 'Intel(R) OpenCL Graphics'

ggml_opencl: device: 'Intel(R) Arc(TM) A770 Graphics (OpenCL 3.0 NEO )'
ggml_opencl: OpenCL driver: 32.0.101.8136
ggml_opencl: vector subgroup broadcast support: false
ggml_opencl: device FP16 support: true
ggml_opencl: mem base addr align: 128
ggml_opencl: max mem alloc size: 4095 MB
ggml_opencl: device max workgroup size: 1024
ggml_opencl: SVM coarse grain buffer support: true
ggml_opencl: SVM fine grain buffer support: false
ggml_opencl: SVM fine grain system support: false
ggml_opencl: SVM atomics support: false
ggml_opencl: flattening quantized weights representation as struct of arrays (GGML_OPENCL_SOA_Q)
ggml_opencl: loading OpenCL kernels....................................................................
ggml_opencl: default device: 'Intel(R) Arc(TM) A770 Graphics (OpenCL 3.0 NEO )'

model size params backend ngl fa test t/s
llama 7B Q4_0 3.56 GiB 6.74 B OpenCL 100 0 pp512 260.32 + 0.91
llama 7B Q4_0 3.56 GiB 6.74 B OpenCL 100 0 tg128 33.67 + 0.10
llama 7B Q4_0 3.56 GiB 6.74 B OpenCL 100 1 pp512 179.66 + 0.43
llama 7B Q4_0 3.56 GiB 6.74 B OpenCL 100 1 tg128 31.89 + 0.48

build: 4a4f7e6 (7409)

C:\dev\llama_opencl\llama.cpp\build\bin\Release>llama-bench -m T:\models\lmstudio-community\gpt-oss-20b-GGUF\gpt-oss-20b-MXFP4.gguf -ngl 100 -fa 0,1
ggml_opencl: selected platform: 'Intel(R) OpenCL Graphics'

ggml_opencl: device: 'Intel(R) Arc(TM) A770 Graphics (OpenCL 3.0 NEO )'
ggml_opencl: OpenCL driver: 32.0.101.8136
ggml_opencl: vector subgroup broadcast support: false
ggml_opencl: device FP16 support: true
ggml_opencl: mem base addr align: 128
ggml_opencl: max mem alloc size: 4095 MB
ggml_opencl: device max workgroup size: 1024
ggml_opencl: SVM coarse grain buffer support: true
ggml_opencl: SVM fine grain buffer support: false
ggml_opencl: SVM fine grain system support: false
ggml_opencl: SVM atomics support: false
ggml_opencl: flattening quantized weights representation as struct of arrays (GGML_OPENCL_SOA_Q)
ggml_opencl: loading OpenCL kernels....................................................................
ggml_opencl: default device: 'Intel(R) Arc(TM) A770 Graphics (OpenCL 3.0 NEO )'

model size params backend ngl fa test t/s
gpt-oss 20B MXFP4 MoE 11.27 GiB 20.91 B OpenCL 100 0 pp512 293.56 + 0.56
gpt-oss 20B MXFP4 MoE 11.27 GiB 20.91 B OpenCL 100 0 tg128 41.14 + 0.34
gpt-oss 20B MXFP4 MoE 11.27 GiB 20.91 B OpenCL 100 1 pp512 289.83 + 0.45
gpt-oss 20B MXFP4 MoE 11.27 GiB 20.91 B OpenCL 100 1 tg128 40.53 + 0.18

build: 4a4f7e6 (7409)

@NeoZhangJianyu
Copy link
Copy Markdown
Contributor

Yes, I see.

I wonder why OpenCL.md mention to focus on Qualcomm Adreno GPU.
Will OpenCL backend include Intel GPU in it's scope?

Thank you!

Copy link
Copy Markdown
Contributor

@NeoZhangJianyu NeoZhangJianyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it provide another choice to Intel GPU users.

Thank you!

@NeoZhangJianyu NeoZhangJianyu merged commit 279cef2 into ggml-org:master Dec 16, 2025
2 checks passed
Anico2 added a commit to Anico2/llama.cpp that referenced this pull request Jan 15, 2026
* added note for old Intel hardware pre sycl

Older hardware used opencl

* typo

* use consistent terms
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* added note for old Intel hardware pre sycl

Older hardware used opencl

* typo

* use consistent terms
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* added note for old Intel hardware pre sycl

Older hardware used opencl

* typo

* use consistent terms
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants