Skip to content

vulkan: allow graphics queue only through env var#20599

Merged
0cc4m merged 4 commits intomasterfrom
0cc4m/vulkan-amd-queue2
Mar 17, 2026
Merged

vulkan: allow graphics queue only through env var#20599
0cc4m merged 4 commits intomasterfrom
0cc4m/vulkan-amd-queue2

Conversation

@0cc4m
Copy link
Copy Markdown
Contributor

@0cc4m 0cc4m commented Mar 15, 2026

Improve #20551 to fix the reported issues. Only use graphics queue on RADV on larger GPUs.

Fixes #20597

@0cc4m 0cc4m requested a review from a team as a code owner March 15, 2026 17:10
@0cc4m 0cc4m requested a review from jeffbolznv March 15, 2026 17:10
@lemmi
Copy link
Copy Markdown

lemmi commented Mar 15, 2026

While graphics queue measurably helps performance on strix halo's 8060s with 40 CUs, desktop performance is unusable now, like with other smaller GPUs, as commented here: #20551 (comment)
I think this either needs to be an option, or a different threshold is necessary.

@0cc4m 0cc4m changed the title vulkan: use graphics queue only on larger AMD GPUs with RADV driver vulkan: allow graphics queue only through env var Mar 15, 2026
@0cc4m
Copy link
Copy Markdown
Contributor Author

0cc4m commented Mar 15, 2026

I have updated the PR to require manually setting GGML_VK_ALLOW_GRAPHICS_QUEUE=1 to enable graphics queue use (if a compute queue is available). It seems this is only really safe when headless.

@github-actions github-actions Bot added Vulkan Issues specific to the Vulkan backend ggml changes relating to the ggml tensor library for machine learning labels Mar 15, 2026
@0cc4m 0cc4m merged commit 740a447 into master Mar 17, 2026
46 of 50 checks passed
@0cc4m 0cc4m deleted the 0cc4m/vulkan-amd-queue2 branch March 17, 2026 09:10
@0cc4m
Copy link
Copy Markdown
Contributor Author

0cc4m commented Mar 17, 2026

@zedbytes That means you should now be using GGML_VK_ALLOW_GRAPHICS_QUEUE=1 instead of RADV_DEBUG=nocompute. This is helpful on headless RADV systems or if you don't mind the graphics interruptions.

@zedbytes
Copy link
Copy Markdown

@0cc4m agreed, building with master and updated flag now !

@zedbytes
Copy link
Copy Markdown

@0cc4m post preliminary testing
I still find ~14% better performance with RADV_DEBUG=nocompute over GGML_VK_ALLOW_GRAPHICS_QUEUE=1
wondering if this gain is restricted only to RDNA 4 as other AMD GPUs aren't getting much performance benefit

@tsterbak
Copy link
Copy Markdown

Quick question: where is this env variable and other available env configuration documented? I cannot find it anywhere :/

@NickM-27
Copy link
Copy Markdown

The findings here seem fairly odd, on linux both my 9060XT and 7900XTX the previous change to use the graphics queue resulted in an increase in TG.

Ethan-a2 pushed a commit to Ethan-a2/llama.cpp that referenced this pull request Mar 20, 2026
* vulkan: avoid graphics queue on non-RADV AMD drivers

* avoid graphics queues on small GPUs

* change to only use graphics queue if overridden with env var GGML_VK_ALLOW_GRAPHICS_QUEUE

* reenable transfer queue if graphics queue is not used
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* vulkan: avoid graphics queue on non-RADV AMD drivers

* avoid graphics queues on small GPUs

* change to only use graphics queue if overridden with env var GGML_VK_ALLOW_GRAPHICS_QUEUE

* reenable transfer queue if graphics queue is not used
rsenthilkumar6 pushed a commit to rsenthilkumar6/llama.cpp that referenced this pull request May 1, 2026
* vulkan: avoid graphics queue on non-RADV AMD drivers

* avoid graphics queues on small GPUs

* change to only use graphics queue if overridden with env var GGML_VK_ALLOW_GRAPHICS_QUEUE

* reenable transfer queue if graphics queue is not used
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning Vulkan Issues specific to the Vulkan backend

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Misc. bug: 40% tg\s drop in latest b8354 release on AMD card

6 participants