Skip to content

metal : fix event synchronization#22260

Merged
ggerganov merged 1 commit intomasterfrom
gg/metal-fix-events
Apr 23, 2026
Merged

metal : fix event synchronization#22260
ggerganov merged 1 commit intomasterfrom
gg/metal-fix-events

Conversation

@ggerganov
Copy link
Copy Markdown
Member

@ggerganov ggerganov commented Apr 22, 2026

Overview

cont #20463
cont #18919

Fix the event synchronization logic when using virtual Metal devices.

Additional information

I think this was the reason why we didn't observe the regression from #17795 when running the pipeline parallel workflows with virtual devices.

GGML_METAL_DEVICES=4 CMAKE_OPTS="-DGGML_BLAS=OFF" scripts/compare-commits.sh master gg/metal-fix-events llama-bench -m ~/models/gpt-oss-20b/ggml-model-mxfp4.gguf -m ~/models/qwen3-0.6b-base/ggml-model-q8_0.gguf -m ~/models/qwen3.5-0.8b-base/ggml-model-q8_0.gguf -m ~/models/qwen3-30b-a3b/ggml-model-q8_0.gguf -fa 1 -ub 2048,512 -b 16384 -p 512,2048,4096,8192,16384,32768 -n 32,32,32 -r 1 -t 1
Model Microbatch size Test t/s master t/s gg/metal-fix-events Speedup
gpt-oss 20B MXFP4 MoE 512 pp512 2350.44 2367.07 1.01
gpt-oss 20B MXFP4 MoE 512 pp2048 2376.94 2339.00 0.98
gpt-oss 20B MXFP4 MoE 512 pp4096 2303.54 2319.16 1.01
gpt-oss 20B MXFP4 MoE 512 pp8192 2154.49 2182.73 1.01
gpt-oss 20B MXFP4 MoE 512 pp16384 1887.93 1911.66 1.01
gpt-oss 20B MXFP4 MoE 512 pp32768 1506.42 1518.49 1.01
gpt-oss 20B MXFP4 MoE 512 tg32 102.71 116.25 1.13
gpt-oss 20B MXFP4 MoE 2048 pp512 2367.84 2382.75 1.01
gpt-oss 20B MXFP4 MoE 2048 pp2048 2703.35 2706.89 1.00
gpt-oss 20B MXFP4 MoE 2048 pp4096 2631.97 2607.63 0.99
gpt-oss 20B MXFP4 MoE 2048 pp8192 2453.51 2429.44 0.99
gpt-oss 20B MXFP4 MoE 2048 pp16384 2131.23 2123.23 1.00
gpt-oss 20B MXFP4 MoE 2048 pp32768 1670.28 1678.03 1.00
gpt-oss 20B MXFP4 MoE 2048 tg32 104.06 116.12 1.12
qwen3 0.6B Q8_0 512 pp512 13577.98 13815.02 1.02
qwen3 0.6B Q8_0 512 pp2048 11833.70 11821.11 1.00
qwen3 0.6B Q8_0 512 pp4096 9619.92 9847.19 1.02
qwen3 0.6B Q8_0 512 pp8192 7085.81 7305.02 1.03
qwen3 0.6B Q8_0 512 pp16384 4612.35 4687.70 1.02
qwen3 0.6B Q8_0 512 pp32768 2682.56 2703.95 1.01
qwen3 0.6B Q8_0 512 tg32 208.69 233.10 1.12
qwen3 0.6B Q8_0 2048 pp512 13482.82 13785.38 1.02
qwen3 0.6B Q8_0 2048 pp2048 13222.19 13268.26 1.00
qwen3 0.6B Q8_0 2048 pp4096 10966.75 10991.81 1.00
qwen3 0.6B Q8_0 2048 pp8192 8071.85 8064.77 1.00
qwen3 0.6B Q8_0 2048 pp16384 5171.93 5174.08 1.00
qwen3 0.6B Q8_0 2048 pp32768 2968.30 2961.29 1.00
qwen3 0.6B Q8_0 2048 tg32 212.48 234.86 1.11
qwen35 0.8B Q8_0 512 pp512 10575.46 10783.21 1.02
qwen35 0.8B Q8_0 512 pp2048 10868.20 10865.48 1.00
qwen35 0.8B Q8_0 512 pp4096 10485.53 10809.71 1.03
qwen35 0.8B Q8_0 512 pp8192 9635.95 10012.95 1.04
qwen35 0.8B Q8_0 512 pp16384 8207.64 8418.07 1.03
qwen35 0.8B Q8_0 512 pp32768 6277.44 6390.16 1.02
qwen35 0.8B Q8_0 512 tg32 154.54 177.99 1.15
qwen35 0.8B Q8_0 2048 pp512 10560.57 10777.40 1.02
qwen35 0.8B Q8_0 2048 pp2048 11856.34 11827.24 1.00
qwen35 0.8B Q8_0 2048 pp4096 11696.23 11783.99 1.01
qwen35 0.8B Q8_0 2048 pp8192 10955.16 10891.34 0.99
qwen35 0.8B Q8_0 2048 pp16384 9391.60 9455.61 1.01
qwen35 0.8B Q8_0 2048 pp32768 7202.31 7244.19 1.01
qwen35 0.8B Q8_0 2048 tg32 156.69 178.88 1.14

Extra CI: https://github.com/ggml-org/llama.cpp/actions/runs/24797481063

Requirements

@ggerganov ggerganov requested a review from a team as a code owner April 22, 2026 19:04
@github-actions github-actions Bot added ggml changes relating to the ggml tensor library for machine learning Apple Metal https://en.wikipedia.org/wiki/Metal_(API) labels Apr 22, 2026
@ggerganov ggerganov merged commit 8635e22 into master Apr 23, 2026
57 of 60 checks passed
@ggerganov ggerganov deleted the gg/metal-fix-events branch April 23, 2026 05:22
arthw pushed a commit to arthw/llama.cpp that referenced this pull request Apr 23, 2026
IntelNav pushed a commit to IntelNav/llama.cpp that referenced this pull request Apr 29, 2026
IntelNav pushed a commit to IntelNav/llama.cpp that referenced this pull request Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Apple Metal https://en.wikipedia.org/wiki/Metal_(API) ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant