Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
106 commits
Select commit Hold shift + click to select a range
abf9a62
server: wrap headers for mcp proxy (#21072)
ngxson Mar 30, 2026
e2eb39e
ci : bump ty to 0.0.26 (#21156)
CISC Mar 30, 2026
278521c
llama-model-loader: print warning when using overrides with mmap (#20…
am17an Mar 30, 2026
389c7d4
webui: Fix branching logic on edit message (#21175)
allozaur Mar 30, 2026
cad2d38
rpc : fix misleading error log (#21184)
rgerganov Mar 30, 2026
64ac9ab
CUDA : Fix CUB's argsort when nrows % block_size == 0 CCCL < 3.1 (#21…
ORippler Mar 30, 2026
ead417f
jinja : handle empty expressions correctly (#20913)
zeph1912 Mar 30, 2026
84ae843
CI : Enable CUDA and Vulkan ARM64 runners and fix CI/CD (#21122)
ehfd Mar 30, 2026
08f2145
opencl: add q4_K gemm and gemv kernels for Adreno (#20919)
shaofeiqi Mar 30, 2026
5ce013c
common : Disable backend sampling if reasoning budget is enabled (#21…
Galunid Mar 31, 2026
26dac84
vendor : update BoringSSL to 0.20260327.0 (#21211)
angt Mar 31, 2026
4453e77
server/webui: cleanup dual representation approach, simplify to opena…
pwilkin Mar 31, 2026
fcc2d59
fix: include API key in CORS proxy requests for MCP connections (#21193)
satishkc7 Mar 31, 2026
90aa83c
common: add bounds check in common_init_result::sampler to prevent se…
mtmcp Mar 31, 2026
62278ce
sycl : enhance fattn perf (#21185)
arthw Mar 31, 2026
41361c8
common : move up common_init() and fix Windows UTF-8 logs (#21176)
angt Mar 31, 2026
0be6c7c
ggml : bump version to 0.9.9 (ggml/1449)
ggerganov Mar 30, 2026
9281dd1
sync : ggml
ggerganov Mar 31, 2026
eec6f85
CI: Enable CPU and Vulkan ARM64 Release (#21207)
ehfd Mar 31, 2026
0b6ff47
fix: correct misspellings in code comments (#21217)
lainon1 Mar 31, 2026
624733d
common : gpt-oss handle builtin and unsolicited tool calls (#21213)
aldehir Mar 31, 2026
4a00bbf
server: (webui) no more gzip compression (#21073)
ngxson Mar 31, 2026
632219a
CANN: fix multi-thread set_tensor race conditions (#20151)
hipudding Mar 31, 2026
6307ec0
common : cleanup logs and modernize the progress bar (#21215)
angt Mar 31, 2026
0fcb376
fix: Use lower-case proxy headers naming (#21235)
allozaur Mar 31, 2026
825eb91
ggml-webgpu: port all AOT operators to JIT (#20728)
abhijitramesh Mar 31, 2026
82764c3
ggml webgpu: quantized buffers to u32 + wider browser/device support …
reeselevine Apr 1, 2026
4951250
llama : refactor llama_model_quantize_params to expose a pure C inter…
EAddario Apr 1, 2026
8845816
CUDA: Add Flash Attention Support for Head Dimension 512 (#20998)
anavp-nvidia Apr 1, 2026
2b86e5c
ggml-cpu: fix fallback for RVV kernels without zvfh (#21157)
taimur-10x Apr 1, 2026
d43375f
ggml : fix RWKV ops thread assignment (#21226)
ggerganov Apr 1, 2026
88d5f8f
CUDA/HIP: Fix kernel slection for mmvq mmid kernel to align host sele…
IMbackK Apr 1, 2026
e1cb817
memory: respect unified KV cache in hybrid memory for eval tasks (#21…
mudler Apr 1, 2026
84f82e8
ggml-cuda: Add generic NVFP4 MMQ kernel (#21074)
michaelw9999 Apr 1, 2026
6b949d1
sycl : support nvfp4 type in mul_mat (#21227)
arthw Apr 1, 2026
296bc05
ggml : bump version to 0.9.10 (ggml/1454)
ggerganov Apr 1, 2026
6422036
sync : ggml
ggerganov Apr 1, 2026
0356e33
scripts: add function call test script (#21234)
ngxson Apr 1, 2026
744c0c7
llama : rotate activations for better quantization (#21038)
ggerganov Apr 1, 2026
1d6d4cf
fix: tool call parsing for LFM2 and LFM2.5 models (#21242)
jbuchananr Apr 1, 2026
8710e5f
hexagon: improve RMS_NORM and DIV accuracy (#21251)
aparmp-quic Apr 1, 2026
5a0ed51
Update Dawn version in WebGPU CI (#20784)
nikhilJain17 Apr 1, 2026
6de97b9
kleidiai: add CPU feature detection to CI run script (#20394)
martin-klacer-arm Apr 1, 2026
86221cf
CUDA: fix FA kernel selection logic (#21271)
JohannesGaessler Apr 1, 2026
12dbf1d
server: Bypass API Key validation for WebUI static bundle assets (#21…
allozaur Apr 1, 2026
95a6eba
opencl: fix leak in Adreno q8_0 path (#21212)
lhez Apr 1, 2026
c30e012
contrib : rewrite AGENTS.md, make it more clear about project values …
ngxson Apr 1, 2026
fbd441c
hexagon : add cumsum op support (#21246)
tboinovski1 Apr 2, 2026
4888137
sycl : fix llama_kv_cache hang when kv_cache is huge: 5GB (#21283)
arthw Apr 2, 2026
bc07d55
ggml : bump version to 0.9.11 (ggml/1456)
ggerganov Apr 2, 2026
dae2bf4
sync : ggml
ggerganov Apr 2, 2026
d6dac92
Ignore Transfer-Encoding header. (#20269)
crmky Apr 2, 2026
17193cc
kv-cache : do not quantize SWA KV cache (#21277)
ggerganov Apr 2, 2026
6137c32
chat : add Granite 4.0 chat template with correct tool_call role mapp…
jesus-talavera-ibm Apr 2, 2026
e15efe0
Relax prefill parser to allow space. (#21240)
pwilkin Apr 2, 2026
2233737
common : add commentary rules for gpt-oss-20b (#21286)
aldehir Apr 2, 2026
63f8fe0
model, mtmd: fix gguf conversion for audio/vision mmproj (#21309)
ngxson Apr 2, 2026
5803c8d
tests: allow exporting graph ops from HF file without downloading wei…
0cc4m Apr 2, 2026
a1cfb64
ggml-webgpu: add vectorized flash attention (#20709)
ArberSephirotheca Apr 2, 2026
7992aa7
tests : add unit test coverage for llama_tensor_get_type (#20112)
bartowski1182 Apr 2, 2026
5208e2d
fix: gemma 4 template (#21326)
pwilkin Apr 2, 2026
7c7d6ce
[HIP] Bump ROCm version to 7.2.1 (#21066)
slojosic-amd Apr 2, 2026
f49e917
ci : add AMD ZenDNN label to PR labeler (#21345)
z-vishal Apr 3, 2026
39b27f0
(revert) kv-cache : do not quantize SWA KV cache (#21332)
ggerganov Apr 3, 2026
57ace0d
chat : avoid including json in chat.h (#21306)
ggerganov Apr 3, 2026
0c58ba3
rpc : reuse compute graph buffers (#21299)
rgerganov Apr 3, 2026
b069b10
vocab: fix Gemma4 tokenizer (#21343)
pwilkin Apr 3, 2026
f1ac841
ggml-zendnn : add MUL_MAT_ID op support for MoE models (#21315)
z-vishal Apr 3, 2026
f851fa5
fix: add openssl to nix dependencies (#21353) (#21355)
Tillerino Apr 3, 2026
43a4ee4
HIP: build eatch ci build test for a different architecture (#21337)
IMbackK Apr 3, 2026
d3416a4
fix: remove stale assert (#21369)
pwilkin Apr 3, 2026
887535c
ci: add more binary checks (#21349)
taronaeo Apr 3, 2026
1f34806
jinja: coerce input for string-specific filters (#21370)
CISC Apr 3, 2026
384c007
docs: Update build.md: HSA_OVERRIDE_GFX_VERSION clarification (#21331)
jeromew Apr 3, 2026
277ff5f
docker : bump cuda12 to 12.9.1 (#20920)
M1DNYT3 Apr 3, 2026
af5c138
common : fix tool call type detection for nullable and enum schemas (…
sacredvoid Apr 3, 2026
f1f793a
common/parser: fix call ID detection (Mistral parser mostly) + atomic…
pwilkin Apr 3, 2026
50e0ad0
server: save and clear idle slots on new task (`--clear-idle`) (#20993)
yychyo Apr 3, 2026
e439700
ci: Add Windows Vulkan backend testing on Intel (#21292)
rillomas Apr 3, 2026
d006858
ggml-webgpu: move from parameter buffer pool to single buffer with of…
reeselevine Apr 3, 2026
b7ad48e
llama: add custom newline split for Gemma 4 (#21406)
am17an Apr 4, 2026
650bf14
llama-model: read final_logit_softcapping for Gemma 4 (#21390)
ssam18 Apr 4, 2026
d01f627
common : respect specified tag, only fallback when tag is empty (#21413)
angt Apr 4, 2026
9c69907
server: Fix undefined timing measurement errors in server context (#2…
thedanhoffman Apr 4, 2026
b863507
common : add gemma 4 specialized parser (#21418)
aldehir Apr 4, 2026
661e9ac
ci: fix vulkan workflow referencing non-existent action (#21442)
nisparks Apr 5, 2026
c08d28d
ci: lower cuda12 floor to 12.8.1 for broader host compatibility (#21438)
M1DNYT3 Apr 5, 2026
5d3a4a7
server : fix logging of build + system info (#21460)
ddh0 Apr 5, 2026
761797f
ci : use default RISE RISC-V Runners (#21263)
luhenry Apr 5, 2026
af76639
model : add HunyuanOCR support (#21395)
richarddd Apr 5, 2026
58190cc
llama : correct platform-independent loading of BOOL metadata (#21428)
anchortense Apr 5, 2026
25eec6f
hexagon: slight optimization for argosrt output init (#21463)
YardenTal44 Apr 6, 2026
f51fd36
sycl : handle other FA case (#21377)
arthw Apr 6, 2026
400ac8e
convert : set "add bos" == True for Gemma 4 (#21500)
ggerganov Apr 6, 2026
3979f2b
docs: add hunyuan-ocr gguf, also add test [no ci] (#21490)
ngxson Apr 6, 2026
482d862
server : handle unsuccessful sink.write in chunked stream provider (#…
lainon1 Apr 6, 2026
941146b
convert : fix block_ff_dim retrieval for lfm2 (#21508)
CISC Apr 6, 2026
4aa962e
vocab : add byte token handling to BPE detokenizer for Gemma4 (#21488)
aldehir Apr 6, 2026
94ca829
llama-bench: add `-fitc` and `-fitt` to arguments (#21304)
am17an Apr 6, 2026
15f786e
[CUDA ] Write an optimized flash_attn_stream_k_fixup kernel (#21159)
gaugarg-nv Apr 6, 2026
506200c
cli: fix stripping of \n in multiline input (#21485)
bipinyadav3175 Apr 6, 2026
2e1f0a8
ggml: add Q1_0 1-bit quantization support (CPU) (#21273)
khosravipasha Apr 6, 2026
d0a6dfe
ggml-webgpu: Add the support of `MUL_MAT_ID` (#21147)
yomaytk Apr 6, 2026
0033f53
docs: fix typo in build.md (emdawbwebgpu -> emdawnwebgpu) (#21518)
CastelDazur Apr 7, 2026
0988acc
[SYCL] Add Q8_0 reorder optimization (~3x tg speedup on Intel Arc) (#…
PMZFX Apr 7, 2026
1c569b9
ggml-cpu: add Q1_0 AVX2 path
elusznik Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
2 changes: 1 addition & 1 deletion .devops/cpu.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,7 @@ RUN mkdir -p /app/full \
FROM ubuntu:$UBUNTU_VERSION AS base

RUN apt-get update \
&& apt-get install -y libgomp1 curl\
&& apt-get install -y libgomp1 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand Down
95 changes: 0 additions & 95 deletions .devops/cuda-new.Dockerfile

This file was deleted.

13 changes: 8 additions & 5 deletions .devops/cuda.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
ARG UBUNTU_VERSION=22.04
ARG UBUNTU_VERSION=24.04
# This needs to generally match the container host's environment.
ARG CUDA_VERSION=12.4.0
ARG CUDA_VERSION=12.8.1
# Target the CUDA build image
ARG BASE_CUDA_DEV_CONTAINER=nvidia/cuda:${CUDA_VERSION}-devel-ubuntu${UBUNTU_VERSION}

Expand All @@ -12,7 +12,9 @@ FROM ${BASE_CUDA_DEV_CONTAINER} AS build
ARG CUDA_DOCKER_ARCH=default

RUN apt-get update && \
apt-get install -y build-essential cmake python3 python3-pip git libssl-dev libgomp1
apt-get install -y gcc-14 g++-14 build-essential cmake python3 python3-pip git libssl-dev libgomp1

ENV CC=gcc-14 CXX=g++-14 CUDAHOSTCXX=g++-14

WORKDIR /app

Expand All @@ -39,7 +41,7 @@ RUN mkdir -p /app/full \
FROM ${BASE_CUDA_RUN_CONTAINER} AS base

RUN apt-get update \
&& apt-get install -y libgomp1 curl\
&& apt-get install -y libgomp1 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand All @@ -60,7 +62,8 @@ RUN apt-get update \
git \
python3 \
python3-pip \
&& pip install --upgrade pip setuptools wheel \
python3-wheel \
&& pip install --break-system-packages --upgrade setuptools \
&& pip install --break-system-packages -r requirements.txt \
&& apt autoremove -y \
&& apt clean -y \
Expand Down
2 changes: 1 addition & 1 deletion .devops/intel.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ RUN mkdir /tmp/neo/ && cd /tmp/neo/ \
&& dpkg --install *.deb

RUN apt-get update \
&& apt-get install -y libgomp1 curl\
&& apt-get install -y libgomp1 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand Down
2 changes: 1 addition & 1 deletion .devops/musa.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ RUN mkdir -p /app/full \
FROM ${BASE_MUSA_RUN_CONTAINER} AS base

RUN apt-get update \
&& apt-get install -y libgomp1 curl\
&& apt-get install -y libgomp1 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand Down
5 changes: 3 additions & 2 deletions .devops/nix/package.nix
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
rocmPackages,
vulkan-headers,
vulkan-loader,
curl,
openssl,
shaderc,
useBlas ?
builtins.all (x: !x) [
Expand Down Expand Up @@ -160,7 +160,8 @@ effectiveStdenv.mkDerivation (finalAttrs: {
++ optionals useMpi [ mpi ]
++ optionals useRocm rocmBuildInputs
++ optionals useBlas [ blas ]
++ optionals useVulkan vulkanBuildInputs;
++ optionals useVulkan vulkanBuildInputs
++ [ openssl ];

cmakeFlags =
[
Expand Down
2 changes: 1 addition & 1 deletion .devops/openvino.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ ARG http_proxy
ARG https_proxy

RUN apt-get update \
&& apt-get install -y libgomp1 libtbb12 curl\
&& apt-get install -y libgomp1 libtbb12 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand Down
12 changes: 6 additions & 6 deletions .devops/rocm.Dockerfile
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
ARG UBUNTU_VERSION=24.04

# This needs to generally match the container host's environment.
ARG ROCM_VERSION=7.2
ARG AMDGPU_VERSION=7.2
ARG ROCM_VERSION=7.2.1
ARG AMDGPU_VERSION=7.2.1

# Target the ROCm build image
ARG BASE_ROCM_DEV_CONTAINER=rocm/dev-ubuntu-${UBUNTU_VERSION}:${ROCM_VERSION}-complete
Expand All @@ -12,11 +12,11 @@ FROM ${BASE_ROCM_DEV_CONTAINER} AS build

# Unless otherwise specified, we make a fat build.
# This is mostly tied to rocBLAS supported archs.
# check https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.0/reference/system-requirements.html
# check https://rocm.docs.amd.com/projects/install-on-linux/en/docs-7.2.1/reference/system-requirements.html
# check https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/compatibility/compatibilityrad/native_linux/native_linux_compatibility.html
# check https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/compatibility/compatibilityryz/native_linux/native_linux_compatibility.html

ARG ROCM_DOCKER_ARCH='gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1151;gfx1150;gfx1200;gfx1201'
ARG ROCM_DOCKER_ARCH='gfx908;gfx90a;gfx942;gfx1030;gfx1100;gfx1101;gfx1102;gfx1151;gfx1150;gfx1200;gfx1201'

# Set ROCm architectures
ENV AMDGPU_TARGETS=${ROCM_DOCKER_ARCH}
Expand Down Expand Up @@ -58,7 +58,7 @@ RUN mkdir -p /app/full \
FROM ${BASE_ROCM_DEV_CONTAINER} AS base

RUN apt-get update \
&& apt-get install -y libgomp1 curl\
&& apt-get install -y libgomp1 curl \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand All @@ -79,7 +79,7 @@ RUN apt-get update \
git \
python3-pip \
python3 \
python3-wheel\
python3-wheel \
&& pip install --break-system-packages --upgrade setuptools \
&& pip install --break-system-packages -r requirements.txt \
&& apt autoremove -y \
Expand Down
17 changes: 10 additions & 7 deletions .devops/vulkan.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -49,17 +49,20 @@ COPY --from=build /app/full /app

WORKDIR /app

ENV PATH="/root/.venv/bin:/root/.local/bin:${PATH}"

# Flag for compatibility with pip
ARG UV_INDEX_STRATEGY="unsafe-best-match"
RUN apt-get update \
&& apt-get install -y \
build-essential \
curl \
git \
python3.13 \
python3.13-dev \
python3-pip \
python3-wheel \
&& update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.13 100 \
&& pip install --break-system-packages --upgrade setuptools \
&& pip install --break-system-packages -r requirements.txt \
ca-certificates \
&& curl -LsSf https://astral.sh/uv/install.sh | sh \
&& uv python install 3.13 \
&& uv venv --python 3.13 /root/.venv \
&& uv pip install --python /root/.venv/bin/python -r requirements.txt \
&& apt autoremove -y \
&& apt clean -y \
&& rm -rf /tmp/* /var/tmp/* \
Expand Down
16 changes: 8 additions & 8 deletions .editorconfig
Original file line number Diff line number Diff line change
Expand Up @@ -21,14 +21,6 @@ indent_style = tab
[prompts/*.txt]
insert_final_newline = unset

[tools/server/public/*]
indent_size = 2

[tools/server/public/deps_*]
trim_trailing_whitespace = unset
indent_style = unset
indent_size = unset

[tools/server/deps_*]
trim_trailing_whitespace = unset
indent_style = unset
Expand Down Expand Up @@ -61,6 +53,14 @@ charset = unset
trim_trailing_whitespace = unset
insert_final_newline = unset

[tools/server/public/**]
indent_style = unset
indent_size = unset
end_of_line = unset
charset = unset
trim_trailing_whitespace = unset
insert_final_newline = unset

[benches/**]
indent_style = unset
indent_size = unset
Expand Down
4 changes: 4 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Treat the generated single-file WebUI build as binary for diff purposes.
# Git's pack-file delta compression still works (byte-level), but this prevents
# git diff from printing the entire minified file on every change.
tools/server/public/index.html -diff
5 changes: 5 additions & 0 deletions .github/labeler.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,11 @@ IBM zDNN:
- any-glob-to-any-file:
- ggml/include/ggml-zdnn.h
- ggml/src/ggml-zdnn/**
AMD ZenDNN:
- changed-files:
- any-glob-to-any-file:
- ggml/include/ggml-zendnn.h
- ggml/src/ggml-zendnn/**
documentation:
- changed-files:
- any-glob-to-any-file:
Expand Down
38 changes: 14 additions & 24 deletions .github/workflows/build-riscv.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ env:

jobs:
ubuntu-riscv64-native-sanitizer:
runs-on: RISCV64
runs-on: ubuntu-24.04-riscv

continue-on-error: true

Expand All @@ -50,17 +50,18 @@ jobs:
sudo apt-get update

# Install necessary packages
sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 rustup cmake build-essential wget ccache git-lfs
sudo apt-get install -y libatomic1 libtsan2 gcc-14 g++-14 cmake build-essential wget git-lfs

# Set gcc-14 and g++-14 as the default compilers
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-14 100
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-14 100
sudo ln -sf /usr/bin/gcc-14 /usr/bin/gcc
sudo ln -sf /usr/bin/g++-14 /usr/bin/g++

# Install Rust stable version
rustup install stable
rustup default stable
if ! which rustc; then
# Install Rust stable version
sudo apt-get install -y rustup
rustup install stable
rustup default stable
fi

git lfs install

Expand All @@ -73,23 +74,12 @@ jobs:
id: checkout
uses: actions/checkout@v6

- name: Setup ccache
run: |
# Unique cache directory per matrix combination
export CCACHE_DIR="$HOME/.ccache/sanitizer-${{ matrix.sanitizer }}-${{ matrix.build_type }}"
mkdir -p "$CCACHE_DIR"

# Configure ccache
ccache --set-config=max_size=5G
ccache --set-config=compression=true
ccache --set-config=compression_level=6
ccache --set-config=cache_dir="$CCACHE_DIR"
ccache --set-config=sloppiness=file_macro,time_macros,include_file_mtime,include_file_ctime
ccache --set-config=hash_dir=false

# Export for subsequent steps
echo "CCACHE_DIR=$CCACHE_DIR" >> $GITHUB_ENV
echo "PATH=/usr/lib/ccache:$PATH" >> $GITHUB_ENV
# FIXME: Enable when ggml-org/ccache-action works on riscv64
# - name: ccache
# uses: ggml-org/ccache-action@v1.2.21
# with:
# key: ubuntu-riscv64-native-sanitizer-${{ matrix.sanytizer }}-${{ matrix.build_type }}
# save: ${{ github.event_name == 'push' && github.ref == 'refs/heads/master' }}

- name: Build
id: cmake_build
Expand Down
21 changes: 21 additions & 0 deletions .github/workflows/build-self-hosted.yml
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,27 @@ jobs:
vulkaninfo --summary
GG_BUILD_VULKAN=1 bash ./ci/run.sh ~/results/llama.cpp ~/mnt/llama.cpp

ggml-ci-win-intel-vulkan:
runs-on: [self-hosted, Windows, X64, Intel]

steps:
- name: Clone
id: checkout
uses: actions/checkout@v6

- name: Test
id: ggml-ci
shell: C:\msys64\usr\bin\bash.exe --noprofile --norc -eo pipefail "{0}"
env:
MSYSTEM: UCRT64
CHERE_INVOKING: 1
PATH: C:\msys64\ucrt64\bin;C:\msys64\usr\bin;C:\Windows\System32;${{ env.PATH }}
run: |
vulkaninfo --summary
# Skip python related tests with GG_BUILD_LOW_PERF=1 since Windows MSYS2 UCRT64 currently fails to create
# a valid python environment for testing
LLAMA_FATAL_WARNINGS=OFF GG_BUILD_NINJA=1 GG_BUILD_VULKAN=1 GG_BUILD_LOW_PERF=1 ./ci/run.sh ./results/llama.cpp ./mnt/llama.cpp

ggml-ci-intel-openvino-gpu-low-perf:
runs-on: [self-hosted, Linux, Intel, OpenVINO]

Expand Down
Loading
Loading