Skip to content

ggml: WebGPU disable SET_ROWS for now#15078

Merged
reeselevine merged 27 commits intoggml-org:masterfrom
reeselevine:master
Aug 5, 2025
Merged

ggml: WebGPU disable SET_ROWS for now#15078
reeselevine merged 27 commits intoggml-org:masterfrom
reeselevine:master

Conversation

@reeselevine
Copy link
Copy Markdown
Contributor

test-thread-safety was recently updated to use SET_ROWS by default, but the WebGPU backend doesn't support it yet. I'm aiming to add support and open a PR in the next couple days, but in the meantime disabling SET_ROWS for the CI so the WebGPU tests pass.

@github-actions github-actions Bot added the devops improvements to build systems and github actions label Aug 5, 2025
@slaren
Copy link
Copy Markdown
Member

slaren commented Aug 5, 2025

Looks like it is still failing.

@github-actions github-actions Bot added the ggml changes relating to the ggml tensor library for machine learning label Aug 5, 2025
@reeselevine
Copy link
Copy Markdown
Contributor Author

Sorry this CI failure for WebGPU on the Linux machine is turning out to be trickier than I expected. I haven't been able to reproduce it locally yet, only on the Github action runners using the simulated Vulkan LLVMpipe backend.

I'll keep working on it this week, to see if I can get a definitive answer into what's going on. In the meantime, would it make sense to just disable the WebGPU CI so it doesn't clutter up other PRs?

Comment thread .github/workflows/build.yml Outdated
@reeselevine
Copy link
Copy Markdown
Contributor Author

Looks like the CI is passing now, I was able to debug on the Github action runner using this: https://github.com/mxschmitt/action-tmate.

I believe the issue was due to not blocking on set_tensor calls. Not sure why it only causes issues on the LLVMpipe backend, but I suppose it's good that the CI caught the issue!

I also made another minor changes in this PR, to explicitly wait on Futures returned by WebGPU API callbacks before returning from graph_compute.

A few of the macOS CI tests are still queued, I'll wait for them to complete successfully before merging. Hopefully the WebGPU CI is more stable from here on!

@reeselevine reeselevine merged commit 9515c61 into ggml-org:master Aug 5, 2025
47 checks passed
blime4 referenced this pull request in blime4/llama.cpp Feb 5, 2026
* Add paramater buffer pool, batching of submissions, refactor command building/submission

* Add header for linux builds

* Free staged parameter buffers at once

* Format with clang-format

* Fix thread-safe implementation

* Use device implicit synchronization

* Update workflow to use custom release

* Remove testing branch workflow

* Disable set_rows until it's implemented

* Fix potential issue around empty queue submission

* Try synchronous submission

* Try waiting on all futures explicitly

* Add debug

* Add more debug messages

* Work on getting ssh access for debugging

* Debug on failure

* Disable other tests

* Remove extra if

* Try more locking

* maybe passes?

* test

* Some cleanups

* Restore build file

* Remove extra testing branch ci
Seunghhon pushed a commit to Seunghhon/llama.cpp that referenced this pull request Apr 26, 2026
* Add paramater buffer pool, batching of submissions, refactor command building/submission

* Add header for linux builds

* Free staged parameter buffers at once

* Format with clang-format

* Fix thread-safe implementation

* Use device implicit synchronization

* Update workflow to use custom release

* Remove testing branch workflow

* Disable set_rows until it's implemented

* Fix potential issue around empty queue submission

* Try synchronous submission

* Try waiting on all futures explicitly

* Add debug

* Add more debug messages

* Work on getting ssh access for debugging

* Debug on failure

* Disable other tests

* Remove extra if

* Try more locking

* maybe passes?

* test

* Some cleanups

* Restore build file

* Remove extra testing branch ci
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

devops improvements to build systems and github actions ggml changes relating to the ggml tensor library for machine learning

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants