SYCL: Fix test-backend-ops crashes with SYCL-Graph#13357
Closed
EwanC wants to merge 1 commit intoggml-org:masterfrom
Closed
SYCL: Fix test-backend-ops crashes with SYCL-Graph#13357EwanC wants to merge 1 commit intoggml-org:masterfrom
EwanC wants to merge 1 commit intoggml-org:masterfrom
Conversation
Currently on a CUDA backend to SYCL when running `GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0` I see crashes from 3 operations: 1) `-o MUL_MAT`: Issue arising from recording of oneMath `ext_codeplay_enqueue_native_command`. 2) `-o CONCAT` : Use of blocking waits on a queue that's being recorded https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/concat.cpp#L185-L187, can these wait calls just be removed? 3) `-o MUL_MAT_ID`: Blocking wait on a recording queue for a copy to host memory https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/ggml-sycl.cpp#L3072-L3074 , host work could be wrapped in a host-task? For 1) I have come up with a oneMath fix in uxlfoundation/oneMath#669, I've put a provisional git tag to pull in this PR for testing, but will update to the upstream commit once merged. For 2 & 3) we've noticed that `ggml-cuda.cu` has the [check_node_graph_compatibility_and_refresh_copy_ops](https://github.com/ggml-org/llama.cpp/blob/39e73ae0d69f882d7e29cecc6dd8f5052fca6731/ggml/src/ggml-cuda/ggml-cuda.cu#L2458-L2458) method for checking if a graph can be used, even if enabled. I've taken a similar approach in this PR by adding a method to `ggml-sycl.cpp` for checking if a graph can be used for the operations even if a user has asked for it to be enabled.
Alcpz
reviewed
May 9, 2025
Collaborator
Alcpz
left a comment
There was a problem hiding this comment.
LGTM Ping again to give the approval once the fix for oneMath gets merged in
Contributor
Author
|
This patch is doing two things by pumping the oneMath commit, which could have other unintended consequences, and adding the skips. Closing this PR and have created #13587 just for the skip part of the change, then will create another PR for the oneMath change when it's ready. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently on a CUDA backend to SYCL when running
GGML_SYCL_DISABLE_GRAPH=0 ./bin/test-backend-ops -b SYCL0locally I see crashes from 3 operations:-o MUL_MAT: Issue arising from recording of oneMathext_codeplay_enqueue_native_command.-o CONCAT: Use of blocking waits on a queue that's being recorded https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/concat.cpp#L185-L187-o MUL_MAT_ID: Blocking wait on a recording queue for a copy to host memory https://github.com/ggml-org/llama.cpp/blob/master/ggml/src/ggml-sycl/ggml-sycl.cpp#L3072-L3074For 1) I have come up with a oneMath fix in uxlfoundation/oneMath#669 I've put a provisional git tag to pull in this PR for testing, which is why this PR is draft, but will update to the upstream commit once merged.
For 2 & 3) we've noticed that
ggml-cuda.cuhas the check_node_graph_compatibility_and_refresh_copy_ops method for checking if a graph can be used, even if enabled. I've taken a similar approach in this PR by adding a method toggml-sycl.cppfor checking if a graph can be used for the operations even if a user has asked for it to be enabled.