Merged
Conversation
Contributor
|
Your PR requires formatting changes to meet the project's style guidelines. Click here to view the suggested changes.diff --git a/src/array.jl b/src/array.jl
index 4d621cf..8ab4ba8 100644
--- a/src/array.jl
+++ b/src/array.jl
@@ -505,17 +505,17 @@ fill(v, dims...) = fill!(oneArray{typeof(v)}(undef, dims...), v)
fill(v, dims::Dims) = fill!(oneArray{typeof(v)}(undef, dims...), v)
function Base.fill!(A::oneDenseArray{T}, val) where T
- length(A) == 0 && return A
- val = convert(T, val)
- sizeof(T) == 0 && return A
-
- # execute! is async, so we need to allocate the pattern in USM memory
- # and keep it alive until the operation completes.
- buf = oneL0.host_alloc(context(A), sizeof(T), Base.datatype_alignment(T))
- unsafe_store!(convert(Ptr{T}, buf), val)
- unsafe_fill!(context(A), device(), pointer(A), convert(ZePtr{T}, buf), length(A))
- synchronize(global_queue(context(A), device()))
- oneL0.free(buf)
+ length(A) == 0 && return A
+ val = convert(T, val)
+ sizeof(T) == 0 && return A
+
+ # execute! is async, so we need to allocate the pattern in USM memory
+ # and keep it alive until the operation completes.
+ buf = oneL0.host_alloc(context(A), sizeof(T), Base.datatype_alignment(T))
+ unsafe_store!(convert(Ptr{T}, buf), val)
+ unsafe_fill!(context(A), device(), pointer(A), convert(ZePtr{T}, buf), length(A))
+ synchronize(global_queue(context(A), device()))
+ oneL0.free(buf)
A
end
diff --git a/test/level-zero.jl b/test/level-zero.jl
index ed7b283..3b13f34 100644
--- a/test/level-zero.jl
+++ b/test/level-zero.jl
@@ -271,22 +271,22 @@ let src = rand(Int, 1024)
synchronize(queue)
@test chk == src
- # FIX: Allocate pattern in USM Host Memory
- # Standard Host memory (stack/heap) is not accessible by discrete GPUs for fill patterns.
- # We must use USM Host Memory.
- pattern_val = 42
- pattern_buf = oneL0.host_alloc(ctx, sizeof(Int), Base.datatype_alignment(Int))
- unsafe_store!(convert(Ptr{Int}, pattern_buf), pattern_val)
+ # FIX: Allocate pattern in USM Host Memory
+ # Standard Host memory (stack/heap) is not accessible by discrete GPUs for fill patterns.
+ # We must use USM Host Memory.
+ pattern_val = 42
+ pattern_buf = oneL0.host_alloc(ctx, sizeof(Int), Base.datatype_alignment(Int))
+ unsafe_store!(convert(Ptr{Int}, pattern_buf), pattern_val)
execute!(queue) do list
- # Use the USM pointer (converted to ZePtr)
- append_fill!(list, pointer(dst), convert(ZePtr{Int}, pattern_buf), sizeof(Int), sizeof(src))
+ # Use the USM pointer (converted to ZePtr)
+ append_fill!(list, pointer(dst), convert(ZePtr{Int}, pattern_buf), sizeof(Int), sizeof(src))
append_barrier!(list)
append_copy!(list, pointer(chk), pointer(dst), sizeof(src))
end
synchronize(queue)
- oneL0.free(pattern_buf)
+ oneL0.free(pattern_buf)
@test all(isequal(42), chk)
|
12e6f0a to
b85723b
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #555 +/- ##
==========================================
+ Coverage 79.24% 79.28% +0.04%
==========================================
Files 46 46
Lines 3064 3070 +6
==========================================
+ Hits 2428 2434 +6
Misses 636 636 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
maleadt
approved these changes
Dec 3, 2025
Member
maleadt
left a comment
There was a problem hiding this comment.
That's curious; I developed this on an A770 discrete GPU where it worked fine.
b85723b to
7f022f5
Compare
b5cfe7c to
112001b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
There was a test failure on a Max 1100 GPU in
oneAPI.jl/test/level-zero.jl
Lines 274 to 281 in 010bd13
pattern = [42]) tozeCommandListAppendMemoryFill. AFAIK, on discrete Intel GPUs (unlike integrated ones), standard host memory is often not directly accessible by the device command processor. I also fixedfill!to address the same issue.In that vein, I will also add a GitHub Actions runner for that GPU.