-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Description
conclusion: this package seems slightly slow than existing ones, but faster for some of the sizes, bmul is the fastest one, I'm looking forward to see how much @batch can improve on this
not sure why, but this is currently 8x slower than SparseArrays, the following is the comparison at different size
I just have a naive benchmark, currently have 5x speedup at 2^20 x 2^20
julia> speedup = map(zip(reports, base_reports)) do (r, br)
minimum(br).time / minimum(r).time
end
9-element Vector{Float64}:
0.00560784995998469
0.013175431564728122
0.03735109945863908
0.20631260631260628
0.4875896304467733
2.7184508079539054
4.600574856107668
5.6593177540493
5.013580931023638version info (8 threads on M1 pro)
julia> versioninfo()
Julia Version 1.8.5
Commit 17cfb8e65ea (2023-01-08 06:45 UTC)
Platform Info:
OS: macOS (arm64-apple-darwin21.5.0)
CPU: 10 × Apple M1 Pro
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-13.0.1 (ORCJIT, apple-m1)
Threads: 8 on 8 virtual coresthe script
to reproduce
using BenchmarkTools
using ParallelMergeCSR
using SparseArrays
function benchmark(n::Int)
A = sprand(2^n, 2^n, 1e-4)
x = rand(2^n)
y = similar(x)
return @benchmark ParallelMergeCSR.mul!($y, transpose($A), $x, 1.0, 0.0)
end
function benchmark_base(n::Int)
A = sprand(2^n, 2^n, 1e-4)
x = rand(2^n)
y = similar(x)
return @benchmark SparseArrays.mul!($y, transpose($A), $x, 1.0, 0.0)
end
reports = []
for n in 4:2:20
@info "benchmarking" n
push!(reports, benchmark(n))
end
base_reports = []
for n in 4:2:20
@info "benchmarking" n
push!(base_reports, benchmark_base(n))
end
speedup = map(zip(reports, base_reports)) do (r, br)
minimum(br).time / minimum(r).time
endReactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels