Skip to content

avoid generic matmul fallback in kernel#378

Merged
lkdvos merged 1 commit intomainfrom
ld-mul
Mar 5, 2026
Merged

avoid generic matmul fallback in kernel#378
lkdvos merged 1 commit intomainfrom
ld-mul

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Mar 5, 2026

This is a very minor change that avoids falling back to generic_matmatmul for the signature mul!(::StridedView, ::StridedView, ::Matrix, ::Number, ::Number), simply by promoting the matrix to a StridedView as well.

Seems to lead to a slight speed-up in my cases, but is typically subleading. Nevertheless, probably better in the long run.

@lkdvos lkdvos requested review from Jutho and kshyatt March 5, 2026 14:32
@lkdvos lkdvos enabled auto-merge (squash) March 5, 2026 14:32
@kshyatt
Copy link
Member

kshyatt commented Mar 5, 2026

This is already being worked on in Strided

@lkdvos
Copy link
Member Author

lkdvos commented Mar 5, 2026

Is it possible to capture all cases like this?
I was under the impression that it would be really hard to capture all combinations of StridedView vs AbstractArray, without dealing with the ambiguity issues that then follow every time LinearAlgebra changes a signature somewhere, but I haven't thought that through in great detail.
(Note that this is not about GPU vs CPU mixtures, it really is just that the buffers are StridedView wrapped and the basis transform isn't, if that makes a difference)

@kshyatt
Copy link
Member

kshyatt commented Mar 5, 2026

This is exactly the case I already hit, in which case we know the basis transform is a raw Matrix (or CuMatrix), not wrapped by anything, so I think we can provide a generic option in Strided for that case.

@kshyatt
Copy link
Member

kshyatt commented Mar 5, 2026

FWIW I really don't think we need to worry about all cases, it's probably enough to cover Matrix, Adjoint{Matrix}, Transpose{Matrix} and the equivalents for the GPU matrices.

@kshyatt
Copy link
Member

kshyatt commented Mar 5, 2026

Doing this in Strided also allows people who run into similar situations not based on this specific basis transformation use case to benefit from the speedup 🚀

@lkdvos
Copy link
Member Author

lkdvos commented Mar 5, 2026

Hmmm, I see your point, but honestly I would still just add the wrapper here, rather than implicitly assuming this is handled upstream.

If we ever decide to change the matrix type, and this now changes into a type that you did not specifically handle in Strided, this would break again, and the addition here really doesn't hurt 😉.

@codecov
Copy link

codecov bot commented Mar 5, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Files with missing lines Coverage Δ
src/tensors/indexmanipulations.jl 77.00% <100.00%> (ø)
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lkdvos lkdvos disabled auto-merge March 5, 2026 22:23
@lkdvos lkdvos merged commit 05772bf into main Mar 5, 2026
60 of 82 checks passed
@lkdvos lkdvos deleted the ld-mul branch March 5, 2026 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants