Skip to content

Mooncake testsuite refactor#175

Merged
lkdvos merged 23 commits intomainfrom
ld-mooncaketests
Feb 26, 2026
Merged

Mooncake testsuite refactor#175
lkdvos merged 23 commits intomainfrom
ld-mooncaketests

Conversation

@lkdvos
Copy link
Member

@lkdvos lkdvos commented Feb 19, 2026

This is a somewhat large refactor of the Mooncake test suite that hinges on the idea to wrap the in-place functions in such a way that they become admissible to the internal Mooncake testing framework. (Based on ideas suggested by @Jutho, thanks!)

The basic idea is that for a function f!(A, input, alg) we need to:

  1. scratch the A space to not count finite-differences results
  2. avoid generating random tangents for the input variables

I also somewhat reorganized the tests to get a slightly better overview of the mooncake tests.
Finally, this also allowed me to find some small mistakes in the mutating rules, which I have here fixed.
This is also relevant for #174 and #173.

@codecov
Copy link

codecov bot commented Feb 19, 2026

Codecov Report

❌ Patch coverage is 88.88889% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pullbacks/qr.jl 81.81% 2 Missing ⚠️
src/pullbacks/lq.jl 90.00% 1 Missing ⚠️
Files with missing lines Coverage Δ
...gebraKitMooncakeExt/MatrixAlgebraKitMooncakeExt.jl 62.57% <100.00%> (+1.81%) ⬆️
src/pullbacks/lq.jl 96.00% <90.00%> (+0.10%) ⬆️
src/pullbacks/qr.jl 94.73% <81.81%> (-1.16%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Co-authored-by: Jutho <Jutho@users.noreply.github.com>
@kshyatt
Copy link
Member

kshyatt commented Feb 23, 2026

I guess I really don't see the point of removing all the ad_*_setup functions given we'll need something similar to add support for Enzyme? Why not just keep them but allow alg to be passed?

@lkdvos lkdvos enabled auto-merge (squash) February 24, 2026 16:58
@kshyatt
Copy link
Member

kshyatt commented Feb 24, 2026

Once this merges, I'll update the other PR

Q2 = view(Q, (p + 1):size(Q, 1), :)
ΔQ2 = view(ΔQ, (p + 1):size(Q, 1), :)
ΔQ2Q1ᴴ = ΔQ2 * Q1'
check_lq_full_cotangents(Q1, ΔQ2, ΔQ2Q1ᴴ; gauge_atol)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a rank-deficient matrix p < minmn, check_lq_full_cotangents can also be called by lq_compact (i.e. size(Q,1) == minmn for lq_compact and size(Q,1) == n for lq_full). Since this is not a change from the current PR, I will try to look into when this was introduced, and whether everything is still fully correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that mean we can go ahead with this one? That seems somewhat unrelated to the testsuite refactor anyways, so might be clearer in a separate PR anyways

remove_eig_gauge_dependence!(ΔV, D, V)

Remove the gauge-dependent part from the cotangent `ΔV` of the eigenvector matrix `V`. The
eigenvectors are only determined up to complex phase (and unitary mixing for degenerate
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the eig case there can also be non-unitary mixing. I think the only convention respected by LAPACK is that all the eigenvectors have norm 1, but that condition cannot easily be transformed into a restriction onto a restriction on the matrices. In fact, even for non-degenerate eigenvalues we actually require v' * Δv (the diagonal element of V' * ΔV) to be completely zero, not just the imaginary part of it (the imaginary part corresponds to phase rotations, the real part to norm changes).

Comment on lines +60 to +61
ΔU[:, (minmn + 1):end] .= 0
ΔVᴴ[(minmn + 1):end, :] .= 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to support rank deficient matrices, where we should use some p instead of minmn?

Comment on lines +75 to +76
Q₁ = @view Q[:, 1:r]
ΔQ₂ = @view ΔQ[:, (r + 1):end]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, since you do use regular view in the last line:

Suggested change
Q₁ = @view Q[:, 1:r]
ΔQ₂ = @view ΔQ[:, (r + 1):end]
Q₁ = view(Q, :, 1:r)
ΔQ₂ = view(ΔQ, :, (r + 1):end)

ambiguity. Additionally, rows of `ΔR` beyond the rank are zeroed out.
"""
function remove_qr_gauge_dependence!(ΔQ, ΔR, A, Q, R)
r = MatrixAlgebraKit.qr_rank(R)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support a rank_atol keyword to be passed along from remove_qr_gauge_dependence! to qr_rank?

Comment on lines +105 to +106
Q₁ = @view Q[1:r, :]
ΔQ₂ = @view ΔQ[(r + 1):end, :]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Q₁ = @view Q[1:r, :]
ΔQ₂ = @view ΔQ[(r + 1):end, :]
Q₁ = view(Q, 1:r, :)
ΔQ₂ = view(ΔQ, (r + 1):end, :)

Additionally, columns of `ΔL` beyond the rank are zeroed out.
"""
function remove_lq_gauge_dependence!(ΔL, ΔQ, A, L, Q)
r = MatrixAlgebraKit.lq_rank(L)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about rank_atol

Comment on lines +134 to +136
Q, _ = qr_compact(A)
mul!(ΔN, Q, Q' * ΔN)
return ΔN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not simply pass this on to remove_qr_null_gauge_dependence in order to avoid code duplication?

null space basis is only determined up to a unitary rotation, so `ΔNᴴ` is projected onto the
row span of the compact LQ factor `Q₁` of `A`.
"""
function remove_right_null_gauge_dependence!(ΔNᴴ, A, Nᴴ)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question with remove_lq_null_gauge_dependence!?

ΔD2 = Diagonal(randn!(similar(A, complex(T), m)))
return DV, (ΔD, ΔV), (ΔD2, ΔV)
ΔV = remove_eig_gauge_dependence!(ΔV, D, V)
ΔD = Diagonal(randn!(similar(A, complex(T), m)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = Diagonal(randn!(similar(A, complex(T), m)))
ΔD = Diagonal(randn!(similar(diagview(D))))

ΔV = randn!(similar(A.diag, T, m, m))
ΔV = remove_eiggauge_dependence!(ΔV, D, V)
ΔV = remove_eig_gauge_dependence!(ΔV, D, V)
ΔD = Diagonal(randn!(similar(A.diag, T, m)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = Diagonal(randn!(similar(A.diag, T, m)))
ΔD = Diagonal(randn!(similar(D.diag)))

or also using diagview(D) if we don't want to access .diag.

ΔD2 = Diagonal(randn!(similar(A, real(T), m)))
return DV, (ΔD, ΔV), (ΔD2, ΔV)
ΔV = remove_eigh_gauge_dependence!(ΔV, D, V)
ΔD = Diagonal(randn!(similar(A, real(T), m)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = Diagonal(randn!(similar(A, real(T), m)))
ΔD = Diagonal(randn!(similar(diagview(D))))

m, n = size(A)
T = complex(eltype(A))
D = eig_vals(A)
ΔD = randn!(similar(A, complex(T), m))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = randn!(similar(D))

m, n = size(A)
T = complex(eltype(A))
D = eig_vals(A)
ΔD = randn!(similar(A.diag, T, m))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = randn!(similar(D))

m, n = size(A)
T = eltype(A)
D = eigh_vals(A)
ΔD = randn!(similar(A, real(T), m))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔD = randn!(similar(D))

Comment on lines 405 to 415
function ad_svd_compact_setup(A)
m, n = size(A)
T = eltype(A)
minmn = min(m, n)
ΔU = randn!(similar(A, T, m, minmn))
ΔS = randn!(similar(A, real(T), minmn, minmn))
ΔS2 = Diagonal(randn!(similar(A, real(T), minmn)))
ΔS = Diagonal(randn!(similar(A, real(T), minmn)))
ΔVᴴ = randn!(similar(A, T, minmn, n))
U, S, Vᴴ = svd_compact(A)
ΔU, ΔVᴴ = remove_svdgauge_dependence!(ΔU, ΔVᴴ, U, S, Vᴴ)
return (U, S, Vᴴ), (ΔU, ΔS, ΔVᴴ), (ΔU, ΔS2, ΔVᴴ)
ΔU, ΔVᴴ = remove_svd_gauge_dependence!(ΔU, ΔVᴴ, U, S, Vᴴ)
return (U, S, Vᴴ), (ΔU, ΔS, ΔVᴴ)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why I cannot make a suggestion here, but anyway

function ad_svd_compact_setup(A)
    U, S, Vᴴ = svd_compact(A)
    ΔU = randn!(similar(U))
    ΔVᴴ = randn!(similar(Vᴴ))
    ΔS = Diagonal(randn!(similar(diagview(S))))
    ΔU, ΔVᴴ = remove_svd_gauge_dependence!(ΔU, ΔVᴴ, U, S, Vᴴ)
    return (U, S, Vᴴ), (ΔU, ΔS, ΔVᴴ)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would probably even remove the need to special case A::Diagonal below.

@@ -324,7 +440,7 @@
U, S, Vᴴ = svd_full(A)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit wasteful to do svd_compact in ad_svd_compact_setup, and the again svd_full here.

diagview(ΔSfull)[1:minmn] .= diagview(ΔS2)
diagview(ΔSfull)[1:minmn] .= diagview(ΔS)
return (U, S, Vᴴ), (ΔUfull, ΔSfull, ΔVᴴfull)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

function ad_svd_full_setup(A)
    U, S, Vᴴ = svd_full(A)
    ΔU = randn!(similar(U))
    ΔVᴴ = randn!(similar(Vᴴ))
    ΔS = zero(S)
    rand!(diagview(ΔS)) # although I think nonzero random contributions in all of ΔS would also just work fine
    ΔU, ΔVᴴ = remove_svd_gauge_dependence!(ΔU, ΔVᴴ, U, S, Vᴴ)
    return (U, S, Vᴴ), (ΔU, ΔS, ΔVᴴ)
end

m, n = size(A)
T = eltype(A)
WP = left_polar(A)
ΔWP = (randn!(similar(A, T, m, n)), randn!(similar(A, T, n, n)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ΔWP = randn!.(similar.(WP))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is not about this PR, I should probably make some changes to ad_utils.jl in a separate PR.

Copy link
Member

@Jutho Jutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some suggestions, but I already approve!

@lkdvos lkdvos merged commit c86c7d7 into main Feb 26, 2026
10 checks passed
@lkdvos lkdvos deleted the ld-mooncaketests branch February 26, 2026 13:24
@kshyatt
Copy link
Member

kshyatt commented Feb 26, 2026

Yay congrats!

@Jutho
Copy link
Member

Jutho commented Feb 26, 2026

I didn't realize automerge was in effect.

@Jutho
Copy link
Member

Jutho commented Feb 26, 2026

I will try to make a PR with the ad_utils suggestions right away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants