Conversation
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 4 files with indirect coverage changes 🚀 New features to boost your workflow:
|
| r = MatrixAlgebraKit.qr_rank(R) | ||
| function remove_qr_gauge_dependence!(ΔQ, ΔR, A, Q, R; rank_atol = MatrixAlgebraKit.default_pullback_rank_atol(R)) | ||
| r = MatrixAlgebraKit.qr_rank(R; rank_atol) | ||
| Q₁ = @view Q[:, 1:r] |
There was a problem hiding this comment.
I guess this is now no longer used? Can you comment on the todo that is left? Is the idea that the zeroing out should use the rank, but the projecting out should only use the minmn?
There was a problem hiding this comment.
What is no longer used?
On the TODO: yes I think we can distinguish between the cases of rank-deficiency, and just the extra columns in Q if Q is square but resulting from qr_full applied to a full-rank tall matrix. In that case, Q should be completely fixed and differentiable. However, to treat the most general case (a rank deficient rectangular matrix), I have to redo the analysis because this was not covered in my notes ( https://jutho.github.io/LinAlgAD/main.html#general-result ): this is about the
There was a problem hiding this comment.
The variable Q1 is not used anymore I think
| ΔQR = randn!.(copy.(QR)) | ||
| remove_qr_gauge_dependence!(ΔQR..., A, QR...) | ||
| ΔQR = structured_randn!.(copy.(QR)) | ||
| A isa Diagonal || remove_qr_gauge_dependence!(ΔQR..., A, QR...) |
There was a problem hiding this comment.
Do we also need to project out gauge dependence for rank-deficient Diagonals?
There was a problem hiding this comment.
For now, I tried for my changes to be more stylistic, i.e. to be mostly functionally equivalent to the existing code. The existing Diagonal specialisation didn't have any gauge dependence removal, but yes, I agree it would make sense to also have this, for a A::Diagonal with some small or zero diagonal elements.
| ΔQR = randn!.(copy.(QR)) | ||
| remove_qr_gauge_dependence!(ΔQR..., A, QR...) | ||
| ΔQR = structured_randn!.(copy.(QR)) | ||
| A isa Diagonal || remove_qr_gauge_dependence!(ΔQR..., A, QR...) |
| mul!(ΔN, Q, Q' * ΔN) | ||
| return ΔN | ||
| end | ||
| remove_left_null_gauge_dependence!(ΔN, A, N) = remove_qr_null_gauge_dependence!(ΔN, A, N) |
There was a problem hiding this comment.
Does it make sense to remove one of the functions, if they are both the same?
There was a problem hiding this comment.
Probably yes, unless it is useful to know that there is a remove_method_gauge_dependence! for every method (or at least decomposition) we have.
| r = MatrixAlgebraKit.lq_rank(L) | ||
| function remove_lq_gauge_dependence!(ΔL, ΔQ, A, L, Q; rank_atol = MatrixAlgebraKit.default_pullback_rank_atol(L)) | ||
| r = MatrixAlgebraKit.lq_rank(L; rank_atol) | ||
| Q₁ = @view Q[1:r, :] |
| T = complex(eltype(A)) | ||
| D = eig_vals(A) | ||
| ΔD = randn!(similar(A, complex(T), m)) | ||
| ΔD = randn!(similar(D)) |
There was a problem hiding this comment.
Here we are using similar, in the other functions copy. I don't have any preference, but let's be consistent
There was a problem hiding this comment.
Both were used, but I agree on consistency. I think similar is slightly more efficient, but maybe I was worried in some cases if similar would produce the correct type, i.e. the same output type, but I think it does for all the cases where we use it.
| ΔS = zero(S) | ||
| randn!(diagview(ΔS)) |
There was a problem hiding this comment.
Is this obvious that this has to be a diagonal variation? Are we checking for this anywhere?
There was a problem hiding this comment.
I thought I had a comment stating that this is not strictly necessary, but apparently it got lost. I think only the diagonal of ΔS is used int he pullback (equivalent to projecting out all non-diagonal entries), so this is not strictly necessary, and we can just as well have structured_rand!(similar(S)).
There was a problem hiding this comment.
Again, it is just that I tried to stay functionally equivalent to the old version, at least for this first iteration. Once the tests pass on this (which they seem to do, except for the LQ stuff), we can try to remove indeed these restrictions.
Some changes as discussed. Somehow the AD correctness tests fail for Mooncake LQ, whereas they don't do for Mooncake QR (which uses the same approach), nor for Enzyme or ChainRules (I think). I haven't figured out what is going on yet, but maybe some reviewer has an idea.