Renormalize tensors before projections#148
Renormalize tensors before projections#148Confusio wants to merge 0 commit intoQuantumKitHub:masterfrom
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
|
Thanks for getting this started! I haven't had time to look at the PR properly but what I just noticed: The Or perhaps I am missing the point: Was the reason you implemented |
I got it. I didn't notice |
lkdvos
left a comment
There was a problem hiding this comment.
Thanks for looking into this!
I left a couple comments throughout the code, but I have some general comments as well:
Could you try and separate out changes that are functional from ones that are cosmetic? It is a bit hard to review PRs where both are going on at the same time, and when things go wrong point at what it is.
Specifically, if I got this correctly, there is
- a restructuring of the callchain for computing the projectors
- a different way of contracting parts of the environment together (with
⊙) - a normalization before and after the SVD
Especially 2. scares me a bit, since the fermionic contractions really are a bit fragile: implementations that seemingly should give the same results can be different because contractions and multiplications are not the same
src/algorithms/ctmrg/projectors.jl
Outdated
| is the left and right tensors. After the projection, the arrow of the bond is now (L⊙P_L)←(P_R⊙R). | ||
|
|
||
| ``` | ||
| L⊙R=(L⊙R)*(L⊙R)^-1*(L⊙R)=(L⊙R)*(U*S*V)^-1*(L⊙R)=(L⊙R)*V'S^{-1}U'*(L⊙R)=L⊙(R*V'*S^{-1/2})*(S^{-1/2}*U'*L)⊙R=L⊙P_L*P_R⊙R |
There was a problem hiding this comment.
I'm not a huge fan of this docstring compared to the previous one, since the previous one had drawings. Here, the long equation is a bit hard to parse, so maybe you could merge the best of both worlds? (it could also be helpful to work a bit with spaces and indentation to improve readability)
There was a problem hiding this comment.
Here, the tensor contraction operator ⊙ is useful for deriving the projector operators on a selected bond. For example, consider the following diagrams:
---- || ----
--->----| | || | |
........| | || | |---->----
........| |----||->----| |.........
---<--- | L |....||......| R |.........
........| |----||-<----| |----<----
---->---| | || | |
---- || ----
No matter how many outer legs or contracted legs appear on the bond, the diagram can always be expressed as
L ⊙ R
where ⊙ represents the tensor contraction rather than map composition. (Note that standard linear algebra operation (e.g. factorization and inverse) are valid only for maps, not for general tensor contractions.)
For the singular value decomposition (SVD) of L ⊙ R, we treat the entire contraction as a map. Thus, we can write
L ⊙ R = U * S * V
where * denotes map composition, obeying the usual rules of linear algebra.
Since L ⊙ R is a map, it possesses an inverse given by
(L ⊙ R)⁻¹ = V' * S⁻¹ * U'
so that
(L ⊙ R) * (L ⊙ R)⁻¹ = (L ⊙ R)⁻¹ * (L ⊙ R) = I,
and we generally don't have
(L ⊙ R) ⊙ (L ⊙ R)⁻¹ = I,
as
From this, we obtain (note that before we implement linear algebra operators (e.g.
L ⊙ R = (L ⊙ R) * (L ⊙ R)⁻¹ * (L ⊙ R)
= (L ⊙ R) * V' * S⁻¹ * U' * (L ⊙ R)
= L ⊙ (R * V' * S^(–1/2)) * (S^(–1/2) * U' * L) ⊙ R
= L ⊙ P_L * P_R ⊙ R,
where
P_L = R * V' * S^(–1/2)
P_R = S^(–1/2) * U' * L.
This explicit distinction between map composition (using *) and tensor contraction (using ⊙) helps beginners understand why we need extra care about the signs of fermions if we do the linear algebra operations (e.g. factorization and inverse). If no linear algebra operations, @tensor in TensorKit.jl automatically does the right thing.
Once this point is clear, addressing the fermion signs in fPEPS becomes much more straightforward. The key rule is: whenever you perform linear algebra operations—such as factorization or inversion—you must use * (map composition), not ⊙ (tensor contraction). If no linear algebra operation, use ⊙ (tensor contraction).
Although one can use the @tensor macro for contractions, it does not easily allow one to formulate the detailed derivation of the projectors, which in turn creates a barrier to understanding the fermion signs in fPEPS (at least for me).
Another point is that a composition of * and twist operations is sufficient for constructing the CTMRG projectors. This is nearly true because the projections are always derived from the expanded corners, which are already permuted into the proper shapes: C1(p1; p2), C2(p2; p3), C3(p3; p4), and C4(p4; p1). Therefore, a composition of * and twist operations suffices for both the half and full infinite environments. In my implementation, many contractions in the environment part can be removed in the file ctmrg_contraction.jl.
With this in mind, there is no need to distinguish between the full_infinite_environment and half_infinite_environment methods when constructing the projectors, since only the left and right tensors are of concern. The proper left and right tensors are clearly specified in the CTMRG steps found in sequential.jl and simultaneous.jl. Specifically, on the bond connecting C1 and C2:
- For the half_infinite_environment, L = C1 and R = C2.
- For the full_infinite_environment, L = C4 ⊙ C1 and R = C2 ⊙ C3,
where C1, C2, C3, and C4 are the expanded corners. With L and R are provided, the compute_projector function gives you projectors.
There was a problem hiding this comment.
Thanks for the more detailed explanation, I think this is definitely helpful. However, I mostly mean that it would be helpful to put that in the docstring, because while I know what is going on, others might not. You removed one of the drawings and replaced it with a single long equation without explanation, where the symbols in the explanation are not necessarily that explanatory. That was mostly the point of my comment.
| """ | ||
| contract_projectors(U, S, V, Q, Q_next) | ||
|
|
||
| Compute projectors based on a SVD of `Q * Q_next`, where the inverse square root | ||
| `isqS` of the singular values is computed. | ||
|
|
||
| Left projector: | ||
| ``` | ||
| -- |~~~~~~| -- |~~| | ||
| |Q_next| |V'| -- isqS -- | ||
| == |~~~~~~| == |~~| | ||
| ``` | ||
|
|
||
| Right projector: | ||
| ``` | ||
| |~~| -- |~~~| -- | ||
| -- isqS -- |U'| | Q | | ||
| |~~| == |~~~| == | ||
| ``` | ||
| """ | ||
| function contract_projectors(U, S, V, Q, Q_next) | ||
| isqS = sdiag_pow(S, -0.5) | ||
| P_left = Q_next * V' * isqS # use * to respect fermionic case | ||
| P_right = isqS * U' * Q | ||
| return P_left, P_right | ||
| end | ||
|
|
There was a problem hiding this comment.
Why do you need to remove this function? It seems like you are still using it in your new implementation, but manually pasted it in?
There was a problem hiding this comment.
As I explain above, given Q (=L) and Q_next (=R) only, we can calculate the projectors in the compute_projector function.
There was a problem hiding this comment.
I know this, but by removing the function and the drawing, this is now slightly less transparent, and I'm wondering what the benefit is.
src/algorithms/ctmrg/projectors.jl
Outdated
|
|
||
| ``` | ||
| L⊙R=(L⊙R)*(L⊙R)^-1*(L⊙R)=(L⊙R)*(U*S*V)^-1*(L⊙R)=(L⊙R)*V'S^{-1}U'*(L⊙R)=L⊙(R*V'*S^{-1/2})*(S^{-1/2}*U'*L)⊙R=L⊙P_L*P_R⊙R | ||
| ``` |
There was a problem hiding this comment.
You need to attach docstrings directly above the function in order to make them be correctly parsed as docstrings, Julia will interpret this as a simple string.
There was a problem hiding this comment.
I am thinking about how to express this clearly. Only one formula can't explain the details.
There was a problem hiding this comment.
My comment might not have been very clear. What I mean is, given that this is the docstring for compute_projector, you need to explicitly have it before that function in the code, without newlines or other functions inbetween:
"""
docstring
"""
function myfunction endThis way, it will show up if you type ?compute_projector in the REPL
src/algorithms/ctmrg/projectors.jl
Outdated
| function ⊙(t1::AbstractTensorMap, t2::AbstractTensorMap) | ||
| return twist(t1, filter(i -> !isdual(space(t1, i)), domainind(t1))) * t2 | ||
| end |
There was a problem hiding this comment.
I'm not a huge fan of this symbol or this implementation.
For the symbol, there are two problems to me:
- if you don't know what it does, there is no way to figure out what this function is except search through the code and see where it is defined. It's not obvious to me why this would be a contraction with twists, and I would actually intuitively think it is some kind of
dotoperation if I would not know what it is. - it seems like we already have way to many functions to deal with fermionic symmetries etc, and adding one just increases the mental space you need to keep track of it.
Implementation-wise this is fully equivalent to, but less efficient than the suggestion I had here in terms of explicit tensorcontract calls.
Do you think we could stick to tensorcontractions and multiplications? In particular, I would much rather just introduce a function combine_environment (or anything along those lines) that we can overload if necessary, because this is now making a lot of implicit assumptions about the structure of these environments.
There was a problem hiding this comment.
See my above explanation. Actually, the expanded corner tensors take the form C1(p1; p2), C2(p2; p3), C3(p3; p4), and C4(p4; p1) quite naturally.
There was a problem hiding this comment.
I don't think I see how this relates to the issues I'm mentioning. Can you maybe elaborate?
There was a problem hiding this comment.
It's not so great an idea to start changing tests in the same PR as you are introducing some new functionality. This makes it hard to gauge if your implementation is actually working or not, since from your comment it seems like it is not and you had to adapt the tests to make them pass (which is of course not really the point of having the tests...)
There was a problem hiding this comment.
Due to a known issue with KrylovKit (Issue #100), the test can't pass. I need find out another initial peps by changing the random seed or switch to iterscheme=:diffgauge.
There was a problem hiding this comment.
I would prefer it if you keep it as non-passing in that case anyways, since it was passing before. What is different here, such that it no longer works? We can then decide if this can be safely ignored, because now it looks like your change makes it such that that issue appears, whereas before it was not appearing.
| corner_functions = Dict( | ||
| NORTHWEST => enlarge_northwest_corner, | ||
| NORTHEAST => enlarge_northeast_corner, | ||
| SOUTHEAST => enlarge_southeast_corner, | ||
| SOUTHWEST => enlarge_southwest_corner, | ||
| ) | ||
|
|
||
| haskey(corner_functions, dir) || error("Wrong direction: $dir") | ||
| corner = corner_functions[dir](Q.E_1, Q.C, Q.E_2, Q.A) |
There was a problem hiding this comment.
Do we really need this change? It is somewhat orthogonal to the other changes, and in particular I don't think this is what we want: using the dictionary of functions most likely results in a forced runtime dispatch, which is not great for type stability or performance in general.
There was a problem hiding this comment.
Sorry, we don't need the changes. I didn't notice Copilot did the changes automatically.
|
|
||
| haskey(corner_functions, dir) || error("Wrong direction: $dir") | ||
| corner = corner_functions[dir](Q.E_1, Q.C, Q.E_2, Q.A) | ||
| return corner / norm(corner) |
There was a problem hiding this comment.
Would it be possible to move the normalization higher up the callchain? TensorMap is a constructor, and as a function really is supposed to contract the tensors and give you a tensor, and implicitly renormalizing seems not entirely fair. (You might not always want to do this, for example if you need access to the value of the norm)
There was a problem hiding this comment.
When the norm of the PEPS tensor is either very small or very large, the contraction of the expanded corners (C1 * C2 * C3 * C4) can become numerically unstable because the PEPS tensor appears four times. For reasons that are not entirely clear, the gradient can sometimes become excessively large. As a result, the trial update during optimization (i.e., peps' = peps + grad * α) may acquire a very large norm, which leads to the breakdown of the LBFGS algorithm.
So I think the normalization here is useful, but we need find out a proper place to do this.
There was a problem hiding this comment.
I do not mean that we should not normalize, I mean that this should not happen in this function :)
src/algorithms/ctmrg/projectors.jl
Outdated
| if dim(codomain(L)) > dim(domain(L)) | ||
| _, L = leftorth!(L) | ||
| R, _ = rightorth!(R) |
There was a problem hiding this comment.
Did you check if this works with fermions? It seems like this would alter the arrows on the legs of what you are actually contracting/decomposing, and it confuses me a bit that there would be no extra parity factors showing up because of that.
Additionally, I think this is type-unstable: depending on the sizes of the tensors, L and R will now have an altered number of legs.
There was a problem hiding this comment.
In my notation, the result bond has the arrow represented as
(L ⊙ P_L) ← (P_R ⊙ R).
The leftorth and rightorth will be very helpful when L and R don't have full ranks, particularly for the case you mentioned they will have different number of outer legs.
There was a problem hiding this comment.
Yes, but you should implement this differently in that case: before you use leftorth, L is an N by M tensormap. After, it will be a 1 by M tensormap. The compiler does not know if you are going to apply leftorth or not. Hence, after this block of code, you lose type stability.
This can be resolved by explicitly having a function barrier before the actual implementation.
The reason I'm checking in about the fermions is that leftorth does not always produce arrows in the same way, but looking at it in more detail now I think it's fine because you are applying the correct twists in the line below.
There was a problem hiding this comment.
same comment about changing tests here
There was a problem hiding this comment.
Some reason due to a known issue with KrylovKit (Issue #100).
Items 1 and 2 might be optional since they mainly help clarify the fermion sign—though I'm not completely certain. Item 3 is essential as it enhances numerical stability. Please see my detailed thoughts. |
|
I'm not saying that we cannot change all of these, I'm saying that you should have separate PRs for them, since they are somewhat orthogonal. That way, we can more easily review the code and obtain the best solutions, without having to go through a large amount of separate issues at the same time |
|
Sorry for repeated commit message. Enhancements to
|
|
fix: Use fixed tolerance in Arnoldi algorithm to resolve KrylovKit issue Previously, the tolerance for rrule_alg was set using the variable ctmrg_tol (with a value of 1.0e-8). Debug output from svdsolve.jl (see this link) revealed that while the vals array contained all expected eigenvalues, the corresponding rvals array was missing an entry near 7e-8. (vals, rvals) = ([0.9972572805593066, 0.06451815859605552, 0.036184598804887586, 0.002434937441416787, 0.000257804461731779, 5.179320746859194e-5, 1.2009304246253585e-5, 9.930554302595036e-6, 7.877564742933071e-6, 4.449498137506608e-6, 9.658547169810283e-7, 6.521371763146172e-7, 2.906002397520167e-7, 7.149473802806458e-8, 6.731283066272181e-8, 3.68284807988075e-8], ComplexF64[0.9972572805593064 + 1.4059950420919265e-32im, 0.0645181585960555 + 5.5002370940311424e-33im, 0.03618459880488763 - 5.903859117468247e-32im, 0.002434937441416788 + 2.5572350764397176e-32im, 0.00025780446173177785 - 3.533902315109301e-33im, 5.1793207468593083e-5 - 3.239920849691298e-31im, 1.2009304246254148e-5 - 4.257380530297327e-25im, 9.930554302595809e-6 + 3.627657579880785e-24im, 7.877564742934238e-6 + 3.579591975126304e-24im, 4.449498137507113e-6 - 2.008137399273548e-24im, 9.65854703934024e-7 + 4.751819423053537e-19im, 6.521369232255494e-7 + 1.8750608545020293e-18im, 2.905964397554036e-7 - 9.161679754055458e-18im, 6.960340417111356e-8 + 1.518834871110928e-17im, 3.699007030825129e-8 + 4.687018494971075e-17im])This discrepancy—occurring close to the original tolerance—suggests a precision issue with the 1.0e-8 threshold. To resolve this, the tolerance has been hardcoded to 1.0e-10 in the Arnoldi algorithm: const rrule_alg = Arnoldi(; tol=1.0e-10, krylovdim=48, verbosity=-1)This adjustment improves numerical stability and consistency, addressing the problem described in KrylovKit.jl issue #110. |
This commit addresses PEPSKit.jl Issue #137 by refactoring CTMRG expanded corners and projector implementations. Key changes include:
Core Improvements
compute_projectorfunction that handles projectors for arbitrary left/right tensors connecting the bond.sequential_projectorsandsimultaneous_projectorsto use the new unified interfaceWorkflow Notes
iterscheme=:diffgaugeas a temporary workaround for a KrylovKit.jl issue (exact root cause remains undiagnosed)New Utility
This tensor contraction helper cleans up
infinite_environmentcontractions in bothfull_infinite_environmentandhalf_infinite_environmentimplementations.