Skip to content

Add eigsolve-style rrule for CTMRG fixed-point gradient#126

Merged
leburgel merged 8 commits intomasterfrom
lb/arnoldi_pullback
Feb 11, 2025
Merged

Add eigsolve-style rrule for CTMRG fixed-point gradient#126
leburgel merged 8 commits intomasterfrom
lb/arnoldi_pullback

Conversation

@leburgel
Copy link
Member

@leburgel leburgel commented Feb 3, 2025

@leburgel leburgel marked this pull request as draft February 3, 2025 22:34
@codecov
Copy link

codecov bot commented Feb 3, 2025

Codecov Report

Attention: Patch coverage is 86.66667% with 2 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...rithms/optimization/fixed_point_differentiation.jl 86.66% 2 Missing ⚠️
Files with missing lines Coverage Δ
src/PEPSKit.jl 87.50% <ø> (ø)
...rithms/optimization/fixed_point_differentiation.jl 94.73% <86.66%> (-1.52%) ⬇️

... and 1 file with indirect coverage changes

@leburgel
Copy link
Member Author

leburgel commented Feb 5, 2025

I had a go at adding this after a suggestion from @Jutho, for now mainly just so I didn't forget about it again. From what I can tell solving the fixed-point gradient linear problem like this is actually very fast and stable for all the examples. It's also the method of choice for the VUMPS pullback in VUMPSAutoDiff.jl.

However, there seems to be a problem with the gradient tests for iterscheme==:diffgauge. One possibility is that there is actually some residual gauge freedom left which makes it such that the modified eigenvalue problem doesn't work because there's no unique eigenvector with eigenvalue 1. I certainly don't really understand what is going on so I don't know an easy fix, but though I should leave this here as a suggestion.

Do you have an idea why diffgauge would give so much trouble here @pbrehmer?

@pbrehmer
Copy link
Collaborator

pbrehmer commented Feb 5, 2025

This is super interesting, thanks for adding this. I will try to take a closer look the next few days.

Regarding :diffgauge: I am kind of surprised by these issues since I thought that :diffgauge is in principle more stable. In that scheme, we differentiate through the gauge fixing itself so I would expect AD to take care of any residual gauge freedom - but perhaps I'm wrong.

@pbrehmer
Copy link
Collaborator

I noticed a few things but I can't claim I have really understood what's going on. The problem seems to be that :diffgauge, as you said, doesn't converge properly and outputs an eigenvalue different from 1. When looking at the output, I noticed that the :fixed mode converges after expanding the Krylov subspace to the specified dimension, whereas :diffgauge needs multiple Schur solves before it stops.

So when choosing :diffgauge, it seems that the Krylov dimension really matters. KrylovKit defaults to krylovdim=30 which might be too high in general - reducing the dimension to e.g. krylovdim=8 really speeds up the Arnoldi convergence and it will actually converge to the correct gradient with eigenvalue 1. There seems to be a certain Krylov dimension above which the :diffgauge Arnoldi will break; below that, the Arnoldi convergence is comparable to the :fixed mode.

One can also enable eager=true so that the Krylov subspace is not expanded to the full specified dimension and this also seems to repair the :diffgauge mode and even speeds it up. Perhaps this also speeds up the :fixed mode.

The difference between :fixed and :diffgauge might be a stability thing? After all, with :diffgauge each application of f(X) will differentiate through an eigenvalue problem inside gauge_fix whereas :fixed really just boils down to a fixed SVD.

Copy link
Collaborator

@pbrehmer pbrehmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the addition, this really seems like an efficient approach to differentiating CTMRG! Perhaps this should be the new default? I will try to benchmark against LinSolver and see what seems best in a different PR.

@leburgel leburgel marked this pull request as ready for review February 11, 2025 09:05
@leburgel
Copy link
Member Author

One can also enable eager=true so that the Krylov subspace is not expanded to the full specified dimension and this also seems to repair the :diffgauge mode and even speeds it up. Perhaps this also speeds up the :fixed mode.

Good catch, I had not thought of that! I added a default gradient_eigsolve which sets eager=true, and enabled eager mode in the tests using iterscheme=:fixed as well.

pbrehmer
pbrehmer previously approved these changes Feb 11, 2025
lkdvos
lkdvos previously approved these changes Feb 11, 2025
@leburgel leburgel dismissed stale reviews from lkdvos and pbrehmer via 1b438a4 February 11, 2025 15:22
@leburgel
Copy link
Member Author

Sorry to dismiss the reviews, but I was maybe too quick in un-drafting this since there were two things that still needed to be addressed before merging:

  • I expanded the warning given when the norm of the auxiliary component vanishes. This basically means that either the eigsolve is unconverged, or the Jacobian doesn't have a unique leading eigenvalue 1. I was struggling with the tolerance for this check a bit, I picked the easiest thing that seemed sensible but maybe we can do better
  • I switched to use realeigsolve for the same reason we switched to reallinsolve before.

If this doesn't break anything, should be good to go then

@leburgel leburgel enabled auto-merge (squash) February 11, 2025 15:57
@leburgel leburgel merged commit dcf1bb2 into master Feb 11, 2025
27 checks passed
@leburgel leburgel deleted the lb/arnoldi_pullback branch February 11, 2025 16:02
@leburgel leburgel mentioned this pull request Mar 12, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants