Scale `rrule_alg` tolerance with singular value magnitude when using `KrylovKit.svdsolve` pullback by leburgel · Pull Request #155 · QuantumKitHub/PEPSKit.jl

leburgel · 2025-03-11T11:52:55Z

An attempt to address (most?) occurrences of Jutho/KrylovKit.jl#110 we encounter, specifically the ones in #151 and #154. This will kind of be superseded by #150 right away, but I think it's anyway not a bad idea to address this directly as well in parallel.

There's a lot of extra warnings now with the lowered tolerance, but these are more a consequence of the precise check being performed in KrylovKit.jl rather than an actual issue, and I've confirmed that this is solved by Jutho/KrylovKit.jl#123.

One question I had here is what we want to do with the verbosity here. It seems that KrylovKitChainRulesCoreExt.compute_svdsolve_pullback_data handles a lot of its warnings based on the verbosity of the primal algorithm. Originally, this was just set to a dummy GKL algorithm which effectively has verbosity=1, giving a lot of warning messages even though by default the rrule_alg has verbosity=-1. I at least set an explicit verbosity=1 in the minimal primal algorithm for now so this is a bit more obvious, but I wonder if we should:

Keep an explicit nonzero verbosity in the primal alg when calling KrylovKitChainRulesCoreExt.compute_svdsolve_pullback_data, so we can at least get a warning when the cotangent linear problem does not converge.
Set the primal alg verbosity to rrule_alg.verbosity, so users can actually choose whether or not they want to see warnings. In particular, this would allow to suppress some previously unsuppressible logging spam if people want.

Should also fix #158 once merged.

…`KrylovKit.svdsolve` pullback

codecov · 2025-03-11T12:02:28Z

Codecov Report

Attention: Patch coverage is 86.95652% with 3 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/utility/svd.jl	86.95%	3 Missing ⚠️

Files with missing lines	Coverage Δ
src/utility/svd.jl	`89.43% <86.95%> (-1.74%)`	⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:

❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pbrehmer

Thanks for getting this in, I like this a lot. I will implement the same thing in #150.

One thing I was wondering: Have you checked to which degree this gets rid of the gauge sensitivity warnings? In any case, I think this is a strict improvement so let's get it merged :-)

leburgel · 2025-03-12T09:36:45Z

This doesn't fix any gauge sensitivity warnings (in fact it might make them a bit worse if the tolerance is lowered...), but combined with Jutho/KrylovKit.jl#123 it at least gets rid of most of the cotangent linear problem returned unexpected result warnings, which is something.

We can actually get rid of the warnings altogether by setting verbosity=rrule_alg.verbosity in the dummy forward algorithm and then controlling the verbosity from there instead of hardcoding verbosity=1, but I was a bit reluctant to do this since this also means that there's no warnings if the cotangent linear problem fails to converge at all.

src/utility/svd.jl

Co-authored-by: Lukas Devos <ldevos98@gmail.com>

leburgel · 2025-03-12T16:17:38Z

Lost coverage now that #150 is merged since the default is the TensorKit.tsvd! rrule now. Seems like that one's faster too so in itself that's very good, but I'll tweak some tests to get back coverage for the KrylovKit.svdsolve rrule.

EDIT: apparently there's still coverage, but I have no clue from where. I'll try to add some more combinations to the CTMRG gradient test to be sure, maybe it's a good time for that anyway now that a lot of stability issues are solved.

leburgel · 2025-03-12T21:16:33Z

Sorry to keep dismissing the reviews, but it turns out using rrule_alg=:arnoldi in the svd adjoint now breaks the CTMRG gradients. This probably didn't show because the default was changed to rrule_alg=:tsvd at the same time so tests kept passing, but probably best to fix this here as well.

leburgel · 2025-03-13T09:55:02Z

Okay, the gradient test at least gives an overview of the combinations that go wrong (see here). In summary, the CTMRG gradient computation is throwing an error for:

gradient_alg=nothing, gradient_iterscheme=:fixed or gradient_iterscheme=:diffgauge, svd_rrule_alg=:arnoldi
- This seems like it really should not give an issue, since neither the iterscheme nor the SVD rrule alg should not have any effect when gradient_alg is nothing. However the stacktrace (see below) is the same as in the other cases, so maybe it's the same underlying problem.
gradient_iterscheme=:diffgauge, svd_rrule_alg=:arnoldi
- It seems using the iterative SVD rrule in the diffgauge does not work since there's some incompatibility in the tsvd calls (see below). I can try to pin this down further, but maybe an obvious question first. Does it make sense to use the iterative rrule with the diffgauge iterscheme in the first place @pbrehmer? And am I passing the kwargs properly to do so? For reference, I'm constructing the CTMRG algorithms as

contrete_ctmrg_alg = PEPSKit.CTMRGAlgorithm(;
    alg=ctmrg_alg,
    verbosity=ctmrg_verbosity,
    projector_alg=projector_alg,
    svd_alg=(; rrule_alg=(; alg=svd_rrule_alg)),
)

Example stacktrace:

ctmrg_alg=:sequential, projector_alg=:halfinfinite, svd_rrule_alg=:arnoldi, gradient_alg=:nothing, gradient_iterscheme=:fixed: Error During Test at /home/runner/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:45
  Got exception outside of a @test
  TaskFailedException
  Stacktrace:
    [1] wait(t::Task)
      @ Base ./task.jl:370
    [2] fetch
      @ ./task.jl:390 [inlined]
    [3] fetch
      @ ~/.julia/packages/StableTasks/cSY7d/src/internals.jl:9 [inlined]
    [4] mapreduce_first
      @ ./reduce.jl:421 [inlined]
    [5] _mapreduce(f::typeof(fetch), op::typeof(BangBang.append!!), ::IndexLinear, A::Vector{StableTasks.StableTask{Any}})
      @ Base ./reduce.jl:432
    [6] _mapreduce_dim(f::Function, op::Function, ::Base._InitialValue, A::Vector{StableTasks.StableTask{Any}}, ::Colon)
      @ Base ./reducedim.jl:337
    [7] mapreduce(f::Function, op::Function, A::Vector{StableTasks.StableTask{Any}})
      @ Base ./reducedim.jl:329
    [8] _tmapreduce(f::Function, op::Function, Arrs::Tuple{Vector{UnitRange{Int64}}}, ::Type{Any}, scheduler::OhMyThreads.Schedulers.DynamicScheduler{OhMyThreads.Schedulers.FixedCount, ChunkSplitters.Consecutive}, mapreduce_kwargs::@NamedTuple{})
      @ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:113
    [9] #tmapreduce#22
      @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:85 [inlined]
   [10] tmapreduce
      @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:69 [inlined]
   [11] _tmap(::OhMyThreads.Schedulers.DynamicScheduler{OhMyThreads.Schedulers.FixedCount, ChunkSplitters.Consecutive}, ::Function, ::Vector{Tuple{Int64, Int64}})
      @ OhMyThreads.Implementation ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:451
   [12] #tmap#102
      @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:373 [inlined]
   [13] tmap
      @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:337 [inlined]
   [14] rrule(config::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::typeof(PEPSKit.dtmap), f::Function, A::Vector{Tuple{Int64, Int64}}; kwargs::@Kwargs{})
      @ PEPSKit ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/diffable_threads.jl:16
   [15] rrule(config::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::typeof(PEPSKit.dtmap), f::Function, A::Vector{Tuple{Int64, Int64}})
      @ PEPSKit ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/diffable_threads.jl:13
   [16] chain_rrule(::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::Function, ::Function, ::Vararg{Any})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:233
   [17] macro expansion
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0 [inlined]
   [18] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.dtmap), ::PEPSKit.var"#328#329"{PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation}}, ::Vector{Tuple{Int64, Int64}})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:91
   [19] sequential_projectors
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/sequential.jl:78 [inlined]
   [20] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.sequential_projectors), ::Int64, ::PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [21] ctmrg_leftmove
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/sequential.jl:49 [inlined]
   [22] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.ctmrg_leftmove), ::Int64, ::PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.SequentialCTMRG)
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [23] ctmrg_iteration
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/sequential.jl:59 [inlined]
   [24] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.ctmrg_iteration), ::PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.SequentialCTMRG)
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [25] #255
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/ctmrg.jl:98 [inlined]
   [26] _pullback(::Zygote.Context{false}, ::PEPSKit.var"#255#259"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.SequentialCTMRG, MPSKit.IterativeLoggers.IterLog})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [27] #35
      @ ~/.julia/packages/LoggingExtras/VLO3o/src/verbosity.jl:103 [inlined]
   [28] #rrule_via_ad#59
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:270 [inlined]
   [29] rrule_via_ad
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:258 [inlined]
   [30] #860
      @ ~/.julia/packages/ChainRules/Q16hj/src/rulesets/Base/CoreLogging.jl:10 [inlined]
   [31] with_logstate(f::ChainRules.var"#860#861"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, LoggingExtras.var"#35#37"{PEPSKit.var"#255#259"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.SequentialCTMRG, MPSKit.IterativeLoggers.IterLog}}}, logstate::Base.CoreLogging.LogState)
      @ Base.CoreLogging ./logging/logging.jl:522
   [32] with_logger(f::Function, logger::LoggingExtras.EarlyFilteredLogger{LoggingExtras.LevelOverrideLogger{Base.CoreLogging.ConsoleLogger}, LoggingExtras.var"#36#38"{Int64}})
      @ Base.CoreLogging ./logging/logging.jl:632
   [33] rrule(rc::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::typeof(Base.CoreLogging.with_logger), f::Function, logger::LoggingExtras.EarlyFilteredLogger{LoggingExtras.LevelOverrideLogger{Base.CoreLogging.ConsoleLogger}, LoggingExtras.var"#36#38"{Int64}})
      @ ChainRules ~/.julia/packages/ChainRules/Q16hj/src/rulesets/Base/CoreLogging.jl:9
   [34] chain_rrule(::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::Function, ::Function, ::Vararg{Any})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:233
   [35] macro expansion
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0 [inlined]
   [36] _pullback(::Zygote.Context{false}, ::typeof(Base.CoreLogging.with_logger), ::LoggingExtras.var"#35#37"{PEPSKit.var"#255#259"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.SequentialCTMRG, MPSKit.IterativeLoggers.IterLog}}, ::LoggingExtras.EarlyFilteredLogger{LoggingExtras.LevelOverrideLogger{Base.CoreLogging.ConsoleLogger}, LoggingExtras.var"#36#38"{Int64}})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:91
   [37] #withlevel#34
      @ ~/.julia/packages/LoggingExtras/VLO3o/src/verbosity.jl:99 [inlined]
   [38] _pullback(::Zygote.Context{false}, ::LoggingExtras.var"##withlevel#34", ::Int64, ::typeof(LoggingExtras.withlevel), ::PEPSKit.var"#255#259"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.SequentialCTMRG, MPSKit.IterativeLoggers.IterLog}, ::Base.CoreLogging.LogLevel)
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [39] withlevel
      @ ~/.julia/packages/LoggingExtras/VLO3o/src/verbosity.jl:98 [inlined]
  --- the above 2 lines are repeated 1 more time ---
   [42] _pullback(::Zygote.Context{false}, ::typeof(Core.kwcall), ::@NamedTuple{verbosity::Int64}, ::typeof(LoggingExtras.withlevel), ::PEPSKit.var"#255#259"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.SequentialCTMRG, MPSKit.IterativeLoggers.IterLog})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [43] leading_boundary
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/ctmrg.jl:94 [inlined]
   [44] _pullback(::Zygote.Context{false}, ::typeof(MPSKit.leading_boundary), ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, ::PEPSKit.SequentialCTMRG)
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [45] _apply
      @ ./boot.jl:946 [inlined]
   [46] adjoint
      @ ~/.julia/packages/Zygote/3To5I/src/lib/lib.jl:212 [inlined]
   [47] _pullback
      @ ~/.julia/packages/ZygoteRules/CkVIK/src/adjoint.jl:67 [inlined]
   [48] #leading_boundary#260
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/ctmrg.jl:115 [inlined]
   [49] _apply
      @ ./boot.jl:946 [inlined]
   [50] adjoint
      @ ~/.julia/packages/Zygote/3To5I/src/lib/lib.jl:212 [inlined]
   [51] _pullback
      @ ~/.julia/packages/ZygoteRules/CkVIK/src/adjoint.jl:67 [inlined]
   [52] leading_boundary
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/ctmrg.jl:114 [inlined]
   [53] #rrule_via_ad#59
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:270 [inlined]
   [54] rrule_via_ad
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:258 [inlined]
   [55] _rrule
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/hook_pullback.jl:47 [inlined]
   [56] #rrule#50
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/hook_pullback.jl:33 [inlined]
   [57] rrule
      @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/hook_pullback.jl:29 [inlined]
   [58] chain_rrule_kw
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:245 [inlined]
   [59] macro expansion
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0 [inlined]
   [60] _pullback(::Zygote.Context{false}, ::typeof(Core.kwcall), ::@NamedTuple{alg_rrule::Nothing}, ::typeof(PEPSKit.hook_pullback), ::typeof(MPSKit.leading_boundary), ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}, ::PEPSKit.SequentialCTMRG)
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:91
   [61] #2
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:78 [inlined]
   [62] _pullback(ctx::Zygote.Context{false}, f::Main.var"##Gradients#231".var"#2#4"{PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, Int64, Nothing, PEPSKit.SequentialCTMRG}, args::PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
   [63] pullback(f::Function, cx::Zygote.Context{false}, args::PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface.jl:96
   [64] pullback
      @ ~/.julia/packages/Zygote/3To5I/src/compiler/interface.jl:94 [inlined]
   [65] withgradient(f::Function, args::PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}})
      @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface.jl:211
   [66] #1
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:77 [inlined]
   [67] optimtest(fg::Main.var"##Gradients#231".var"#1#3"{Int64, Nothing, PEPSKit.SequentialCTMRG}, x::Tuple{PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}}, d::PEPSKit.InfinitePEPS{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}; alpha::StepRangeLen{Float64, Base.TwicePrecision{Float64}, Base.TwicePrecision{Float64}, Int64}, retract::typeof(PEPSKit.peps_retract), inner::typeof(PEPSKit.real_inner))
      @ OptimKit ~/.julia/packages/OptimKit/G6i79/src/OptimKit.jl:132
   [68] macro expansion
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:70 [inlined]
   [69] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1793 [inlined]
   [70] macro expansion
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:45 [inlined]
   [71] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1793 [inlined]
   [72] top-level scope
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/ctmrg/gradients.jl:1812
   [73] include(mod::Module, _path::String)
      @ Base ./Base.jl:557
   [74] include(x::String)
      @ Main.var"##Gradients#231" ~/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:28
   [75] macro expansion
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/runtests.jl:18 [inlined]
   [76] macro expansion
      @ /opt/hostedtoolcache/julia/1.11.4/x64/share/julia/stdlib/v1.11/Test/src/Test.jl:1704 [inlined]
   [77] macro expansion
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/runtests.jl:18 [inlined]
   [78] top-level scope
      @ ~/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:30
   [79] eval(m::Module, e::Any)
      @ Core ./boot.jl:430
   [80] macro expansion
      @ ~/.julia/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:28 [inlined]
   [81] macro expansion
      @ ./timing.jl:581 [inlined]
   [82] macro expansion
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/runtests.jl:17 [inlined]
   [83] macro expansion
      @ ./timing.jl:581 [inlined]
   [84] top-level scope
      @ ~/work/PEPSKit.jl/PEPSKit.jl/test/runtests.jl:315
   [85] include(fname::String)
      @ Main ./sysimg.jl:38
   [86] top-level scope
      @ none:6
   [87] eval
      @ ./boot.jl:430 [inlined]
   [88] exec_options(opts::Base.JLOptions)
      @ Base ./client.jl:296
  
      nested task error: TypeError: in keyword argument alg, expected Union{TensorKit.SDD, TensorKit.SVD}, got a value of type PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}
      Stacktrace:
        [1] kwcall(::@NamedTuple{alg::PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, trunc::TensorKit.TruncationSpace{TensorKit.ComplexSpace}, p::Int64}, ::typeof(ChainRulesCore.rrule), ::Zygote.ZygoteRuleConfig{Zygote.Context{false}}, ::Function, ::TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}})
          @ ChainRulesCore ~/.julia/packages/ChainRulesCore/U6wNx/src/rules.jl:144
        [2] chain_rrule_kw
          @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:245 [inlined]
        [3] macro expansion
          @ ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0 [inlined]
        [4] _pullback(::Zygote.Context{false}, ::typeof(Core.kwcall), ::@NamedTuple{alg::PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, trunc::TensorKit.TruncationSpace{TensorKit.ComplexSpace}, p::Int64}, ::typeof(TensorKit.tsvd!), ::TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:91
        [5] #tsvd!#27
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/svd.jl:110 [inlined]
        [6] _pullback(::Zygote.Context{false}, ::PEPSKit.var"##tsvd!#27", ::TensorKit.TruncationSpace{TensorKit.ComplexSpace}, ::Int64, ::typeof(TensorKit.tsvd!), ::TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}}, ::PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
        [7] tsvd!
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/svd.jl:109 [inlined]
        [8] _pullback(::Zygote.Context{false}, ::typeof(Core.kwcall), ::@NamedTuple{trunc::TensorKit.TruncationSpace{TensorKit.ComplexSpace}}, ::typeof(TensorKit.tsvd!), ::TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}}, ::PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
        [9] compute_projector
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/projectors.jl:149 [inlined]
       [10] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.compute_projector), ::Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 3, Vector{ComplexF64}}}, ::Tuple{Int64, Int64, Int64}, ::PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, TensorKit.TruncationSpace{TensorKit.ComplexSpace}})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
       [11] sequential_projectors
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/sequential.jl:94 [inlined]
       [12] _pullback(::Zygote.Context{false}, ::typeof(PEPSKit.sequential_projectors), ::Tuple{Int64, Int64, Int64}, ::PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, ::PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, ::PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, TensorKit.TruncationSpace{TensorKit.ComplexSpace}})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
       [13] #328
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/algorithms/ctmrg/sequential.jl:80 [inlined]
       [14] _pullback(ctx::Zygote.Context{false}, f::PEPSKit.var"#328#329"{PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation}}, args::Tuple{Int64, Int64})
          @ Zygote ~/.julia/packages/Zygote/3To5I/src/compiler/interface2.jl:0
       [15] #rrule_via_ad#59
          @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:270 [inlined]
       [16] rrule_via_ad
          @ ~/.julia/packages/Zygote/3To5I/src/compiler/chainrules.jl:258 [inlined]
       [17] #15
          @ ~/work/PEPSKit.jl/PEPSKit.jl/src/utility/diffable_threads.jl:17 [inlined]
       [18] iterate
          @ ./generator.jl:48 [inlined]
       [19] _collect(c::SubArray{Tuple{Int64, Int64}, 1, Vector{Tuple{Int64, Int64}}, Tuple{UnitRange{Int64}}, true}, itr::Base.Generator{SubArray{Tuple{Int64, Int64}, 1, Vector{Tuple{Int64, Int64}}, Tuple{UnitRange{Int64}}, true}, PEPSKit.var"#15#18"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, PEPSKit.var"#328#329"{PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation}}}}, ::Base.EltypeUnknown, isz::Base.HasShape{1})
          @ Base ./array.jl:811
       [20] collect_similar
          @ ./array.jl:720 [inlined]
       [21] map
          @ ./abstractarray.jl:3371 [inlined]
       [22] #132
          @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:448 [inlined]
       [23] mapreduce_first
          @ ./reduce.jl:421 [inlined]
       [24] _mapreduce(f::OhMyThreads.Implementation.var"#132#135"{PEPSKit.var"#15#18"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, PEPSKit.var"#328#329"{PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation}}}, Tuple{Vector{Tuple{Int64, Int64}}}}, op::typeof(BangBang.append!!), ::IndexLinear, A::SubArray{UnitRange{Int64}, 1, Vector{UnitRange{Int64}}, Tuple{UnitRange{Int64}}, true})
          @ Base ./reduce.jl:432
       [25] _mapreduce_dim
          @ ./reducedim.jl:337 [inlined]
       [26] mapreduce
          @ ./reducedim.jl:329 [inlined]
       [27] #27
          @ ~/.julia/packages/OhMyThreads/eiaNP/src/implementation.jl:110 [inlined]
       [28] (::OhMyThreads.Implementation.var"#28#36"{StableTasks.AtomicRef{Any}, OhMyThreads.Implementation.var"#27#35"{Tuple{SubArray{UnitRange{Int64}, 1, Vector{UnitRange{Int64}}, Tuple{UnitRange{Int64}}, true}}, @NamedTuple{}, OhMyThreads.Implementation.var"#132#135"{PEPSKit.var"#15#18"{Zygote.ZygoteRuleConfig{Zygote.Context{false}}, PEPSKit.var"#328#329"{PEPSKit.InfiniteSquareNetwork{Tuple{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 4, Vector{ComplexF64}}}}, PEPSKit.CTMRGEnv{TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 1, 1, Vector{ComplexF64}}, TensorKit.TensorMap{ComplexF64, TensorKit.ComplexSpace, 3, 1, Vector{ComplexF64}}}, PEPSKit.HalfInfiniteProjector{PEPSKit.SVDAdjoint{TensorKit.SDD, KrylovKit.Arnoldi{KrylovKit.ModifiedGramSchmidt2, Float64}, Nothing}, PEPSKit.FixedSpaceTruncation}}}, Tuple{Vector{Tuple{Int64, Int64}}}}, typeof(BangBang.append!!)}})()
          @ OhMyThreads.Implementation ~/.julia/packages/StableTasks/cSY7d/src/internals.jl:101

pbrehmer · 2025-03-13T10:16:43Z

Hmm, I am confused by this since I feel this definitely worked before. Judging by the stacktrace, this is not a numerical problem but an issue with the algorithm creation? I will take a look now and fix it.

...the SVD rrule alg should not have any effect when gradient_alg is nothing

Well, the SVD rrule alg should have an effect since nothing also just differentiates through all the tsvd! calls which use the rrule alg that is specified in the SVDAdjoint. Still, this should work.

Does it make sense to use the iterative rrule with the diffgauge iterscheme in the first place?

It would make sense to use an iterative rrule only when the specified forward SVD algorithm is iterative since then the SVD cannot return the full U, S and V such that TensorKit's rrule cannot be used. If, however, TensorKit's tsvd! is used in the forward pass anyway, then of course the rrule algorithm shouldn't be iterative.

leburgel · 2025-03-13T10:35:53Z

It would make sense to use an iterative rrule only when the specified forward SVD algorithm is iterative since then the SVD cannot return the full U, S and V such that TensorKit's rrule cannot be used. If, however, TensorKit's tsvd! is used in the forward pass anyway, then of course the rrule algorithm shouldn't be iterative.

I agree it does not make much sense to use the iterative rrule with a tsvd forward pass, but my question was more whether this should actually run or not (even if it's not a smart thing to do). If we don't need this to run, we should probably explicitly disallow this through the keyword selector.

* Implement norm-preserving retractions * Drop implicit assumption that state has unit norm * Update src/algorithms/optimization/peps_optimization.jl Co-authored-by: Lukas Devos <ldevos98@gmail.com> * Increase tol for cotangent gauge sensitivity warnings * Fix typo * Not too high, not too low? * Definitely too low * Very arbitrary * Restore tolerance, reduce bond dimension instead? * No real difference, so restore bond dimension --------- Co-authored-by: Lukas Devos <ldevos98@gmail.com>

pbrehmer · 2025-03-13T11:17:09Z

The problem was the typing of the SVD pullback. Apparently, some algorithm combinations led to the case of the TensorKit _tsvd function being used which explicitly restricts to Union{SDD,SVD} for the alg keyword. I circumvented this by handling all tsvd calls inside PEPSKit now (which I anyway find cleaner for the moment, given that the SVD interface will be refactored at some point) and defining the rrule on all the PEPSKit tsvd!s.

I agree it does not make much sense to use the iterative rrule with a tsvd forward pass, but my question was more whether this should actually run or not (even if it's not a smart thing to do).

I would say this should run since this is a valid algorithm combination in principle. Sure, one wouldn't use this for practical purposes but perhaps for methodological reason. Also given that the iterative rrules tend to have a few hickups, I find it useful to always test them in different combinations to see whether they work.

…Kit.jl into lb/svdsolve_rrule_tol

leburgel · 2025-03-13T16:00:29Z

Thanks a lot for the fixes @pbrehmer! Everything seems okay now, but the CTMRG tests time out because the gradient test covers a lot more combinations now. Since this is not super urgent I was thinking of taking some more time here and:

Merging CTMRG support for PEPS-PEPO-PEPS networks #134 first
Splitting off the CTMRG gradient test into a separate workflow since this is very big just by itself
Adding some extra combinations for the p-wave superconductor test to ensure the stability improvements actually make everything usable for more challenging models as well

How does that sound?

pbrehmer · 2025-03-13T16:37:09Z

Likewise thanks for your work!

Everything seems okay now, but the CTMRG tests time out because the gradient test covers a lot more combinations now.

In my experience most of the time in that test is spent in first compiling Zygote (especially for the fermionic p-wave superconductor!) and then in all the combinations where gradient_alg=nothing. The rest usually is relatively fast. So it's good to keep that in mind if we want to speed up that test.

leburgel · 2025-03-14T08:40:06Z

I split off the gradient tests into a separate workflow and tried some more combinations for the p-wave superconductor gradient test. Everything runs, which is nice, but is seems that the only combination that gives a correct gradient for the current setup is ctmrg_alg=:sequential, projector_alg=:halfinfinite and gradient_iterscheme=:diffgauge. The other CTMRG, projector and iterscheme flavors fail. I filed a separate issue on this (#158) and will have a go at figuring this out for a follow up, but I think actually solving this is a bit beyond the scope of this PR.

Another issue is the compilation time and the runtimes for gradient_alg=nothing, which really eat up most of the time limit of the workflow. I'll trim things down here until they no longer time out, but again we should probably figure out how we can test more combinations within a reasonable amount of time in a follow-up.

lkdvos · 2025-03-14T13:38:49Z

Just as a sidenote, if we feel this is warranted we can also increase the allowed time. I manually reduced this a bit to since this tends to indicate something fishy is going on, but if we know this is just how long it takes we can obviously increase that again

leburgel · 2025-03-14T14:54:14Z

I bumped the environment bond dimension, which fixes #158, and added in all working combinations for both models. I left out the gradient_alg = nothing tests for the p-wave superconductor, since this takes over 15 minutes per test case at 6 possible combinations so that's pretty much hopeless.

This should now give an idea of how long things take. I can try to reduce the bond dimension a bit in the hopes that the p-wave tests then still pass, but still I think the only way of getting this to not time out is to either increase the limit (we'll see how far it gets now), or drop the gradient_alg = nothing tests for the Heisenberg model as well.

leburgel · 2025-03-17T10:50:02Z

It turns out that a very small environment bond dimension increase is enough to make the p-wave gradient tests pass, so I reduced this again. Still, the gradient tests time out severely. I think completing all of them would take double the time of the current timeout, so just increasing the time limit would not be ideal as the test suite would then take very long to complete.

The only reasonable way to fix this aside from just increasing the time limit I can think of is to not test the trivial gradient_alg=nothing for all other algorithm combinations. These cases take several minutes each and really eat up the testing time. We could allow one test with gradient_alg=nothing per model just to make sure this works in the first place, and then skip this gradient mode for all other combinations. This is not ideal since we lose some coverage on naively differentiating through all algorithms, but all the building blocks of this are still covered by the other tests so making sure it works fully in one case might be enough. What do we think?

lkdvos · 2025-03-17T11:16:07Z

Which part of these tests are the expensive ones? Can we get away with doing a single gradient evaluation for all of them, and would that help? What I mean is, can we converge something, and then simply check if all methods give more or less the same gradient

leburgel · 2025-03-17T11:24:04Z

The really expensive ones are those where Zygote just automatically backpropagates through all CTMRG iterations to get the gradient (these >5 minutes per case versus around one minute for all other gradient modes). In general, it's really the gradient computation itself that takes the most time, so even reusing a pre-converged CTMRG environment would not really help speed things up much I think.

lkdvos · 2025-03-17T11:47:54Z

Have we ever tried the trick of first computing the environment, and then doing AD through a limited number of iterations?

leburgel · 2025-03-17T12:04:52Z

I'm pretty sure that's what we're already doing in the test actually. As far as I can tell the environment is pre-converged first and then optimtest is called on the derivative of a leading_boundary with the default miniter=4. So unless the first run did not converge within maxiter without us knowing, the derivative is computed using only 4 CTMRG iterations. I'll check if the first leading_boundary converges properly, but if it does I don't really know if there's more we can do.

leburgel · 2025-03-17T12:47:43Z

I'm pretty sure that's what we're already doing in the test actually.

Okay that's not entirely true. We're pre-converging the environment once at the initial point but then for every retraction in the optimtest the gradient is computed on the full leading_boundary run for the retracted point starting from this one pre-converged environment. But these leading_boundary runs take at most 20 iterations for the step sizes chosen in the test, so that's really already not very much. I tried splitting the leading_boundary on the retracted points into two parts, where I first fully converge it and then compute the gradient using a dummy run with 4 iterations, but this breaks the tests for the naive gradient mode. It seems the gradient obtained by differentiating through a few effectively converged iterations is not accurate enough, which is also to be expected.

I also thought of decreasing the number of step sizes tried in the optimtest, but we're only using 5 step sizes right now so it really feels like we shouldn't decrease that further.

pbrehmer · 2025-03-17T12:49:07Z

The only reasonable way to fix this aside from just increasing the time limit I can think of is to not test the trivial gradient_alg=nothing for all other algorithm combinations.

I would suggest cutting down on the combinations with gradient_alg=nothing. We could for example limit these tests to the Heisenberg model with (ctmrg_alg, projector_alg, svd_rrule_alg, gradient_alg):

(:simultaneous, :halfinfinite, :tsvd, nothing)
(:simultaneous, :fullinfinite, :tsvd, nothing)
(:sequential, :halfinfinite, :tsvd, nothing)

Note that gradient_iterscheme doesn't make any difference with gradient_alg=nothing so we can fix that to one option. I think all other combinations should already be sufficiently tested by the other gradient algorithms, especially given the execution time constraints.

leburgel · 2025-03-17T14:29:24Z

Even after these cuts, two timeouts on v1.11, by at most few minutes. I don't think we should cut any more test combinations and I don't really see any more obvious ways of speeding things up. How would we feel about increasing the timeout by 15 minutes?

lkdvos · 2025-03-17T14:38:03Z

You can do so by adding a timeout-minutes: 75 keyword here

lkdvos

I didn't go through this in detail again, but to me this definitely looks good. Maybe @pbrehmer can also approve and we can then merge?

pbrehmer

Thanks @leburgel for vastly improving the gradient tests! I left a comment on having a user option for gradient_alg=nothing but that can be addressed in another PR. We should keep in mind that the SVD interface is still quite messy but that's okay for now, I guess.

pbrehmer · 2025-03-17T16:40:51Z

test/gradients/ctmrg_gradients.jl

+        )
+        # instantiate because hook_pullback doesn't go through the keyword selector...
+        concrete_gradient_alg = if isnothing(gradient_alg)
+            nothing # TODO: add this to the PEPSKit.GradMode selector?


I was also considering this and I am not sure about it. From the user perspective it would make a lot of sense to have a :naive option which differentiates through all the CTMRG iterations via gradient_alg=nothing, mostly as a reference method. However, internally it would be weird to do this inside the GradMode constructor since Nothing is definitely not a subtype of GradMode. Of course we could ignore that for the sake usability but I am unsure about that.

In any case, this should probably be handled in a different PR, so let's keep this in mind for now.

I think my main problem here was that I needed to instantiate the algorithm to use with hook_pullback, and my first attempt of calling GradMode(; alg=nothing) didn't work so I left the comment more as a reminder to myself. I don't think this is really much of a problem, as the keyword selector as a whole hadles calling fixedpoint(...; gradient_alg=nothing, ...) perfectly well. So we can open an issue and make it so we can use GradMode directly for this in the future, but I'm not sure if this is very useful.

A working example of how to do naive AD would solve everything as well, and I think we're anyway going to add that in the docs update.

leburgel added 2 commits March 11, 2025 12:23

Scale rrule_alg tolerance with singular value magnitude when using …

9971bf8

…`KrylovKit.svdsolve` pullback

Format

93ab8d0

leburgel mentioned this pull request Mar 11, 2025

try more stable svd ad rule computation Jutho/KrylovKit.jl#123

Merged

leburgel and others added 2 commits March 11, 2025 15:52

Be more explicit

435cf1b

Done

a4a4b88

leburgel requested a review from pbrehmer March 12, 2025 09:20

pbrehmer previously approved these changes Mar 12, 2025

View reviewed changes

pbrehmer mentioned this pull request Mar 12, 2025

Implement TensorKit's tsvd! reverse-rule for :fixed differentiation mode #150

Merged

leburgel dismissed pbrehmer’s stale review via 637e925 March 12, 2025 14:49

Merge branch 'master' into lb/svdsolve_rrule_tol

637e925

lkdvos reviewed Mar 12, 2025

View reviewed changes

src/utility/svd.jl Outdated Show resolved Hide resolved

src/utility/svd.jl Outdated Show resolved Hide resolved

Update src/utility/svd.jl

095d40d

Co-authored-by: Lukas Devos <ldevos98@gmail.com>

leburgel added 2 commits March 12, 2025 17:57

Address review suggestions

d624fd4

Fix

f5d8d83

lkdvos previously approved these changes Mar 12, 2025

View reviewed changes

Increase gradient test coverage

11e02da

leburgel dismissed lkdvos’s stale review via 11e02da March 12, 2025 21:13

leburgel added 2 commits March 13, 2025 08:21

Wrong filter

1563ab2

Filter was right all along

39d7861

pbrehmer and others added 3 commits March 13, 2025 11:51

Fix SVD pullback typing

2ae3278

Merge branch 'master' into lb/svdsolve_rrule_tol

9ccb1f3

pbrehmer and others added 3 commits March 13, 2025 13:48

Fix SVD mess

a161cf3

Merge branch 'lb/svdsolve_rrule_tol' of github.com:QuantumKitHub/PEPS…

d34900e

…Kit.jl into lb/svdsolve_rrule_tol

Fix typo

466e8ab

leburgel added 2 commits March 13, 2025 19:06

Merge branch 'master' into lb/svdsolve_rrule_tol

eb916b0

Split off gradient tests

e11e2db

Attempt at testing all working combinations

7b2821a

A bit less also works

916169a

Cut down on algorithm combinations for naive gradient test

89f454f

leburgel added 2 commits March 17, 2025 15:38

Fix testset name

f176156

Increase test timeout

795d960

lkdvos approved these changes Mar 17, 2025

View reviewed changes

pbrehmer approved these changes Mar 17, 2025

View reviewed changes

leburgel merged commit 3f62147 into master Mar 17, 2025
39 checks passed

leburgel deleted the lb/svdsolve_rrule_tol branch March 17, 2025 19:21

Conversation

leburgel commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

pbrehmer left a comment

Choose a reason for hiding this comment

Uh oh!

leburgel commented Mar 12, 2025

Uh oh!

Uh oh!

Uh oh!

leburgel commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leburgel commented Mar 12, 2025

Uh oh!

leburgel commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pbrehmer commented Mar 13, 2025

Uh oh!

leburgel commented Mar 13, 2025

Uh oh!

pbrehmer commented Mar 13, 2025

Uh oh!

leburgel commented Mar 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pbrehmer commented Mar 13, 2025

Uh oh!

leburgel commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkdvos commented Mar 14, 2025

Uh oh!

leburgel commented Mar 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leburgel commented Mar 17, 2025

Uh oh!

lkdvos commented Mar 17, 2025

Uh oh!

leburgel commented Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkdvos commented Mar 17, 2025

Uh oh!

leburgel commented Mar 17, 2025

Uh oh!

leburgel commented Mar 17, 2025

Uh oh!

pbrehmer commented Mar 17, 2025

Uh oh!

leburgel commented Mar 17, 2025

Uh oh!

lkdvos commented Mar 17, 2025

Uh oh!

lkdvos left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pbrehmer left a comment

Choose a reason for hiding this comment

Uh oh!

pbrehmer Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

leburgel Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

leburgel commented Mar 11, 2025 •

edited

Loading

codecov bot commented Mar 11, 2025 •

edited

Loading

leburgel commented Mar 12, 2025 •

edited

Loading

leburgel commented Mar 13, 2025 •

edited

Loading

leburgel commented Mar 13, 2025 •

edited

Loading

leburgel commented Mar 14, 2025 •

edited

Loading

leburgel commented Mar 14, 2025 •

edited

Loading

leburgel commented Mar 17, 2025 •

edited

Loading

lkdvos left a comment •

edited

Loading