Sequential CTMRG is slow compared to Python (with PyTorch)

I use the CTMRG algorithm to measure the Heisenberg model ground state obtained from simple update. The algorithm settings are
```
trscheme = truncerr(1e-8) & truncdim(12)
ctm_alg = CTMRG(; tol=1e-12, miniter=4, maxiter=100, verbosity=3, trscheme=trscheme)
```
The PEPS bond dimension is D = 6, and the environment bond dimension is χ = 12. Starting from random CTMRGEnv, it takes about 1.3s to perform one CTMRG step:
```
[ Info: CTMRG init:     obj = +2.310478936455e-09       err = 1.0000e+00
[ Info: CTMRG   1:      obj = +2.532703514106e-01       err = 3.4933131838e-01  time = 4.43 sec
[ Info: CTMRG   1:      obj = +2.532703514106e-01       err = 3.4933131838e-01  time = 0.18 sec
[ Info: CTMRG   2:      obj = +7.808860223845e-01       err = 1.5017305263e-01  time = 1.46 sec
[ Info: CTMRG   2:      obj = +7.808860223845e-01       err = 1.5017305263e-01  time = 0.00 sec
[ Info: CTMRG   3:      obj = +9.380692547858e-01       err = 4.9406274229e-02  time = 1.21 sec
[ Info: CTMRG   3:      obj = +9.380692547858e-01       err = 4.9406274229e-02  time = 0.00 sec
[ Info: CTMRG   4:      obj = +9.836562322063e-01       err = 2.2671232670e-02  time = 1.19 sec
[ Info: CTMRG   4:      obj = +9.836562322063e-01       err = 2.2671232670e-02  time = 0.00 sec
[ Info: CTMRG   5:      obj = +9.991752105702e-01       err = 9.6623034220e-03  time = 1.35 sec
[ Info: CTMRG   5:      obj = +9.991752105702e-01       err = 9.6623034220e-03  time = 0.00 sec
[ Info: CTMRG   6:      obj = +1.004775679484e+00       err = 3.8899763085e-03  time = 1.34 sec
[ Info: CTMRG   6:      obj = +1.004775679484e+00       err = 3.8899763085e-03  time = 0.00 sec
...
```
However, using my own Python implementation (using PyTorch; the projectors are also found from the half-infinite environment), it only takes about 0.7s per step, about twice the speed of PEPSKit:
```
iter      svd_diff    time/s
0       2.2900e+01      0.64
1       2.6927e-01      0.64
2       4.4606e-02      0.65
3       8.5458e-03      0.65
4       1.3304e-03      0.66
5       1.8990e-04      0.65
6       2.6252e-05      0.75
...
```
Here `svd_diff` is the convergence criterion calculated as follows (a little bit different from the `err` of PEPSKit):
- Calculate singular values spectrum for each CTM tensor before and after the RG step
- Calculate the 2-norm of the spectrum difference for each CTM tensor
- Sum them up and devide by 8 * N_row * N_col

I tried to use the functions in PEPSKit to write a simpler version without the fancy autodiff stuff, then the speed can be improved to about 0.9s per RG step, but is still slower than PyTorch:
```
1         4.5357e-01    11.248 s
2         2.0233e-01     0.984 s
3         3.5182e-02     0.769 s
4         7.0871e-03     0.952 s
5         1.2966e-03     0.757 s
6         2.1065e-04     0.953 s
```
So my concern is that the auto-diff stuff from Zygote, etc may cause too much performance overhead for applications not using auto-diff of CTMRG.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sequential CTMRG is slow compared to Python (with PyTorch) #81

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Sequential CTMRG is slow compared to Python (with PyTorch) #81

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions