Faster los calc signal sum calls for newcalc=False by Didou09 · Pull Request #308 · ToFuProject/tofu

Didou09 · 2019-12-02T15:17:00Z

Motivations:

In the case where newcalc is False, we use a for loop over the LOS to compute the sum (integral).
I wanted to know whether there might be more efficient function available in numpy.
I asked the question on Stackoverflow.

It exists, and here it is: np.add.reduceat()

Ref:
https://stackoverflow.com/questions/59079141/perform-numpy-sum-or-scipy-integrate-simps-on-large-splitted-array-efficient/59085492#59085492

This PR is interesting because the algorithm used when newcalc=False is very similar to the one used when (newcalc=True, method='sum', minimize='calls').
So acceleration in one case may be useful in the other.

Main changes:

Replaced the for loop in Rays.calc_signal(newcalc=False) and Rays.calc_signal_from_Plasma2D(newcalc=False) by np.add.reduceat()
While testing, a bug appeared in Plasma2D._get_indtu(), when t has only one time step. Fixed.

Benefits:

A small benchmark on ITER (git + python setup.py build_ext) yielded (on a case with 250,000 LOS but only res=0.1):

In [1]: import tofu as tf
m/home/ITER/vezined/ToFu_All/tofu/tofu/__init__.py:95: UserWarning: 
The following subpackages are not available:
    - tofu.mag
  => see tofu.dsub[<subpackage>] for details.
  warnings.warn(msg)

In [2]: multi = tf.imas2tofu.MultiIDSLoader(user='hoeneno', tokamak='convert', shot=134000, run=29, ids=['edge_sources'])
Getting ids   [occ]  tokamak  user     version  shot    run  refshot  refrun
------------  -----  -------  -------  -------  ------  ---  -------  ------
edge_sources  [0]    convert  hoeneno  3        134000  29   -1       -1    

In [3]: _dshort = {'core_sources': {'1drhotn':'source[identifier.name=radiation].profiles_1d[time].grid.rho_tor_norm',
   ...:    ...:                             '1deEnergy':'source[identifier.name=radiation].profiles_1d[time].electrons.energy'
   ...:    ...:                             },
   ...:    ...:            'equilibrium': {'2dpsi': 'time_slice[time].profiles_2d[0].psi',
   ...:    ...:                            '2dmeshR': 'time_slice[time].profiles_2d[0].r',
   ...:    ...:                            '2dmeshZ': 'time_slice[time].profiles_2d[0].z'}}
   ...:                            

In [4]: multi.set_shortcuts(dshort=_dshort)

In [5]: plasma = multi.to_Plasma2D()

In [6]: cam = tf.load('/home/ITER/munechk/public/MyTofu/output/ITER_test_camera_config.npz')
sLoaded from:
    /home/ITER/munechk/public/MyTofu/output/ITER_test_camera_config.npz

In [7]: sig_oldbasic, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=False, ufunc=False)

In [8]: sig_oldufunc, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=False, ufunc=True, fill_value=0.)

In [9]: sig_newbasic, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=True, method='sum', minimize='calls', ufunc=False)

In [10]: np.allclose(sig_oldbasic.data, sig_oldufunc.data)
Out[10]: True

In [11]: np.allclose(sig_newbasic.data, sig_oldufunc.data)
Out[11]: True

In [12]: %timeit sig_oldbasic, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=False, ufunc=False)
11.8 s ± 90.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [13]: %timeit sig_oldufunc, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=False, ufunc=True, fill_value=0.)
8.77 s ± 19.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [14]: %timeit sig_newbasic, units = cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, newcalc=True, method='sum', minimize='calls', ufunc=False)
9.94 s ± 19.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Notice the 25% speed gain when newcalc=False.

The comparison with newcalc=True and a look at the code shows that in _GG.pyx, in the case (method='sum', minimize='calls'), the summing operation (l. 2991-2997) is done is a loop which is not parallelized, we could either:

Parallelize this loop ?
Quick hotfix : use the same function np.add.reduceat() to suppress the loop ?

What do you think @lasofivec ?

P.S.: keep in mind that currently, running tofu on the ITER clusters in ipython does not make use of the parallelization (cf. issue #307 ), which explains why in the above benchmark newcalc=True is not faster than newcalc=True, since it runs on only one CPU

…educeat and made Plasma2D.get_intu() more robust vs one-time step

pep8speaks · 2019-12-02T15:17:09Z

Hello @Didou09! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-12-02 15:20:40 UTC

Didou09 · 2019-12-02T15:22:42Z

        if t is not None:
-            indt = np.digitize(t, tbinall)
-            indtu = np.unique(indt)
+            if len(t) == len(tall) and np.allclose(t, tall):


handle trivial case first, particularly suited for cases with t.size = 1

Didou09 · 2019-12-02T15:25:22Z

        coefs=None,
        ind=None,
-        out=object,
+        returnas=object,


much more explicit variable name, especially since we tend to use 'out' to store temporary output inside functions.
It as the case in the other method by the way (calc_signal_from_Plasma2D()), out was redefined at line 6254

Didou09 · 2019-12-02T15:26:38Z

-                    )
-                    * reseff[ii]
-                )
+            sig = np.add.reduceat(val, np.r_[0, indpts],


More concise and numerically faster, only advantages :-)
Usable in _GG.LOS_calc_signal(method='sum', minimize='calls') ?
Usable elsewhere ?

Compatible with parallelization ?

nope, only functions that don't require gil can be parallelized, calling numpy functions requires the gil, so it is not possible. This is why it wasn't parallelized with newcalc=True

Didou09 · 2019-12-02T15:27:28Z

                    indpts[0], np.diff(indpts), pts.shape[1] - indpts[-1]
                ]
                vect = np.repeat(self.u, nbrep, axis=1)
+            if fill_value is None:


Avoid nans, they are not handled by np.add.reduceat() => we set fill_value to 0 instead in this case (if not forced by the user)

lasofivec · 2020-01-06T17:44:43Z

It is not possible to parallelize the code that uses numpy function (GIL-requirement).
So I went for the "hotfix" which is not such a hotfix, more an enhancement.
I tried the new version, which seems to be working, I will create a new PR.

lasofivec · 2020-01-06T17:47:40Z

Oh, one comment: in your benchmark, you put the parameter ufunc = [True/False] which is not a valid parameter for the function. I wonder if you were using the wrong version of tofu ? or a newer that was never pushed ?

Didou09 · 2020-01-07T08:19:14Z

It was a temporary version where all options were possible.
Since the becnhmark showed that using the ufunc (Universal Function, in this case np.add.reduceat) was significantly faster, I went for it and cleaned the function to leave only the ufunc=True implementation for newcalc=False, before pushing. That's why you didn't see it, it was a temporary version for benchmarking before pushing.

Didou09

Fine for me when tests are passing,
But there seems to be an issue with array contiguity in unit tests, do you see why ?

Sorry, this comment was intended for the other PR, my mistake

VEZINET Didier added 2 commits December 2, 2019 15:36

[FasterLOSCalcSumCalls] Replace loop on partial sum by ufinc np.add.r…

4a7a4e6

…educeat and made Plasma2D.get_intu() more robust vs one-time step

[(FasterLOSCalcSignalSumCalls] Corrected tests

e9af491

Didou09 added the geom - performance label Dec 2, 2019

Didou09 requested a review from lasofivec December 2, 2019 15:17

Didou09 self-assigned this Dec 2, 2019

[FasterLOSCalcSignalSumCalls] PEP8 Compliance

5d7d5f1

Didou09 commented Dec 2, 2019

View reviewed changes

lasofivec merged commit 00d1e05 into devel Jan 6, 2020

lasofivec deleted the Faster_LOSCalcSignalSumCalls branch January 6, 2020 17:44

lasofivec mentioned this pull request Jan 6, 2020

Faster LOS calc signal with new_calc = True #322

Merged

Didou09 commented Jan 7, 2020

View reviewed changes

Didou09 mentioned this pull request Jan 30, 2020

tofu 1.4.2 preparation #339

Merged

Didou09 mentioned this pull request Mar 10, 2020

Preparing 1.4.3 #364

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster los calc signal sum calls for newcalc=False#308

Faster los calc signal sum calls for newcalc=False#308
lasofivec merged 3 commits intodevelfrom
Faster_LOSCalcSignalSumCalls

Didou09 commented Dec 2, 2019 •

edited

Loading

Uh oh!

pep8speaks commented Dec 2, 2019 •

edited

Loading

Uh oh!

Didou09 Dec 2, 2019

Uh oh!

Didou09 Dec 2, 2019

Uh oh!

Didou09 Dec 2, 2019

Uh oh!

lasofivec Jan 6, 2020

Uh oh!

Didou09 Dec 2, 2019

Uh oh!

lasofivec commented Jan 6, 2020

Uh oh!

lasofivec commented Jan 6, 2020

Uh oh!

Didou09 commented Jan 7, 2020

Uh oh!

Didou09 left a comment •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Didou09 commented Dec 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivations:

Main changes:

Benefits:

Uh oh!

pep8speaks commented Dec 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Comment last updated at 2019-12-02 15:20:40 UTC

Uh oh!

Didou09 Dec 2, 2019

Choose a reason for hiding this comment

Uh oh!

Didou09 Dec 2, 2019

Choose a reason for hiding this comment

Uh oh!

Didou09 Dec 2, 2019

Choose a reason for hiding this comment

Uh oh!

lasofivec Jan 6, 2020

Choose a reason for hiding this comment

Uh oh!

Didou09 Dec 2, 2019

Choose a reason for hiding this comment

Uh oh!

lasofivec commented Jan 6, 2020

Uh oh!

lasofivec commented Jan 6, 2020

Uh oh!

Didou09 commented Jan 7, 2020

Uh oh!

Didou09 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Didou09 commented Dec 2, 2019 •

edited

Loading

pep8speaks commented Dec 2, 2019 •

edited

Loading

Didou09 left a comment •

edited

Loading