Parallelization not working on ITER

On ITER clusters, using @munechika-koyo 's camera (which is big with 250,000 LOS) and an edge emissivity field from JINTRAC, the computation time (for res= 10 cm, just for testing) does not seem to be accelerated by parallelization:

Minimum working example (tofu 1.4.2-a5):
-----------------------------------------------
```
In [1]: import tofu as tf
/home/ITER/vezined/ToFu_All/tofu/tofu/__init__.py:95: UserWarning: 
The following subpackages are not available:
    - tofu.mag
  => see tofu.dsub[<subpackage>] for details.
  warnings.warn(msg)

In [2]: cam = tf.load('/home/ITER/munechk/public/MyTofu/output/ITER_test_camera_config.npz')
Loaded from:
    /home/ITER/munechk/public/MyTofu/output/ITER_test_camera_config.npz

In [3]: multi = tf.imas2tofu.MultiIDSLoader(user='hoeneno', tokamak='convert', shot=134000, run=29, ids=['core_sources', 'equilibrium', 'edge_sources'])
Getting ids   [occ]  tokamak  user     version  shot    run  refshot  refrun
------------  -----  -------  -------  -------  ------  ---  -------  ------
core_sources  [0]    convert  hoeneno  3        134000  29   -1       -1    
edge_sources  [0]    "        "        "        "       "    "        "     
equilibrium   [0]    "        "        "        "       "    "        "     

In [4]: _dshort = {'core_sources': {'1drhotn':'source[identifier.name=radiation].profiles_1d[time].grid.rho_tor_norm',
   ...:    ...:                             '1deEnergy':'source[identifier.name=radiation].profiles_1d[time].electrons.energy'
   ...:    ...:                             },
   ...:    ...:            'equilibrium': {'2dpsi': 'time_slice[time].profiles_2d[0].psi',
   ...:    ...:                            '2dmeshR': 'time_slice[time].profiles_2d[0].r',
   ...:    ...:                            '2dmeshZ': 'time_slice[time].profiles_2d[0].z'}}
   ...:                            

In [5]: multi.set_shortcuts(dshort=_dshort)

In [6]: plasma = multi.to_Plasma2D(shapeRZ=('R', 'Z'))
/home/ITER/vezined/ToFu_All/tofu/tofu/imas2tofu/_core.py:1972: UserWarning: The following data could not be retrieved:
	- equilibrium:
		2dB  : '2dBT'
		2dBR  : list index out of range
		2dBT  : list index out of range
		2dBZ  : list index out of range
		2djT  : list index out of range
		2dmeshFaces  : list index out of range
		2dmeshNodes  : list index out of range
		2dphi  : list index out of range
		2dpsi  : list index out of range
		2drhopn  : '2dpsi'
		2drhotn  : '2dphi'
		2dtheta  : '2dmeshNodes'
		strike0  : 'strike0R'
		strike0R  : list index out of range
		strike0Z  : list index out of range
		strike1  : 'strike1R'
		strike1R  : list index out of range
		strike1Z  : list index out of range
		x0  : 'x0R'
		x0R  : list index out of range
		x0Z  : list index out of range
		x1  : 'x1R'
		x1R  : list index out of range
		x1Z  : list index out of range
	- core_sources:
		1dbrem  : No / several matching signals for:     - source[]['identifier', 'name'] = bremsstrahlung     - nb.of matches: 0
		1dline  : No / several matching signals for:     - source[]['identifier', 'name'] = lineradiation     - nb.of matches: 0
		1dprad  : '1dbrem'
		1dpsi  : No / several matching signals for:     - source[]['identifier', 'name'] = lineradiation     - nb.of matches: 0
		1drhopn  : '1dpsi'
		1drhotn  : No / several matching signals for:     - source[]['identifier', 'name'] = lineradiation     - nb.of matches: 0
  warnings.warn(msg)

In [7]: %timeit sig_sum, units =  cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, method='sum', minimize='calls', num_threads=1)
12.8 s ± 61.7 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [8]: %timeit sig_sum, units =  cam.calc_signal_from_Plasma2D(plasma, quant='edge_sources.2dradiation', plot=False, res=0.1, method='sum', minimize='calls', num_threads=10)
13 s ± 88.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
```
The computation time is virtually the same.

Checking the presence of openmp:
---------------------------------------
I checked the presence of openmp based on the test in the setup.py:

```
In [11]: omp_test = r"""
    ...: #include <omp.h>
    ...: #include <stdio.h>
    ...: int main() {
    ...: #pragma omp parallel
    ...: printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(),
    ...:        omp_get_num_threads());
    ...: }
    ...: """
    ...: 
    ...: 
    ...: def check_for_openmp(cc_var):
    ...:     import tempfile
    ...: 
    ...:     tmpdir = tempfile.mkdtemp()
    ...:     curdir = os.getcwd()
    ...:     os.chdir(tmpdir)
    ...: 
    ...:     filename = r"test.c"
    ...:     with open(filename, "w") as file:
    ...:         file.write(omp_test)
    ...:     with open(os.devnull, "w") as fnull:
    ...:         result = subprocess.call(
    ...:             [cc_var, "-fopenmp", filename], stdout=fnull, stderr=fnull
    ...:         )
    ...: 
    ...:     os.chdir(curdir)
    ...:     # clean up
    ...:     shutil.rmtree(tmpdir)
    ...:     return result
    ...: 
    ...: 

In [12]: import subprocess

In [13]: import shutil

In [14]: openmp_installed = not check_for_openmp("cc")

In [15]: openmp_installed
Out[15]: True

```

So openmp is apparently available.

But, if I open another terminal in parallel and try to monitor the CPU usage during the executaiuon of the two %timeit commands above using 
top -u vezined

I see that the CPU usage is effectively limited to 100%, meaning that despite the presence of openmp and num_threads=10, we are limited to 1 CPU only.

Possible causes on my opinion:
-----------------------------------
* I suspect this is due to the fact that we are running from inside the ipython console and that ipython was allocated only one CPU was it was first started.
    => this seems to hold some valuable information on that point:
          http://ipython.org/ipython-doc/stable/parallel/parallel_intro.html 
* Or, it could be that the system admins allocated, by default, only one CPU per user on the cluster

What do you think @lasofivec ?


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelization not working on ITER #307

Minimum working example (tofu 1.4.2-a5):

Checking the presence of openmp:

Possible causes on my opinion:

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallelization not working on ITER #307

Description

Minimum working example (tofu 1.4.2-a5):

Checking the presence of openmp:

Possible causes on my opinion:

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions