Skip to content

calc_background in cryspy takes 75% of the execution time of job.calculate_profile() #152

@elindgren

Description

@elindgren

TL;DR It would be good to be able to disable calc_background in cryspy.

I was playing around with the MCMC notebook and noticed that evaluation of job.calculation_profile() was really slow. I did some profiling, and it seems like the function calc_background() in rhochi_pd.py is taking a lot of time, mainly due to creating various numpy objects.

See the following output from a short cProfile. script:

Thu Nov 14 17:18:14 2024    easystats

         13594 function calls (13470 primitive calls) in 0.037 seconds

   Ordered by: cumulative time
   List reduced from 590 to 25 due to restriction <25>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.037    0.037 {built-in method builtins.exec}
        1    0.000    0.000    0.037    0.037 <string>:1(<module>)
        1    0.000    0.000    0.037    0.037 Job.py:520(calculate_profile)
        1    0.000    0.000    0.036    0.036 Analysis.py:35(calculate_profile)
        1    0.000    0.000    0.036    0.036 computation.py:894(apply_ufunc)
        1    0.000    0.000    0.036    0.036 computation.py:271(apply_dataarray_vfunc)
        1    0.000    0.000    0.036    0.036 computation.py:704(apply_variable_ufunc)
        1    0.000    0.000    0.036    0.036 xarray.py:631(func)
        1    0.000    0.000    0.035    0.035 Inferface.py:139(__fit_func)
        1    0.000    0.000    0.035    0.035 cryspyV2.py:721(fit_func)
        1    0.000    0.000    0.035    0.035 cryspyV2.py:566(full_callback)
        1    0.000    0.000    0.035    0.035 cryspy.py:637(full_calculate)
        1    0.000    0.000    0.035    0.035 cryspy.py:403(powder_1d_calculate)
        1    0.000    0.000    0.035    0.035 cryspy.py:501(do_calc_setup)
        1    0.000    0.000    0.034    0.034 cryspy.py:900(_do_run)
        1    0.000    0.000    0.034    0.034 rhochi_by_dictionary.py:236(rhochi_calc_chi_sq_by_dictionary)
        1    0.002    0.002    0.034    0.034 rhochi_pd.py:65(calc_chi_sq_for_pd_by_dictionary)
   ---> 1    0.010    0.010    0.028    0.028 rhochi_pd.py:31(calc_background)
       10    0.000    0.000    0.018    0.002 fromnumeric.py:53(_wrapfunc)
        2    0.000    0.000    0.018    0.009 numeric.py:561(argwhere)
        3    0.000    0.000    0.018    0.006 fromnumeric.py:1881(nonzero)
        3    0.018    0.006    0.018    0.006 {method 'nonzero' of 'numpy.ndarray' objects}
        1    0.000    0.000    0.002    0.002 structure_factor.py:1067(calc_index_hkl_multiplicity_in_range)
        1    0.000    0.000    0.002    0.002 powder_diffraction_const_wavelength.py:195(calc_profile_pseudo_voight)
       10    0.000    0.000    0.001    0.000 property.py:55(__get__)

The important figure here is cumtime, which lists how much time is spent in a function, including it's called functions. Out of the total of 36 ms, 28 ms is spent in calc_background, while structure_factor.py only takes up 2 ms.

I discussed this with @AndrewSazonov, and he said that cryspy does not need to compute the background due to easydiffraction adding it later. I then I commented out the following lines in rhochi_pd.py in order to skip computing the background:

https://github.com/ikibalin/cryspy/blob/4097fa308a3d2a1e0f50adc720874084f40bd24b/cryspy/procedure_rhochi/rhochi_pd.py#L118-L124

as well as modify this line to read

total_signal_sum = total_signal_plus + total_signal_minus + signal_background -> total_signal_sum = total_signal_plus + total_signal_minus

This change resulted in the execution taking only 9ms, down from the original 36ms.

Thu Nov 14 17:11:54 2024    easystats

         13540 function calls (13416 primitive calls) in 0.009 seconds

   Ordered by: cumulative time
   List reduced from 584 to 25 due to restriction <25>

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.009    0.009 {built-in method builtins.exec}
        1    0.000    0.000    0.009    0.009 <string>:1(<module>)
        1    0.000    0.000    0.009    0.009 Job.py:520(calculate_profile)
        1    0.000    0.000    0.009    0.009 Analysis.py:35(calculate_profile)
        1    0.000    0.000    0.008    0.008 computation.py:894(apply_ufunc)
        1    0.000    0.000    0.008    0.008 computation.py:271(apply_dataarray_vfunc)
        1    0.000    0.000    0.008    0.008 computation.py:704(apply_variable_ufunc)
        1    0.000    0.000    0.008    0.008 xarray.py:631(func)
        1    0.000    0.000    0.008    0.008 Inferface.py:139(__fit_func)
        1    0.000    0.000    0.008    0.008 cryspyV2.py:721(fit_func)
        1    0.000    0.000    0.008    0.008 cryspyV2.py:566(full_callback)
        1    0.000    0.000    0.008    0.008 cryspy.py:637(full_calculate)
        1    0.000    0.000    0.008    0.008 cryspy.py:403(powder_1d_calculate)
        1    0.000    0.000    0.007    0.007 cryspy.py:501(do_calc_setup)
        1    0.000    0.000    0.006    0.006 cryspy.py:900(_do_run)
        1    0.000    0.000    0.006    0.006 rhochi_by_dictionary.py:236(rhochi_calc_chi_sq_by_dictionary)
        1    0.002    0.002    0.006    0.006 rhochi_pd.py:65(calc_chi_sq_for_pd_by_dictionary)
        1    0.000    0.000    0.002    0.002 structure_factor.py:1067(calc_index_hkl_multiplicity_in_range)
        1    0.000    0.000    0.002    0.002 powder_diffraction_const_wavelength.py:195(calc_profile_pseudo_voight)
       10    0.000    0.000    0.001    0.000 property.py:55(__get__)
        1    0.000    0.000    0.001    0.001 powder_diffraction_const_wavelength.py:45(calc_asymmetry_factor)
       10    0.000    0.000    0.001    0.000 property.py:94(makeEntry)
        1    0.000    0.000    0.001    0.001 Point.py:90(calculate)
       50    0.000    0.000    0.001    0.000 map.py:104(_nested_get)
        1    0.000    0.000    0.001    0.001 symmetry_elements.py:271(calc_equivalent_reflections)

My profiling script is attached, in case anyone wants to play around with this.
test.py.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions