The flopy3_modpath7_create_simulation.ipynb notebook is slow, taking 10-15 minutes on CI. The bottleneck seems to be in PathlineFile.get_destination_pathline_data(). Calculating pathline data for the backwards case with river cell locations is slowest. Computing endpoint data is generally fast.
Benchmark
Results from my machine (runtimes in milliseconds):
| Name (time in ms) |
Min |
Max |
Mean |
StdDev |
Median |
IQR |
Outliers |
OPS |
Rounds |
Iterations |
| test_get_destination_endpoint_data[well-forward] |
1.1320 (1.0) |
1.6217 (1.02) |
1.1668 (1.0) |
0.0368 (1.20) |
1.1588 (1.0) |
0.0335 (1.38) |
92;34 |
857.0405 (1.0) |
790 |
1 |
| test_get_destination_endpoint_data[river-forward] |
1.2307 (1.09) |
1.5880 (1.0) |
1.2638 (1.08) |
0.0308 (1.0) |
1.2575 (1.09) |
0.0242 (1.0) |
82;62 |
791.2854 (0.92) |
739 |
1 |
| test_get_destination_endpoint_data[well-backward] |
2.1071 (1.86) |
37.6406 (23.70) |
2.2844 (1.96) |
1.8545 (60.22) |
2.1728 (1.88) |
0.0754 (3.11) |
1;9 |
437.7543 (0.51) |
366 |
1 |
| test_get_destination_endpoint_data[river-backward] |
2.1563 (1.90) |
3.0739 (1.94) |
2.2155 (1.90) |
0.0620 (2.01) |
2.2074 (1.90) |
0.0526 (2.17) |
33;16 |
451.3653 (0.53) |
433 |
1 |
| test_get_destination_pathline_data[well-backward] |
26,589.8808 (>1000.0) |
26,652.4284 (>1000.0) |
26,623.4600 (>1000.0) |
25.6051 (831.50) |
26,616.2695 (>1000.0) |
40.1569 (>1000.0) |
2;0 |
0.0376 (0.00) |
5 |
1 |
| test_get_destination_pathline_data[well-forward] |
40,028.3296 (>1000.0) |
40,195.3408 (>1000.0) |
40,104.5977 (>1000.0) |
70.1838 (>1000.0) |
40,077.0743 (>1000.0) |
115.3085 (>1000.0) |
2;0 |
0.0249 (0.00) |
5 |
1 |
| test_get_destination_pathline_data[river-forward] |
61,112.0500 (>1000.0) |
61,440.9841 (>1000.0) |
61,247.5391 (>1000.0) |
132.9419 (>1000.0) |
61,200.5654 (>1000.0) |
201.5238 (>1000.0) |
2;0 |
0.0163 (0.00) |
5 |
1 |
| test_get_destination_pathline_data[river-backward] |
568,730.0336 (>1000.0) |
575,857.0278 (>1000.0) |
571,281.7784 (>1000.0) |
2,803.3254 (>1000.0) |
571,121.0215 (>1000.0) |
3,365.4625 (>1000.0) |
1;0 |
0.0018 (0.00) |
5 |
1 |
Performance profile
Below is a profile of the backward/river case. The ModpathFile.get_destination_pathline() method spends most of its time sorting numpy arrays:

Calls ordered by total time:
| ncalls |
tottime |
percall |
cumtime |
percall filename:lineno(function) |
| 2626 |
457.829 |
0.174 |
457.851 |
0.174 {method 'sort' of 'numpy.ndarray' objects} |
| 2625 |
2.875 |
0.001 |
460.892 |
0.176 /Users/wes/dev/flopy/flopy/utils/modpathfile.py:129(get_data) |
| 2627 |
0.158 |
0.000 |
0.182 |
0.000 {built-in method numpy.core._multiarray_umath.implement_array_function} |
| 2 |
0.025 |
0.013 |
0.025 |
0.013 {built-in method numpy.array} |
| 1 |
0.021 |
0.021 |
0.022 |
0.022 /Users/wes/dev/flopy/venv/lib/python3.10/site-packages/numpy/lib/arraysetops.py:523(in1d) |
| 1 |
0.019 |
0.019 |
460.996 |
460.996 /Users/wes/dev/flopy/flopy/utils/modpathfile.py:195(get_destination_data) |
| 1 |
0.018 |
0.018 |
0.018 |
0.018 {method 'copy' of 'numpy.ndarray' objects} |
| 2625 |
0.017 |
0.000 |
0.022 |
0.000 /Users/wes/dev/flopy/venv/lib/python3.10/site-packages/numpy/core/_internal.py:395(_newnames) |
| 2625 |
0.011 |
0.000 |
460.902 |
0.176 /Users/wes/dev/flopy/flopy/utils/modpathfile.py:633(get_data) |
| 2625 |
0.008 |
0.000 |
0.167 |
0.000 <array_function internals>:177(where) |
| 1 |
0.006 |
0.006 |
460.909 |
460.909 /Users/wes/dev/flopy/flopy/utils/modpathfile.py:266() |
| 5273 |
0.003 |
0.000 |
0.003 |
0.000 {built-in method builtins.isinstance} |
| 5250 |
0.002 |
0.000 |
0.002 |
0.000 {method 'remove' of 'list' objects} |
| 1 |
0.002 |
0.002 |
460.997 |
460.997 /Users/wes/dev/flopy/flopy/utils/modpathfile.py:704(get_destination_pathline_data) |
It looks like the sorting occurs in _ModpathSeries.get_data():
|
ra.sort(order=["particleid", "time"]) |
The
flopy3_modpath7_create_simulation.ipynbnotebook is slow, taking 10-15 minutes on CI. The bottleneck seems to be inPathlineFile.get_destination_pathline_data(). Calculating pathline data for the backwards case with river cell locations is slowest. Computing endpoint data is generally fast.Benchmark
Results from my machine (runtimes in milliseconds):
Performance profile
Below is a profile of the backward/river case. The
ModpathFile.get_destination_pathline()method spends most of its time sortingnumpyarrays:Calls ordered by total time:
It looks like the sorting occurs in
_ModpathSeries.get_data():flopy/flopy/utils/modpathfile.py
Line 150 in 5d0f585