Conversation
| Test grdlandmask with no set outgrid. | ||
| """ | ||
| result = grdlandmask(spacing=1, region=[125, 130, 30, 35]) | ||
| result = grdlandmask(spacing=1, region=[125, 130, 30, 35], cores=2) |
There was a problem hiding this comment.
The Linux runner for GitHub Actions has 2 CPU cores according to https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources, but not sure if we should set cores to 2 or 1 for consistent benchmarking.
There was a problem hiding this comment.
Using two cores is faster and can help check the OpenMP feature in GMT, so I prefer to use two cores if possible.
BTW, the macOS runner for GitHub Actions has 3 cores, so instead of cores=2, cores=True will use all available cores.
There was a problem hiding this comment.
Is only 1 core being used if we don't set cores, or use cores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says [default is to use all available cores], but is that only with cores=True (i.e. -x in GMT)?
The benchmark.yml workflow is running on Linux, but I guess we could set cores=True since this test would also run on macOS in ci_test.yml. My question was more around whether multi-threading with multiple-cores (2 or more) would provide a consistent benchmark compared to using only 1 core, but I guess we need to merge this PR to find out!
There was a problem hiding this comment.
Is only 1 core being used if we don't set
cores, or usecores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says[default is to use all available cores], but is that only withcores=True(i.e.-xin GMT)?
Yes.
There was a problem hiding this comment.
Is only 1 core being used if we don't set
cores, or usecores=None? I'm reading https://docs.generic-mapping-tools.org/6.4/gmt.html#core-full which says[default is to use all available cores], but is that only withcores=True(i.e.-xin GMT)?Yes.
Actually it uses all available cores if -x is not used.
$ gmt grdlandmask -R0/10/0/10 -Gabc.nc -I1/1 -Vi
grdlandmask [INFORMATION]: Enable all available threads (up to xxx)
There was a problem hiding this comment.
Maybe set
cores=1and see what the flame graph diff shows?
The test_grdlandmask_no_outgrid benchmark speedup went from 3x to 9.6x 🤣 It might be that there is some overhead with using threading (2 cores) vs no threading (1 core). Here's the flame graph diff from https://codspeed.io/GenericMappingTools/pygmt/branches/limit-cores again:
The extra gomp_team_barrier_wait_* calls are still there, but also some other stuff (in blue).
There was a problem hiding this comment.
This reminds me of the warnings message I saw when running the tests in random order (see https://github.com/GenericMappingTools/pygmt/actions/runs/7396374248/job/20121388367?pr=2936):
pygmt/tests/test_session_management.py::test_session_multiprocessing
pygmt/tests/test_session_management.py::test_session_multiprocessing
/home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child.
self.pid = os.fork()
There are some related discussions at https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555 and related PR at python/cpython#100228.
os.fork() in a multi-threaded application is a likely source of deadlocks on many platforms. We should raise a warning when people call os.fork() from a process that we know has other threads running.
Not sure if it's related to the issue here.
There was a problem hiding this comment.
pygmt/tests/test_session_management.py::test_session_multiprocessing pygmt/tests/test_session_management.py::test_session_multiprocessing /home/runner/micromamba/envs/pygmt/lib/python3.12/multiprocessing/popen_fork.py:66: DeprecationWarning: This process (pid=2433) is multi-threaded, use of fork() may lead to deadlocks in the child. self.pid = os.fork()
That test_session_multiprocessing function is only running basemap and savefig right? It shouldn't have any multi-threading going on, so not sure what's happening.
I tried running some benchmarks using cores from -1 (all cores) to 0 (no-multithreading) to 16 (all cores on my laptop). This was running grdlandmask with a spacing of 0.05 arc degrees. The results do vary between runs, but in general, more cores means less time:
Code to reproduce:
import time
import pandas as pd
import pygmt
import tqdm
timings = {}
for cores in tqdm.tqdm(range(-1, 16)):
tic = time.perf_counter()
pygmt.grdlandmask(spacing="0.05d", region=[0, 180, 0, 90], cores=cores)
toc = time.perf_counter()
timings[cores] = toc - tic
df = pd.DataFrame.from_dict(data=timings, orient="index", columns=["time"])
# %%
fig = pygmt.Figure()
region = pygmt.info(data=df.reset_index(), per_column=True, spacing=0.05)
fig.plot(
x=df.index,
y=df.time,
region=region,
pen="thick",
frame=[
"WSne+tBenchmark grdlandmask",
"xaf+lCores",
"yaf+lTime (s), lower is better",
],
)
fig.savefig("benchmark_grdlandmask.png")
fig.show()If I change the spacing from 0.05 arc degrees to 1 arc degree, more cores actually take more time:
For that test_grdlandmask_no_outgrid unit test which is using spacing=1 (i.e. 1 arc degree), I think we should just use cores=2, since cores=3 might not be much faster on such a low resolution. As long as we are benchmarking with a fixed number of core counts, it should be ok?
CodSpeed Performance ReportMerging #2944 will improve performances by ×3Comparing Summary
Benchmarks breakdown
|
The benchmark report is quite confusing. It seems that |
Yeah, it seems to confirm the hypothesis in #2942 (comment) though. Does the setting affect |
pygmt/tests/test_grdlandmask.py
Outdated
| Test grdlandmask with no set outgrid. | ||
| """ | ||
| result = grdlandmask(spacing=1, region=[125, 130, 30, 35]) | ||
| result = grdlandmask(spacing=1, region=[125, 130, 30, 35], cores=1, verbose="i") |
There was a problem hiding this comment.
Hmm, the debug info at https://github.com/GenericMappingTools/pygmt/actions/runs/7403072686/job/20142175894?pr=2944#step:8:111 is showing that there are 4 threads on GitHub Actions??
grdlandmask [INFORMATION]: Enable 1 threads of 4 available
grdlandmask [INFORMATION]: Selected ice front line as Antarctica coastline
grdlandmask [INFORMATION]: GSHHG version 2.3.7
grdlandmask [INFORMATION]: Derived from World Vector Shoreline, CIA WDB-II, and Atlas of the Cryosphere
grdlandmask [INFORMATION]: Processed by Paul Wessel and Walter H. F. Smith, 1994-2017
grdlandmask [INFORMATION]: Nodes in the oceans will be set to
0
grdlandmask [INFORMATION]: Nodes on land will be set to
1
grdlandmask [INFORMATION]: Nodes in lakes will be set to
0
grdlandmask [INFORMATION]: Nodes in islands will be set to
1
grdlandmask [INFORMATION]: Nodes in ponds will be set to
0
grdlandmask [INFORMATION]: Apply linear scale to avoid wrap-around: -Jx100id; this temporarily changes the domain
grdlandmask [INFORMATION]: Writing grid to file /tmp/pygmt-0c0skk20.nc
grdlandmask [INFORMATION]: Level 0 set for 33 nodes
grdlandmask [INFORMATION]: Level 1 set for 3 nodes
grdlandmask [INFORMATION]: Enable 1 threads of 4 available
grdlandmask [INFORMATION]: Selected ice front line as Antarctica coastline
grdlandmask [INFORMATION]: GSHHG version 2.3.7
grdlandmask [INFORMATION]: Derived from World Vector Shoreline, CIA WDB-II, and Atlas of the Cryosphere
grdlandmask [INFORMATION]: Processed by Paul Wessel and Walter H. F. Smith, 1994-2017
grdlandmask [INFORMATION]: Nodes in the oceans will be set to
0
grdlandmask [INFORMATION]: Nodes on land will be set to
1
grdlandmask [INFORMATION]: Nodes in lakes will be set to
0
grdlandmask [INFORMATION]: Nodes in islands will be set to
1
grdlandmask [INFORMATION]: Nodes in ponds will be set to
0
grdlandmask [INFORMATION]: Apply linear scale to avoid wrap-around: -Jx100id; this temporarily changes the domain
grdlandmask [INFORMATION]: Writing grid to file /tmp/pygmt-i3n6zkcg.nc
grdlandmask [INFORMATION]: Level 0 set for 33 nodes
grdlandmask [INFORMATION]: Level 1 set for 3 nodes
So the 2 CPUs mentioned at https://docs.github.com/en/actions/using-github-hosted-runners/about-github-hosted-runners/about-github-hosted-runners#supported-runners-and-hardware-resources might have hyperthreading, and we could potentially set cores=4, though see #2944 (comment) on how more cores might not actually improve runtime.



Description of proposed changes
Add alias cores (-x) for
grdlandmask, and set limit ofcores=2fortest_grdlandmask_no_outgridto make benchmark execution deviate less.Preview at https://pygmt-dev--2944.org.readthedocs.build/en/2944/api/generated/pygmt.grdlandmask.html
Extends #1423, may address #2942 partially.
Reminders
make formatandmake checkto make sure the code follows the style guide.doc/api/index.rst.Slash Commands
You can write slash commands (
/command) in the first line of a comment to performspecific operations. Supported slash commands are:
/format: automatically format and lint the code/test-gmt-dev: run full tests on the latest GMT development version