Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
ca21c11
Preparation work: Refactoring
franzpoeschel Jul 2, 2024
6706b66
Basic compression/filtering in HDF5
franzpoeschel Jul 2, 2024
8280eb5
Configure generic filters via JSON object
franzpoeschel Jul 3, 2024
aecdd88
Full support for the set_filter API
franzpoeschel Jul 3, 2024
2154e8f
Fix: captured structured bindings are a C++20 extension
franzpoeschel Jul 3, 2024
e956128
Refactoring to satisfy the Github bot
franzpoeschel Jul 3, 2024
48d1fea
Fix includes
franzpoeschel Jul 3, 2024
2c28a9b
Switch to JSON config for NVidia compiler's benefit
franzpoeschel Jul 3, 2024
cc76388
Verbose CI debugging lets goo
franzpoeschel Dec 9, 2024
ac30fd7
Revert "Verbose CI debugging lets goo"
franzpoeschel Dec 9, 2024
773bf4a
Use Blosc2 filter
franzpoeschel Jul 23, 2025
853ea34
Add compression example
franzpoeschel Jul 24, 2025
2beab51
Add HDF5-Blosc2 to some Linux workflow
franzpoeschel Jul 29, 2025
0157cf7
Update .github/workflows/dependencies/install_hdf5_blosc2
franzpoeschel Jul 29, 2025
d13708f
Add Python example
franzpoeschel Jul 29, 2025
9ab5eff
Some documentation fixes
franzpoeschel Jul 30, 2025
f0726b4
Fix install_hdf5_blosc2 script
franzpoeschel Jul 30, 2025
d1863e5
Complete examples
franzpoeschel Jul 30, 2025
af60b17
ADIOS2 shorthand: dataset.operators may also be a single element
franzpoeschel Jul 30, 2025
6c23bdd
Fix indentation
franzpoeschel Jul 30, 2025
76adc58
Fix patch URL
franzpoeschel Jul 30, 2025
69468b1
Update documentation and tests for ADIOS2
franzpoeschel Jul 30, 2025
ae32857
Deactivate tests for HDF5-Blosc2
franzpoeschel Jul 30, 2025
ab177c0
Add documentation
franzpoeschel Jul 30, 2025
dfc85e0
Some more consistency in examples
franzpoeschel Jul 30, 2025
2c0d8e4
Install with sudo rights
franzpoeschel Jul 30, 2025
5712034
Erase unnecessary line from example
franzpoeschel Jul 30, 2025
2d3e2d3
Fix datatypes in Python example
franzpoeschel Jul 30, 2025
d10ed1c
Use CMake flag directly...
franzpoeschel Jul 30, 2025
aaf467d
Reset extended write example to dev
franzpoeschel Jul 31, 2025
3a9a1cb
Do we need -L/usr/local/lib ??
franzpoeschel Jul 31, 2025
621598e
Try if HDF5 finds the filter on its own...
franzpoeschel Jul 31, 2025
18dcfcc
Ok that works, so cleanup
franzpoeschel Jul 31, 2025
0f39f62
Explicitly set chunks = "auto"
franzpoeschel Jul 31, 2025
c951c73
CI fixes
franzpoeschel Jul 31, 2025
99e8c3e
Add HDF5-Blosc2 to further CI runs
franzpoeschel Jul 31, 2025
3465eab
Add hdf5plugin to some Python runs
franzpoeschel Jul 31, 2025
f86d511
Skip patch in Clang runs
franzpoeschel Jul 31, 2025
264b255
Fix includes
franzpoeschel Jul 31, 2025
6118a3a
Fixes
franzpoeschel Aug 1, 2025
624e430
Further fixes
franzpoeschel Aug 1, 2025
4300173
Remove blosc filter from some runs again
franzpoeschel Aug 1, 2025
d9c51b7
Add missing dataset definition
franzpoeschel Aug 8, 2025
dfd1924
Pull the Blosc2 stuff down in the example file
franzpoeschel Sep 3, 2025
ab45489
Ditch self-compiled Blosc2 plugin, use hdf5plugin package
franzpoeschel Sep 18, 2025
a93e87e
CI fixes
franzpoeschel Sep 18, 2025
60db3b9
Try installing the deb package for h5pl...
franzpoeschel Sep 23, 2025
d3d9d4f
tmp: check if python example for hdf5+blosc2 runs
franzpoeschel Sep 23, 2025
a0ed3a6
fixes
franzpoeschel Sep 23, 2025
eb98995
Move hdf5plugin Python tests to other runs
franzpoeschel Sep 23, 2025
6c6297c
Revert "tmp: check if python example for hdf5+blosc2 runs"
franzpoeschel Sep 23, 2025
e2dd692
....
franzpoeschel Sep 24, 2025
5f766ce
...
franzpoeschel Sep 25, 2025
4a05428
Install hdf5plugin into venv
franzpoeschel Sep 25, 2025
325ee1f
Remove CI debugging
franzpoeschel Sep 25, 2025
deada72
Cleanup
franzpoeschel Sep 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/dependencies/install_hdf5_plugins
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#!/usr/bin/env bash

version_major=1.14
version_minor=6
build_var=ubuntu-2404_gcc

cd /opt
wget "https://github.com/HDFGroup/hdf5_plugins/releases/download/hdf5-${version_major}.${version_minor}/hdf5_plugins-${version_major}-${build_var}.deb" >&2
sudo dpkg -i "hdf5_plugins-${version_major}-${build_var}.deb" >&2
rm "hdf5_plugins-${version_major}-${build_var}.deb"
echo "/HDF_Group/HDF5/${version_major}.${version_minor}/lib/plugin/"
22 changes: 18 additions & 4 deletions .github/workflows/linux.yml
Original file line number Diff line number Diff line change
Expand Up @@ -97,9 +97,12 @@ jobs:
sudo apt-get update
sudo apt-get install clang-11 gfortran libopenmpi-dev python3
sudo .github/workflows/dependencies/install_spack

- name: Build
env: {CC: clang-11, CXX: clang++-11, CXXFLAGS: -Werror}
run: |
# Use this to make the HDF5 plugins available from the C/C++ API.
export HDF5_PLUGIN_PATH="$(sudo -E .github/workflows/dependencies/install_hdf5_plugins)"
sudo ln -s "$(which cmake)" /usr/bin/cmake
eval $(spack env activate --sh .github/ci/spack-envs/clang11_nopy_ompi_h5_ad2/)
spack install
Expand Down Expand Up @@ -172,16 +175,20 @@ jobs:
run: |
sudo apt-get update
sudo apt-get remove openmpi* libopenmpi* *hdf5* || true
sudo apt-get install g++ gfortran python3
sudo apt-get install g++ gfortran python3 python3-venv

sudo .github/workflows/dependencies/install_spack


# Need to build this manually due to broken MPICH package in Ubuntu 24.04
# https://bugs.launchpad.net/ubuntu/+source/mpich/+bug/2072338
sudo .github/workflows/dependencies/install_mpich

- name: Build
env: {CC: gcc, CXX: g++, MPICH_CC: gcc, MPICH_CXX: g++, CXXFLAGS: -Werror}
run: |
# Use this to make the HDF5 plugins available from the C/C++ API.
export HDF5_PLUGIN_PATH="$(sudo -E .github/workflows/dependencies/install_hdf5_plugins)"
cmake --version
mpiexec --version
mpicxx --version
Expand All @@ -190,9 +197,13 @@ jobs:
eval $(spack env activate --sh .github/ci/spack-envs/gcc13_py312_mpich_h5_ad2/)
spack install

python -m venv venv
source venv/bin/activate
pip install mpi4py numpy hdf5plugin

share/openPMD/download_samples.sh build
cmake -S . -B build \
-DopenPMD_USE_PYTHON=OFF \
-DopenPMD_USE_PYTHON=ON \
-DopenPMD_USE_MPI=ON \
-DopenPMD_USE_HDF5=ON \
-DopenPMD_USE_ADIOS2=ON \
Expand Down Expand Up @@ -238,6 +249,8 @@ jobs:
- name: Build
env: {CC: gcc-12, CXX: g++-12, CXXFLAGS: -Werror}
run: |
# Use this to make the HDF5 plugins available from the C/C++ API.
export HDF5_PLUGIN_PATH="$(sudo -E .github/workflows/dependencies/install_hdf5_plugins)"
sudo ln -s "$(which cmake)" /usr/bin/cmake
eval $(spack env activate --sh .github/ci/spack-envs/gcc12_py36_ompi_h5_ad2/)
spack install
Expand All @@ -248,7 +261,8 @@ jobs:
-DopenPMD_USE_MPI=ON \
-DopenPMD_USE_HDF5=ON \
-DopenPMD_USE_ADIOS2=ON \
-DopenPMD_USE_INVASIVE_TESTS=ON
-DopenPMD_USE_INVASIVE_TESTS=ON \
-DCMAKE_VERBOSE_MAKEFILE=ON
cmake --build build --parallel 4
ctest --test-dir build --output-on-failure

Expand All @@ -261,6 +275,7 @@ jobs:
run: |
sudo apt-get update
sudo apt-get install g++ libopenmpi-dev libhdf5-openmpi-dev python3 python3-numpy python3-mpi4py python3-pandas python3-h5py-mpi python3-pip
python3 -m pip install jsonschema==4.* referencing
# TODO ADIOS2
- name: Build
env: {CXXFLAGS: -Werror, PKG_CONFIG_PATH: /usr/lib/x86_64-linux-gnu/pkgconfig}
Expand All @@ -278,7 +293,6 @@ jobs:
cmake --build build --parallel 4
ctest --test-dir build --output-on-failure

python3 -m pip install jsonschema==4.* referencing
cd share/openPMD/json_schema
PATH="../../../build/bin:$PATH" make -j 2
# We need to exclude the thetaMode example since that has a different
Expand Down
10 changes: 10 additions & 0 deletions .github/workflows/tooling.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,11 @@ jobs:
sudo apt-get install clang clang-tidy gfortran libopenmpi-dev python-is-python3
SPACK_VER=1.0.1 sudo -E .github/workflows/dependencies/install_spack
echo "SPACK VERSION: $(spack --version)"

# Use this to make the HDF5 plugins available from the C/C++ API.
export HDF5_PLUGIN_PATH="$(sudo -E .github/workflows/dependencies/install_hdf5_plugins)"
echo "$HDF5_PLUGIN_PATH"
ls "$HDF5_PLUGIN_PATH"
- name: Build
env: {CC: clang, CXX: clang++}
run: |
Expand Down Expand Up @@ -52,6 +57,11 @@ jobs:
sudo apt-get install clang-19 libc++-dev libc++abi-dev python3 gfortran libopenmpi-dev python3-numpy
SPACK_VER=1.0.1 sudo -E .github/workflows/dependencies/install_spack
echo "SPACK VERSION: $(spack --version)"

# Use this to make the HDF5 plugins available from the C/C++ API.
export HDF5_PLUGIN_PATH="$(sudo -E .github/workflows/dependencies/install_hdf5_plugins)"
echo "$HDF5_PLUGIN_PATH"
ls "$HDF5_PLUGIN_PATH"
- name: Build
env: {CC: mpicc, CXX: mpic++, OMPI_CC: clang-19, OMPI_CXX: clang++-19, CXXFLAGS: -Werror, OPENPMD_HDF5_CHUNKS: none, OPENPMD_TEST_NFILES_MAX: 100}
run: |
Expand Down
2 changes: 2 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -718,6 +718,7 @@ set(openPMD_EXAMPLE_NAMES
12_span_write
13_write_dynamic_configuration
14_toml_template
15_compression
)
set(openPMD_PYTHON_EXAMPLE_NAMES
2_read_serial
Expand All @@ -734,6 +735,7 @@ set(openPMD_PYTHON_EXAMPLE_NAMES
11_particle_dataframe
12_span_write
13_write_dynamic_configuration
15_compression
)

if(openPMD_USE_INVASIVE_TESTS)
Expand Down
13 changes: 13 additions & 0 deletions docs/source/backends/hdf5.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,19 @@ Virtual file drivers are configured via JSON/TOML.
Refer to the page on :ref:`JSON/TOML configuration <backendconfig-hdf5>` for further details.


Filters (compression)
*********************

HDF5 supports so-called filters for transformations such as compression on datasets.
These can be permanent (applied to an entire dataset) and transient (applied to individual I/O operations).
The openPMD-api currently supports permanent filters.
Pipelines of multiple subsequent filters are supported.
Refer also to `this documentation <https://web.ics.purdue.edu/~aai/HDF5/html/Filters.html>`_.

Filters are applied via :ref:`JSON/TOML configuration <backendconfig-hdf5>`, see there for detailed instructions on how to apply filters.
There are also extended examples on how to apply compression options to ADIOS2 and HDF5 in the examples: `Python <https://github.com/openPMD/openPMD-api/blob/dev/examples/15_compression.py>`_ / `C++ <https://github.com/openPMD/openPMD-api/blob/dev/examples/15_compression.cpp>`_.


Backend-Specific Controls
-------------------------

Expand Down
22 changes: 20 additions & 2 deletions docs/source/details/backendconfig.rst
Original file line number Diff line number Diff line change
Expand Up @@ -185,8 +185,8 @@ Explanation of the single keys:
Additionally, specifying ``"disk_override"``, ``"buffer_override"`` or ``"new_step_override"`` will take precedence over options specified without the ``_override`` suffix, allowing to invert the normal precedence order.
This way, a data producing code can hardcode the preferred flush target per ``flush()`` call, but users can e.g. still entirely deactivate flushing to disk in the ``Series`` constructor by specifying ``preferred_flush_target = buffer_override``.
This is useful when applying the asynchronous IO capabilities of the BP5 engine.
* ``adios2.dataset.operators``: This key contains a list of ADIOS2 `operators <https://adios2.readthedocs.io/en/latest/components/components.html#operator>`_, used to enable compression or dataset transformations.
Each object in the list has two keys:
* ``adios2.dataset.operators``: This key contains either a single ADIOS2 `operator <https://adios2.readthedocs.io/en/latest/components/components.html#operator>`_ or a list of operators, used to enable compression or dataset transformations.
Each operator is an object with two keys:

* ``type`` supported ADIOS operator type, e.g. zfp, sz
* ``parameters`` is an associative map of string parameters for the operator (e.g. compression levels)
Expand Down Expand Up @@ -247,6 +247,24 @@ Explanation of the single keys:
An explicit chunk size can be specified as a list of positive integers, e.g. ``hdf5.dataset.chunks = [10, 100]``. Note that this specification should only be used per-dataset, e.g. in ``resetDataset()``/``reset_dataset()``.

Chunking generally improves performance and only needs to be disabled in corner-cases, e.g. when heavily relying on independent, parallel I/O that non-collectively declares data records.
* ``hdf5.datasets.permanent_filters``: Either a single HDF5 permanent filter specification or a list of HDF5 permanent filter specifications.
Each filter specification is a JSON/TOML object, but there are multiple options:

* Zlib: The Zlib filter has a distinct API in HDF5 and the configuration for Zlib in openPMD is hence also different. It is activated by the mandatory key ``type = "zlib"`` and configured by the optional integer key ``aggression``.
Example: ``{"type": "zlib", "aggression": 5}``.
* Filters identified by their global ID `registered with the HDF group <https://github.com/HDFGroup/hdf5_plugins/blob/master/docs/RegisteredFilterPlugins.md>`_.
They are activated by the mandatory integer key ``id`` containing this global ID.
All other keys are optional:

* ``type = "by_id"`` may optionally be specified for clarity and consistency.
* The string key ``flags`` can take the values ``"mandatory"`` or ``"optional"``, indicating if HDF5 should abort execution if the filter cannot be applied for some reason.
* The key ``cd_values`` points to a list of nonnegative integers.
These are filter-specific configuration options.
Refer to the specific filter's documentation.

Alternatively to an integer ID, the key ``id`` may also be of string type, identifying one of the six builtin filters of HDF5: ``"deflate", "shuffle", "fletcher32", "szip", "nbit", "scaleoffset"``.


* ``hdf5.vfd.type`` selects the HDF5 virtual file driver.
Currently available are:

Expand Down
13 changes: 6 additions & 7 deletions examples/13_write_dynamic_configuration.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ type = "bp4"

# ADIOS2 allows adding several operators
# Lists are given in TOML by using double brackets
# For specifying a single operator only, the list may be skipped.
[[adios2.dataset.operators]]
type = "zlib"

Expand Down Expand Up @@ -192,14 +193,12 @@ CFG.CHUNKS = [10]
"resizable": true,
"adios2": {
"dataset": {
"operators": [
{
"type": "zlib",
"parameters": {
"clevel": 9
}
"operators": {
"type": "zlib",
"parameters": {
"clevel": 9
}
]
}
}
}
})END";
Expand Down
5 changes: 3 additions & 2 deletions examples/13_write_dynamic_configuration.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@

# ADIOS2 allows adding several operators
# Lists are given in TOML by using double brackets
# For specifying a single operator only, the list may be skipped.
[[adios2.dataset.operators]]
type = "zlib"

Expand Down Expand Up @@ -106,12 +107,12 @@ def main():
}
}
config['adios2']['dataset'] = {
'operators': [{
'operators': {
'type': 'zlib',
'parameters': {
'clevel': 9
}
}]
}
}

temperature = iteration.meshes["temperature"]
Expand Down
Loading
Loading