From 92ba30bee519bc738a49fce0f337fa9e510f4eb8 Mon Sep 17 00:00:00 2001
From: Geoffroy Lesur <geoffroy.lesur@univ-grenoble-alpes.fr>
Date: Sat, 18 Oct 2025 16:27:02 +0200
Subject: [PATCH 1/3] add documentation for CI add fix proposed to issue with
 mpi IO in FAQs

---
 doc/source/faq.rst              |   9 ++
 doc/source/index.rst            |   1 +
 doc/source/testing.rst          | 117 +++++++++++++++++++++
 doc/source/testing/idfxTest.rst | 173 ++++++++++++++++++++++++++++++++
 4 files changed, 300 insertions(+)
 create mode 100644 doc/source/testing.rst
 create mode 100644 doc/source/testing/idfxTest.rst

diff --git a/doc/source/faq.rst b/doc/source/faq.rst
index 4c7bc056..198d7f54 100644
--- a/doc/source/faq.rst
+++ b/doc/source/faq.rst
@@ -61,6 +61,15 @@ How can I stop the code without loosing the current calculation?
 I'm doing performance measures. How do I disable all outputs in *Idefix*?
   Add ``-nowrite`` when you call *Idefix* executable.
 
+VTK output appears corrupted when running with MPI (OpenMPI)
+    Some OpenMPI configurations (notably OpenMPI 4 with the ompio component) can produce corrupted VTK/VTU output when running with MPI enabled. This appears to be caused by bugs in OpenMPI's ompio I/O component.
+    Disable ompio so OpenMPI falls back to ROMIO (MPICH's MPI-IO), which is typically more stable:
+
+    .. code-block:: console
+
+       mpirun --mca io ^ompio ...
+
+    This has resolved intermittent corruption for several users. See issue #348 for discussion and reports.
 
 Developement
 ------------
diff --git a/doc/source/index.rst b/doc/source/index.rst
index 411d73d6..d36ab39b 100644
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@@ -122,6 +122,7 @@ The Idefix collaboration benefited from funding from the “Programme National d
    reference
    modules
    programmingguide
+   testing
    performances
    kokkos
    contributing
diff --git a/doc/source/testing.rst b/doc/source/testing.rst
new file mode 100644
index 00000000..44ff88d0
--- /dev/null
+++ b/doc/source/testing.rst
@@ -0,0 +1,117 @@
+Continuous Integration (CI) tests
+================================
+
+This document describes the GitHub Actions continuous-integration setup used to run the Idefix
+test-suite. The CI is implemented by two workflows checked in .github/workflows:
+
+- .github/workflows/idefix-ci.yml
+- .github/workflows/idefix-ci-jobs.yml
+
+Overview
+--------
+
+The CI is split in two layers:
+
+- A top-level workflow (.github/workflows/idefix-ci.yml) that:
+
+  - runs a Linter job (pre-commit) on push / PR / manual dispatch,
+  - then calls a reusable workflow for different compiler/backends (intel, gcc, cuda)
+    providing two inputs: TESTME_OPTIONS and IDEFIX_COMPILER.
+
+- A reusable workflow (.github/workflows/idefix-ci-jobs.yml) that:
+
+  - defines the actual test jobs grouped by physics domain (ShocksHydro, ParabolicHydro,
+    ShocksMHD, ParabolicMHD, Fargo, ShearingBox, SelfGravity, Planet, Dust, Braginskii,
+    Examples, Utils),
+  - runs test scripts on self-hosted runners,
+  - expects the repository to be checked out with submodules,
+  - invokes the repository-provided CI helper scripts to configure / build / run tests.
+
+Key configuration points
+------------------------
+
+- Inputs passed from the top-level workflow:
+
+  - TESTME_OPTIONS (string): flags forwarded to the per-test runner (examples: -cuda, -Werror,
+    -intel, -all).
+  - IDEFIX_COMPILER (string): which compiler the tests should use (e.g. icc, gcc, nvcc).
+
+- Environment variables set by the reusable workflow:
+
+  - IDEFIX_COMPILER, TESTME_OPTIONS, PYTHONPATH, IDEFIX_DIR
+
+- Linter job:
+
+  - Runs only when repository is the main project (not arbitrary forks).
+  - Uses actions/setup-python and runs pre-commit (pre-commit/action@v3 and pre-commit-ci/lite).
+  - Prevents regressions in style and common mistakes before running heavy test jobs.
+
+- Test execution:
+
+  - All test jobs call the repository script scripts/ci/run-tests with a test directory
+    and the TESTME_OPTIONS flags. Example invocation (from the workflows):
+      scripts/ci/run-tests $IDEFIX_DIR/test/HD/sod -all $TESTME_OPTIONS
+
+  - The reusable workflow is written to execute many test directories in separate job steps,
+    so each physics group is kept logically separated in CI logs.
+
+Runners and prerequisites
+-------------------------
+
+- The heavy numerical tests run on self-hosted runners (see runs-on: self-hosted).
+  The CI assumes appropriate hardware and dependencies are available on those runners
+  (compilers, MPI, GPUs when CUDA/HIP flags are used, required system libraries).
+
+- The workflows check out the repository and its submodules. Submodules must be available
+  on the CI machines.
+
+How tests are driven (testme scripts)
+-------------------------------------
+
+Each test directory contains a small Python "testMe" driver that uses the helper Python
+class documented in the repository:
+
+- See the test helper documentation: :doc:`idfxTest <testing/idfxTest>`
+
+That helper (idfxTest) is responsible for:
+
+- parsing TESTME_OPTIONS-like flags (precision, MPI, CUDA, reconstruction, vector potential, etc.),
+- calling configure / compile / run,
+- performing standard python checks and non-regression (RMSE) comparisons against
+  reference dumps,
+- optionally creating / updating reference dumps (init mode).
+
+Practical examples
+------------------
+
+- Example of a CI invocation (triggered by workflows):
+
+  - Top-level workflow calls the reusable jobs workflow for each compiler/back-end, e.g.
+    TESTME_OPTIONS="-cuda -Werror" IDEFIX_COMPILER=nvcc
+
+- Running tests locally (developer machine)
+  - You can mimic what CI does by calling the repository helper script directly. Example:
+    scripts/ci/run-tests /path/to/idefix/test/HD/sod -all -mpi -dec 2 2 -reconstruction 3 -single
+
+Notes for maintainers
+---------------------
+
+- The reusable jobs workflow contains a commented concurrency block for optional cancellation
+  of in-flight runs — consider enabling it if you want to auto-cancel redundant CI runs.
+- Because tests are run on self-hosted runners, ensure the pools have the required compilers,
+  MPI stacks and GPU drivers for the requested TESTME_OPTIONS.
+- Keep TESTME_OPTIONS in sync with the options understood by the test helper documented in
+  :doc:`idfxTest <testing/idfxTest>`.
+
+Relevant files
+--------------
+
+- Workflow entry point: .github/workflows/idefix-ci.yml
+- Reusable jobs: .github/workflows/idefix-ci-jobs.yml
+- Test helper documentation: :doc:`idfxTest <testing/idfxTest>`
+
+.. toctree::
+   :maxdepth: 2
+   :caption: Contents:
+
+   testing/idfxTest.rst
\ No newline at end of file
diff --git a/doc/source/testing/idfxTest.rst b/doc/source/testing/idfxTest.rst
new file mode 100644
index 00000000..93171102
--- /dev/null
+++ b/doc/source/testing/idfxTest.rst
@@ -0,0 +1,173 @@
+=========
+idfxTest
+=========
+
+.. autoclass:: idfxTest
+   :members:
+   :undoc-members:
+   :show-inheritance:
+
+Overview
+--------
+
+The ``idfxTest`` class provides a high-level interface for automating the configuration, compilation, execution, and regression testing of Idefix simulations. It is designed to be used in test scripts (such as ``testme.py``) to streamline the testing workflow, including handling reference files and plotting differences.
+
+Constructor and Command-Line Options
+------------------------------------
+
+The constructor parses command-line arguments using ``argparse``. These options can be passed directly to the test script or via the command line. The following options are available:
+
+.. list-table::
+   :header-rows: 1
+
+   * - Option
+     - Attribute
+     - Description
+   * - ``-noplot``
+     - ``noplot``
+     - Disable plotting in standard tests (default: True).
+   * - ``-ploterr``
+     - ``ploterr``
+     - Enable plotting of differences when regression tests fail.
+   * - ``-cmake OPT [OPT ...]``
+     - ``cmake``
+     - Extra CMake options (list of strings).
+   * - ``-definitions FILE``
+     - ``definitions``
+     - Specify a custom ``definitions.hpp`` file.
+   * - ``-dec NX NY NZ``
+     - ``dec``
+     - MPI domain decomposition (list of integers).
+   * - ``-check``
+     - ``check``
+     - Only perform regression tests without compilation.
+   * - ``-cuda``
+     - ``cuda``
+     - Enable CUDA backend for Nvidia GPUs.
+   * - ``-intel``
+     - ``intel``
+     - Use Intel OneAPI compilers.
+   * - ``-hip``
+     - ``hip``
+     - Enable HIP backend for AMD GPUs.
+   * - ``-single``
+     - ``single``
+     - Enable single precision.
+   * - ``-vectPot``
+     - ``vectPot``
+     - Enable vector potential formulation.
+   * - ``-reconstruction N``
+     - ``reconstruction``
+     - Set reconstruction scheme (2=PLM, 3=LimO3, 4=PPM).
+   * - ``-idefixDir PATH``
+     - ``idefixDir``
+     - Set the directory for Idefix source files (default: ``$IDEFIX_DIR``).
+   * - ``-mpi``
+     - ``mpi``
+     - Enable MPI parallelism.
+   * - ``-all``
+     - ``all``
+     - Run the full test suite with multiple configurations.
+   * - ``-init``
+     - ``init``
+     - Reinitialize reference files for non-regression tests.
+   * - ``-Werror``
+     - ``Werror``
+     - Treat compiler warnings as errors.
+
+Main Methods
+------------
+
+.. list-table::
+   :header-rows: 1
+
+   * - Method
+     - Description
+   * - ``configure``
+     - Runs CMake to configure the build system for Idefix, using options set by command-line flags (e.g., precision, MPI, CUDA, etc.).
+   * - ``compile``
+     - Compiles the Idefix code using ``make`` with the specified number of parallel jobs.
+   * - ``run``
+     - Executes the Idefix binary, optionally with MPI, using the provided input file and runtime options.
+   * - ``checkOnly``
+     - Performs regression testing only, without compiling or running the code (useful for checking outputs after a manual run).
+   * - ``standardTest``
+     - Runs any Python-based standard tests (e.g., ``testidefix.py``) present in the test directory for additional validation.
+   * - ``nonRegressionTest``
+     - Compares the output dump file to a reference file using RMSE; fails if the error exceeds the tolerance.
+   * - ``compareDump``
+     - Compares two arbitrary dump files using the same logic as ``nonRegressionTest``.
+   * - ``makeReference``
+     - Copies the specified output file to the reference directory, updating the reference for future regression tests.
+
+Usage Example
+-------------
+
+Below is an example inspired by ``testme.py`` from ``test/HD/sod/testme.py``. This demonstrates a typical workflow for running tests and performing regression checks.
+
+.. code-block:: python
+
+   import pytools.idfx_test as tst
+
+   name = "dump.0001.dmp"
+
+   def testMe(test):
+       test.configure()
+       test.compile()
+       inifiles = ["idefix.ini", "idefix-hll.ini", "idefix-hllc.ini", "idefix-tvdlf.ini"]
+       if test.reconstruction == 4:
+           inifiles = ["idefix-rk3.ini", "idefix-hllc-rk3.ini"]
+
+       # Loop over all ini files for this test
+       for ini in inifiles:
+           test.run(inputFile=ini)
+           if test.init:
+               test.makeReference(filename=name)
+           test.standardTest()
+           test.nonRegressionTest(filename=name)
+
+   test = tst.idfxTest()
+
+   if not test.all:
+       if test.check:
+           test.checkOnly(filename=name)
+       else:
+           testMe(test)
+   else:
+       test.noplot = True
+       for rec in range(2, 5):
+           test.vectPot = False
+           test.single = False
+           test.reconstruction = rec
+           test.mpi = False
+           testMe(test)
+
+       # Test in single precision
+       test.reconstruction = 2
+       test.single = True
+       testMe(test)
+
+How to Run
+----------
+
+You can run the test script from the command line, passing any of the supported options. For example:
+
+.. code-block:: bash
+
+   python testme.py -mpi -dec 2 2 -reconstruction 3 -single -ploterr -idefixDir /path/to/idefix
+
+This will configure, compile, and run the test in MPI mode with a 2x2 domain decomposition, third-order reconstruction, single precision, and plotting enabled for regression errors.
+
+Reference File Management
+------------------------
+
+- Reference files are stored in ``$IDEFIX_DIR/reference/<testDir>``.
+- The filename is generated based on precision, reconstruction, input file, and vector potential settings.
+- Use ``test.init`` to regenerate reference files (dangerous: overwrites existing references).
+
+Regression Testing
+------------------
+
+- The ``nonRegressionTest`` method compares output dumps to reference files using RMSE.
+- If the error exceeds the tolerance, the test fails and (optionally) plots the difference.
+

From da7b00f2970a687af8dfee146235c67a7c2ff378 Mon Sep 17 00:00:00 2001
From: Geoffroy Lesur <geoffroy.lesur@univ-grenoble-alpes.fr>
Date: Sat, 18 Oct 2025 16:55:03 +0200
Subject: [PATCH 2/3] Update doc/source/faq.rst

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
---
 doc/source/faq.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/source/faq.rst b/doc/source/faq.rst
index 198d7f54..4dd19831 100644
--- a/doc/source/faq.rst
+++ b/doc/source/faq.rst
@@ -71,8 +71,8 @@ VTK output appears corrupted when running with MPI (OpenMPI)
 
     This has resolved intermittent corruption for several users. See issue #348 for discussion and reports.
 
-Developement
-------------
+Development
+-----------
 
 I have a serious bug (e.g. segmentation fault), in my setup, how do I proceed?
   Add ``-DIdefix_DEBUG=ON`` to ``cmake`` and recompile to find out exactly where the code crashes (see :ref:`debugging`).

From f96950fbefab90e177834a69b09a9ee589ae07df Mon Sep 17 00:00:00 2001
From: Geoffroy Lesur <geoffroy.lesur@univ-grenoble-alpes.fr>
Date: Sat, 18 Oct 2025 17:32:35 +0200
Subject: [PATCH 3/3] fix linting errors

---
 doc/source/testing.rst          | 2 +-
 doc/source/testing/idfxTest.rst | 1 -
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/doc/source/testing.rst b/doc/source/testing.rst
index 44ff88d0..888ac205 100644
--- a/doc/source/testing.rst
+++ b/doc/source/testing.rst
@@ -114,4 +114,4 @@ Relevant files
    :maxdepth: 2
    :caption: Contents:
 
-   testing/idfxTest.rst
\ No newline at end of file
+   testing/idfxTest.rst
diff --git a/doc/source/testing/idfxTest.rst b/doc/source/testing/idfxTest.rst
index 93171102..b425d3fe 100644
--- a/doc/source/testing/idfxTest.rst
+++ b/doc/source/testing/idfxTest.rst
@@ -170,4 +170,3 @@ Regression Testing
 
 - The ``nonRegressionTest`` method compares output dumps to reference files using RMSE.
 - If the error exceeds the tolerance, the test fails and (optionally) plots the difference.
-