From b4da18b31353f672518d56caae08955faecbec1e Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Tue, 11 Oct 2016 12:36:39 +0100 Subject: [PATCH 1/4] Beginnings --- .../src/developers_guide/graphics_tests.rst | 55 +++++++++++++++++++ docs/iris/src/developers_guide/tests.rst | 16 ++++++ 2 files changed, 71 insertions(+) create mode 100644 docs/iris/src/developers_guide/graphics_tests.rst diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst new file mode 100644 index 0000000000..419b1ca528 --- /dev/null +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -0,0 +1,55 @@ +.. _graphics_tests: + +Graphics tests +================= + +Iris provides specific support for plotting via matplotlib, in the packages +:mod:`iris.plot` and :mod:`iris.quickplot`. +The effective testing of this plotting functionality is a slightly awkward +problem, as the only method which is both simple and effective is to produce +actual plots and compare them with stored reference images. +This basic 'graphics test' operation is provided by the method +:method:`iris.tests.IrisTest.check_graphic`. + +Existing 'graphics tests', which use this method, were previous (up to Iris +1.10) divided between the legacy test modules `iris.tests.test_plot` and +`iris.tests.test_quickplot`, and more recent functions are tested in +For reasons explained below, all these tests are technically "integration" +tests and not true unit tests. + + + +Testing against actual images introduces two very significant problems: + * Firstly, it is not practical to automatically distinguish plots which are + 'nearly the same' as a reference image from ones which are + 'significantly different' : I.E. any allowed tolerance in output + * Firstly, these tests are all 'integration' tests using matplotlib, and + hence tend to be only work with a specific version of matplotlib. + +Prior to Iris 1.10, all graphics tests compared against a stored reference +image with a small allowed RMS tolerance, based on RGB values in each pixel. + +From Iris v1.11 onward, we have adopted an entirely new means of testing +plotted outputs. +This consists of : + + * using a hash of image output pixels (the SHA of a PNG file) as the basis + for comparing a test result + * storing the hashes of 'known accepted results' for each test in a + database in the repo (which is actually stored in + ``lib/iris/tests/results/imagerepo.json``). + * storing associated reference images for each hash value in a separate web + project, currently in https://github.com/SciTools/test-images-scitools . + + +NOTES: + * not just about mpl versions : As they tend to be "integration" style + tests, potentially any dependency can affect the result. + In practice : especially numpy. + +BRIEF... +There should be sufficient work-flow detail here to allow an iris developer to: + * understand the new check graphic test process + * understand the steps to take and tools to use to add a new graphic test + * understand the steps to take and tools to use to diagnose and fix an graphic test failure + diff --git a/docs/iris/src/developers_guide/tests.rst b/docs/iris/src/developers_guide/tests.rst index 4ad9e0de0b..11399daa17 100644 --- a/docs/iris/src/developers_guide/tests.rst +++ b/docs/iris/src/developers_guide/tests.rst @@ -125,3 +125,19 @@ developers locate relevant tests. It is recommended they are named according to the capabilities under test, e.g. `metadata/test_pp_preservation.py`, and not named according to the module(s) under test. + + +Graphics tests +================= +Certain Iris tests rely on testing plotted results. +This is required for testing the modules :mod:`iris.plot` and +:mod:`iris.quickplot`, but is also used for some other legacy and integration +tests. + +Prior to Iris version 1.10, a single reference image for each test was stored +in the main Iris repository, and a 'tolerant' test was performed against this. + +From version 1.11 onwards, all these tests perform exact comparison against a +set of known-good image *hashes*, while the actual corresponding reference +images, required for human-eyes evaluation, are stored elsewhere. +See :ref:`graphics_tests`. From 72403d0bcf6482a3356652a5a725e7d7dd0d2e5b Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Mon, 10 Oct 2016 14:11:30 +0100 Subject: [PATCH 2/4] Updated 'PR checklist' in developer guide. --- .../documenting/whats_new_contributions.rst | 2 + docs/iris/src/developers_guide/pulls.rst | 86 +++++++++++-------- 2 files changed, 51 insertions(+), 37 deletions(-) diff --git a/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst b/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst index 6b039d0ae9..203a422457 100644 --- a/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst +++ b/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst @@ -1,3 +1,5 @@ +.. _whats_new_contributions: + ================================= Contributing a "What's New" entry ================================= diff --git a/docs/iris/src/developers_guide/pulls.rst b/docs/iris/src/developers_guide/pulls.rst index a0a7b25202..6797c48bf5 100644 --- a/docs/iris/src/developers_guide/pulls.rst +++ b/docs/iris/src/developers_guide/pulls.rst @@ -16,66 +16,78 @@ is merged. Before submitting a pull request please consider this list. The Iris Check List ==================== -* Have you provided a helpful description of the Pull Request? What has - changed and why. This should include: - - * the aim of the change - the problem addressed, a link to the issue; +* Have you provided a helpful description of the Pull Request? + I.E. what has changed and why. This should include: + * the aim of the change ; the problem addressed ; a link to the issue. * how the change has been delivered. -* Do new files pass PEP8? - +* Do all new sourcefiles pass PEP8? * PEP8_ is the Python source code style guide. * There is a python module for checking pep8 compliance: python-pep8_ + * a standard Iris test checks that all sourcefiles meet PEP8 compliance. -* Do all the tests pass locally? +* Do all new or modified sourcefiles have a correct, up-to-date copyright + header? + * a standard Iris test checks that all sourcefiles include a copyright message + (including the correct year of the latest change). +* Do all the tests pass locally? * The Iris tests may be run with ``python setup.py test`` which has a command line utility included. - * Coding standards, including PEP8_ compliance and copyright message (including - the correct year of the latest change), are tested. - -* Has a new test been provided? - -* Has iris-test-data been updated? - - * iris-test-data_ is a github project containing all the data to support the - tests. - * If this has been updated a reference to the relevant pull request should be - provided. -* Has the the documentation been updated to explain the new feature or bug fix? +* Have new tests been provided for all additional functionality ? - * with reference to the developer guide on docstrings_ +* Has the documentation been updated to explain the new feature or bug fix? + * refer to the developer guide on docstrings_ * Have code examples been provided inside the relevant docstrings? - -* Has iris-sample-data been updated? - - * iris-sample-data_ is a github project containing all the data to support - the gallery and examples. + * these are strongly recommended as concrete (working) examples always + considerably enhance the documentation. + * live test code can be included in docstrings. + * See for example :data:`iris.cube.Cube.data` + * Details at http://www.sphinx-doc.org/en/stable/ext/doctest.html + * The documentation tests may be run with ``make doctest``, from within the + ``./docs/iris`` subdirectory. + +* Have you provided a 'whats new' contribution ? + * this should be done for all changes that affect API or behaviour. + See :ref:`whats_new_contributions` * Does the documentation build without errors? - * The documentation is built using ``make html`` in ``./docs/iris``. -* Do the documentation tests pass? - - * ``make doctest``, ``make extest`` in ``./docs/iris``. - -* Is there an associated iris-code-generators pull request? - - * iris-code-generators_ is a github project which provides processes for - generating a small subset of the Iris source code files from other - information sources. +* Do the documentation and code-example tests pass? + * Run with ``make doctest`` and ``make extest``, from within the subdirectory + ``./docs/iris``. + * note that code examples must *not* raise deprecations. This is now checked + and will result in an error. + When an existing code example encounters a deprecation, it must be fixed. * Has the travis file been updated to reflect any dependency updates? - * ``./.travis.yml`` is used to manage the continuous integration testing. + * the files ``./conda-requirements.yml`` and + ``./minimal-conda-requirements.yml`` are used to define the software + environments used, using the conda_ package manager. + +* Have you provided updates to supporting projects for test or example data ? + * the following separate repos are used to manage larger files used by tests + and code examples : + * iris-test-data_ is a github project containing all the data to support the + tests. + * iris-sample-data_ is a github project containing all the data to support + the gallery and examples. + * test-images-scitools_ is a github project containing reference plot images + to support iris graphics tests : see :ref:`test graphics images`. + * If new files are required by tests or code examples, they must be added to + the appropriate supporting project via a suitable pull-request. + This new 'supporting pull request' should be referenced in the main Iris + pull request, and must be accepted and merged before the Iris one can be. .. _PEP8: http://www.python.org/dev/peps/pep-0008/ .. _python-pep8: https://pypi.python.org/pypi/pep8 +.. _conda: http://conda.readthedocs.io/en/latest/ .. _iris-test-data: https://github.com/SciTools/iris-test-data .. _iris-sample-data: https://github.com/SciTools/iris-sample-data -.. _iris-code-generators: https://github.com/SciTools/iris-code-generators +.. _test-images-scitools https://github.com/SciTools/test-images-scitools .. _docstrings: http://scitools.org.uk/iris/docs/latest/developers_guide/documenting/docstrings.html From 035097955dd956178de913497331cd5b5b4a49cc Mon Sep 17 00:00:00 2001 From: Patrick Peglar Date: Tue, 11 Oct 2016 18:02:52 +0100 Subject: [PATCH 3/4] Tweaks. --- .../src/developers_guide/graphics_tests.rst | 103 +++++++++++++----- docs/iris/src/developers_guide/tests.rst | 4 +- 2 files changed, 76 insertions(+), 31 deletions(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 419b1ca528..1ea9b4620c 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -1,51 +1,61 @@ -.. _graphics_tests: +.. _developer_graphics_tests: Graphics tests -================= +************** -Iris provides specific support for plotting via matplotlib, in the packages -:mod:`iris.plot` and :mod:`iris.quickplot`. -The effective testing of this plotting functionality is a slightly awkward -problem, as the only method which is both simple and effective is to produce -actual plots and compare them with stored reference images. -This basic 'graphics test' operation is provided by the method -:method:`iris.tests.IrisTest.check_graphic`. +The only practical way of testing plotting functionality is to check actual +output plots. +For this, a basic 'graphics test' assertion operation is provided in the method +:method:`iris.tests.IrisTest.check_graphic` : This tests plotted output for a +match against a stored reference. +A "graphics test" is any test which employs this. -Existing 'graphics tests', which use this method, were previous (up to Iris -1.10) divided between the legacy test modules `iris.tests.test_plot` and -`iris.tests.test_quickplot`, and more recent functions are tested in -For reasons explained below, all these tests are technically "integration" -tests and not true unit tests. +At present (Iris version 1.10), such tests include the testing for modules +`iris.tests.test_plot` and `iris.tests.test_quickplot`, and also some other +'legacy' style tests (as described in :ref:`developer_tests`). +It is conceivable that new 'graphics tests' of this sort can still be added. +However, as graphics tests are inherently "integration" style rather than true +unit tests, results are dependent on the installed versions of dependences (see +below), so this is not recommended except where no alternative is practical. +Testing actual plot results introduces some significant difficulties : + * Graphics tests are inherently 'integration' style tests, so results will + often vary with the versions of key dependencies : Obviously, results will + depend on the matplotlib version, but they can also depend on numpy and + other installed packages. + * Although seems possible in principle to accommodate 'small' result changes + by distinguishing plots which are 'nearly the same' from those which are + 'significantly different', in practice no *automatic* scheme for this can be + perfect : That is, any calculated tolerance in output matching will allow + some changes which a human would judge as a significant error. + * Storing a variety of alternative 'acceptable' results as reference images + can easily lead to uncontrolled increases in the size of the repository, + given multiple indpendent sources of variation. -Testing against actual images introduces two very significant problems: - * Firstly, it is not practical to automatically distinguish plots which are - 'nearly the same' as a reference image from ones which are - 'significantly different' : I.E. any allowed tolerance in output - * Firstly, these tests are all 'integration' tests using matplotlib, and - hence tend to be only work with a specific version of matplotlib. +Graphics Testing Strategy +========================= Prior to Iris 1.10, all graphics tests compared against a stored reference image with a small allowed RMS tolerance, based on RGB values in each pixel. -From Iris v1.11 onward, we have adopted an entirely new means of testing -plotted outputs. +From Iris v1.11 onward, we want to support testing Iris against multiple +versions of matplotlib (and some other dependencies). +To make these reasonable, we have now rewritten "check_graphic" to allow +multiple alternative 'correct' results without including many more images in +the Iris repository. This consists of : * using a hash of image output pixels (the SHA of a PNG file) as the basis - for comparing a test result + for checking test results * storing the hashes of 'known accepted results' for each test in a database in the repo (which is actually stored in ``lib/iris/tests/results/imagerepo.json``). * storing associated reference images for each hash value in a separate web - project, currently in https://github.com/SciTools/test-images-scitools . - - -NOTES: - * not just about mpl versions : As they tend to be "integration" style - tests, potentially any dependency can affect the result. - In practice : especially numpy. + project, currently in https://github.com/SciTools/test-images-scitools , + allowing human-eye judgement of 'valid equivalent' results. + * a new version of the 'iris/tests/idiff.py' assists in comparing proposed + new 'correct' result images with the existing accepted ones. BRIEF... There should be sufficient work-flow detail here to allow an iris developer to: @@ -53,3 +63,36 @@ There should be sufficient work-flow detail here to allow an iris developer to: * understand the steps to take and tools to use to add a new graphic test * understand the steps to take and tools to use to diagnose and fix an graphic test failure + +Basic workflow +============== +#. a graphics test result has changed, following changes in Iris or + dependencies, so a test is failing +#. the developer judges that the resulting, changed plot image is actually + "correct" (usually, by visually comparing it with previous 'good' results). +#. a copy of the output PNG file is added to the reference image repository in + https://github.com/SciTools/test-images-scitools. The file is named + according to the image hash value, as ``.png``. +#. the hash value of the new result is added into the relevant set of 'valid + result hashes' in the image result database file, + ``tests/results/imagerepo.json``. +#. the tests can now be re-run, and the 'new' result should now be accepted. +#. a pull request is created to add this file into the test-images-scitools + repository +#. a pull request is created in the Iris repository, including the change to + the image results database (``tests/results/imagerepo.json``) : + This pull request must contain a reference to the matching one in + test-images-scitools. + +Note: the Iris pull-request will not test out successfully in Travis until the +test-images-scitools pull request has been merged : This is because there is +an Iris test which ensures the existence of the reference images (uris) for all +the targets in the image results database. + + +Fixing a failing graphics test +============================== + + +Adding a new graphics test +========================== diff --git a/docs/iris/src/developers_guide/tests.rst b/docs/iris/src/developers_guide/tests.rst index 11399daa17..79e838e94f 100644 --- a/docs/iris/src/developers_guide/tests.rst +++ b/docs/iris/src/developers_guide/tests.rst @@ -1,3 +1,5 @@ +.. _developer_tests: + Testing ******* @@ -140,4 +142,4 @@ in the main Iris repository, and a 'tolerant' test was performed against this. From version 1.11 onwards, all these tests perform exact comparison against a set of known-good image *hashes*, while the actual corresponding reference images, required for human-eyes evaluation, are stored elsewhere. -See :ref:`graphics_tests`. +See :ref:`developer_graphics_tests`. From 7a37ef127405b22ae36322c570a7b37a5620acfe Mon Sep 17 00:00:00 2001 From: Corinne Bosley Date: Thu, 13 Oct 2016 15:57:31 +0100 Subject: [PATCH 4/4] some extra words for the docs in Patrick's absence --- .../src/developers_guide/graphics_tests.rst | 57 ++++++++++++------- 1 file changed, 36 insertions(+), 21 deletions(-) diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst index 1ea9b4620c..a07287c2a1 100644 --- a/docs/iris/src/developers_guide/graphics_tests.rst +++ b/docs/iris/src/developers_guide/graphics_tests.rst @@ -15,22 +15,23 @@ At present (Iris version 1.10), such tests include the testing for modules 'legacy' style tests (as described in :ref:`developer_tests`). It is conceivable that new 'graphics tests' of this sort can still be added. However, as graphics tests are inherently "integration" style rather than true -unit tests, results are dependent on the installed versions of dependences (see -below), so this is not recommended except where no alternative is practical. +unit tests, results can differ with the installed versions of dependent +libraries (see below), so this is not recommended except where no alternative +is practical. Testing actual plot results introduces some significant difficulties : * Graphics tests are inherently 'integration' style tests, so results will often vary with the versions of key dependencies : Obviously, results will depend on the matplotlib version, but they can also depend on numpy and other installed packages. - * Although seems possible in principle to accommodate 'small' result changes + * Although it seems possible in principle to accommodate 'small' result changes by distinguishing plots which are 'nearly the same' from those which are 'significantly different', in practice no *automatic* scheme for this can be perfect : That is, any calculated tolerance in output matching will allow some changes which a human would judge as a significant error. * Storing a variety of alternative 'acceptable' results as reference images can easily lead to uncontrolled increases in the size of the repository, - given multiple indpendent sources of variation. + given multiple independent sources of variation. Graphics Testing Strategy @@ -66,23 +67,37 @@ There should be sufficient work-flow detail here to allow an iris developer to: Basic workflow ============== -#. a graphics test result has changed, following changes in Iris or - dependencies, so a test is failing -#. the developer judges that the resulting, changed plot image is actually - "correct" (usually, by visually comparing it with previous 'good' results). -#. a copy of the output PNG file is added to the reference image repository in - https://github.com/SciTools/test-images-scitools. The file is named - according to the image hash value, as ``.png``. -#. the hash value of the new result is added into the relevant set of 'valid - result hashes' in the image result database file, - ``tests/results/imagerepo.json``. -#. the tests can now be re-run, and the 'new' result should now be accepted. -#. a pull request is created to add this file into the test-images-scitools - repository -#. a pull request is created in the Iris repository, including the change to - the image results database (``tests/results/imagerepo.json``) : - This pull request must contain a reference to the matching one in - test-images-scitools. +# If you notice that a graphics test in the Iris testing suite has failed + following changes in Iris or any of its dependencies, this is the process + you now need to follow: + +#1. Create a directory in iris/lib/iris/tests called 'result_image_comparison'. +#2. From your Iris root directory, run the tests by using the command: + ``python setup.py test``. +#3. Navigate to iris/lib/iris/tests and run the command: ``python idiff.py``. + This will open a window for you to visually inspect the changes to the + graphic and then either accept or reject the new result. +#4. Upon acceptance of a change or a new image, a copy of the output PNG file + is added to the reference image repository in + https://github.com/SciTools/test-images-scitools. The file is named + according to the image hash value, as ``.png``. +#5. The hash value of the new result is added into the relevant set of 'valid + result hashes' in the image result database file, + ``tests/results/imagerepo.json``. +#6. The tests must now be re-run, and the 'new' result should be accepted. + Occasionally there are several graphics checks in a single test, only the + first of which will be run should it fail. If this is the case, then you + may well encounter further graphical test failures in your next runs, and + you must repeat the process until all the graphical tests pass. +#7. To add your changes to Iris, you need to make two pull requests. The first + should be made to the test-images-scitools repository, and this should + contain all the newly-generated png files copied into the folder named + 'image_files'. +#8. The second pull request should be created in the Iris repository, and should + only include the change to the image results database + (``tests/results/imagerepo.json``) : + This pull request must contain a reference to the matching one in + test-images-scitools. Note: the Iris pull-request will not test out successfully in Travis until the test-images-scitools pull request has been merged : This is because there is