From b4da18b31353f672518d56caae08955faecbec1e Mon Sep 17 00:00:00 2001
From: Patrick Peglar <patrick.peglar@metoffice.gov.uk>
Date: Tue, 11 Oct 2016 12:36:39 +0100
Subject: [PATCH 1/4] Beginnings

---
 .../src/developers_guide/graphics_tests.rst   | 55 +++++++++++++++++++
 docs/iris/src/developers_guide/tests.rst      | 16 ++++++
 2 files changed, 71 insertions(+)
 create mode 100644 docs/iris/src/developers_guide/graphics_tests.rst

diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst
new file mode 100644
index 0000000000..419b1ca528
--- /dev/null
+++ b/docs/iris/src/developers_guide/graphics_tests.rst
@@ -0,0 +1,55 @@
+.. _graphics_tests:
+
+Graphics tests
+=================
+
+Iris provides specific support for plotting via matplotlib, in the packages
+:mod:`iris.plot` and :mod:`iris.quickplot`.  
+The effective testing of this plotting functionality is a slightly awkward
+problem, as the only method which is both simple and effective is to produce
+actual plots and compare them with stored reference images.
+This basic 'graphics test' operation is provided by the method
+:method:`iris.tests.IrisTest.check_graphic`.
+
+Existing 'graphics tests', which use this method, were previous (up to Iris
+1.10) divided between the legacy test modules `iris.tests.test_plot` and
+`iris.tests.test_quickplot`, and more recent functions are tested in 
+For reasons explained below, all these tests are technically "integration"
+tests and not true unit tests.
+
+
+
+Testing against actual images introduces two very significant problems:
+ * Firstly, it is not practical to automatically distinguish plots which are
+   'nearly the same' as a reference image from ones which are
+   'significantly different' :  I.E. any allowed tolerance in output 
+ * Firstly, these tests are all 'integration' tests using matplotlib, and
+   hence tend to be only work with a specific version of matplotlib.
+
+Prior to Iris 1.10, all graphics tests compared against a stored reference
+image with a small allowed RMS tolerance, based on RGB values in each pixel.
+
+From Iris v1.11 onward, we have adopted an entirely new means of testing
+plotted outputs.
+This consists of :
+
+ * using a hash of image output pixels (the SHA of a PNG file) as the basis
+   for comparing a test result
+ * storing the hashes of 'known accepted results' for each test in a
+   database in the repo (which is actually stored in 
+   ``lib/iris/tests/results/imagerepo.json``).
+ * storing associated reference images for each hash value in a separate web
+   project, currently in https://github.com/SciTools/test-images-scitools .
+
+
+NOTES:
+    * not just about mpl versions :  As they tend to be "integration" style
+        tests, potentially any dependency can affect the result.
+        In practice : especially numpy.
+
+BRIEF...
+There should be sufficient work-flow detail here to allow an iris developer to:
+    * understand the new check graphic test process
+    * understand the steps to take and tools to use to add a new graphic test
+    * understand the steps to take and tools to use to diagnose and fix an graphic test failure
+
diff --git a/docs/iris/src/developers_guide/tests.rst b/docs/iris/src/developers_guide/tests.rst
index 4ad9e0de0b..11399daa17 100644
--- a/docs/iris/src/developers_guide/tests.rst
+++ b/docs/iris/src/developers_guide/tests.rst
@@ -125,3 +125,19 @@ developers locate relevant tests. It is recommended they are named
 according to the capabilities under test, e.g.
 `metadata/test_pp_preservation.py`, and not named according to the
 module(s) under test.
+
+
+Graphics tests
+=================
+Certain Iris tests rely on testing plotted results.
+This is required for testing the modules :mod:`iris.plot` and
+:mod:`iris.quickplot`, but is also used for some other legacy and integration
+tests.
+
+Prior to Iris version 1.10, a single reference image for each test was stored
+in the main Iris repository, and a 'tolerant' test was performed against this.
+
+From version 1.11 onwards, all these tests perform exact comparison against a
+set of known-good image *hashes*, while the actual corresponding reference
+images, required for human-eyes evaluation, are stored elsewhere.
+See :ref:`graphics_tests`.

From 72403d0bcf6482a3356652a5a725e7d7dd0d2e5b Mon Sep 17 00:00:00 2001
From: Patrick Peglar <patrick.peglar@metoffice.gov.uk>
Date: Mon, 10 Oct 2016 14:11:30 +0100
Subject: [PATCH 2/4] Updated 'PR checklist' in developer guide.

---
 .../documenting/whats_new_contributions.rst   |  2 +
 docs/iris/src/developers_guide/pulls.rst      | 86 +++++++++++--------
 2 files changed, 51 insertions(+), 37 deletions(-)

diff --git a/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst b/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst
index 6b039d0ae9..203a422457 100644
--- a/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst
+++ b/docs/iris/src/developers_guide/documenting/whats_new_contributions.rst
@@ -1,3 +1,5 @@
+.. _whats_new_contributions:
+
 =================================
 Contributing a "What's New" entry
 =================================
diff --git a/docs/iris/src/developers_guide/pulls.rst b/docs/iris/src/developers_guide/pulls.rst
index a0a7b25202..6797c48bf5 100644
--- a/docs/iris/src/developers_guide/pulls.rst
+++ b/docs/iris/src/developers_guide/pulls.rst
@@ -16,66 +16,78 @@ is merged.  Before submitting a pull request please consider this list.
 The Iris Check List
 ====================
 
-* Have you provided a helpful description of the Pull Request? What has 
-  changed and why.  This should include:
-
- * the aim of the change - the problem addressed, a link to the issue;
+* Have you provided a helpful description of the Pull Request?
+  I.E. what has changed and why.  This should include:
+ * the aim of the change ; the problem addressed ; a link to the issue.
  * how the change has been delivered.
 
-* Do new files pass PEP8?
-
+* Do all new sourcefiles pass PEP8?
  * PEP8_ is the Python source code style guide.
  * There is a python module for checking pep8 compliance: python-pep8_
+ * a standard Iris test checks that all sourcefiles meet PEP8 compliance.
 
-* Do all the tests pass locally?
+* Do all new or modified sourcefiles have a correct, up-to-date copyright
+  header?
+ * a standard Iris test checks that all sourcefiles include a copyright message
+   (including the correct year of the latest change).
 
+* Do all the tests pass locally?
  * The Iris tests may be run with ``python setup.py test`` which has a command 
    line utility included.
- * Coding standards, including PEP8_ compliance and copyright message (including 
-   the correct year of the latest change), are tested. 
-
-* Has a new test been provided?
-
-* Has iris-test-data been updated?
-
- * iris-test-data_ is a github project containing all the data to support the
-   tests.
- * If this has been updated a reference to the relevant pull request should be
-   provided.
 
-* Has the the documentation been updated to explain the new feature or bug fix?
+* Have new tests been provided for all additional functionality ?
 
- * with reference to the developer guide on docstrings_
+* Has the documentation been updated to explain the new feature or bug fix?
+ * refer to the developer guide on docstrings_
 
 * Have code examples been provided inside the relevant docstrings?
-
-* Has iris-sample-data been updated?
-
- * iris-sample-data_ is a github project containing all the data to support
-   the gallery and examples.
+ * these are strongly recommended as concrete (working) examples always
+   considerably enhance the documentation.
+  * live test code can be included in docstrings.
+   * See for example :data:`iris.cube.Cube.data`
+   * Details at http://www.sphinx-doc.org/en/stable/ext/doctest.html
+  * The documentation tests may be run with ``make doctest``, from within the
+    ``./docs/iris`` subdirectory.
+
+* Have you provided a 'whats new' contribution ?
+ * this should be done for all changes that affect API or behaviour.
+   See :ref:`whats_new_contributions`
 
 * Does the documentation build without errors?
-
  * The documentation is built using ``make html`` in ``./docs/iris``.
 
-* Do the documentation tests pass?
-
- * ``make doctest``, ``make extest``  in ``./docs/iris``.
-
-* Is there an associated iris-code-generators pull request?
-
- * iris-code-generators_ is a github project which provides processes for
-   generating a small subset of the Iris source code files from other 
-   information sources.
+* Do the documentation and code-example tests pass?
+ * Run with ``make doctest`` and ``make extest``, from within the subdirectory
+   ``./docs/iris``.
+ * note that code examples must *not* raise deprecations.  This is now checked
+   and will result in an error.
+   When an existing code example encounters a deprecation, it must be fixed.
 
 * Has the travis file been updated to reflect any dependency updates?
-
  * ``./.travis.yml`` is used to manage the continuous integration testing.
+ * the files ``./conda-requirements.yml`` and
+    ``./minimal-conda-requirements.yml`` are used to define the software
+    environments used, using the conda_ package manager.
+
+* Have you provided updates to supporting projects for test or example data ?
+ * the following separate repos are used to manage larger files used by tests
+   and code examples :
+  * iris-test-data_ is a github project containing all the data to support the
+    tests.
+  * iris-sample-data_ is a github project containing all the data to support
+    the gallery and examples.
+  * test-images-scitools_ is a github project containing reference plot images
+    to support iris graphics tests : see :ref:`test graphics images`.
+ * If new files are required by tests or code examples, they must be added to
+   the appropriate supporting project via a suitable pull-request.
+   This new 'supporting pull request' should be referenced in the main Iris
+   pull request, and must be accepted and merged before the Iris one can be.
 
 
 .. _PEP8: http://www.python.org/dev/peps/pep-0008/
 .. _python-pep8: https://pypi.python.org/pypi/pep8
+.. _conda: http://conda.readthedocs.io/en/latest/
 .. _iris-test-data: https://github.com/SciTools/iris-test-data
 .. _iris-sample-data: https://github.com/SciTools/iris-sample-data
-.. _iris-code-generators: https://github.com/SciTools/iris-code-generators
+.. _test-images-scitools https://github.com/SciTools/test-images-scitools
 .. _docstrings: http://scitools.org.uk/iris/docs/latest/developers_guide/documenting/docstrings.html

From 035097955dd956178de913497331cd5b5b4a49cc Mon Sep 17 00:00:00 2001
From: Patrick Peglar <patrick.peglar@metoffice.gov.uk>
Date: Tue, 11 Oct 2016 18:02:52 +0100
Subject: [PATCH 3/4] Tweaks.

---
 .../src/developers_guide/graphics_tests.rst   | 103 +++++++++++++-----
 docs/iris/src/developers_guide/tests.rst      |   4 +-
 2 files changed, 76 insertions(+), 31 deletions(-)

diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst
index 419b1ca528..1ea9b4620c 100644
--- a/docs/iris/src/developers_guide/graphics_tests.rst
+++ b/docs/iris/src/developers_guide/graphics_tests.rst
@@ -1,51 +1,61 @@
-.. _graphics_tests:
+.. _developer_graphics_tests:
 
 Graphics tests
-=================
+**************
 
-Iris provides specific support for plotting via matplotlib, in the packages
-:mod:`iris.plot` and :mod:`iris.quickplot`.  
-The effective testing of this plotting functionality is a slightly awkward
-problem, as the only method which is both simple and effective is to produce
-actual plots and compare them with stored reference images.
-This basic 'graphics test' operation is provided by the method
-:method:`iris.tests.IrisTest.check_graphic`.
+The only practical way of testing plotting functionality is to check actual
+output plots.
+For this, a basic 'graphics test' assertion operation is provided in the method
+:method:`iris.tests.IrisTest.check_graphic` :  This tests plotted output for a
+match against a stored reference.
+A "graphics test" is any test which employs this.
 
-Existing 'graphics tests', which use this method, were previous (up to Iris
-1.10) divided between the legacy test modules `iris.tests.test_plot` and
-`iris.tests.test_quickplot`, and more recent functions are tested in 
-For reasons explained below, all these tests are technically "integration"
-tests and not true unit tests.
+At present (Iris version 1.10), such tests include the testing for modules
+`iris.tests.test_plot` and `iris.tests.test_quickplot`, and also some other
+'legacy' style tests (as described in :ref:`developer_tests`).
+It is conceivable that new 'graphics tests' of this sort can still be added.
+However, as graphics tests are inherently "integration" style rather than true
+unit tests, results are dependent on the installed versions of dependences (see
+below), so this is not recommended except where no alternative is practical.
 
+Testing actual plot results introduces some significant difficulties :
+ * Graphics tests are inherently 'integration' style tests, so results will
+   often vary with the versions of key dependencies :  Obviously, results will
+   depend on the matplotlib version, but they can also depend on numpy and
+   other installed packages.
+ * Although seems possible in principle to accommodate 'small' result changes
+   by distinguishing plots which are 'nearly the same' from those which are
+   'significantly different', in practice no *automatic* scheme for this can be
+   perfect :  That is, any calculated tolerance in output matching will allow
+   some changes which a human would judge as a significant error.
+ * Storing a variety of alternative 'acceptable' results as reference images
+   can easily lead to uncontrolled increases in the size of the repository,
+   given multiple indpendent sources of variation.
 
 
-Testing against actual images introduces two very significant problems:
- * Firstly, it is not practical to automatically distinguish plots which are
-   'nearly the same' as a reference image from ones which are
-   'significantly different' :  I.E. any allowed tolerance in output 
- * Firstly, these tests are all 'integration' tests using matplotlib, and
-   hence tend to be only work with a specific version of matplotlib.
+Graphics Testing Strategy
+=========================
 
 Prior to Iris 1.10, all graphics tests compared against a stored reference
 image with a small allowed RMS tolerance, based on RGB values in each pixel.
 
-From Iris v1.11 onward, we have adopted an entirely new means of testing
-plotted outputs.
+From Iris v1.11 onward, we want to support testing Iris against multiple
+versions of matplotlib (and some other dependencies).  
+To make these reasonable, we have now rewritten "check_graphic" to allow
+multiple alternative 'correct' results without including many more images in
+the Iris repository.  
 This consists of :
 
  * using a hash of image output pixels (the SHA of a PNG file) as the basis
-   for comparing a test result
+   for checking test results
  * storing the hashes of 'known accepted results' for each test in a
    database in the repo (which is actually stored in 
    ``lib/iris/tests/results/imagerepo.json``).
  * storing associated reference images for each hash value in a separate web
-   project, currently in https://github.com/SciTools/test-images-scitools .
-
-
-NOTES:
-    * not just about mpl versions :  As they tend to be "integration" style
-        tests, potentially any dependency can affect the result.
-        In practice : especially numpy.
+   project, currently in https://github.com/SciTools/test-images-scitools ,
+   allowing human-eye judgement of 'valid equivalent' results.
+ * a new version of the 'iris/tests/idiff.py' assists in comparing proposed
+   new 'correct' result images with the existing accepted ones.
 
 BRIEF...
 There should be sufficient work-flow detail here to allow an iris developer to:
@@ -53,3 +63,36 @@ There should be sufficient work-flow detail here to allow an iris developer to:
     * understand the steps to take and tools to use to add a new graphic test
     * understand the steps to take and tools to use to diagnose and fix an graphic test failure
 
+
+Basic workflow
+==============
+#. a graphics test result has changed, following changes in Iris or
+   dependencies, so a test is failing
+#. the developer judges that the resulting, changed plot image is actually
+   "correct" (usually, by visually comparing it with previous 'good' results).
+#. a copy of the output PNG file is added to the reference image repository in
+   https://github.com/SciTools/test-images-scitools.  The file is named
+   according to the image hash value, as ``<hash>.png``.
+#. the hash value of the new result is added into the relevant set of 'valid
+   result hashes' in the image result database file,
+   ``tests/results/imagerepo.json``.
+#. the tests can now be re-run, and the 'new' result should now be accepted.
+#. a pull request is created to add this file into the test-images-scitools
+   repository
+#. a pull request is created in the Iris repository, including the change to
+   the image results database (``tests/results/imagerepo.json``) :
+   This pull request must contain a reference to the matching one in
+   test-images-scitools.
+
+Note: the Iris pull-request will not test out successfully in Travis until the
+test-images-scitools pull request has been merged :  This is because there is
+an Iris test which ensures the existence of the reference images (uris) for all
+the targets in the image results database.
+
+
+Fixing a failing graphics test
+==============================
+
+
+Adding a new graphics test
+==========================
diff --git a/docs/iris/src/developers_guide/tests.rst b/docs/iris/src/developers_guide/tests.rst
index 11399daa17..79e838e94f 100644
--- a/docs/iris/src/developers_guide/tests.rst
+++ b/docs/iris/src/developers_guide/tests.rst
@@ -1,3 +1,5 @@
+.. _developer_tests:
+
 Testing
 *******
 
@@ -140,4 +142,4 @@ in the main Iris repository, and a 'tolerant' test was performed against this.
 From version 1.11 onwards, all these tests perform exact comparison against a
 set of known-good image *hashes*, while the actual corresponding reference
 images, required for human-eyes evaluation, are stored elsewhere.
-See :ref:`graphics_tests`.
+See :ref:`developer_graphics_tests`.

From 7a37ef127405b22ae36322c570a7b37a5620acfe Mon Sep 17 00:00:00 2001
From: Corinne Bosley <corinne.bosley@metoffice.gov.uk>
Date: Thu, 13 Oct 2016 15:57:31 +0100
Subject: [PATCH 4/4] some extra words for the docs in Patrick's absence

---
 .../src/developers_guide/graphics_tests.rst   | 57 ++++++++++++-------
 1 file changed, 36 insertions(+), 21 deletions(-)

diff --git a/docs/iris/src/developers_guide/graphics_tests.rst b/docs/iris/src/developers_guide/graphics_tests.rst
index 1ea9b4620c..a07287c2a1 100644
--- a/docs/iris/src/developers_guide/graphics_tests.rst
+++ b/docs/iris/src/developers_guide/graphics_tests.rst
@@ -15,22 +15,23 @@ At present (Iris version 1.10), such tests include the testing for modules
 'legacy' style tests (as described in :ref:`developer_tests`).
 It is conceivable that new 'graphics tests' of this sort can still be added.
 However, as graphics tests are inherently "integration" style rather than true
-unit tests, results are dependent on the installed versions of dependences (see
-below), so this is not recommended except where no alternative is practical.
+unit tests, results can differ with the installed versions of dependent
+libraries (see below), so this is not recommended except where no alternative
+is practical.
 
 Testing actual plot results introduces some significant difficulties :
  * Graphics tests are inherently 'integration' style tests, so results will
    often vary with the versions of key dependencies :  Obviously, results will
    depend on the matplotlib version, but they can also depend on numpy and
    other installed packages.
- * Although seems possible in principle to accommodate 'small' result changes
+ * Although it seems possible in principle to accommodate 'small' result changes
    by distinguishing plots which are 'nearly the same' from those which are
    'significantly different', in practice no *automatic* scheme for this can be
    perfect :  That is, any calculated tolerance in output matching will allow
    some changes which a human would judge as a significant error.
  * Storing a variety of alternative 'acceptable' results as reference images
    can easily lead to uncontrolled increases in the size of the repository,
-   given multiple indpendent sources of variation.
+   given multiple independent sources of variation.
 
 
 Graphics Testing Strategy
@@ -66,23 +67,37 @@ There should be sufficient work-flow detail here to allow an iris developer to:
 
 Basic workflow
 ==============
-#. a graphics test result has changed, following changes in Iris or
-   dependencies, so a test is failing
-#. the developer judges that the resulting, changed plot image is actually
-   "correct" (usually, by visually comparing it with previous 'good' results).
-#. a copy of the output PNG file is added to the reference image repository in
-   https://github.com/SciTools/test-images-scitools.  The file is named
-   according to the image hash value, as ``<hash>.png``.
-#. the hash value of the new result is added into the relevant set of 'valid
-   result hashes' in the image result database file,
-   ``tests/results/imagerepo.json``.
-#. the tests can now be re-run, and the 'new' result should now be accepted.
-#. a pull request is created to add this file into the test-images-scitools
-   repository
-#. a pull request is created in the Iris repository, including the change to
-   the image results database (``tests/results/imagerepo.json``) :
-   This pull request must contain a reference to the matching one in
-   test-images-scitools.
+#   If you notice that a graphics test in the Iris testing suite has failed
+    following changes in Iris or any of its dependencies, this is the process
+    you now need to follow:
+
+#1. Create a directory in iris/lib/iris/tests called 'result_image_comparison'.
+#2. From your Iris root directory, run the tests by using the command:
+    ``python setup.py test``.
+#3. Navigate to iris/lib/iris/tests and run the command: ``python idiff.py``.
+    This will open a window for you to visually inspect the changes to the
+    graphic and then either accept or reject the new result.
+#4. Upon acceptance of a change or a new image, a copy of the output PNG file
+    is added to the reference image repository in
+    https://github.com/SciTools/test-images-scitools.  The file is named
+    according to the image hash value, as ``<hash>.png``.
+#5. The hash value of the new result is added into the relevant set of 'valid
+    result hashes' in the image result database file,
+    ``tests/results/imagerepo.json``.
+#6. The tests must now be re-run, and the 'new' result should be accepted.
+    Occasionally there are several graphics checks in a single test, only the
+    first of which will be run should it fail.  If this is the case, then you
+    may well encounter further graphical test failures in your next runs, and
+    you must repeat the process until all the graphical tests pass.
+#7. To add your changes to Iris, you need to make two pull requests.  The first
+    should be made to the test-images-scitools repository, and this should
+    contain all the newly-generated png files copied into the folder named
+    'image_files'.
+#8. The second pull request should be created in the Iris repository, and should
+    only include the change to the image results database
+    (``tests/results/imagerepo.json``) :
+    This pull request must contain a reference to the matching one in
+    test-images-scitools.
 
 Note: the Iris pull-request will not test out successfully in Travis until the
 test-images-scitools pull request has been merged :  This is because there is