diff --git a/doc/recipe/preprocessor.rst b/doc/recipe/preprocessor.rst
index 6a5b72e673..88ba4c9ff7 100644
--- a/doc/recipe/preprocessor.rst
+++ b/doc/recipe/preprocessor.rst
@@ -456,17 +456,17 @@ Missing values masks
 --------------------
 
 Missing (masked) values can be a nuisance especially when dealing with
-multimodel ensembles and having to compute multimodel statistics; different
+multi-model ensembles and having to compute multi-model statistics; different
 numbers of missing data from dataset to dataset may introduce biases and
-artificially assign more weight to the datasets that have less missing
-data. This is handled in ESMValTool via the missing values masks: two types of
-such masks are available, one for the multimodel case and another for the
-single model case.
+artificially assign more weight to the datasets that have less missing data.
+This is handled in ESMValTool via the missing values masks: two types of such
+masks are available, one for the multi-model case and another for the single
+model case.
 
-The multimodel missing values mask (``mask_fillvalues``) is a preprocessor step
+The multi-model missing values mask (``mask_fillvalues``) is a preprocessor step
 that usually comes after all the single-model steps (regridding, area selection
 etc) have been performed; in a nutshell, it combines missing values masks from
-individual models into a multimodel missing values mask; the individual model
+individual models into a multi-model missing values mask; the individual model
 masks are built according to common criteria: the user chooses a time window in
 which missing data points are counted, and if the number of missing data points
 relative to the number of total data points in a window is less than a chosen
@@ -492,11 +492,11 @@ See also :func:`esmvalcore.preprocessor.mask_fillvalues`.
 Common mask for multiple models
 -------------------------------
 
-It is possible to use ``mask_fillvalues`` to create a combined multimodel
-mask (all the masks from all the analyzed models combined into a single
-mask); for that purpose setting the ``threshold_fraction`` to 0 will not
-discard any time windows, essentially keeping the original model masks and
-combining them into a single mask; here is an example:
+It is possible to use ``mask_fillvalues`` to create a combined multi-model mask
+(all the masks from all the analyzed models combined into a single mask); for
+that purpose setting the ``threshold_fraction`` to 0 will not discard any time
+windows, essentially keeping the original model masks and combining them into a
+single mask; here is an example:
 
 .. code-block:: yaml
 
@@ -530,13 +530,12 @@ Horizontal regridding
 
 Regridding is necessary when various datasets are available on a variety of
 `lat-lon` grids and they need to be brought together on a common grid (for
-various statistical operations e.g. multimodel statistics or for e.g. direct
+various statistical operations e.g. multi-model statistics or for e.g. direct
 inter-comparison or comparison with observational datasets). Regridding is
 conceptually a very similar process to interpolation (in fact, the regridder
 engine uses interpolation and extrapolation, with various schemes). The primary
 difference is that interpolation is based on sample data points, while
-regridding is based on the horizontal grid of another cube (the reference
-grid).
+regridding is based on the horizontal grid of another cube (the reference grid).
 
 The underlying regridding mechanism in ESMValTool uses the `cube.regrid()
 <https://scitools.org.uk/iris/docs/latest/iris/iris/cube.html#iris.cube.Cube.regrid>`_
@@ -651,28 +650,28 @@ Multi-model statistics
 ======================
 Computing multi-model statistics is an integral part of model analysis and
 evaluation: individual models display a variety of biases depending on model
-set-up, initial conditions, forcings and implementation; comparing model data
-to observational data, these biases have a significantly lower statistical
-impact when using a multi-model ensemble. ESMValTool has the capability of
-computing a number of multi-model statistical measures: using the preprocessor
-module ``multi_model_statistics`` will enable the user to ask for either a
-multi-model ``mean``, ``median``, ``max``, ``min``, ``std``, and / or
-``pXX.YY`` with a set of argument parameters passed to
-``multi_model_statistics``. Percentiles can be specified like ``p1.5`` or
-``p95``. The decimal point will be replaced by a dash in the output file.
-
-Note that current multimodel statistics in ESMValTool are local (not global),
-and are computed along the time axis. As such, can be computed across a common
-overlap in time (by specifying ``span: overlap`` argument) or across the full
-length in time of each model (by specifying ``span: full`` argument).
+set-up, initial conditions, forcings and implementation; comparing model data to
+observational data, these biases have a significantly lower statistical impact
+when using a multi-model ensemble. ESMValTool has the capability of computing a
+number of multi-model statistical measures: using the preprocessor module
+``multi_model_statistics`` will enable the user to ask for either a multi-model
+``mean``, ``median``, ``max``, ``min``, ``std``, and / or ``pXX.YY`` with a set
+of argument parameters passed to ``multi_model_statistics``. Percentiles can be
+specified like ``p1.5`` or ``p95``. The decimal point will be replaced by a dash
+in the output file.
 
 Restrictive computation is also available by excluding  any set of models that
 the user will not want to include in the statistics (by setting ``exclude:
 [excluded models list]`` argument). The implementation has a few restrictions
-that apply to the input data: model datasets must have consistent shapes, and
-from a statistical point of view, this is needed since weights are not yet
-implemented; also higher dimensional data is not supported (i.e. anything with
-dimensionality higher than four: time, vertical axis, two horizontal axes).
+that apply to the input data: model datasets must have consistent shapes, apart
+from the time dimension; and cubes with more than four dimensions (time,
+vertical axis, two horizontal axes) are not supported.
+
+Input datasets may have different time coordinates. Statistics can be computed
+across overlapping times only (``span: overlap``) or across the full time span
+of the combined models (``span: full``). The preprocessor sets a common time
+coordinate on all datasets. As the number of days in a year may vary between
+calendars, (sub-)daily data with different calendars are not supported.
 
 Input datasets may have different time coordinates. The multi-model statistics
 preprocessor sets a common time coordinate on all datasets. As the number of
@@ -681,7 +680,7 @@ days in a year may vary between calendars, (sub-)daily data are not supported.
 .. code-block:: yaml
 
     preprocessors:
-      multimodel_preprocessor:
+      multi_model_preprocessor:
         multi_model_statistics:
           span: overlap
           statistics: [mean, median]
@@ -702,14 +701,12 @@ entry contains the resulting cube with the requested statistic operations.
 
 .. note::
 
-   Note that the multimodel array operations, albeit performed in
-   per-time/per-horizontal level loops to save memory, could, however, be
-   rather memory-intensive (since they are not performed lazily as
-   yet). The Section on :ref:`Memory use` details the memory intake
-   for different run scenarios, but as a thumb rule, for the multimodel
-   preprocessor, the expected maximum memory intake could be approximated as
-   the number of datasets multiplied by the average size in memory for one
-   dataset.
+   The multi-model array operations can be rather memory-intensive (since they
+   are not performed lazily as yet). The Section on :ref:`Memory use` details
+   the memory intake for different run scenarios, but as a thumb rule, for the
+   multi-model preprocessor, the expected maximum memory intake could be
+   approximated as the number of datasets multiplied by the average size in
+   memory for one dataset.
 
 .. _time operations:
 
@@ -1512,14 +1509,14 @@ In the most general case, we can set upper limits on the maximum memory the
 analysis will require:
 
 
-``Ms = (R + N) x F_eff - F_eff`` - when no multimodel analysis is performed;
+``Ms = (R + N) x F_eff - F_eff`` - when no multi-model analysis is performed;
 
-``Mm = (2R + N) x F_eff - 2F_eff`` - when multimodel analysis is performed;
+``Mm = (2R + N) x F_eff - 2F_eff`` - when multi-model analysis is performed;
 
 where
 
 * ``Ms``: maximum memory for non-multimodel module
-* ``Mm``: maximum memory for multimodel module
+* ``Mm``: maximum memory for multi-model module
 * ``R``: computational efficiency of module; `R` is typically 2-3
 * ``N``: number of datasets
 * ``F_eff``: average size of data per dataset where ``F_eff = e x f x F``
@@ -1538,7 +1535,7 @@ where
 ``Mm = 1.5 x (N - 2)`` GB
 
 As a rule of thumb, the maximum required memory at a certain time for
-multimodel analysis could be estimated by multiplying the number of datasets by
+multi-model analysis could be estimated by multiplying the number of datasets by
 the average file size of all the datasets; this memory intake is high but also
 assumes that all data is fully realized in memory; this aspect will gradually
 change and the amount of realized data will decrease with the increase of
diff --git a/esmvalcore/preprocessor/_multimodel.py b/esmvalcore/preprocessor/_multimodel.py
index 750768195a..680496aa08 100644
--- a/esmvalcore/preprocessor/_multimodel.py
+++ b/esmvalcore/preprocessor/_multimodel.py
@@ -1,16 +1,13 @@
-"""multimodel statistics.
+"""Statistics across cubes.
 
-Functions for multi-model operations
-supports a multitude of multimodel statistics
-computations; the only requisite is the ingested
-cubes have (TIME-LAT-LON) or (TIME-PLEV-LAT-LON)
-dimensions; and obviously consistent units.
+This module contains functions to compute statistics across multiple cubes or
+products.
 
-It operates on different (time) spans:
-- full: computes stats on full dataset time;
-- overlap: computes common time overlap between datasets;
+Wrapper functions separate esmvalcore internals, operating on products, from
+generalized functions that operate on iris cubes.
 """
 
+import copy
 import logging
 import re
 from datetime import datetime
@@ -20,6 +17,7 @@
 import iris
 import numpy as np
 import scipy
+from iris.experimental.equalise_cubes import equalise_attributes
 
 logger = logging.getLogger(__name__)
 
@@ -295,66 +293,47 @@ def _assemble_data(cubes, statistic, span='overlap'):
     return stats_cube
 
 
-def multi_model_statistics(products, span, statistics, output_products=None):
-    """Compute multi-model statistics.
+def multicube_statistics(cubes, statistics, span):
+    """Compute statistics across input cubes.
 
-    Multimodel statistics computed along the time axis. Can be
-    computed across a common overlap in time (set span: overlap)
-    or across the full length in time of each model (set span: full).
-    Restrictive computation is also available by excluding any set of
-    models that the user will not want to include in the statistics
-    (set exclude: [excluded models list]).
+    This function deals with non-homogeneous cubes by taking the time union
+    (span='full') or intersection (span='overlap'), and extending or subsetting
+    the cubes as necessary. Apart from the time coordinate, cubes must have
+    consistent shapes.
 
-    Restrictions needed by the input data:
-    - model datasets must have consistent shapes,
-    - higher dimensional data is not supported (ie dims higher than four:
-    time, vertical axis, two horizontal axes).
+    This function operates directly on numpy (masked) arrays and rebuilds the
+    resulting cubes from scratch. Therefore, it is not suitable for lazy
+    evaluation. This function is restricted to maximum four-dimensional data:
+    time, vertical axis, two horizontal axes.
 
     Parameters
     ----------
-    products: list
-        list of data products or cubes to be used in multimodel stat
-        computation;
-        cube attribute of product is the data cube for computing the stats.
+    cubes: list
+        list of cubes across which the statistics will be computed;
+    statistics: list
+        statistical metrics to be computed. Available options: mean, median,
+        max, min, std, or pXX.YY (for percentile XX.YY; decimal part optional).
     span: str
         overlap or full; if overlap, statitsticss are computed on common time-
         span; if full, statistics are computed on full time spans, ignoring
         missing data.
-    output_products: dict
-        dictionary of output products. MUST be specified if products are NOT
-        cubes
-    statistics: list of str
-        list of statistical measure(s) to be computed. Available options:
-        mean, median, max, min, std, or pXX.YY (for percentile XX.YY; decimal
-        part optional).
 
     Returns
     -------
-    set or dict or list
-        `set` of data products if `output_products` is given
-        `dict` of cubes if `output_products` is not given
-        `list` of input cubes if there is no overlap between cubes when
-        using `span='overlap'`
+    dict
+        dictionary of statistics cubes with statistics' names as keys.
 
     Raises
     ------
     ValueError
         If span is neither overlap nor full.
     """
-    logger.debug('Multimodel statistics: computing: %s', statistics)
-    if len(products) < 2:
-        logger.info("Single dataset in list: will not compute statistics.")
-        return products
-    if output_products:
-        cubes = [cube for product in products for cube in product.cubes]
-        statistic_products = set()
-    else:
-        cubes = products
-        statistic_products = {}
+    logger.debug('Multicube statistics: computing: %s', statistics)
 
     # Reset time coordinates and make cubes share the same calendar
     _unify_time_coordinates(cubes)
 
+    # Check whether input is valid
     if span == 'overlap':
         # check if we have any time overlap
         times = [cube.coord('time').points for cube in cubes]
@@ -362,7 +341,7 @@ def multi_model_statistics(products, span, statistics, output_products=None):
         if len(overlap) <= 1:
             logger.info("Time overlap between cubes is none or a single point."
                         "check datasets: will not compute statistics.")
-            return products
+            return cubes
         logger.debug("Using common time overlap between "
                      "datasets to compute statistics.")
     elif span == 'full':
@@ -372,22 +351,156 @@ def multi_model_statistics(products, span, statistics, output_products=None):
             "Unexpected value for span {}, choose from 'overlap', 'full'".
             format(span))
 
+    # Compute statistics
+    statistics_cubes = {}
     for statistic in statistics:
-        # Compute statistic
         statistic_cube = _assemble_data(cubes, statistic, span)
+        statistics_cubes[statistic] = statistic_cube
+
+    return statistics_cubes
+
+
+def multicube_statistics_iris(cubes, statistics):
+    """Compute statistics across input cubes.
+
+    Like multicube_statistics, but operates directly on Iris cubes. Cubes are
+    merged and subsequently collapsed along a new auxiliary coordinate.
+    Inconsistent attributes will be removed.
+
+    This method uses iris' built in functions, exposing the operators in
+    iris.analysis and supporting lazy evaluation. Input cubes must have
+    consistent shapes.
+
+    Note: some of the operators in iris.analysis require additional
+    arguments, such as percentiles or weights. These operators are
+    currently not supported.
+
+    Parameters
+    ----------
+    cubes: list
+        list of cubes to be used in multimodel stat computation;
+        statistics: list statistical measures to be computed. Choose from the
+        operators listed in the iris.analysis package.
+
+    Returns
+    -------
+    dict
+        dictionary of statistics cubes with statistics' names as keys.
+    """
+    operators = vars(iris.analysis)
+
+    cubes = copy.deepcopy(cubes)
+
+    for i, cube in enumerate(cubes):
+        concat_dim = iris.coords.AuxCoord(i, var_name='ens')
+        cube.add_aux_coord(concat_dim)
+
+    equalise_attributes(cubes)
+
+    cubes = iris.cube.CubeList(cubes)
+    cube = cubes.merge_cube()
+
+    statistics_cubes = {}
+    for statistic in statistics:
+        try:
+            operator = operators[statistic.upper()]
+        except KeyError as err:
+            raise ValueError(
+                f'Statistic `{statistic}` not supported in',
+                '`ensemble_statistics`. Choose supported operator from',
+                '`iris.analysis package`.') from err
+
+        # this will always return a masked array
+        statistic_cube = cube.collapsed('ens', operator)
+        statistics_cubes[statistic] = statistic_cube
+
+    return statistics_cubes
 
-        if output_products:
-            # Add to output product and log provenance
-            statistic_product = output_products[statistic]
-            statistic_product.cubes = [statistic_cube]
-            for product in products:
-                statistic_product.wasderivedfrom(product)
-            logger.info("Generated %s", statistic_product)
-            statistic_products.add(statistic_product)
-        else:
-            statistic_products[statistic] = statistic_cube
 
-    if output_products:
-        products |= statistic_products
-        return products
-    return statistic_products
+def _multiproduct_statistics(products,
+                             statistics,
+                             output_products,
+                             span='overlap',
+                             engine='esmvalcore'):
+    """Compute statistics across ESMValCore products.
+
+    Extract cubes from products, calculate the statistics across cubes and
+    assign the resulting output cubes to the output_products.
+
+    This function separates the ESMValCore internals (products) from the actual
+    statistics function that operates on Iris cubes.
+
+    Parameters
+    ----------
+    products: list
+        list of PreprocessorFile's
+    statistics: list
+        list of strings describing the statistics that will be computed
+    output_products: dict
+        dict of PreprocessorFile's with statistic names as keys.
+    span: str
+        'full' or 'overlap', whether to calculate the statistics on the time
+        union ('full') or time intersection ('overlap') of the cubes.
+    engine: str
+        'iris' or 'esmvalcore', which function to use to compute the
+        statistics. Iris is more efficient and exposes more operators,
+        but requires highly homogeneous/compatible cubes.
+
+    Returns
+    -------
+    set
+        set of PreprocessorFiles containing the computed statistics cubes.
+    """
+    if engine == 'iris':
+        aggregator = multicube_statistics_iris
+    else:
+        aggregator = partial(multicube_statistics, span=span)
+
+    # Extract cubes from products
+    cubes = [cube for product in products for cube in product.cubes]
+
+    # Compute statistics
+    if len(cubes) < 2:
+        logger.info('Found only 1 cube; no statistics computed for %r',
+                    list(products)[0])
+        statistics_cubes = {statistic: cubes[0] for statistic in statistics}
+    else:
+        statistics_cubes = aggregator(cubes=cubes, statistics=statistics)
+
+    # Add cubes to output products and log provenance
+    statistics_products = set()
+    for statistic, cube in statistics_cubes.items():
+        statistics_product = output_products[statistic]
+        statistics_product.cubes = [cube]
+
+        for product in products:
+            statistics_product.wasderivedfrom(product)
+
+        logger.info("Generated %s", statistics_product)
+        statistics_products.add(statistics_product)
+
+    return statistics_products
+
+
+def multi_model_statistics(products: set,
+                           statistics: list,
+                           output_products: set,
+                           span: str = 'overlap',
+                           engine: str = 'esmvalcore'):
+    """Entry point for ESMValCore's multi-model statistics preprocessor.
+
+    Compute statistics across cubes from different models.
+
+    See Also
+    --------
+    multicube_statistics : core statistics function.
+    """
+    statistics_products = _multiproduct_statistics(
+        products=products,
+        statistics=statistics,
+        output_products=output_products,
+        span=span,
+        engine=engine,
+    )
+
+    return statistics_products
diff --git a/tests/sample_data/multimodel_statistics/test_multimodel.py b/tests/sample_data/multimodel_statistics/test_multimodel.py
index a527a9b913..9f6551675a 100644
--- a/tests/sample_data/multimodel_statistics/test_multimodel.py
+++ b/tests/sample_data/multimodel_statistics/test_multimodel.py
@@ -1,6 +1,7 @@
 """Test using sample data for :func:`esmvalcore.preprocessor._multimodel`."""
 
 import pickle
+from functools import partial
 from itertools import groupby
 from pathlib import Path
 
@@ -8,7 +9,11 @@
 import numpy as np
 import pytest
 
-from esmvalcore.preprocessor import extract_time, multi_model_statistics
+from esmvalcore.preprocessor import extract_time
+from esmvalcore.preprocessor._multimodel import (
+    multicube_statistics,
+    multicube_statistics_iris,
+)
 
 esmvaltool_sample_data = pytest.importorskip("esmvaltool_sample_data")
 
@@ -28,6 +33,11 @@
 
 SPAN_PARAMS = ('overlap', 'full')
 
+ENGINE_PARAMS = (
+    'esmvalcore',
+    pytest.param('iris', marks=pytest.mark.xfail),
+)
+
 
 def assert_array_almost_equal(this, other):
     """Assert that array `this` almost equals array `other`."""
@@ -118,18 +128,23 @@ def calendar(cube):
     return cube_dict
 
 
-def multimodel_test(cubes, span, statistic):
+def multimodel_test(cubes, statistic, span, engine=None):
     """Run multimodel test with some simple checks."""
     statistics = [statistic]
 
-    result = multi_model_statistics(cubes, span=span, statistics=statistics)
+    if engine == 'iris':
+        aggregator = multicube_statistics_iris
+    else:
+        aggregator = partial(multicube_statistics, span=span)
+
+    result = aggregator(cubes, statistics=statistics)
     assert isinstance(result, dict)
     assert statistic in result
 
     return result
 
 
-def multimodel_regression_test(cubes, span, name):
+def multimodel_regression_test(cubes, span, name, engine):
     """Run multimodel regression test.
 
     This test will fail if the input data or multimodel code changed. To
@@ -139,7 +154,10 @@ def multimodel_regression_test(cubes, span, name):
     are being written.
     """
     statistic = 'mean'
-    result = multimodel_test(cubes, span=span, statistic=statistic)
+    result = multimodel_test(cubes,
+                             statistic=statistic,
+                             span=span,
+                             engine=engine)
     result_cube = result[statistic]
 
     filename = Path(__file__).with_name(f'{name}-{span}-{statistic}.nc')
@@ -164,8 +182,9 @@ def multimodel_regression_test(cubes, span, name):
 
 
 @pytest.mark.use_sample_data
+@pytest.mark.parametrize('engine', ENGINE_PARAMS)
 @pytest.mark.parametrize('span', SPAN_PARAMS)
-def test_multimodel_regression_month(timeseries_cubes_month, span):
+def test_multimodel_regression_month(timeseries_cubes_month, span, engine):
     """Test statistic."""
     cubes = timeseries_cubes_month
     name = 'timeseries_monthly'
@@ -173,13 +192,16 @@ def test_multimodel_regression_month(timeseries_cubes_month, span):
         name=name,
         span=span,
         cubes=cubes,
+        engine=engine,
     )
 
 
 @pytest.mark.use_sample_data
+@pytest.mark.parametrize('engine', ENGINE_PARAMS)
 @pytest.mark.parametrize('calendar', CALENDAR_PARAMS)
 @pytest.mark.parametrize('span', SPAN_PARAMS)
-def test_multimodel_regression_day(timeseries_cubes_day, span, calendar):
+def test_multimodel_regression_day(timeseries_cubes_day, span, calendar,
+                                   engine):
     """Test statistic."""
     cubes = timeseries_cubes_day[calendar]
     name = f'timeseries_daily_{calendar}'
@@ -187,6 +209,7 @@ def test_multimodel_regression_day(timeseries_cubes_day, span, calendar):
         name=name,
         span=span,
         cubes=cubes,
+        engine=engine,
     )
 
 
diff --git a/tests/unit/preprocessor/_multimodel/test_multimodel.py b/tests/unit/preprocessor/_multimodel/test_multimodel.py
index 42b28a9525..fa17ddf4d1 100644
--- a/tests/unit/preprocessor/_multimodel/test_multimodel.py
+++ b/tests/unit/preprocessor/_multimodel/test_multimodel.py
@@ -4,20 +4,24 @@
 
 import iris
 import numpy as np
+import pytest
 from cf_units import Unit
 
 import tests
-from esmvalcore.preprocessor import multi_model_statistics
-from esmvalcore.preprocessor._multimodel import (_assemble_data,
-                                                 _compute_statistic,
-                                                 _get_time_slice, _plev_fix,
-                                                 _put_in_cube,
-                                                 _unify_time_coordinates)
+from esmvalcore.preprocessor._multimodel import (
+    _assemble_data,
+    _compute_statistic,
+    _get_time_slice,
+    _plev_fix,
+    _put_in_cube,
+    _unify_time_coordinates,
+    multicube_statistics,
+    multicube_statistics_iris,
+)
 
 
 class Test(tests.Test):
     """Test class for preprocessor/_multimodel.py."""
-
     def setUp(self):
         """Prepare tests."""
         # Make various time arrays
@@ -92,7 +96,9 @@ def test_compute_statistic(self):
 
     def test_compute_full_statistic_mon_cube(self):
         data = [self.cube1, self.cube2]
-        stats = multi_model_statistics(data, 'full', ['mean'])
+        stats = multicube_statistics(cubes=data,
+                                     statistics=['mean'],
+                                     span='full')
         expected_full_mean = np.ma.ones((5, 3, 2, 2))
         expected_full_mean.mask = np.ones((5, 3, 2, 2))
         expected_full_mean.mask[1] = False
@@ -100,7 +106,9 @@ def test_compute_full_statistic_mon_cube(self):
 
     def test_compute_full_statistic_yr_cube(self):
         data = [self.cube4, self.cube5]
-        stats = multi_model_statistics(data, 'full', ['mean'])
+        stats = multicube_statistics(cubes=data,
+                                     statistics=['mean'],
+                                     span='full')
         expected_full_mean = np.ma.ones((4, 3, 2, 2))
         expected_full_mean.mask = np.zeros((4, 3, 2, 2))
         expected_full_mean.mask[2:4] = True
@@ -108,16 +116,43 @@ def test_compute_full_statistic_yr_cube(self):
 
     def test_compute_overlap_statistic_mon_cube(self):
         data = [self.cube1, self.cube1]
-        stats = multi_model_statistics(data, 'overlap', ['mean'])
+        stats = multicube_statistics(cubes=data,
+                                     statistics=['mean'],
+                                     span='overlap')
         expected_ovlap_mean = np.ma.ones((2, 3, 2, 2))
         self.assert_array_equal(stats['mean'].data, expected_ovlap_mean)
 
     def test_compute_overlap_statistic_yr_cube(self):
         data = [self.cube4, self.cube4]
-        stats = multi_model_statistics(data, 'overlap', ['mean'])
+        stats = multicube_statistics(cubes=data,
+                                     statistics=['mean'],
+                                     span='overlap')
         expected_ovlap_mean = np.ma.ones((2, 3, 2, 2))
         self.assert_array_equal(stats['mean'].data, expected_ovlap_mean)
 
+    def test_multicube_statistics_fail(self):
+        data = [self.cube1, self.cube1 * 2.0]
+        with pytest.raises(ValueError):
+            multicube_statistics(data,
+                                 span='overlap',
+                                 statistics=['non-existant'])
+
+    def test_multicube_statistics_iris(self):
+        data = [self.cube1, self.cube1 * 2.0]
+        statistics = ['mean', 'min', 'max']
+        stats = multicube_statistics_iris(data, statistics=statistics)
+        expected_mean = np.ma.ones((2, 3, 2, 2)) * 1.5
+        expected_min = np.ma.ones((2, 3, 2, 2)) * 1.0
+        expected_max = np.ma.ones((2, 3, 2, 2)) * 2.0
+        self.assert_array_equal(stats['mean'].data, expected_mean)
+        self.assert_array_equal(stats['min'].data, expected_min)
+        self.assert_array_equal(stats['max'].data, expected_max)
+
+    def test_multicube_statistics_iris_fail(self):
+        data = [self.cube1, self.cube1 * 2.0]
+        with pytest.raises(ValueError):
+            multicube_statistics_iris(data, statistics=['non-existent'])
+
     def test_compute_std(self):
         """Test statistic."""
         data = [self.cube1.data[0], self.cube2.data[0] * 2]