Refactor multi-model statistics code to facilitate ensemble stats and lazy evaluation#949
Refactor multi-model statistics code to facilitate ensemble stats and lazy evaluation#949bouweandela merged 9 commits intomasterfrom
Conversation
Separate the ESMValCore internals, dealing with products, from the core function operating on cubes. This makes it easier to add a new preprocessor for ensemble statistics, and also to make a new core function (lazily) compute statistics across cubes.
There was a problem hiding this comment.
Nice work @Peter9192 , all tests are passing, so it looks good to me! Separating how products are processed from cubes will make the code much easier to work with.
There was a problem hiding this comment.
cheers @Peter9192 for this! I am still to read the changes to the documentation, the functionality seems to be nicely restructured, but am not quite happy with the docstring changes - they have become too technical and I don't think this is a good idea, given who we cater for 👍 EDIT I have now gone through the documentation changes and I like'em, job well done there - but still confused as to why multimodel -> multi-model 🍺
|
Thanks @valeriupredoi I'll look into your suggestions! I appreciate the point about the audience for the docstrings. |
| return statistics_products | ||
|
|
||
|
|
||
| def multi_model_statistics(products: set, |
There was a problem hiding this comment.
Note that this function is part of the public API. Changing it so it no longer accepts cubes is a breaking change that will probably break diagnostic code. Would it be possible to make it work again that if you put cubes in, you get cubes out? And of course the docstring needs to be good as @valeriupredoi already pointed out.
There was a problem hiding this comment.
shucks! I totally missed that - very good point @bouweandela - yes, @Peter9192 please see this and revise the approach 👍
There was a problem hiding this comment.
Yes I realized this as well but I'm struggling to find an appropriate solution.
There was a problem hiding this comment.
@Peter9192 Didn't we talk about making multi_model_statistics cube only (not to mess with the public API) and simply pass the arguments to multi_cube_statistics. Then use _multi_model_statistics with products for internal use? Or am I confusing 2 PRs now?
There was a problem hiding this comment.
After a lot of contemplating I have accepted that the public API function will have to accept both products and cubes as possible inputs. I had rather hidden those products from the public altogether, but this becomes very ugly very quickly. So I have now restored the public multi_model statistics as the single entry point that returns different functions based on its input type.
as entry point for both multi-cube and multi-product statistics. This is to maintain the alignment between the recipe API and the public preprocessor documentation.
|
All green! @valeriupredoi I think I got everything covered, could you please have another look? Thanks 🍻 |
valeriupredoi
left a comment
There was a problem hiding this comment.
looks good, development tests pass too (not cached ones on Circle, actual branch and ESMValTool dev ones), cheers @Peter9192 🍺 @bouweandela did you want to have another look yerself?
bouweandela
left a comment
There was a problem hiding this comment.
Looks good to me, more clean. Just a minor comment on the unit tests.
|
Now the test fails because of iris docs change or smt like that 😞 |
That should be fixed by #964 |
Description
This PR refactors the multimodel statistics code. Specifically, it separates the core functionality (computing statistics across cubes) from the intermediate layer that deals with the handling of ESMValCore products.
This facilitates re-use of the core function in e.g. an ensemble statistics function (see #673) and also it makes it easier to make the core functionality lazy (see #781 and #950).
Before you get started
Checklist
pre-commitoryamllintchecksIf you make backwards incompatible changes to the recipe format:
To help with the number pull requests: