Skip to content

Revised grib load+save#482

Merged
bblay merged 1 commit intoSciTools:masterfrom
pp-mo:grib_translation
May 24, 2013
Merged

Revised grib load+save#482
bblay merged 1 commit intoSciTools:masterfrom
pp-mo:grib_translation

Conversation

@pp-mo
Copy link
Member

@pp-mo pp-mo commented May 1, 2013

Extra grib level-types translation, and grib <-> cf phenomenon translation.

These changes were originally driven by a requirement to form monthly averages from the ecmwf ERA climate series, input from Grib1 and output as Grib2.

Summary of changes:

  • Grib2<-->CF parameter translation now uses data from the "metadata translation project"
  • (file: fileformats/grib/grib_cf_map.py)
  • though the current version included has manual edits
  • Some Grib2 output now equates a "surface" type level with a "height=0.0" scalar coordinate (generated on input, .
  • Various new tests written and older ones enhanced

Note: also includes an updated etc/cf-standard-name-table.xml

Reviewer: cpelley

@pp-mo
Copy link
Member Author

pp-mo commented May 1, 2013

Details of existing behaviour + changes.

GRIB1 Loading:

  • Phenomenon recognition:
  • NOCHANGE GRIB1 standard table params (retained, but there are only 3)
  • NOCHANGE unrecognised standard-table entries are encoded in long_name
  • NOCHANGE unrecognised local-table entries encoded in long_name
  • ADDED specific ecmwf local-table codes are handled
  • NOCHANGE no additional GRIB1 standard table codes are supported
    - -- as existing grib_cf_map contains no such entries
    - -- (to be added in future)
  • Level handling:
    • ADDED now adds "height=0.0" for levelType=1, except for specific recognised cases
    • ADDED now recognises certain ecmwf local-table cases
      • -- some of these define an extra height coord
        (where the phenom "contains" a height, e.g. '2-metre temperature')
      • -- others do not, and these result in no associated height=0 coord (unlike unrecognised cases)

GRIB2 Loading:

  • Phenomenon recognition:
  • ADDED some grib2->cf phenomenon translations : to standard_name+unit OR long_name+unit
    - -- this replaces all the previous specific-rules ones
    - -- not all of these translate with identical names (or units) as from the existing code
  • Level handling:
  • NOCHANGE

GRIB2 Saving:

  • Phenomenon recognition:
  • ADDED recognised CF->Grib2 translations are supported (not many, yet)
    - -- sets discipline/category/number
    - -- converts data to appropriate reference unit
  • Template handling:
  • ADDED recognises a slightly broader set of cell-methods as time-based processes
    (to accommodate monthly averaging)
  • Level handling:
  • ADDED recognises cases with no vertical coord whatsoever, and records as levelType=1 ('surface')
    - (previously was only possible to save by explicitly adding a "height=0" coord)

@pp-mo
Copy link
Member Author

pp-mo commented May 1, 2013

Possible immediate practical problems:

  • Some existing Grib2 phenomena may now read with a different standard name + unit.
  • The interpretation of levelType=1 as "no vertical coordinate" for Grib1 input will probably not generalise to
    other cases: Probably, some cases should get "height=0.0", and some no vertical coord at all (as here).
  • Similarly, the grib2 output conversion of "no vertical coordinate --> surface_level" might not always be appropriate.

@pp-mo
Copy link
Member Author

pp-mo commented May 1, 2013

Unresolved:
We need to be more certain that the encoding of 'levelType="surface"' to effectively mean 'no associated surface' is a sensible encoding approach, especially in Grib2.
( I.E. data which are not measures "at" any particular level, such as "cloud_fraction" )

Note: this levelType is distinct from height above surface (land or sea).

This does seems to be the case for the data that I have examined (both Grib1 and Grib2) .
This issue is to be discussed with experts at ECMWF : If necessary, we can change the interpretation later.

@esc24
Copy link
Member

esc24 commented May 1, 2013

Please fix the test failures. Looks like a missing/out-of-date license headers.

@pp-mo
Copy link
Member Author

pp-mo commented May 1, 2013

PLEASE NOTE Trying stuff. This presumably won't work, as I need to fix headers in new code.
Don't really grasp why it failed on older ones (maybe because I'd moved them??)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must not make something up that could misrepresent the data.
If it's missing from the cube it should be set to missing in the grib message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must not make something up that could misrepresent the data.
If it's missing from the cube it should be set to missing in the grib message.

Having looked into the issue, and scanned available files, I don't believe it is this clear-cut.
I have not seen 'missing' leveltype in any data we have, where you might expect to.
We are consulting with ECMWF for additional guidance on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or fail)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this code needed anyway? Why can't it just fail as it used to?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this code can 'just fail' A cube with no 'Z' like coordinates is a reasonable cube, I think. I don't see where we or anyone else has put forward that it is not.

I think the question is how to encode a Cube which has no 'z' like coordinate in GRIB2.

I support Patrick's suggestion to consult other users on this, the ECMWF seem as good a candidate as any.

The use of the 'missing' | 255 seems valid to me, although I haven't seen it in use much. I am interested as to whether the use of 1 as a default is common practice.

What does the 2nd fixed surface get encoded as if it is not in use?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does the 2nd fixed surface get encoded as if it is not in use?

the code (line 307) seems to set this to -1.

I cannot find where this is documented in the GRIB2 specification, either in 4.x templates or in code table 4.5

how is -1 interpreted?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marqh, this is a saving, not a loading issue. Failing here does not denote an unreasonable cube, but an unreasonable request to save to grib2 (should we not choose the 'missing' option also discussed here).

Regarding the alternative, 'missing' option, I suggest it doesn't matter if it's common practice to set a default level type. It's never ok to make up a level type unless it's a universal practice (even then I'd want to see that written into the spec). I want us to aim for correct practice. I believe the only correct options are failure to save, or setting the level type to 'missing'.

If someone saves an upper air_temperature cube to grib2, and forgets to set the vertical coord, we must not write out a surface temperature cube, I'm sure you agree.

Failing is the current behaviour. I asked about why we can't fail because I want to know what has changed in our requirements that means we have to change the current behaviour.

2nd level type = 255 for non-bounded layers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is -1 interpreted?

That's just 255, they're the same thing in unsigned 8-bit.
The code should probably use 255 instead, as it's clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Failing here does not denote an unreasonable cube, but an unreasonable request to save to grib2 (should we not choose the 'missing' option also discussed here).

The cube is valid and that there is a valid way to save it as a GRIB2 message; I'm not sure exactly what the appropriate encoding is but we will get to that, I find it nearly impossible to believe that failing at this point is the correct response to the 'save' request.

Failing is the current behaviour. I asked about why we can't fail because I want to know what has changed in our requirements that means we have to change the current behaviour.

I think this behaviour is wrong and needs to change. We have a clear use case for data where the phenomenon is defined but there is no 'z' type coordinate on the cube; this data is required to be exported as GRIB2 messages.

I feel we should find the correct encoding for this, then implement it appropriately. Patrick's approach looks valid, but a further opinion on the encoding would be of value here. I have requested feedback from our colleagues at the ECMWF and I will report back on any insights they can provide.

@ghost ghost assigned bblay May 1, 2013
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you can safely interpret the phenomena without taking into account the local tables:
11 localTablesVersion
8-9 subCentre
6-7 centre

In addition, even if there is no local table, it would be nice to interpret the origin of the data (assuming that the centre and subcentre determine this if no local parameter table number is present).

0 indicates not used and 255 indicates missing (I'm assuming 255 would mean we could not interpret the phenomena, but this will need checking)
the Local table (octet 11),

6-7 centre = 74 [U.K. Met Office - Exeter (grib1/0.table) ]

10 tablesVersion = 4 [Version implemented on 7 November 2007 (grib2/tables/1.0.table) ]

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scratch the above, discussion with @marqh has clarified the situation
parameter numbers 0-20 are common no matter what table

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure you can safely interpret the phenomena without taking into account the local tables

That is correct. If the master tables version is 255 we've got a local parameter code as defined by the originating centre. Almost certainly, all the parameters will be different in this case.

Local parameter tables can also define those entries in the master table from 192 to 254 which are "Reserved for local use", when master tables version is not 255 and 192 <= param <= 245.

If we wish to handle local param codes, we need to take into account:

  • master tables version
  • local tables version
  • originating centre (but not subcentre)

Note: The grib spec _"strongly discourages"_ exchange of such data between centres. One could reasonably argue that by telling Iris about params from different centres we are actively encouraging such exchange, and would be in conflict with the grib spec.

However, in a modern, collaborative environment we can also reasonably expect centres to exchange experimental data that relies on local tables. In this case, and in the case of internal exchange, we certainly need to be able to understand local param codes from at least one centre.

Perhaps we can separate the capability (which must be in the Iris core) from the details (which need not). One such solution might see the user activating local param handling for one or more specific centres:

iris.fileformats.grib.allow_local_codes("ecmwf", "ukmo")

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks all @cpelley @marqh @bblay : Well spotted + great work, guys!

I really thought this could be an important gotcha, but I have discussed with @marqh and he believes the existing route is adequate based on the stated evolution mechanism for Grib parameter codes ...

  • parameters are not allowed to change with master version : you may get new ones; old ones can be deprecated but will not be removed or re-assigned
  • local table parameters, while still supported, are given a different range of valid codes -- exactly so that they can never collide with standard ones
    So basically, if a particular parameter within the standard range has any reasonable meaning, it can only be one thing + that will never change.

It is true that the current strategy doesn't support local tables but these are (intentionally) very little used now. We could add this if required in future, but it doesn't have to fit into the existing mechanism, as long as they can't be confused with standard ones.

The only case that may escape this logic seems to be where @bblay says :

If the master tables version is 255 we've got a local parameter code as defined by the originating centre

The key statement in the spec (Section 1 documentation) seems to be --

(2) If octet 10 contains 255 then only Local tables are in use, the Local table version number (octet 11) must not be zero nor
missing, and Local tables may include entries from the entire range of the tables.

If that is the case, we can include a test to rule this out within grib spec lookup.
@bblay can you expand on the detail here, are we sure this is needed ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we sure this is needed

Certainly not. That discussion has not taken place. I simply chipped in to say Carwyn was right; we will misinterpret messages using local parameter codes if we don't take the tables versions into account.
(Whilst also pointing out that collaborations will likely require this.)

@bblay
Copy link
Contributor

bblay commented May 14, 2013

An offline conversation revealed the data file can/has been edited. Please remove the "do not edit" notification at the top. Also, please put this stuff in to experimental.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want "Class" in any of these type names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, this makes the code harder to read than if we just used tuples:

grib1_key = _Grib1ToCfKeyClass(table2_version, centre_number, param_number)
... <code omitted> ...
cf_data = _GribToCfDataClass(standard_name=cfdata.standard_name,
                             long_name=cfdata.long_name,
                             units=iris_units,
                             set_height=height)
... <code omitted> ...
_GRIB1_CF_TABLE[grib1_key] = cf_data

vs

... <code omitted> ...
_GRIB1_CF_TABLE[(table2_version, centre_number, param_number)] = 
    (cfdata.standard_name, cfdata.long_name, iris_units, height)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the benefit is to abstract away the construction details of the table keys and values, by making their components named properties: Using named tuples ensures that the code writing the tables and that using them for lookup are using a consistent definition of the attributes and ordering, especially for the keys, and allows us to specify data components by name instead of index order.
For example, to look up a value in the dictionary + use the result, you can write...

data = dict[_keytype(discipline=d, category=c, number=n)]
name = data.standard_name

..instead of..

data = dict[(d, c, n)]
name = data[2]

..which is certainly shorter, but more cryptic and fragile.
When we need to change or extend the table contents (very likely), this should also catch any code that hasn't been updated.
Otherwise, we can easily get format mismatches between table entries, or between the table keys and lookup keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pp-mo, you are technically correct but it's overkill.
It won't be cryptic and we're clever enough to handle changes.
However, I won't push for this any further as you believe this is better.

@bblay
Copy link
Contributor

bblay commented May 14, 2013

An offline conversation revealed that some of the code is merely present to make the sample export look the way we need it. Please add a comment indicating which parts of the code are rearranging the data like this. (Apologies if it's already there and I didn't spot it yet).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be using the params table2_version, centre_number, param_number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops, yes. Clearly a legacy from separating this into a separate function (which we may now be changing back..)
Ta @bblay !

@pp-mo
Copy link
Member Author

pp-mo commented May 14, 2013

@bblay An offline conversation revealed the data file can/has been edited. Please remove the "do not edit" notification at the top...

I have edited the file header to clarify.

@bblay ...Also, please put this stuff in to experimental.

I don't believe this belongs in the experimental category, because
(a) it changes the existing behaviour and adds new behaviour which you can't 'turn off'.
(b) the core code now depends on it
If you can't evaluate something as a proposal or alternative, I don't think you can reasonably call it "experimental"...
So to make sense of this, I think I'd have to make it switchable so you could disable it if required, reverting to older methods. Which is more work than I think it warrants?

@bblay
Copy link
Contributor

bblay commented May 14, 2013

I don't believe this belongs in the experimental category

Ok, but it is experimental. It's a pretend export from an unreleased product using a format that is likely to change. I won't push this, your call.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment suggests that grib._cf_data holds "specific ecmwf-local-table params", but that's not true, it can cover params from any centre or even the international tables can't it?

@pp-mo
Copy link
Member Author

pp-mo commented May 14, 2013

Thanks @bblay.
Please review again, I believe I've now addressed everything to date.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep this function simple and consistent we should just set v_coord = None here and deal with the encoding at the bottom, where the rest of the encoding stuff lives.

@cpelley
Copy link

cpelley commented May 24, 2013

thank you very much for your clarification @bblay!
from our discussion earlier, might I suggest then a switch between the proposed behaviour and the more technically safe behaviour, i.e between a level type of 1 (switch on) to 255 (switch off)? (much like what we have with the grib hindcast situation)

@bblay
Copy link
Contributor

bblay commented May 24, 2013

might I suggest then a switch

I strongly support the proposal for a switch, something like: grib_save_rules.invent_surface_level.
I suggest we should never make stuff up as default behaviour, so this should be False by default.

@cpelley
Copy link

cpelley commented May 24, 2013

On speaking with @marqh and @bblay I'm happy to have the above suggestion with a switch implemented as a near-future PR.

@cpelley
Copy link

cpelley commented May 24, 2013

The changes I have suggested above are minor and do not effect the validity of the data generated, I therefore also suggest that these minor points be addressed in a future PR.

@bblay
Copy link
Contributor

bblay commented May 24, 2013

@cpelley, please make the pr with the switch and 255 issues in it, init.

@bblay
Copy link
Contributor

bblay commented May 24, 2013

ok @pp-mo, i think we're ready to squash now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a mismatch between the name _cf_data and the comment "ECMWF GRIB1 local params"

@bblay
Copy link
Contributor

bblay commented May 24, 2013

ok, re-reviewed and ready to merge once squashed.
all actions since "i think we're ready to squash now?" can be moved to a future pr.

would just like quick answers to the two "why" questions above

@bblay
Copy link
Contributor

bblay commented May 24, 2013

@pp-mo, you have been very patient in this contentious PR, trying to please all sides. Well done and apologies for being a "side" (or a pain in one!).

@bblay bblay mentioned this pull request May 24, 2013
@bblay
Copy link
Contributor

bblay commented May 24, 2013

I'm merging, but am uphappy with the following:

  • We're using an (edited) export from metarelate.
    We should consider having a caching metarelate importer instead,
    so we're not looking at the information from metarelate's perspective.
    It's not desirable to jump through these lookup hoops as we do.
  • We're inventing metadata on export when there's no vertical coord.
    We're doing this because some other software system falls over otherwise.
    We should be setting the vertical information to "missing".

Also, follow-up work is required in #525.

bblay added a commit that referenced this pull request May 24, 2013
@bblay bblay merged commit 3c709da into SciTools:master May 24, 2013
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe we do this, plus itpp doesn't correspond to a github id.

@bblay
Copy link
Contributor

bblay commented May 31, 2013

I'd have expected the review to have at least commented on this.

Quite right. I should have spotted this.

Because there was a rush to get this in, I had to merge it as it was, I opened a new ticket for continued review actions #525. (Perhaps there's lesson about rushing a merge here)

However, something else to consider:
The author explained to me this is a placeholder mapping solution, which only exists because the metarelate export is not in a form that is directly usable by Iris, and is expected to change.

(I tried to relay his observation in my merge message but I think it came across as a complaint, sorry)

So, the format of the metarelate export (or import, if that happens instead) is intended to change into something we can use directly, and therefore this whole "lookup hoops" code is intended to be trashed anyway (but was a necessary step in getting the basic functionality merged in a hurry).

@pp-mo, I trust this is accurate? Please correct of not.

@pp-mo
Copy link
Member Author

pp-mo commented Jun 5, 2013

@pp-mo, I trust this is accurate? Please correct of not.

Not quite, I think.

Though the metarelate export is "expected to change", that doesn't mean it will someday (ever) assume an 'ideal' form for Iris' purpose.
In my view, if the Metarelate view of this information is different to that of Iris (as at present), then we're better off having our translation logic inside Iris (as at present), rather than injecting it into Metarelate .
So, the need for a 'hookup code' layer will not necessarily ever disappear.

In the meantime, the mentioned problems do need fixing. So see to #525 to do that.

@pp-mo pp-mo mentioned this pull request Jun 14, 2013
@pp-mo pp-mo deleted the grib_translation branch January 24, 2014 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants