Conversation
|
Details of existing behaviour + changes. GRIB1 Loading:
GRIB2 Loading:
GRIB2 Saving:
|
|
Possible immediate practical problems:
|
|
Unresolved: Note: this levelType is distinct from height above surface (land or sea). This does seems to be the case for the data that I have examined (both Grib1 and Grib2) . |
|
Please fix the test failures. Looks like a missing/out-of-date license headers. |
|
PLEASE NOTE Trying stuff. This presumably won't work, as I need to fix headers in new code. |
There was a problem hiding this comment.
We must not make something up that could misrepresent the data.
If it's missing from the cube it should be set to missing in the grib message.
There was a problem hiding this comment.
We must not make something up that could misrepresent the data.
If it's missing from the cube it should be set to missing in the grib message.
Having looked into the issue, and scanned available files, I don't believe it is this clear-cut.
I have not seen 'missing' leveltype in any data we have, where you might expect to.
We are consulting with ECMWF for additional guidance on this.
There was a problem hiding this comment.
Why is this code needed anyway? Why can't it just fail as it used to?
There was a problem hiding this comment.
I don't see how this code can 'just fail' A cube with no 'Z' like coordinates is a reasonable cube, I think. I don't see where we or anyone else has put forward that it is not.
I think the question is how to encode a Cube which has no 'z' like coordinate in GRIB2.
I support Patrick's suggestion to consult other users on this, the ECMWF seem as good a candidate as any.
The use of the 'missing' | 255 seems valid to me, although I haven't seen it in use much. I am interested as to whether the use of 1 as a default is common practice.
What does the 2nd fixed surface get encoded as if it is not in use?
There was a problem hiding this comment.
What does the 2nd fixed surface get encoded as if it is not in use?
the code (line 307) seems to set this to -1.
I cannot find where this is documented in the GRIB2 specification, either in 4.x templates or in code table 4.5
how is -1 interpreted?
There was a problem hiding this comment.
@marqh, this is a saving, not a loading issue. Failing here does not denote an unreasonable cube, but an unreasonable request to save to grib2 (should we not choose the 'missing' option also discussed here).
Regarding the alternative, 'missing' option, I suggest it doesn't matter if it's common practice to set a default level type. It's never ok to make up a level type unless it's a universal practice (even then I'd want to see that written into the spec). I want us to aim for correct practice. I believe the only correct options are failure to save, or setting the level type to 'missing'.
If someone saves an upper air_temperature cube to grib2, and forgets to set the vertical coord, we must not write out a surface temperature cube, I'm sure you agree.
Failing is the current behaviour. I asked about why we can't fail because I want to know what has changed in our requirements that means we have to change the current behaviour.
2nd level type = 255 for non-bounded layers.
There was a problem hiding this comment.
how is -1 interpreted?
That's just 255, they're the same thing in unsigned 8-bit.
The code should probably use 255 instead, as it's clearer.
There was a problem hiding this comment.
Failing here does not denote an unreasonable cube, but an unreasonable request to save to grib2 (should we not choose the 'missing' option also discussed here).
The cube is valid and that there is a valid way to save it as a GRIB2 message; I'm not sure exactly what the appropriate encoding is but we will get to that, I find it nearly impossible to believe that failing at this point is the correct response to the 'save' request.
Failing is the current behaviour. I asked about why we can't fail because I want to know what has changed in our requirements that means we have to change the current behaviour.
I think this behaviour is wrong and needs to change. We have a clear use case for data where the phenomenon is defined but there is no 'z' type coordinate on the cube; this data is required to be exported as GRIB2 messages.
I feel we should find the correct encoding for this, then implement it appropriately. Patrick's approach looks valid, but a further opinion on the encoding would be of value here. I have requested feedback from our colleagues at the ECMWF and I will report back on any insights they can provide.
There was a problem hiding this comment.
I'm not sure you can safely interpret the phenomena without taking into account the local tables:
11 localTablesVersion
8-9 subCentre
6-7 centre
In addition, even if there is no local table, it would be nice to interpret the origin of the data (assuming that the centre and subcentre determine this if no local parameter table number is present).
0 indicates not used and 255 indicates missing (I'm assuming 255 would mean we could not interpret the phenomena, but this will need checking)
the Local table (octet 11),
6-7 centre = 74 [U.K. Met Office - Exeter (grib1/0.table) ]
10 tablesVersion = 4 [Version implemented on 7 November 2007 (grib2/tables/1.0.table) ]
There was a problem hiding this comment.
scratch the above, discussion with @marqh has clarified the situation
parameter numbers 0-20 are common no matter what table
There was a problem hiding this comment.
I'm not sure you can safely interpret the phenomena without taking into account the local tables
That is correct. If the master tables version is 255 we've got a local parameter code as defined by the originating centre. Almost certainly, all the parameters will be different in this case.
Local parameter tables can also define those entries in the master table from 192 to 254 which are "Reserved for local use", when master tables version is not 255 and 192 <= param <= 245.
If we wish to handle local param codes, we need to take into account:
- master tables version
- local tables version
- originating centre (but not subcentre)
Note: The grib spec _"strongly discourages"_ exchange of such data between centres. One could reasonably argue that by telling Iris about params from different centres we are actively encouraging such exchange, and would be in conflict with the grib spec.
However, in a modern, collaborative environment we can also reasonably expect centres to exchange experimental data that relies on local tables. In this case, and in the case of internal exchange, we certainly need to be able to understand local param codes from at least one centre.
Perhaps we can separate the capability (which must be in the Iris core) from the details (which need not). One such solution might see the user activating local param handling for one or more specific centres:
iris.fileformats.grib.allow_local_codes("ecmwf", "ukmo")
There was a problem hiding this comment.
Thanks all @cpelley @marqh @bblay : Well spotted + great work, guys!
I really thought this could be an important gotcha, but I have discussed with @marqh and he believes the existing route is adequate based on the stated evolution mechanism for Grib parameter codes ...
- parameters are not allowed to change with master version : you may get new ones; old ones can be deprecated but will not be removed or re-assigned
- local table parameters, while still supported, are given a different range of valid codes -- exactly so that they can never collide with standard ones
So basically, if a particular parameter within the standard range has any reasonable meaning, it can only be one thing + that will never change.
It is true that the current strategy doesn't support local tables but these are (intentionally) very little used now. We could add this if required in future, but it doesn't have to fit into the existing mechanism, as long as they can't be confused with standard ones.
The only case that may escape this logic seems to be where @bblay says :
If the master tables version is 255 we've got a local parameter code as defined by the originating centre
The key statement in the spec (Section 1 documentation) seems to be --
(2) If octet 10 contains 255 then only Local tables are in use, the Local table version number (octet 11) must not be zero nor
missing, and Local tables may include entries from the entire range of the tables.
If that is the case, we can include a test to rule this out within grib spec lookup.
@bblay can you expand on the detail here, are we sure this is needed ?
There was a problem hiding this comment.
are we sure this is needed
Certainly not. That discussion has not taken place. I simply chipped in to say Carwyn was right; we will misinterpret messages using local parameter codes if we don't take the tables versions into account.
(Whilst also pointing out that collaborations will likely require this.)
|
An offline conversation revealed the data file can/has been edited. Please remove the "do not edit" notification at the top. Also, please put this stuff in to experimental. |
There was a problem hiding this comment.
I don't think we want "Class" in any of these type names.
There was a problem hiding this comment.
From what I can tell, this makes the code harder to read than if we just used tuples:
grib1_key = _Grib1ToCfKeyClass(table2_version, centre_number, param_number)
... <code omitted> ...
cf_data = _GribToCfDataClass(standard_name=cfdata.standard_name,
long_name=cfdata.long_name,
units=iris_units,
set_height=height)
... <code omitted> ...
_GRIB1_CF_TABLE[grib1_key] = cf_data
vs
... <code omitted> ...
_GRIB1_CF_TABLE[(table2_version, centre_number, param_number)] =
(cfdata.standard_name, cfdata.long_name, iris_units, height)
There was a problem hiding this comment.
I believe the benefit is to abstract away the construction details of the table keys and values, by making their components named properties: Using named tuples ensures that the code writing the tables and that using them for lookup are using a consistent definition of the attributes and ordering, especially for the keys, and allows us to specify data components by name instead of index order.
For example, to look up a value in the dictionary + use the result, you can write...
data = dict[_keytype(discipline=d, category=c, number=n)]
name = data.standard_name
..instead of..
data = dict[(d, c, n)]
name = data[2]
..which is certainly shorter, but more cryptic and fragile.
When we need to change or extend the table contents (very likely), this should also catch any code that hasn't been updated.
Otherwise, we can easily get format mismatches between table entries, or between the table keys and lookup keys.
There was a problem hiding this comment.
@pp-mo, you are technically correct but it's overkill.
It won't be cryptic and we're clever enough to handle changes.
However, I won't push for this any further as you believe this is better.
|
An offline conversation revealed that some of the code is merely present to make the sample export look the way we need it. Please add a comment indicating which parts of the code are rearranging the data like this. (Apologies if it's already there and I didn't spot it yet). |
There was a problem hiding this comment.
Shouldn't this be using the params table2_version, centre_number, param_number?
There was a problem hiding this comment.
Whoops, yes. Clearly a legacy from separating this into a separate function (which we may now be changing back..)
Ta @bblay !
I have edited the file header to clarify.
I don't believe this belongs in the experimental category, because |
Ok, but it is experimental. It's a pretend export from an unreleased product using a format that is likely to change. I won't push this, your call. |
lib/iris/etc/grib_rules.txt
Outdated
There was a problem hiding this comment.
This comment suggests that grib._cf_data holds "specific ecmwf-local-table params", but that's not true, it can cover params from any centre or even the international tables can't it?
|
Thanks @bblay. |
There was a problem hiding this comment.
To keep this function simple and consistent we should just set v_coord = None here and deal with the encoding at the bottom, where the rest of the encoding stuff lives.
|
thank you very much for your clarification @bblay! |
I strongly support the proposal for a switch, something like: |
|
The changes I have suggested above are minor and do not effect the validity of the data generated, I therefore also suggest that these minor points be addressed in a future PR. |
|
@cpelley, please make the pr with the switch and 255 issues in it, init. |
|
ok @pp-mo, i think we're ready to squash now? |
There was a problem hiding this comment.
There is a mismatch between the name _cf_data and the comment "ECMWF GRIB1 local params"
|
ok, re-reviewed and ready to merge once squashed. would just like quick answers to the two "why" questions above |
|
@pp-mo, you have been very patient in this contentious PR, trying to please all sides. Well done and apologies for being a "side" (or a pain in one!). |
…nomenon translation from metadata project.
|
I'm merging, but am uphappy with the following:
Also, follow-up work is required in #525. |
There was a problem hiding this comment.
I don't believe we do this, plus itpp doesn't correspond to a github id.
Quite right. I should have spotted this. Because there was a rush to get this in, I had to merge it as it was, I opened a new ticket for continued review actions #525. (Perhaps there's lesson about rushing a merge here) However, something else to consider: (I tried to relay his observation in my merge message but I think it came across as a complaint, sorry) So, the format of the metarelate export (or import, if that happens instead) is intended to change into something we can use directly, and therefore this whole "lookup hoops" code is intended to be trashed anyway (but was a necessary step in getting the basic functionality merged in a hurry). @pp-mo, I trust this is accurate? Please correct of not. |
Not quite, I think. Though the metarelate export is "expected to change", that doesn't mean it will someday (ever) assume an 'ideal' form for Iris' purpose. In the meantime, the mentioned problems do need fixing. So see to #525 to do that. |
Extra grib level-types translation, and grib <-> cf phenomenon translation.
These changes were originally driven by a requirement to form monthly averages from the ecmwf ERA climate series, input from Grib1 and output as Grib2.
Summary of changes:
Note: also includes an updated etc/cf-standard-name-table.xml
Reviewer: cpelley