Skip to content

Don't save more decimals than necessary in the uncertainty (or data) files #2095

@scarlehoff

Description

@scarlehoff

At the moment due to the way in which the uncertainties are computed in the filter files, running the same file twice with two different computers might results in differences in the ~10 digit.
These differences are completely spurious and I believe it makes more sense to round the values to the point they are more or less stable.

In order to make sure everybody is using the same function it would be a good idea to have a custom_yaml_dumper or something as part of the filter file with a custom representer for floats.

Something like this would achieve the desired value (where I set the number of digits to 8).

import yaml
def prettify_float(dumper, value):
    """Override the default yaml representer: https://github.com/yaml/pyyaml/blob/48838a3c768e3d1bcab44197d800145cfd0719d6/lib/yaml/representer.py#L189"""
    ret = dumper.represent_float(value)
    if len(ret.value) > 8:
        ret_str = f"{value:.8e}"
        ret = dumper.represent_scalar('tag:yaml.org,2002:float', ret_str)
    return ret

yaml.add_representer(float, prettify_float)

Here I'm passing every float that yaml.dump will see to prettify_float, which then delegates to the default represent_float function. If the result has a length greater than 8 (this is looking at the length of the string so it is taking into account things like -, . or e4. Then it rounds the value to 8 digits. Otherwise it uses the default oen.

(assigning this to @comane since he was taking care of organizing a bit the filter utils)

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions