feat: Forecast wrapper for custom xarray datasets#302
Conversation
|
@aaTman this is working version of my "in-memory" forecast dataset I used for the test evals on our new icechunk-based MLWP archive. Given that icechunk has a few more complexities, it may be easier to open an icechunk archive separately and then use this pass into EWB. |
There was a problem hiding this comment.
Pull request overview
This PR introduces a new XarrayForecast class that allows users to wrap pre-opened or manually constructed xarray datasets for use in forecast evaluations. This is particularly useful when working with datasets assembled from multiple sources or when integrating with alternative storage backends like Icechunk.
Changes:
- Added
XarrayForecastclass tosrc/extremeweatherbench/inputs.pythat extendsForecastBaseand stores an in-memory xarray dataset - Comprehensive test suite with 39 test methods covering instantiation, preprocessing, variable mapping, integration with evaluation objects, and edge cases
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
src/extremeweatherbench/inputs.py |
Implements the new XarrayForecast class with validation in __post_init__ and a simple _open_data_from_source method that returns the stored dataset |
tests/test_inputs.py |
Adds comprehensive test class TestXarrayForecast with 39 test methods covering all functionality and edge cases |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
aaTman
left a comment
There was a problem hiding this comment.
Love this, simple and elegant. Just some thoughts on making this a bit nicer with upstream changes before merging.
Do you happen to have an example script on hand we can also include to show how its used?
|
@aaTman here's a hacky version of what my script looked like: from extremeweatherbench.inputs import XarrayForecast
import icechunk
import xarray as xr
from extremeweatherbench import cases, evaluate
mlwp_model_variable_mapping = {
"2m_temperature": "surface_air_temperature",
"2m_dewpoint_temperature": "surface_dewpoint_temperature",
"2m_relative_humidity": "surface_relative_humidity",
"2m_wind_speed": "surface_wind_speed",
"2m_wind_from_direction": "surface_wind_from_direction",
"2m_wind_gust": "surface_wind_gust",
"2m_wind_gust_direction": "surface_wind_gust_direction",
"2m_wind_gust_speed": "surface_wind_gust_speed",
}
def open_icechunk_dataset(
bucket: str = DEFAULT_ICECHUNK_BUCKET,
prefix: str = DEFAULT_ICECHUNK_PREFIX,
variable_mapping: dict[str, str] | None = None,
chunks: str | dict | None = "auto",
source_credentials_prefix: str = DEFAULT_SOURCE_CREDENTIALS_PREFIX,
) -> xr.Dataset:
"""Open a dataset from an Icechunk repository with preprocessing.
The repository config already knows where the virtual chunks are located.
We just need to provide credentials broad enough to cover that location.
Args:
bucket: GCS bucket containing the Icechunk repository.
prefix: Prefix within the bucket for the repository.
variable_mapping: Dictionary mapping source variable names to target names.
chunks: Chunk specification for xarray (default: "auto").
source_credentials_prefix: GCS prefix for virtual chunk credentials.
Should be broad enough to cover wherever the source data lives.
Returns:
Preprocessed xarray Dataset ready for evaluation.
"""
logger.info(f"Opening Icechunk repository at gs://{bucket}/{prefix}")
# Set up storage
storage = icechunk.gcs_storage(bucket=bucket, prefix=prefix)
# Set up credentials for virtual chunks.
# The repo config knows the exact location; we just provide credentials
# broad enough to cover it.
gcs_credentials = icechunk.gcs_from_env_credentials()
virtual_credentials = icechunk.containers_credentials({source_credentials_prefix: gcs_credentials})
# Open repository
repo = icechunk.Repository.open(storage, authorize_virtual_chunk_access=virtual_credentials)
session = repo.readonly_session("main")
# Open dataset
ds = xr.open_dataset(session.store, engine="zarr", chunks=chunks)
# Apply variable renaming if specified
if variable_mapping:
rename_dict = {k: v for k, v in variable_mapping.items() if k in ds.data_vars}
if rename_dict:
ds = ds.rename(rename_dict)
return ds
if __name__ == "__main__":
ds = open_icechunk_dataset(
bucket=icechunk_bucket,
prefix=icechunk_prefix,
variable_mapping=mlwp_model_variable_mapping,
chunks="auto",
source_credentials_prefix=source_prefix,
)
forecast = XarrayForecast(
ds,
name=f"{icechunk_bucket}/{icechunk_prefix}",
# NOTE: we have to pass in the variables that will actually be used for this
# metrics calculations. We can at least bypass the variable mapping by manually
# processing the datasets for ourselves.
variables=["surface_air_temperature"],
variable_mapping=mlwp_model_variable_mapping,
)
target = ...
evals = ...
ewb = evaluate.ExtremeWeatherBench(
case_metadata=cases.load_ewb_events_yaml_into_case_collection(),
evaluation_objects=evals,
) |
Awesome. I'd love to have some generic version of this in the documentation. |
* Add pressure_dimension_str arg to geopotential_thickness (#297) * `DurationMeanError` memory fix and add time resolution option (#296) * update duration with handling spatial dims, remove compute, fix sparse lead time dim generation * update name on metric in tests * add docstring for time res arg * Move parallel config check outside of function (#301) * move function out of run, move cache mkdir to init * add tests for new func * ruff * update parallel_config passthrough and tests * feat: Forecast wrapper for custom xarray datasets (#302) * implements a new Forecast object that can wrap existing xarray datasets * Revise per copilot review * Simplify IBTrACS polars subset (#303) * Update `geopotential_thickness` var names and docstring (#306) * update docstrings and var namings * rename vars, add test * ruff * Clarify default preprocess function names; geopotential division fix (#305) * update naming * default preprocess for applied_tc * ruff * ruff * Remove "cases" key requirement in yamls and dicts (#308) * remove cases top level of yaml and fix code to handle this * remove old load events yaml function * update validation precommit and formatting * remove out-of-date notebook from docs * CIRA Icechunk store (#310) * dependencies and generate store file started * in-flight, added and cleaned filter funcs * add icechunk + obstore and cira icechunk generation script * remove cira gen script no longer used * code cleanup * add icechunk datatree forecast class object * uv lock * add documentation, group helper func, and add repository kwargs passthrough * remove icechunk forecast object * typo * ruff * update pyproject and uv lock * add TODO * update PR template * Remove `IndividualCaseCollection` (#317) * update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists * update template * make questions bold * add whitespace * remove indent error and typo from evaluate_cli * make load_individual_cases include passthrough for existing dataclasses * ruff * add comment for clarification on list comp * ruff (again) * remove all references to collection, replace with list * ruff * rename collection -> list * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos --------- Co-authored-by: Daniel Rothenberg <daniel@danielrothenberg.com>
* Add pressure_dimension_str arg to geopotential_thickness (#297) * `DurationMeanError` memory fix and add time resolution option (#296) * update duration with handling spatial dims, remove compute, fix sparse lead time dim generation * update name on metric in tests * add docstring for time res arg * Move parallel config check outside of function (#301) * move function out of run, move cache mkdir to init * add tests for new func * ruff * update parallel_config passthrough and tests * feat: Forecast wrapper for custom xarray datasets (#302) * implements a new Forecast object that can wrap existing xarray datasets * Revise per copilot review * Simplify IBTrACS polars subset (#303) * Update `geopotential_thickness` var names and docstring (#306) * update docstrings and var namings * rename vars, add test * ruff * Clarify default preprocess function names; geopotential division fix (#305) * update naming * default preprocess for applied_tc * ruff * ruff * Remove "cases" key requirement in yamls and dicts (#308) * remove cases top level of yaml and fix code to handle this * remove old load events yaml function * update validation precommit and formatting * remove out-of-date notebook from docs * CIRA Icechunk store (#310) * dependencies and generate store file started * in-flight, added and cleaned filter funcs * add icechunk + obstore and cira icechunk generation script * remove cira gen script no longer used * code cleanup * add icechunk datatree forecast class object * uv lock * add documentation, group helper func, and add repository kwargs passthrough * remove icechunk forecast object * typo * ruff * update pyproject and uv lock * add TODO * update PR template * Remove `IndividualCaseCollection` (#317) * update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists * update template * make questions bold * add whitespace * remove indent error and typo from evaluate_cli * make load_individual_cases include passthrough for existing dataclasses * ruff * add comment for clarification on list comp * ruff (again) * remove all references to collection, replace with list * ruff * rename collection -> list * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * Bump version from 0.2.0 to 0.3.0 (#324) * Updated API (#321) * move cache dir creation to init, rename funcs, add parallel/serial check function, update test names * update naming * add run method for backwards compatibility * update tests * add tests and cover if serial and parallel_config is not None * feat: redesign public API with hierarchical namespace submodules - Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench) - Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics, ewb.derived, ewb.regions, ewb.cases, ewb.defaults - Expose all classes at top level for convenience (ewb.ERA5, etc.) - Add ewb.load_cases() convenience alias - Update all example files to use new import pattern - Update usage.md documentation - Maintain backward compatibility with existing imports * ruff/linting. add utils to init * add test coverage for module loading patterns * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * update defaults var refs * Golden tests (#323) * first pass for gt test infra + yaml * use shapefile for severe convection and catch latitude swap * add ignore for golden test when running pytest by default * ruff * move pytest addopts and markers to pyproject.toml * Remove `IndividualCaseCollection` (#317) * update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists * update template * make questions bold * add whitespace * remove indent error and typo from evaluate_cli * make load_individual_cases include passthrough for existing dataclasses * ruff * add comment for clarification on list comp * ruff (again) * remove all references to collection, replace with list * ruff * rename collection -> list * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * Bump version from 0.2.0 to 0.3.0 (#324) * Updated API (#321) * move cache dir creation to init, rename funcs, add parallel/serial check function, update test names * update naming * add run method for backwards compatibility * update tests * add tests and cover if serial and parallel_config is not None * feat: redesign public API with hierarchical namespace submodules - Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench) - Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics, ewb.derived, ewb.regions, ewb.cases, ewb.defaults - Expose all classes at top level for convenience (ewb.ERA5, etc.) - Add ewb.load_cases() convenience alias - Update all example files to use new import pattern - Update usage.md documentation - Maintain backward compatibility with existing imports * ruff/linting. add utils to init * add test coverage for module loading patterns * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * update defaults var refs * remove to_csv * PyPI Preparation (#315) * update build-system and project * update workflows, publish, and pyproject * add justfile and twine * update publish yaml * change to python 3.10 as minimum requirement * kerchunk needs 3.11, swapping pyproject and tests to remove 3.10 * change workflows to use version matrix * align workflows * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * Bump version from 0.2.0 to 0.3.0 (#324) * Updated API (#321) * move cache dir creation to init, rename funcs, add parallel/serial check function, update test names * update naming * add run method for backwards compatibility * update tests * add tests and cover if serial and parallel_config is not None * feat: redesign public API with hierarchical namespace submodules - Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench) - Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics, ewb.derived, ewb.regions, ewb.cases, ewb.defaults - Expose all classes at top level for convenience (ewb.ERA5, etc.) - Add ewb.load_cases() convenience alias - Update all example files to use new import pattern - Update usage.md documentation - Maintain backward compatibility with existing imports * ruff/linting. add utils to init * add test coverage for module loading patterns * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * update defaults var refs * Golden tests (#323) * first pass for gt test infra + yaml * use shapefile for severe convection and catch latitude swap * add ignore for golden test when running pytest by default * ruff * move pytest addopts and markers to pyproject.toml * Remove `IndividualCaseCollection` (#317) * update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists * update template * make questions bold * add whitespace * remove indent error and typo from evaluate_cli * make load_individual_cases include passthrough for existing dataclasses * ruff * add comment for clarification on list comp * ruff (again) * remove all references to collection, replace with list * ruff * rename collection -> list * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * Bump version from 0.2.0 to 0.3.0 (#324) * Updated API (#321) * move cache dir creation to init, rename funcs, add parallel/serial check function, update test names * update naming * add run method for backwards compatibility * update tests * add tests and cover if serial and parallel_config is not None * feat: redesign public API with hierarchical namespace submodules - Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench) - Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics, ewb.derived, ewb.regions, ewb.cases, ewb.defaults - Expose all classes at top level for convenience (ewb.ERA5, etc.) - Add ewb.load_cases() convenience alias - Update all example files to use new import pattern - Update usage.md documentation - Maintain backward compatibility with existing imports * ruff/linting. add utils to init * add test coverage for module loading patterns * ruff * Cleanup docstrings in repo (#318) * update these docstrings * remove docstring changes markdown * update docstrings * update other docstrings * remove individualcasecollection reference, update based on develop changes * add explanation for dim reqs (#320) * Update `defaults` and `inputs` to include new CIRA icechunk store (#319) * more explicit naming, add func and model names var * add test coverage, ruff, linting * update readme for new cira approach * move cira func and model ref to inputs * update docs * module wasnt called for moved func * update tests for moving func and var * ruff * fix mock typos * update defaults var refs * remove to_csv * swap pyproject tools to hatch; add if and packages-dir to publish * update pyproject version for release * Remove duplicate function and fixtures (#326) * chore: remove duplicate function and fixtures - Remove duplicate _parallel_serial_config_check function from evaluate.py (was defined twice at lines 189 and 982 with identical implementation) - Remove duplicate runner fixture from test_evaluate_cli.py (already defined in conftest.py) - Remove duplicate temp_config_dir fixture from test_evaluate_cli.py (already defined in conftest.py) - Remove unused tempfile import from test_evaluate_cli.py * ruff --------- Co-authored-by: Daniel Rothenberg <daniel@danielrothenberg.com>
EWB Pull Request
Description
This PR implements a new type of
ForecastBase, anXarrayForecastwhich wraps around a previously-opened xr.Dataset. It can be used to interact with datasets which were manually constructed (e.g. by reading many different datasets and concatenating together), or as a way to read an Icechunk dataset which was previously opened.Type of change
Please delete options that are not relevant.
How Has This Been Tested?
Comprehensive unit tests. This code was adapted from real-world code used for testing icechunk integrations with EWB.
Checklist: