Skip to content

Comments

feat: Forecast wrapper for custom xarray datasets#302

Merged
aaTman merged 2 commits intodevelopfrom
darothen/revised-xarrayforecast
Jan 13, 2026
Merged

feat: Forecast wrapper for custom xarray datasets#302
aaTman merged 2 commits intodevelopfrom
darothen/revised-xarrayforecast

Conversation

@darothen
Copy link
Collaborator

EWB Pull Request

Description

This PR implements a new type of ForecastBase, an XarrayForecast which wraps around a previously-opened xr.Dataset. It can be used to interact with datasets which were manually constructed (e.g. by reading many different datasets and concatenating together), or as a way to read an Icechunk dataset which was previously opened.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

Comprehensive unit tests. This code was adapted from real-world code used for testing icechunk integrations with EWB.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@darothen darothen requested review from aaTman and Copilot January 13, 2026 15:38
@darothen darothen changed the base branch from main to develop January 13, 2026 15:38
@darothen
Copy link
Collaborator Author

@aaTman this is working version of my "in-memory" forecast dataset I used for the test evals on our new icechunk-based MLWP archive. Given that icechunk has a few more complexities, it may be easier to open an icechunk archive separately and then use this pass into EWB.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new XarrayForecast class that allows users to wrap pre-opened or manually constructed xarray datasets for use in forecast evaluations. This is particularly useful when working with datasets assembled from multiple sources or when integrating with alternative storage backends like Icechunk.

Changes:

  • Added XarrayForecast class to src/extremeweatherbench/inputs.py that extends ForecastBase and stores an in-memory xarray dataset
  • Comprehensive test suite with 39 test methods covering instantiation, preprocessing, variable mapping, integration with evaluation objects, and edge cases

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/extremeweatherbench/inputs.py Implements the new XarrayForecast class with validation in __post_init__ and a simple _open_data_from_source method that returns the stored dataset
tests/test_inputs.py Adds comprehensive test class TestXarrayForecast with 39 test methods covering all functionality and edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Collaborator

@aaTman aaTman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this, simple and elegant. Just some thoughts on making this a bit nicer with upstream changes before merging.

Do you happen to have an example script on hand we can also include to show how its used?

@darothen
Copy link
Collaborator Author

darothen commented Jan 13, 2026

@aaTman here's a hacky version of what my script looked like:

from extremeweatherbench.inputs import XarrayForecast
import icechunk
import xarray as xr
from extremeweatherbench import cases, evaluate

mlwp_model_variable_mapping = {
    "2m_temperature": "surface_air_temperature",
    "2m_dewpoint_temperature": "surface_dewpoint_temperature",
    "2m_relative_humidity": "surface_relative_humidity",
    "2m_wind_speed": "surface_wind_speed",
    "2m_wind_from_direction": "surface_wind_from_direction",
    "2m_wind_gust": "surface_wind_gust",
    "2m_wind_gust_direction": "surface_wind_gust_direction",
    "2m_wind_gust_speed": "surface_wind_gust_speed",
}

def open_icechunk_dataset(
    bucket: str = DEFAULT_ICECHUNK_BUCKET,
    prefix: str = DEFAULT_ICECHUNK_PREFIX,
    variable_mapping: dict[str, str] | None = None,
    chunks: str | dict | None = "auto",
    source_credentials_prefix: str = DEFAULT_SOURCE_CREDENTIALS_PREFIX,
) -> xr.Dataset:
    """Open a dataset from an Icechunk repository with preprocessing.

    The repository config already knows where the virtual chunks are located.
    We just need to provide credentials broad enough to cover that location.

    Args:
        bucket: GCS bucket containing the Icechunk repository.
        prefix: Prefix within the bucket for the repository.
        variable_mapping: Dictionary mapping source variable names to target names.
        chunks: Chunk specification for xarray (default: "auto").
        source_credentials_prefix: GCS prefix for virtual chunk credentials.
            Should be broad enough to cover wherever the source data lives.

    Returns:
        Preprocessed xarray Dataset ready for evaluation.
    """
    logger.info(f"Opening Icechunk repository at gs://{bucket}/{prefix}")

    # Set up storage
    storage = icechunk.gcs_storage(bucket=bucket, prefix=prefix)

    # Set up credentials for virtual chunks.
    # The repo config knows the exact location; we just provide credentials
    # broad enough to cover it.
    gcs_credentials = icechunk.gcs_from_env_credentials()
    virtual_credentials = icechunk.containers_credentials({source_credentials_prefix: gcs_credentials})

    # Open repository
    repo = icechunk.Repository.open(storage, authorize_virtual_chunk_access=virtual_credentials)
    session = repo.readonly_session("main")

    # Open dataset
    ds = xr.open_dataset(session.store, engine="zarr", chunks=chunks)

    # Apply variable renaming if specified
    if variable_mapping:
        rename_dict = {k: v for k, v in variable_mapping.items() if k in ds.data_vars}
        if rename_dict:
            ds = ds.rename(rename_dict)

    return ds


if __name__ == "__main__":

    ds = open_icechunk_dataset(
        bucket=icechunk_bucket,
        prefix=icechunk_prefix,
        variable_mapping=mlwp_model_variable_mapping,
        chunks="auto",
        source_credentials_prefix=source_prefix,
    )


    forecast = XarrayForecast(
        ds,
        name=f"{icechunk_bucket}/{icechunk_prefix}",
        # NOTE: we have to pass in the variables that will actually be used for this
        # metrics calculations. We can at least bypass the variable mapping by manually
        # processing the datasets for ourselves.
        variables=["surface_air_temperature"],
        variable_mapping=mlwp_model_variable_mapping,
    )
    target = ...
    evals = ...
    ewb = evaluate.ExtremeWeatherBench(
        case_metadata=cases.load_ewb_events_yaml_into_case_collection(),
        evaluation_objects=evals,
    )

@aaTman
Copy link
Collaborator

aaTman commented Jan 13, 2026

@aaTman here's a hacky version of what my script looked like:

from extremeweatherbench.inputs import XarrayForecast
import icechunk
import xarray as xr
from extremeweatherbench import cases, evaluate

mlwp_model_variable_mapping = {
    "2m_temperature": "surface_air_temperature",
    "2m_dewpoint_temperature": "surface_dewpoint_temperature",
    "2m_relative_humidity": "surface_relative_humidity",
    "2m_wind_speed": "surface_wind_speed",
    "2m_wind_from_direction": "surface_wind_from_direction",
    "2m_wind_gust": "surface_wind_gust",
    "2m_wind_gust_direction": "surface_wind_gust_direction",
    "2m_wind_gust_speed": "surface_wind_gust_speed",
}

def open_icechunk_dataset(
    bucket: str = DEFAULT_ICECHUNK_BUCKET,
    prefix: str = DEFAULT_ICECHUNK_PREFIX,
    variable_mapping: dict[str, str] | None = None,
    chunks: str | dict | None = "auto",
    source_credentials_prefix: str = DEFAULT_SOURCE_CREDENTIALS_PREFIX,
) -> xr.Dataset:
    """Open a dataset from an Icechunk repository with preprocessing.

    The repository config already knows where the virtual chunks are located.
    We just need to provide credentials broad enough to cover that location.

    Args:
        bucket: GCS bucket containing the Icechunk repository.
        prefix: Prefix within the bucket for the repository.
        variable_mapping: Dictionary mapping source variable names to target names.
        chunks: Chunk specification for xarray (default: "auto").
        source_credentials_prefix: GCS prefix for virtual chunk credentials.
            Should be broad enough to cover wherever the source data lives.

    Returns:
        Preprocessed xarray Dataset ready for evaluation.
    """
    logger.info(f"Opening Icechunk repository at gs://{bucket}/{prefix}")

    # Set up storage
    storage = icechunk.gcs_storage(bucket=bucket, prefix=prefix)

    # Set up credentials for virtual chunks.
    # The repo config knows the exact location; we just provide credentials
    # broad enough to cover it.
    gcs_credentials = icechunk.gcs_from_env_credentials()
    virtual_credentials = icechunk.containers_credentials({source_credentials_prefix: gcs_credentials})

    # Open repository
    repo = icechunk.Repository.open(storage, authorize_virtual_chunk_access=virtual_credentials)
    session = repo.readonly_session("main")

    # Open dataset
    ds = xr.open_dataset(session.store, engine="zarr", chunks=chunks)

    # Apply variable renaming if specified
    if variable_mapping:
        rename_dict = {k: v for k, v in variable_mapping.items() if k in ds.data_vars}
        if rename_dict:
            ds = ds.rename(rename_dict)

    return ds


if __name__ == "__main__":

    ds = open_icechunk_dataset(
        bucket=icechunk_bucket,
        prefix=icechunk_prefix,
        variable_mapping=mlwp_model_variable_mapping,
        chunks="auto",
        source_credentials_prefix=source_prefix,
    )


    forecast = XarrayForecast(
        ds,
        name=f"{icechunk_bucket}/{icechunk_prefix}",
        # NOTE: we have to pass in the variables that will actually be used for this
        # metrics calculations. We can at least bypass the variable mapping by manually
        # processing the datasets for ourselves.
        variables=["surface_air_temperature"],
        variable_mapping=mlwp_model_variable_mapping,
    )
    target = ...
    evals = ...
    ewb = evaluate.ExtremeWeatherBench(
        case_metadata=cases.load_ewb_events_yaml_into_case_collection(),
        evaluation_objects=evals,
    )

Awesome. I'd love to have some generic version of this in the documentation.

@aaTman aaTman merged commit d09f456 into develop Jan 13, 2026
3 checks passed
@aaTman aaTman deleted the darothen/revised-xarrayforecast branch January 13, 2026 17:19
aaTman added a commit that referenced this pull request Jan 26, 2026
* Add pressure_dimension_str arg to geopotential_thickness (#297)

* `DurationMeanError` memory fix and add time resolution option (#296)

* update duration with handling spatial dims, remove compute, fix sparse lead time dim generation

* update name on metric in tests

* add docstring for time res arg

* Move parallel config check outside of function (#301)

* move function out of run, move cache mkdir to init

* add tests for new func

* ruff

* update parallel_config passthrough and tests

* feat: Forecast wrapper for custom xarray datasets (#302)

* implements a new Forecast object that can wrap existing xarray datasets

* Revise per copilot review

* Simplify IBTrACS polars subset (#303)

* Update `geopotential_thickness` var names and docstring (#306)

* update docstrings and var namings

* rename vars, add test

* ruff

* Clarify default preprocess function names; geopotential division fix (#305)

* update naming

* default preprocess for applied_tc

* ruff

* ruff

* Remove "cases" key requirement in yamls and dicts (#308)

* remove cases top level of yaml and fix code to handle this

* remove old load events yaml function

* update validation precommit and formatting

* remove out-of-date notebook from docs

* CIRA Icechunk store (#310)

* dependencies and generate store file started

* in-flight, added and cleaned filter funcs

* add icechunk + obstore and cira icechunk generation script

* remove cira gen script no longer used

* code cleanup

* add icechunk datatree forecast class object

* uv lock

* add documentation, group helper func, and add repository kwargs passthrough

* remove icechunk forecast object

* typo

* ruff

* update pyproject and uv lock

* add TODO

* update PR template

* Remove `IndividualCaseCollection` (#317)

* update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists

* update template

* make questions bold

* add whitespace

* remove indent error and typo from evaluate_cli

* make load_individual_cases include passthrough for existing dataclasses

* ruff

* add comment for clarification on list comp

* ruff (again)

* remove all references to collection, replace with list

* ruff

* rename collection -> list

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

---------

Co-authored-by: Daniel Rothenberg <daniel@danielrothenberg.com>
aaTman added a commit that referenced this pull request Jan 27, 2026
* Add pressure_dimension_str arg to geopotential_thickness (#297)

* `DurationMeanError` memory fix and add time resolution option (#296)

* update duration with handling spatial dims, remove compute, fix sparse lead time dim generation

* update name on metric in tests

* add docstring for time res arg

* Move parallel config check outside of function (#301)

* move function out of run, move cache mkdir to init

* add tests for new func

* ruff

* update parallel_config passthrough and tests

* feat: Forecast wrapper for custom xarray datasets (#302)

* implements a new Forecast object that can wrap existing xarray datasets

* Revise per copilot review

* Simplify IBTrACS polars subset (#303)

* Update `geopotential_thickness` var names and docstring (#306)

* update docstrings and var namings

* rename vars, add test

* ruff

* Clarify default preprocess function names; geopotential division fix (#305)

* update naming

* default preprocess for applied_tc

* ruff

* ruff

* Remove "cases" key requirement in yamls and dicts (#308)

* remove cases top level of yaml and fix code to handle this

* remove old load events yaml function

* update validation precommit and formatting

* remove out-of-date notebook from docs

* CIRA Icechunk store (#310)

* dependencies and generate store file started

* in-flight, added and cleaned filter funcs

* add icechunk + obstore and cira icechunk generation script

* remove cira gen script no longer used

* code cleanup

* add icechunk datatree forecast class object

* uv lock

* add documentation, group helper func, and add repository kwargs passthrough

* remove icechunk forecast object

* typo

* ruff

* update pyproject and uv lock

* add TODO

* update PR template

* Remove `IndividualCaseCollection` (#317)

* update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists

* update template

* make questions bold

* add whitespace

* remove indent error and typo from evaluate_cli

* make load_individual_cases include passthrough for existing dataclasses

* ruff

* add comment for clarification on list comp

* ruff (again)

* remove all references to collection, replace with list

* ruff

* rename collection -> list

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* Bump version from 0.2.0 to 0.3.0 (#324)

* Updated API (#321)

* move cache dir creation to init, rename funcs, add parallel/serial check function, update test names

* update naming

* add run method for backwards compatibility

* update tests

* add tests and cover if serial and parallel_config is not None

* feat: redesign public API with hierarchical namespace submodules

- Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench)
- Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics,
  ewb.derived, ewb.regions, ewb.cases, ewb.defaults
- Expose all classes at top level for convenience (ewb.ERA5, etc.)
- Add ewb.load_cases() convenience alias
- Update all example files to use new import pattern
- Update usage.md documentation
- Maintain backward compatibility with existing imports

* ruff/linting. add utils to init

* add test coverage for module loading patterns

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* update defaults var refs

* Golden tests (#323)

* first pass for gt test infra + yaml

* use shapefile for severe convection and catch latitude swap

* add ignore for golden test when running pytest by default

* ruff

* move pytest addopts and markers to pyproject.toml

* Remove `IndividualCaseCollection` (#317)

* update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists

* update template

* make questions bold

* add whitespace

* remove indent error and typo from evaluate_cli

* make load_individual_cases include passthrough for existing dataclasses

* ruff

* add comment for clarification on list comp

* ruff (again)

* remove all references to collection, replace with list

* ruff

* rename collection -> list

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* Bump version from 0.2.0 to 0.3.0 (#324)

* Updated API (#321)

* move cache dir creation to init, rename funcs, add parallel/serial check function, update test names

* update naming

* add run method for backwards compatibility

* update tests

* add tests and cover if serial and parallel_config is not None

* feat: redesign public API with hierarchical namespace submodules

- Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench)
- Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics,
  ewb.derived, ewb.regions, ewb.cases, ewb.defaults
- Expose all classes at top level for convenience (ewb.ERA5, etc.)
- Add ewb.load_cases() convenience alias
- Update all example files to use new import pattern
- Update usage.md documentation
- Maintain backward compatibility with existing imports

* ruff/linting. add utils to init

* add test coverage for module loading patterns

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* update defaults var refs

* remove to_csv

* PyPI Preparation (#315)

* update build-system and project

* update workflows, publish, and pyproject

* add justfile and twine

* update publish yaml

* change to python 3.10 as minimum requirement

* kerchunk needs 3.11, swapping pyproject and tests to remove 3.10

* change workflows to use version matrix

* align workflows

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* Bump version from 0.2.0 to 0.3.0 (#324)

* Updated API (#321)

* move cache dir creation to init, rename funcs, add parallel/serial check function, update test names

* update naming

* add run method for backwards compatibility

* update tests

* add tests and cover if serial and parallel_config is not None

* feat: redesign public API with hierarchical namespace submodules

- Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench)
- Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics,
  ewb.derived, ewb.regions, ewb.cases, ewb.defaults
- Expose all classes at top level for convenience (ewb.ERA5, etc.)
- Add ewb.load_cases() convenience alias
- Update all example files to use new import pattern
- Update usage.md documentation
- Maintain backward compatibility with existing imports

* ruff/linting. add utils to init

* add test coverage for module loading patterns

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* update defaults var refs

* Golden tests (#323)

* first pass for gt test infra + yaml

* use shapefile for severe convection and catch latitude swap

* add ignore for golden test when running pytest by default

* ruff

* move pytest addopts and markers to pyproject.toml

* Remove `IndividualCaseCollection` (#317)

* update all references to IndividualCaseCollection and convert dicts/ "cases": keys to lists

* update template

* make questions bold

* add whitespace

* remove indent error and typo from evaluate_cli

* make load_individual_cases include passthrough for existing dataclasses

* ruff

* add comment for clarification on list comp

* ruff (again)

* remove all references to collection, replace with list

* ruff

* rename collection -> list

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* Bump version from 0.2.0 to 0.3.0 (#324)

* Updated API (#321)

* move cache dir creation to init, rename funcs, add parallel/serial check function, update test names

* update naming

* add run method for backwards compatibility

* update tests

* add tests and cover if serial and parallel_config is not None

* feat: redesign public API with hierarchical namespace submodules

- Add ewb.evaluation() as main entry point (alias for ExtremeWeatherBench)
- Create namespace submodules: ewb.targets, ewb.forecasts, ewb.metrics,
  ewb.derived, ewb.regions, ewb.cases, ewb.defaults
- Expose all classes at top level for convenience (ewb.ERA5, etc.)
- Add ewb.load_cases() convenience alias
- Update all example files to use new import pattern
- Update usage.md documentation
- Maintain backward compatibility with existing imports

* ruff/linting. add utils to init

* add test coverage for module loading patterns

* ruff

* Cleanup docstrings in repo (#318)

* update these docstrings

* remove docstring changes markdown

* update docstrings

* update other docstrings

* remove individualcasecollection reference, update based on develop changes

* add explanation for dim reqs (#320)

* Update `defaults` and `inputs` to include new CIRA icechunk store (#319)

* more explicit naming, add func and model names var

* add test coverage, ruff, linting

* update readme for new cira approach

* move cira func and model ref to inputs

* update docs

* module wasnt called for moved func

* update tests for moving func and var

* ruff

* fix mock typos

* update defaults var refs

* remove to_csv

* swap pyproject tools to hatch; add if and packages-dir to publish

* update pyproject version for release

* Remove duplicate function and fixtures (#326)

* chore: remove duplicate function and fixtures

- Remove duplicate _parallel_serial_config_check function from evaluate.py
  (was defined twice at lines 189 and 982 with identical implementation)
- Remove duplicate runner fixture from test_evaluate_cli.py
  (already defined in conftest.py)
- Remove duplicate temp_config_dir fixture from test_evaluate_cli.py
  (already defined in conftest.py)
- Remove unused tempfile import from test_evaluate_cli.py

* ruff

---------

Co-authored-by: Daniel Rothenberg <daniel@danielrothenberg.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants