Perf: cache and serialize partition metadata across sessions

## Problem

`partition_metadata()` in `df.py` recomputes min/max coordinate bounds for all partitions every time `read_xarray_table()` is called. For ARCO-ERA5 (732,072 partitions), this adds startup latency on every new session even though the coordinate layout of the dataset never changes.

For remote datasets (GCS/S3), each coordinate access has network latency — making this especially costly.

## Proposed API

```python
table = read_xarray_table(
    ds,
    chunks={'time': 1},
    metadata_cache='./era5_meta.parquet'
)
# First call: computes and saves bounds to cache file
# Subsequent calls: loads bounds from cache, skipping 732,072 coordinate reads
```

The partition bounds are a pure function of: dataset path + chunk specification, so caching is safe as long as the dataset structure doesn't change.

## Storage formats to consider

1. Parquet sidecar file (efficient, columnar)
2. JSON sidecar file (human-readable, debuggable)
3. Zarr consolidated metadata attributes (colocated with dataset)

Parent: #126

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Perf: cache and serialize partition metadata across sessions #133

Problem

Proposed API

Storage formats to consider

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Perf: cache and serialize partition metadata across sessions #133

Description

Problem

Proposed API

Storage formats to consider

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions