Make read_vrt chunks mode lazy#1807
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses #1798 by making read_vrt(chunks=...) on CPU genuinely lazy: instead of assembling the full VRT mosaic eagerly and then chunking, it builds a dask array composed of per-window read tasks.
Changes:
- Added a new CPU dask path for
read_vrt(chunks=...)that constructs windowed delayed tasks (_read_vrt_dask). - Added
_vrt_effective_dtypeto infer a stable output dtype for the lazy dask graph. - Added regression tests covering value parity vs eager reads and ensuring no source reads/warnings during lazy construction.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
xrspatial/geotiff/__init__.py |
Adds a new lazy dask backend for CPU read_vrt(chunks=...) and routes chunked CPU reads through it. |
xrspatial/geotiff/tests/test_read_vrt_lazy_chunks_1798.py |
Adds tests for the new lazy chunked VRT read behavior. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
+3839
to
+3842
| rows = list(range(0, height, ch_h)) | ||
| cols = list(range(0, width, ch_w)) | ||
| out_has_band_axis = band is None and n_bands > 1 | ||
|
|
Comment on lines
+3843
to
+3851
| @dask.delayed | ||
| def _read_chunk(chunk_window): | ||
| chunk_da = read_vrt( | ||
| source, dtype=dtype, window=chunk_window, band=band, | ||
| chunks=None, gpu=False, max_pixels=max_pixels, | ||
| ) | ||
| arr = np.asarray(chunk_da.values) | ||
| if arr.dtype != out_dtype: | ||
| arr = arr.astype(out_dtype) |
Contributor
Author
|
@copilot resolve the merge conflicts in this pull request |
Agent-Logs-Url: https://github.com/xarray-contrib/xarray-spatial/sessions/27f4131a-2907-4ca0-bf00-f303ab61f2e9 Co-authored-by: brendancol <433221+brendancol@users.noreply.github.com>
Contributor
Resolves conflict in xrspatial/geotiff/__init__.py: keeps the `_read_vrt_dask` dispatch hook from the PR branch. All other geotiff changes from main (#1791, #1793, #1801, #1802, #1803, #1804, #1805, #1806) were already integrated into the working tree by the prior 7329dd9 commit; this merge just records the parent so git recognises the reconciliation.
PR #1803 forwarded the caller's max_pixels to read_to_array inside read_vrt's source loop so a tiny VRT output cannot force a huge source decode (#1796). The output-window check at the source read enforces that correctly. A separate per-tile dimension check at the same call sites also consumed the caller's max_pixels, so a caller setting max_pixels as an output budget (e.g. 10_000) failed the per-tile sanity check on any normal source whose default tile size is 256x256 (= 65_536 pixels). Use MAX_PIXELS_DEFAULT for the per-tile dim check at the two call sites in _read_tiles (local) and _read_tiles_cog_http (HTTP). The output-window check at the same functions continues to enforce the user-supplied max_pixels, preserving the #1796 protection.
Copilot AI
added a commit
that referenced
this pull request
May 13, 2026
Keep _read_vrt_chunked dispatch (handles gpu=True + chunks=) over the non-GPU-capable _read_vrt_dask added in #1807. Remove the now-dead _read_vrt_dask and _vrt_effective_dtype functions that were only reachable via the superseded dispatch branch. Auto-merged from main: _vrt.py (VRT resample window inverse #1704 + XML size cap #1815 + minimal source window #1821), test files test_read_vrt_lazy_chunks_1798.py, test_vrt_dstrect_resample_cap_1737.py, test_vrt_resample_window_inverse_1704.py, test_vrt_xml_size_cap_1815.py. Co-authored-by: brendancol <433221+brendancol@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #1798.\n\nAdds a lazy CPU dask VRT path that builds windowed tasks instead of assembling the full mosaic before chunking.\n\nTested: pytest xrspatial/geotiff/tests/test_read_vrt_lazy_chunks_1798.py