diff --git a/.claude/sweep-accuracy-state.csv b/.claude/sweep-accuracy-state.csv index a8872d7fa..fbf2f0804 100644 --- a/.claude/sweep-accuracy-state.csv +++ b/.claude/sweep-accuracy-state.csv @@ -12,7 +12,7 @@ emerging_hotspots,2026-04-30,,MEDIUM,2;3,MEDIUM: threshold_90 uses int() (trunca fire,2026-04-30,,,,All ops per-pixel (no accumulation/stencil/projected distance). NaN handled via x!=x; CUDA bounds use strict <; rdnbr and ros divisions guarded; CPU/GPU/dask paths algorithmically identical. No accuracy issues found. flood,2026-04-30,,MEDIUM,2;5,"MEDIUM (not fixed): dask backend preserves float32 input dtype while numpy promotes to float64 in flood_depth and curve_number_runoff; DataArray inputs for curve_number, mannings_n bypass scalar > 0 (and CN <= 100) range validation, silently producing NaN/garbage." focal,2026-03-30T13:00:00Z,1092,,, -geotiff,2026-05-10,,,,"Pass 10 (2026-05-10): CLEAN. Audited 5 recent commits: #1558 drop-defensive-copies (frombuffer path still .copy()s before in-place predictor decode at _reader.py:778), #1556 fp-predictor ngjit (writer pre-ravels so 1-D slice arg is correct, float32/64 LE+BE bit-exact), #1552 batched D2H (OOM guard fires before cupy.concatenate, host_buf offsets correct), #1551 parallel-decode gate (>= vs > sends 256x256 default to parallel path, no value diff confirmed via partial-tile parity), #1549 nvjpeg constants (gray + RGB GPU JPEG decode pixel-identical to Pillow CPU, max diff = 0). Cross-backend parity re-verified clean: numpy/dask+numpy/cupy/dask+cupy equal .data/.dtype/.coords/nodata/NaN-mask on deflate+predictor3+nodata; orientations 1-8 numpy==GPU; partial edge tiles 100x150, 257x383, 512x257 numpy==GPU==dask; predictor2 LE/BE round-trip uint8/int16/uint16/int32/uint32 pass; predictor3 LE/BE float32/64 pass. Deferred LOW (pre-existing, not opened): float16 (bps=16, SampleFormat=3) absent from tiff_dtype_to_numpy map - writer never emits, asymmetric but unreachable. | Pass 9 (2026-05-09): TWO HIGH fixed -- (a) PR #1539 closes #1537: TIFF Orientation tag 2/3/4 (mirror flips) on georeferenced files left y/x coords computed from the un-flipped transform, so xarray label lookups returned the wrong pixel even though _apply_orientation flipped the buffer. PR #1521 only updated the transform for the 5-8 axis-swap branch. Fix updates origin and pixel-scale signs along whichever axes were flipped, for both PixelIsArea (origin shifts by N*step) and PixelIsPoint (shifts by (N-1)*step). 10 new tests in test_orientation.py. (b) PR #1546 closes #1540: read_geotiff_gpu ignored Orientation tag completely; CPU correctly applied 2-8 (PR #1521) but GPU returned the raw stored buffer. Cross-backend disagreement on every non-default orientation. Fix adds _apply_orientation_gpu (cupy slicing mirror of the CPU helper) and _apply_orientation_geo_info, threads them into the tiled GPU pipeline, reuses CPU-fallback geo_info for the stripped path to avoid double-applying. 28 new tests in test_orientation_gpu.py (every orientation, single-band tiled, single-band stripped, 3-band tiled, mirror-flip sel-fidelity, default no-tag passthrough). Re-confirmed clean: HTTP coalesce_ranges with overlapping ranges and zero-length ranges, parallel streaming write thread-safety (each tile gets independent buffer via copy or padded zeros), planar=2 + chunky GPU LERC mask propagation matches CPU, IFD chain cap MAX_IFDS=256, max_z_error round-trip on tiled write, _resolve_masked_fill float vs integer dtype semantics. Deferred LOW: per-sample LERC mask (3D mask (h,w,samples)) collapsed to per-pixel ""any sample invalid"" on GPU while CPU honours per-sample; LERC implementations rarely emit 3D masks (verified: lerc.encode with 2D mask on 3-band returns 2D mask). Documented planar=2 + LERC + GPU silently drops mask (rare in practice, source comment acknowledges). | Pass 8 (2026-05-07): HIGH fixed in fix-jpeg-tiff-disable -- to_geotiff(compression='jpeg') wrote files that no external reader can decode. The writer tags compression=7 (new-style JPEG) but emits a self-contained JFIF stream per tile/strip and never writes the JPEGTables tag (347) that the TIFF spec requires for that codec. libtiff/GDAL/rasterio all reject the file with TIFFReadEncodedStrip() failed; our reader round-trips because Pillow decodes the standalone JFIF, hiding the break. Pass-4 notes flagged the read side of the same JPEGTables gap and deferred it; pass-8 covers the write side. Fix: reject compression='jpeg' at the to_geotiff entry with a clear ValueError pointing at deflate/zstd/lzw. The internal _writer.write is untouched so the existing self-decoding tests still cover the codec; re-enabling the public path needs a JPEGTables-aware encoder. PR diffs reviewed but not merged: #1512 (BytesIO source) and #1513 (LERC max_z_error) -- both look correct; #1512 file-like read path goes through read_all() once so the per-call BytesIOSource lock is theoretical, and #1513 forwards max_z_error through every overview/tile/strip/streaming path including _write_vrt_tiled and _compress_block. No regressions found in either open PR. Other surfaces audited clean: predictor=3 with float16 (writer auto-promotes to float32 on both eager and streaming paths, value-exact round-trip); planar=2 multi-tile read uses band_idx*tiles_per_band offset so no cross-contamination between planes; _header.py multi-byte tag parsing uses bo (byte_order) consistently; Pillow YCbCr-vs-tagged-RGB photometric mismatch becomes moot once JPEG is disabled. Deferred (LOW/MEDIUM, not filed): JPEG2000 writer accepts arbitrary dtype with no validation (rare codec, narrow risk); float16 dtype not in tiff_dtype_to_numpy decode map (writer never emits it - asymmetric but unreachable); Orientation tag (274) still ignored on read (pass-4 deferral). | Pass 7 (2026-05-07): HIGH fixed in fix-mmap-cache-refcount-after-replace -- _MmapCache.release() looked up the cache entry by realpath, so a holder that acquired the OLD mmap before an os.replace and released it AFTER another caller had acquired the post-replace entry would decrement the new holder's refcount. Subsequent eviction (cache full, or another acquire) closed the still-in-use mmap, breaking reads with 'mmap closed or invalid'. Real exposure: any concurrent reader/writer pattern where to_geotiff replaces a file that another reader had just opened via open_geotiff with chunks= or via _FileSource. PR #1506 added stale-replacement detection but did not fix the refcount confusion across the pop. Fix: acquire returns an opaque entry token; release takes the token and decrements that exact entry, regardless of cache state. Orphaned (popped) entries close their fh+mmap when their own refcount hits zero. _FileSource updated to pass the token. Regression test test_release_after_path_replacement_does_not_clobber_new_holder added. All 665 geotiff tests pass; GPU path verified. | Pass 6 (2026-05-07) PR #1507: BE pred2 numba TypingError. | Pass 5 (2026-05-06) PR #1506: mmap cache stale after file replace. | Pass 4 (2026-05-06) PR #1501: sparse COG tiles. | Pass 3 (2026-05-06) PR #1500: predictor=3 byte order. | Pass 2 (2026-05-05) PR #1498: predictor=2 sample-wise. | Pass 1 (2026-04-23) PR #1247. Re-confirmed clean over passes 2-7: items 2 (writer always emits LE TIFFs - hardcoded b'II'), 3 (RowsPerStrip default = height when missing), 4 (StripByteCounts missing raises clear ValueError), 5 (TileWidth without TileLength caught by 'tw <= 0 or th <= 0' check at _reader.py:688), 9 (read determinism on compressed+tiled+multiband), 11 (predictor=2 with awkward sample stride round-trips), 18 (compression_level=99 raises ValueError 'out of range for deflate (valid: 1-9)'), 21 (concurrent writes serialize correctly via mkstemp+os.replace), 24 (uint16 dtype preserved on numpy backend, dask honors chunks param), 26 (chunks rounds correctly with remainder chunk for non-tile-aligned). Deferred: item 8 (BytesIO/file-like sources are not supported, source.lower() error) - documented as 'str' parameter, not a bug; item 19 (LERC max_z_error not user-exposed by to_geotiff) - missing feature, not a bug." +geotiff,2026-05-10,,,,"Pass 11 (2026-05-10): CLEAN. Audited the one additional commit since pass 10 -- #1559 (PR 1548, Centralise GeoTIFF attrs population across all read backends). Refactor extracts _populate_attrs_from_geo_info helper and routes eager numpy, dask, GPU stripped, GPU tiled read paths through it; before the fix dask only emitted crs/transform/raster_type/nodata while numpy emitted the full attrs set including x/y_resolution, resolution_unit, image_description, extra_samples, GDAL metadata, and the CRS-description fields. No data-path arithmetic touched; only attrs dict population. Windowed origin math (origin_x + c0*pixel_width, origin_y + r0*pixel_height) verified to produce -98.0 / 48.75 origin for window=(10,20,50,70) on a (0.1,-0.125) pixel-size raster, with PixelIsArea half-pixel offset preserved on coord lookups (-97.95, 48.6875). Cross-backend attrs parity re-verified: numpy/dask/cupy all emit identical key set on deflate+predictor3+nodata round-trip (crs, crs_wkt, nodata, transform, x_resolution, y_resolution). Data bit-parity re-verified across numpy/dask/cupy on same payload (np.array_equal with equal_nan=True). test_attrs_parity_1548.py (5 tests), test_reader.py/test_writer.py/test_dask_cupy_combined.py (25 tests), GPU orientation/predictor2-BE/LERC-mask/nodata/byteswap suites (65 tests) all green. No accuracy or backend-divergence findings. | Pass 10 (2026-05-10): CLEAN. Audited 5 recent commits: #1558 drop-defensive-copies (frombuffer path still .copy()s before in-place predictor decode at _reader.py:778), #1556 fp-predictor ngjit (writer pre-ravels so 1-D slice arg is correct, float32/64 LE+BE bit-exact), #1552 batched D2H (OOM guard fires before cupy.concatenate, host_buf offsets correct), #1551 parallel-decode gate (>= vs > sends 256x256 default to parallel path, no value diff confirmed via partial-tile parity), #1549 nvjpeg constants (gray + RGB GPU JPEG decode pixel-identical to Pillow CPU, max diff = 0). Cross-backend parity re-verified clean: numpy/dask+numpy/cupy/dask+cupy equal .data/.dtype/.coords/nodata/NaN-mask on deflate+predictor3+nodata; orientations 1-8 numpy==GPU; partial edge tiles 100x150, 257x383, 512x257 numpy==GPU==dask; predictor2 LE/BE round-trip uint8/int16/uint16/int32/uint32 pass; predictor3 LE/BE float32/64 pass. Deferred LOW (pre-existing, not opened): float16 (bps=16, SampleFormat=3) absent from tiff_dtype_to_numpy map - writer never emits, asymmetric but unreachable. | Pass 9 (2026-05-09): TWO HIGH fixed -- (a) PR #1539 closes #1537: TIFF Orientation tag 2/3/4 (mirror flips) on georeferenced files left y/x coords computed from the un-flipped transform, so xarray label lookups returned the wrong pixel even though _apply_orientation flipped the buffer. PR #1521 only updated the transform for the 5-8 axis-swap branch. Fix updates origin and pixel-scale signs along whichever axes were flipped, for both PixelIsArea (origin shifts by N*step) and PixelIsPoint (shifts by (N-1)*step). 10 new tests in test_orientation.py. (b) PR #1546 closes #1540: read_geotiff_gpu ignored Orientation tag completely; CPU correctly applied 2-8 (PR #1521) but GPU returned the raw stored buffer. Cross-backend disagreement on every non-default orientation. Fix adds _apply_orientation_gpu (cupy slicing mirror of the CPU helper) and _apply_orientation_geo_info, threads them into the tiled GPU pipeline, reuses CPU-fallback geo_info for the stripped path to avoid double-applying. 28 new tests in test_orientation_gpu.py (every orientation, single-band tiled, single-band stripped, 3-band tiled, mirror-flip sel-fidelity, default no-tag passthrough). Re-confirmed clean: HTTP coalesce_ranges with overlapping ranges and zero-length ranges, parallel streaming write thread-safety (each tile gets independent buffer via copy or padded zeros), planar=2 + chunky GPU LERC mask propagation matches CPU, IFD chain cap MAX_IFDS=256, max_z_error round-trip on tiled write, _resolve_masked_fill float vs integer dtype semantics. Deferred LOW: per-sample LERC mask (3D mask (h,w,samples)) collapsed to per-pixel ""any sample invalid"" on GPU while CPU honours per-sample; LERC implementations rarely emit 3D masks (verified: lerc.encode with 2D mask on 3-band returns 2D mask). Documented planar=2 + LERC + GPU silently drops mask (rare in practice, source comment acknowledges). | Pass 8 (2026-05-07): HIGH fixed in fix-jpeg-tiff-disable -- to_geotiff(compression='jpeg') wrote files that no external reader can decode. The writer tags compression=7 (new-style JPEG) but emits a self-contained JFIF stream per tile/strip and never writes the JPEGTables tag (347) that the TIFF spec requires for that codec. libtiff/GDAL/rasterio all reject the file with TIFFReadEncodedStrip() failed; our reader round-trips because Pillow decodes the standalone JFIF, hiding the break. Pass-4 notes flagged the read side of the same JPEGTables gap and deferred it; pass-8 covers the write side. Fix: reject compression='jpeg' at the to_geotiff entry with a clear ValueError pointing at deflate/zstd/lzw. The internal _writer.write is untouched so the existing self-decoding tests still cover the codec; re-enabling the public path needs a JPEGTables-aware encoder. PR diffs reviewed but not merged: #1512 (BytesIO source) and #1513 (LERC max_z_error) -- both look correct; #1512 file-like read path goes through read_all() once so the per-call BytesIOSource lock is theoretical, and #1513 forwards max_z_error through every overview/tile/strip/streaming path including _write_vrt_tiled and _compress_block. No regressions found in either open PR. Other surfaces audited clean: predictor=3 with float16 (writer auto-promotes to float32 on both eager and streaming paths, value-exact round-trip); planar=2 multi-tile read uses band_idx*tiles_per_band offset so no cross-contamination between planes; _header.py multi-byte tag parsing uses bo (byte_order) consistently; Pillow YCbCr-vs-tagged-RGB photometric mismatch becomes moot once JPEG is disabled. Deferred (LOW/MEDIUM, not filed): JPEG2000 writer accepts arbitrary dtype with no validation (rare codec, narrow risk); float16 dtype not in tiff_dtype_to_numpy decode map (writer never emits it - asymmetric but unreachable); Orientation tag (274) still ignored on read (pass-4 deferral). | Pass 7 (2026-05-07): HIGH fixed in fix-mmap-cache-refcount-after-replace -- _MmapCache.release() looked up the cache entry by realpath, so a holder that acquired the OLD mmap before an os.replace and released it AFTER another caller had acquired the post-replace entry would decrement the new holder's refcount. Subsequent eviction (cache full, or another acquire) closed the still-in-use mmap, breaking reads with 'mmap closed or invalid'. Real exposure: any concurrent reader/writer pattern where to_geotiff replaces a file that another reader had just opened via open_geotiff with chunks= or via _FileSource. PR #1506 added stale-replacement detection but did not fix the refcount confusion across the pop. Fix: acquire returns an opaque entry token; release takes the token and decrements that exact entry, regardless of cache state. Orphaned (popped) entries close their fh+mmap when their own refcount hits zero. _FileSource updated to pass the token. Regression test test_release_after_path_replacement_does_not_clobber_new_holder added. All 665 geotiff tests pass; GPU path verified. | Pass 6 (2026-05-07) PR #1507: BE pred2 numba TypingError. | Pass 5 (2026-05-06) PR #1506: mmap cache stale after file replace. | Pass 4 (2026-05-06) PR #1501: sparse COG tiles. | Pass 3 (2026-05-06) PR #1500: predictor=3 byte order. | Pass 2 (2026-05-05) PR #1498: predictor=2 sample-wise. | Pass 1 (2026-04-23) PR #1247. Re-confirmed clean over passes 2-7: items 2 (writer always emits LE TIFFs - hardcoded b'II'), 3 (RowsPerStrip default = height when missing), 4 (StripByteCounts missing raises clear ValueError), 5 (TileWidth without TileLength caught by 'tw <= 0 or th <= 0' check at _reader.py:688), 9 (read determinism on compressed+tiled+multiband), 11 (predictor=2 with awkward sample stride round-trips), 18 (compression_level=99 raises ValueError 'out of range for deflate (valid: 1-9)'), 21 (concurrent writes serialize correctly via mkstemp+os.replace), 24 (uint16 dtype preserved on numpy backend, dask honors chunks param), 26 (chunks rounds correctly with remainder chunk for non-tile-aligned). Deferred: item 8 (BytesIO/file-like sources are not supported, source.lower() error) - documented as 'str' parameter, not a bug; item 19 (LERC max_z_error not user-exposed by to_geotiff) - missing feature, not a bug." glcm,2026-05-01,1408,HIGH,2,"angle=None averaged NaN as 0, masking no-valid-pairs as zero texture; fixed via nanmean-style averaging" hillshade,2026-04-10T12:00:00Z,,,,"Horn's method correct. All backends consistent. NaN propagation correct. float32 adequate for [0,1] output." hydro,2026-04-30,,LOW,1,Only LOW: twi log(0)=-inf if fa=0 (out-of-contract); MFD weighted sum no Kahan (negligible). No CRIT/HIGH issues. diff --git a/.claude/sweep-performance-state.csv b/.claude/sweep-performance-state.csv index 4b3ff160f..e332fcd79 100644 --- a/.claude/sweep-performance-state.csv +++ b/.claude/sweep-performance-state.csv @@ -18,7 +18,7 @@ fire,2026-03-31T18:00:00Z,SAFE,compute-bound,0,, flood,2026-03-31T18:00:00Z,SAFE,compute-bound,0,, focal,2026-03-31T18:00:00Z,SAFE,compute-bound,0,, geodesic,2026-03-31T18:00:00Z,N/A,compute-bound,0,, -geotiff,2026-05-10,SAFE,IO-bound,0,,"Pass 3 (2026-05-10): SAFE/IO-bound. Audited 4 perf commits: #1558 (in-place NaN writes on uniquely-owned buffers correct), #1556 (fp-predictor ngjit ~297us/tile for 256x256 float32), #1552 (single cupy.concatenate + one .get() for batched D2H at _gpu_decode.py:870-913), #1551 (parallel decode threshold >=65536px engages 256x256 default at _reader.py:1121). Bench: 8192x8192 f32 deflate+pred2 256-tile write 782ms; 4096x4096 f32 deflate read 83ms with parallel decode. Deferred LOW (none filed, all <10% MEDIUM threshold): _writer.py:459/1109 redundant .copy() before predictor encode (~1% per tile), _compression.py:280 lzw_decompress dst[:n].copy() (~2% per LZW tile decode), _writer.py:1419 seg_np.copy() before in-place NaN substitution (negligible, conditional path), _CloudSource.read_range opens fresh fsspec handle per range (pre-existing, predates audit scope). nvCOMP per-tile D2H batching break-even confirmed (variable sizes need staging buffer, no win). | Pass 3 (2026-05-10): audited f157746,39322c3,f23ec8f,1aac3b7. All 5 commits correct. Redundant .copy() in _writer.py:459,1109 and _compression.py:280 (1-2% overhead, LOW). _CloudSource.read_range() per-call open is pre-existing arch issue. No HIGH/MEDIUM regressions. SAFE. | re-audit 2026-05-02: 6 commits since 2026-04-16 (predictor=3 CPU encode/decode, GPU predictor stride fix, validate_tile_layout, BigTIFF LONG8 offsets, AREA_OR_POINT VRT, per-tile alloc guard). 1M dask chunk cap intact at __init__.py:948; adler32 batch transfer intact at _gpu_decode.py:1825. New code is metadata validation and dispatcher logic with no extra materialization or per-tile sync points. No HIGH/MEDIUM regressions." +geotiff,2026-05-10,SAFE,IO-bound,0,,"Pass 4 (2026-05-10): re-audit after #1559 (centralise attrs across all read backends). New _populate_attrs_from_geo_info helper at __init__.py:301 runs once per read, not per-chunk -- no perf impact. Probe: 2560x2560 deflate-tiled file opened via read_geotiff_dask yields 400 tasks (4 tasks/chunk for 100 chunks), well under 1M cap. read_geotiff_gpu(1024x1024) returns cupy.ndarray end-to-end with no host round-trip (226ms incl. write+decode). No new HIGH/MEDIUM findings. SAFE/IO-bound holds. | Pass 3 (2026-05-10): SAFE/IO-bound. Audited 4 perf commits: #1558 (in-place NaN writes on uniquely-owned buffers correct), #1556 (fp-predictor ngjit ~297us/tile for 256x256 float32), #1552 (single cupy.concatenate + one .get() for batched D2H at _gpu_decode.py:870-913), #1551 (parallel decode threshold >=65536px engages 256x256 default at _reader.py:1121). Bench: 8192x8192 f32 deflate+pred2 256-tile write 782ms; 4096x4096 f32 deflate read 83ms with parallel decode. Deferred LOW (none filed, all <10% MEDIUM threshold): _writer.py:459/1109 redundant .copy() before predictor encode (~1% per tile), _compression.py:280 lzw_decompress dst[:n].copy() (~2% per LZW tile decode), _writer.py:1419 seg_np.copy() before in-place NaN substitution (negligible, conditional path), _CloudSource.read_range opens fresh fsspec handle per range (pre-existing, predates audit scope). nvCOMP per-tile D2H batching break-even confirmed (variable sizes need staging buffer, no win). | Pass 3 (2026-05-10): audited f157746,39322c3,f23ec8f,1aac3b7. All 5 commits correct. Redundant .copy() in _writer.py:459,1109 and _compression.py:280 (1-2% overhead, LOW). _CloudSource.read_range() per-call open is pre-existing arch issue. No HIGH/MEDIUM regressions. SAFE. | re-audit 2026-05-02: 6 commits since 2026-04-16 (predictor=3 CPU encode/decode, GPU predictor stride fix, validate_tile_layout, BigTIFF LONG8 offsets, AREA_OR_POINT VRT, per-tile alloc guard). 1M dask chunk cap intact at __init__.py:948; adler32 batch transfer intact at _gpu_decode.py:1825. New code is metadata validation and dispatcher logic with no extra materialization or per-tile sync points. No HIGH/MEDIUM regressions." glcm,2026-03-31T18:00:00Z,SAFE,compute-bound,0,,"Downgraded to MEDIUM. da.stack without rechunk is scheduling overhead, not OOM risk." hillshade,2026-04-16T12:00:00Z,SAFE,compute-bound,0,,"Re-audit after Horn's method rewrite (PR 1175): clean stencil, map_overlap depth=(1,1), no materialization. Zero findings." hydro,2026-05-01,RISKY,memory-bound,0,1416,"Fixed-in-tree 2026-05-01: hand_mfd._hand_mfd_dask now assembles via da.map_blocks instead of eager da.block of pre-computed tiles (matches hand_dinf pattern). Remaining MEDIUM: sink_d8 CCL fully materializes labels (inherently global), flow_accumulation_mfd frac_bdry held in driver dict instead of memmap-backed BoundaryStore. D8 iterative paths (flow_accum/fill/watershed/basin/stream_*) use serial-tile sweep with memmap-backed boundary store -- per-tile RAM bounded but driver iterates O(diameter) times. flow_direction_*, flow_path/snap_pour_point/twi/hand_d8/hand_dinf are SAFE." diff --git a/.claude/sweep-security-state.csv b/.claude/sweep-security-state.csv index 257f25a59..c91c1c745 100644 --- a/.claude/sweep-security-state.csv +++ b/.claude/sweep-security-state.csv @@ -18,7 +18,7 @@ fire,2026-04-25,,,,,"Clean. Despite the module's size hint, fire.py is purely pe flood,2026-05-03,1437,MEDIUM,3,,Re-audit 2026-05-03. MEDIUM Cat 3 fixed in PR #1438 (travel_time and flood_depth_vegetation now validate mannings_n DataArray values are finite and strictly positive via _validate_mannings_n_dataarray helper). No remaining unfixed findings. Other categories clean: every allocation is same-shape as input; no flat index math; NaN propagation explicit in every backend; tan_slope clamped by _TAN_MIN; no CUDA kernels; no file I/O; every public API calls _validate_raster on DataArray inputs. focal,2026-04-27,1284,HIGH,1,,"HIGH (fixed PR #1286): apply(), focal_stats(), and hotspots() accepted unbounded user-supplied kernels via custom_kernel(), which only checks shape parity. The kernel-size guard from #1241 (_check_kernel_memory) only ran inside circle_kernel/annulus_kernel, so a (50001, 50001) custom kernel on a 10x10 raster allocated ~10 GB on the kernel itself plus a much larger padded raster before any work -- same shape as the bilateral DoS in #1236. Fixed by adding _check_kernel_vs_raster_memory in focal.py and wiring it into apply(), focal_stats(), and hotspots() after custom_kernel() validation. All 134 focal tests + 19 bilateral tests pass. No other findings: 10 CUDA kernels all have proper bounds + stencil guards; _validate_raster called on every public entry point; hotspots already raises ZeroDivisionError on constant-value rasters; _focal_variety_cuda uses a fixed-size local buffer (silent truncation but bounded); _focal_std_cuda/_focal_var_cuda clamp the catastrophic-cancellation case via if var < 0.0: var = 0.0; no file I/O." geodesic,2026-04-27,1283,HIGH,1,,"HIGH (fixed PR #1285): slope(method='geodesic') and aspect(method='geodesic') stack a (3, H, W) float64 array (data, lat, lon) before dispatch with no memory check. A large lat/lon-tagged raster passed to either function would OOM. Fixed by adding _check_geodesic_memory(rows, cols) in xrspatial/geodesic.py (mirrors morphology._check_kernel_memory): budgets 56 bytes/cell (24 stacked float64 + 4 float32 output + 24 padded copy + slack) and raises MemoryError when > 50% of available RAM; called from slope.py and aspect.py inside the geodesic branch before dispatch. No other findings: 6 CUDA kernels all have bounds guards (e.g. _run_gpu_geodesic_aspect at geodesic.py:395), custom 16x16 thread blocks avoid register spill, no shared memory, _validate_raster runs upstream in slope/aspect, all backends cast to float32, slope_mag < 1e-7 flat threshold prevents arctan2 NaN propagation, curvature correction uses hardcoded WGS84 R." -geotiff,2026-05-10,,,,,"Pass 10 (2026-05-10): CLEAN. Audited 8 commits since pass 9: #1558 (drop nodata copies - buffers uniquely owned), #1556 (fp-predictor ngjit row loop - bit-exact), #1552 (batched GDS-fallback D2H - _check_gpu_memory guards new staging), #1551 (parallel decode at tile_size=256 - pool bounded by min(n_tiles, cpu_count), mmap thread-safe read-only), #1549 (nvjpeg constants - actually closed CRITICAL: prior off-by-two passed PLANAR where INTERLEAVED expected, dereferenced NULL channel ptrs), #1542 (GPU nodata mask - called once per path, stripped fallback returns early), #1543 (combined backend tests-only), #1544 (__all__/deprecation no security surface). Re-verified guards intact: MAX_PIXELS_DEFAULT, MAX_IFD_ENTRY_COUNT, MAX_IFDS, MAX_TILE_BYTES_DEFAULT (256 MiB), _MAX_DASK_CHUNKS, _check_gpu_memory, _MmapCache TOCTOU, VRT realpath, LZW (sp<4096) and Deflate (overflow array) bounds. No issues opened. | Pass 2 (2026-05-09): Found unbounded compressed-byte-count fetch in HTTP COG path (_fetch_decode_cog_http_tiles passes attacker-controlled TileByteCounts to read_ranges_coalesced). Verified exploit: 4 tiles x 100 MB byte_count -> single 100 MB Range GET. Fixed via per-tile cap (MAX_TILE_BYTES_DEFAULT=256 MiB, XRSPATIAL_COG_MAX_TILE_BYTES env override). Other recent PRs (#1530 IFD chain cap, #1527 IFD count, #1535 LERC mask, #1532 PlanarConfig=2, #1534 HTTP coalesce, #1531 parallel write, #1528 nvCOMP batch) audited clean. | Prior: Re-audit 2026-04-28: only #7984f7a since prior pass (predictor=3 multi-band CPU fix, LERC dedup, VRT nodata dtype, AREA_OR_POINT, BigTIFF LONG8 offsets). LZW/inflate GPU kernels gate every write on out_pos single 100 MB Range GET. Fixed via per-tile cap (MAX_TILE_BYTES_DEFAULT=256 MiB, XRSPATIAL_COG_MAX_TILE_BYTES env override). Other recent PRs (#1530 IFD chain cap, #1527 IFD count, #1535 LERC mask, #1532 PlanarConfig=2, #1534 HTTP coalesce, #1531 parallel write, #1528 nvCOMP batch) audited clean. | Prior: Re-audit 2026-04-28: only #7984f7a since prior pass (predictor=3 multi-band CPU fix, LERC dedup, VRT nodata dtype, AREA_OR_POINT, BigTIFF LONG8 offsets). LZW/inflate GPU kernels gate every write on out_pos= 3 and distance only as >= 1, with no upper bound on either. _glcm_numba_kernel iterates range(r-half, r+half+1) for every pixel, so window_size=1_000_001 on a 10x10 raster ran ~10^14 loop iterations with all neighbors failing the interior bounds check (CPU DoS). On the dask backends depth = window_size // 2 + distance drove map_overlap padding, so a huge window also caused oversize per-chunk allocations (memory DoS). Fixed by adding max_val caps in the public entrypoint: window_size <= max(3, min(rows, cols)) and distance <= max(1, window_size // 2). One cap covers every backend because cupy and dask+cupy call through to the CPU kernel after cupy.asnumpy. No other HIGH findings: levels is already capped at 256 so the per-pixel np.zeros((levels, levels)) matrix in the kernel is bounded to 512 KB. No CUDA kernels. No file I/O. Quantization clips to [0, levels-1] before the kernel and NaN maps to -1 which the kernel filters with i_val >= 0. Entropy log(p) and correlation p / (std_i * std_j) are both guarded. All four backends use _validate_raster and cast to float64 before quantizing. MEDIUM (unfixed, Cat 1): the per-pixel np.zeros((levels, levels)) allocation inside the hot loop is a perf issue (levels=256 -> 512 KB alloc+free per pixel) but not a security issue because levels is bounded. Could be hoisted out of the loop or replaced with an in-place clear, but that is an efficiency concern, not security." gpu_rtx,2026-04-29,1308,HIGH,1,,"HIGH (fixed #1308 / PR #1310): hillshade_rtx (gpu_rtx/hillshade.py:184) and viewshed_gpu (gpu_rtx/viewshed.py:269) allocated cupy device buffers sized by raster shape with no memory check. create_triangulation (mesh_utils.py:23-24) adds verts (12 B/px) + triangles (24 B/px) = 36 B/px; hillshade_rtx adds d_rays(32) + d_hits(16) + d_aux(12) + d_output(4) = 64 B/px (100 B/px total); viewshed_gpu adds d_rays(32) + d_hits(16) + d_visgrid(4) + d_vsrays(32) = 84 B/px (120 B/px total). A 30000x30000 raster asked for 90-108 GB of VRAM before cupy surfaced an opaque allocator error. Fixed by adding gpu_rtx/_memory.py with _available_gpu_memory_bytes() and _check_gpu_memory(func_name, h, w) helpers (cost_distance #1262 / sky_view_factor #1299 pattern, 120 B/px budget covers worst case, raises MemoryError when required > 50% of free VRAM, skips silently when memGetInfo() unavailable). Wired into both entry points after the cupy.ndarray type check and before create_triangulation. 9 new tests in test_gpu_rtx_memory.py (5 helper-unit + 4 end-to-end gated on has_rtx). All 81 existing hillshade/viewshed tests still pass. Cat 4 clean: all CUDA kernels (hillshade.py:25/62/106, viewshed.py:32/74/116, mesh_utils.py:50) have bounds guards; no shared memory, no syncthreads needed. MEDIUM not fixed (Cat 6): hillshade_rtx and viewshed_gpu do not call _validate_raster directly but parent hillshade() (hillshade.py:252) and viewshed() (viewshed.py:1707) already validate, so input validation runs before the gpu_rtx entry point - defense-in-depth, not exploitable. MEDIUM not fixed (Cat 2): mesh_utils.py:64-68 cast mesh_map_index to int32 in the triangle index buffer; overflows at H*W > 2.1B vertices (~46341x46341+) but the new memory guard rejects rasters that large first - documentation/clarity item rather than exploitable. MEDIUM not fixed (Cat 3): mesh_utils.py:19 scale = maxDim / maxH divides by zero on an all-zero raster, propagating inf/NaN into mesh vertex z-coords; separate follow-up. LOW not fixed (Cat 5): mesh_utils.write() opens user-supplied path without canonicalization but its only call site (mesh_utils.py:38-39) sits behind if False: in create_triangulation, not reachable in production." hillshade,2026-04-27,,,,,"Clean. Cat 1: only allocation is the output np.empty(data.shape) at line 32 (cupy at line 165) and a _pad_array with hardcoded depth=1 (line 62) -- bounded by caller, no user-controlled amplifier. Azimuth/altitude are scalars and don't drive size. Cat 2: numba kernel uses range(1, rows-1) with simple (y, x) indexing; numba range loops promote to int64. Cat 3: math.sqrt(1.0 + xx_plus_yy) is always >= 1.0 (no neg sqrt, no div-by-zero); NaN elevation propagates correctly through dz_dx/dz_dy -> shaded -> output (the shaded < 0.0 / shaded > 1.0 clamps don't fire on NaN). Azimuth validated to [0, 360], altitude to [0, 90]. Cat 4: _gpu_calc_numba (line 107) guards both grid bounds and 3x3 stencil reads via i > 0 and i < shape[0]-1 and j > 0 and j < shape[1]-1; no shared memory. Cat 5: no file I/O. Cat 6: hillshade() calls _validate_raster (line 252) and _validate_scalar for both azimuth (253) and angle_altitude (254); all four backend paths cast to float32; tests parametrize int32/int64/float32/float64." diff --git a/xrspatial/geotiff/__init__.py b/xrspatial/geotiff/__init__.py index 37c69fb46..9597e95c0 100644 --- a/xrspatial/geotiff/__init__.py +++ b/xrspatial/geotiff/__init__.py @@ -2211,12 +2211,20 @@ def write_geotiff_gpu(data, path: str, *, # Extract array and metadata geo_transform = None epsg = None + wkt_fallback = None # WKT string when EPSG is not available raster_type = 1 + gdal_meta_xml = None + extra_tags_list = None + x_res = None + y_res = None + res_unit = None if isinstance(crs, int): epsg = crs elif isinstance(crs, str): epsg = _wkt_to_epsg(crs) + if epsg is None: + wkt_fallback = crs if isinstance(data, xr.DataArray): arr = data.data @@ -2229,13 +2237,56 @@ def write_geotiff_gpu(data, path: str, *, else: arr = cupy.asarray(np.asarray(arr)) # numpy -> GPU - geo_transform = _coords_to_transform(data) - if epsg is None: - epsg = data.attrs.get('crs') + # Prefer attrs['transform'] over the coord-derived transform: it + # is bit-stable across round-trips, while _coords_to_transform + # can drift on fractional pixel sizes (the same reasoning the + # CPU to_geotiff path applies for issue #1484). + geo_transform = _transform_from_attr(data.attrs.get('transform')) + if geo_transform is None: + geo_transform = _coords_to_transform(data) + # Resolve CRS the same way the CPU writer does. attrs['crs'] may + # be an int EPSG or a WKT string; attrs['crs_wkt'] only carries + # WKT. Without the WKT branch the GPU writer silently drops CRS + # on files whose original CRS only resolves to WKT (no recognized + # EPSG). + if epsg is None and crs is None: + crs_attr = data.attrs.get('crs') + if isinstance(crs_attr, str): + epsg = _wkt_to_epsg(crs_attr) + if epsg is None and wkt_fallback is None: + wkt_fallback = crs_attr + elif crs_attr is not None: + epsg = int(crs_attr) + if epsg is None: + wkt = data.attrs.get('crs_wkt') + if isinstance(wkt, str): + epsg = _wkt_to_epsg(wkt) + if epsg is None and wkt_fallback is None: + wkt_fallback = wkt if nodata is None: nodata = data.attrs.get('nodata') if data.attrs.get('raster_type') == 'point': raster_type = RASTER_PIXEL_IS_POINT + # Mirror the CPU writer's pass-through of GDAL metadata, the + # extra_tags list, the friendly image_description / extra_samples + # / colormap synthesis, and the resolution tags. Without these, + # a GPU write -> CPU read round-trip silently drops every rich + # tag (#1563). + gdal_meta_xml = data.attrs.get('gdal_metadata_xml') + if gdal_meta_xml is None: + gdal_meta_dict = data.attrs.get('gdal_metadata') + if isinstance(gdal_meta_dict, dict): + from ._geotags import _build_gdal_metadata_xml + gdal_meta_xml = _build_gdal_metadata_xml(gdal_meta_dict) + extra_tags_list = data.attrs.get('extra_tags') + extra_tags_list = _merge_friendly_extra_tags( + extra_tags_list, data.attrs) + x_res = data.attrs.get('x_resolution') + y_res = data.attrs.get('y_resolution') + unit_str = data.attrs.get('resolution_unit') + if unit_str is not None: + _unit_ids = {'none': 1, 'inch': 2, 'centimeter': 3} + res_unit = _unit_ids.get(str(unit_str), None) else: if hasattr(data, 'compute'): data = data.compute() # Dask -> CuPy or numpy @@ -2297,7 +2348,14 @@ def _gpu_compress_to_part(gpu_arr, w, h, spp): width, height, np_dtype, comp_tag, pred_val, True, tile_size, parts, geo_transform, epsg, nodata, is_cog=(cog and len(parts) > 1), - raster_type=raster_type) + raster_type=raster_type, + crs_wkt=wkt_fallback if epsg is None else None, + gdal_metadata_xml=gdal_meta_xml, + extra_tags=extra_tags_list, + x_resolution=x_res, + y_resolution=y_res, + resolution_unit=res_unit, + ) _write_bytes(file_bytes, path) diff --git a/xrspatial/geotiff/tests/test_gpu_writer_attrs_1563.py b/xrspatial/geotiff/tests/test_gpu_writer_attrs_1563.py new file mode 100644 index 000000000..94bfaac6d --- /dev/null +++ b/xrspatial/geotiff/tests/test_gpu_writer_attrs_1563.py @@ -0,0 +1,308 @@ +"""Cross-backend write-path attrs parity tests for issue #1563. + +Before the fix, ``write_geotiff_gpu`` silently dropped most metadata +attrs that the CPU ``to_geotiff`` preserves: + +* ``crs_wkt`` (the WKT-only CRS fallback) +* ``gdal_metadata`` / ``gdal_metadata_xml`` +* ``extra_tags`` +* ``image_description`` / ``extra_samples`` / ``colormap`` +* ``x_resolution`` / ``y_resolution`` / ``resolution_unit`` + +It also ignored ``attrs['transform']`` and re-derived the GeoTransform +from coords, which drifts on fractional pixel sizes. + +These tests pin the contract that the GPU writer mirrors the CPU +writer's attr-resolution logic. +""" +from __future__ import annotations + +import importlib.util + +import numpy as np +import pytest +import xarray as xr + +from xrspatial.geotiff import open_geotiff, write_geotiff_gpu +from xrspatial.geotiff._geotags import GeoTransform, _epsg_to_wkt +from xrspatial.geotiff._writer import write + + +def _gpu_available() -> bool: + if importlib.util.find_spec("cupy") is None: + return False + try: + import cupy + return bool(cupy.cuda.is_available()) + except Exception: + return False + + +_HAS_GPU = _gpu_available() +_gpu_only = pytest.mark.skipif(not _HAS_GPU, reason="cupy + CUDA required") + + +@_gpu_only +def test_crs_wkt_only_attr_round_trips(tmp_path): + """When the source CRS only resolves to WKT (no EPSG attr), the GPU + writer must still emit GeoKeys so the file has a CRS at all.""" + import cupy + wkt = _epsg_to_wkt(4326) + if wkt is None: + pytest.skip("pyproj not available") + arr = np.arange(64, dtype=np.float32).reshape(8, 8) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={'crs_wkt': wkt}, + ) + out = str(tmp_path / 'crs_wkt_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + # The WKT should resolve back to EPSG 4326 on read. + assert rd.attrs.get('crs') == 4326, ( + f"crs_wkt was dropped on the GPU write path; " + f"got attrs={sorted(rd.attrs.keys())}" + ) + + +@_gpu_only +def test_image_description_round_trips_via_gpu_writer(tmp_path): + import cupy + arr = np.zeros((8, 8), dtype=np.float32) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={'crs': 4326, 'image_description': 'gpu-attr-test-1563'}, + ) + out = str(tmp_path / 'desc_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + assert rd.attrs.get('image_description') == 'gpu-attr-test-1563' + + +@_gpu_only +def test_extra_samples_single_band_writer_compat(tmp_path): + """Single-band write with ``extra_samples`` set must not crash, even + though TIFF 6.0 only surfaces ExtraSamples on multi-sample images + (the reader drops it for 1-sample files). The multi-band case is + covered by ``test_extra_samples_round_trips_multiband_via_gpu_writer`` + below. + """ + import cupy + arr = np.zeros((8, 8), dtype=np.float32) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={'crs': 4326, 'extra_samples': [1]}, + ) + out = str(tmp_path / 'es_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + assert rd.attrs.get('crs') == 4326 + + +@_gpu_only +def test_extra_samples_round_trips_multiband_via_gpu_writer(tmp_path): + """Multi-band write: ExtraSamples surfaces on the reader because + SamplesPerPixel > 1. The assembler auto-synthesizes the tag for + multi-band minisblack (same behaviour as the CPU writer), so we + just assert the attr appears -- pinning that the GPU path reaches + the multi-band writer code without dropping ExtraSamples entirely. + """ + import cupy + arr = np.zeros((4, 5, 2), dtype=np.uint8) + arr[..., 1] = 255 + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x', 'band'], + coords={'y': np.arange(4.0), 'x': np.arange(5.0)}, + attrs={'crs': 4326}, + ) + out = str(tmp_path / 'es_multi_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + es = rd.attrs.get('extra_samples') + assert es is not None, ( + f"extra_samples dropped on multi-band GPU write; " + f"attrs={sorted(rd.attrs.keys())}" + ) + + +@_gpu_only +def test_colormap_round_trips_via_gpu_writer(tmp_path): + """A uint8 raster with a 768-entry colormap (3*256 uint16 values for + R/G/B) must surface ``attrs['colormap']`` after the GPU round-trip -- + the synthesized tag 320 entry rides in ``extra_tags`` and the read + path projects it back onto the friendly attr.""" + import cupy + palette = [] + for ch_offset in (0, 1, 2): + for i in range(256): + palette.append((i * 257 + ch_offset) & 0xFFFF) + assert len(palette) == 768 + pixels = np.array([[0, 1, 2, 254, 255], + [10, 20, 30, 40, 50]], dtype=np.uint8) + da_gpu = xr.DataArray( + cupy.asarray(pixels), + dims=['y', 'x'], + coords={'y': np.arange(2.0), 'x': np.arange(5.0)}, + attrs={'crs': 4326, 'colormap': palette}, + ) + out = str(tmp_path / 'cmap_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + rd_cmap = rd.attrs.get('colormap') + assert rd_cmap is not None, ( + f"colormap dropped on GPU write; attrs={sorted(rd.attrs.keys())}" + ) + assert tuple(rd_cmap) == tuple(palette) + + +@_gpu_only +def test_extra_tags_custom_tag_round_trips_via_gpu_writer(tmp_path): + """A user-supplied ``extra_tags`` entry (here: Software, tag 305, + ASCII) must be forwarded to ``_assemble_tiff`` and reappear in + ``attrs['extra_tags']`` on read.""" + import cupy + arr = np.zeros((8, 8), dtype=np.float32) + software = "xrspatial-1563-test" + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={ + 'crs': 4326, + 'extra_tags': [(305, 2, len(software) + 1, software)], + }, + ) + out = str(tmp_path / 'extra_tags_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + et = rd.attrs.get('extra_tags') or [] + by_id = {t[0]: t for t in et} + assert 305 in by_id, ( + f"extra_tags Software (305) dropped on GPU write; got ids " + f"{sorted(by_id.keys())}" + ) + value = by_id[305][3] + # ASCII values may be returned as bytes or str; strip NUL terminator. + if isinstance(value, bytes): + value = value.decode('ascii') + assert value.rstrip('\x00') == software + + +@_gpu_only +def test_resolution_tags_round_trip_via_gpu_writer(tmp_path): + """``x_resolution`` / ``y_resolution`` / ``resolution_unit`` must + survive a GPU write -> CPU read cycle. The writer maps the unit + string back to its TIFF id (1=none, 2=inch, 3=centimeter).""" + import cupy + arr = np.zeros((8, 8), dtype=np.float32) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={ + 'crs': 4326, + 'x_resolution': 300.0, + 'y_resolution': 300.0, + 'resolution_unit': 'inch', + }, + ) + out = str(tmp_path / 'res_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + assert rd.attrs.get('x_resolution') == pytest.approx(300.0, rel=0.01), ( + f"x_resolution drift: got {rd.attrs.get('x_resolution')}" + ) + assert rd.attrs.get('y_resolution') == pytest.approx(300.0, rel=0.01), ( + f"y_resolution drift: got {rd.attrs.get('y_resolution')}" + ) + assert rd.attrs.get('resolution_unit') == 'inch', ( + f"resolution_unit drift: got {rd.attrs.get('resolution_unit')}" + ) + + +@_gpu_only +def test_gdal_metadata_round_trips_via_gpu_writer(tmp_path): + import cupy + arr = np.zeros((8, 8), dtype=np.float32) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(8.0), 'x': np.arange(8.0)}, + attrs={'crs': 4326, + 'gdal_metadata': {'AREA_OR_POINT': 'Area', + 'CUSTOM_KEY': 'val_1563'}}, + ) + out = str(tmp_path / 'gdal_meta_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + meta = rd.attrs.get('gdal_metadata') or {} + assert meta.get('CUSTOM_KEY') == 'val_1563', ( + f"gdal_metadata was dropped on the GPU write path; " + f"got {meta}" + ) + + +@_gpu_only +def test_transform_attr_round_trip_bit_stable(tmp_path): + """Reading a file with a fractional pixel size, then writing it back + through ``write_geotiff_gpu`` must preserve ``attrs['transform']`` + bit-for-bit (the CPU writer guarantees this; the GPU writer used to + drop the attr and recompute from coords, which drifts). + """ + arr = np.arange(64, dtype=np.float32).reshape(8, 8) + gt = GeoTransform( + origin_x=-122.123456789, + origin_y=37.987654321, + pixel_width=1.0 / 3600.0 + 1e-12, + pixel_height=-(1.0 / 3600.0 + 1e-12), + ) + src = str(tmp_path / 'frac_in_1563.tif') + write(arr, src, geo_transform=gt, crs_epsg=4326, + compression='none', tiled=False) + eager_in = open_geotiff(src) + + da_gpu = open_geotiff(src, gpu=True) + out = str(tmp_path / 'frac_out_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + after = open_geotiff(out) + assert after.attrs['transform'] == eager_in.attrs['transform'], ( + f"transform drifted on GPU round-trip:\n" + f" in : {eager_in.attrs['transform']}\n" + f" out: {after.attrs['transform']}" + ) + + +@_gpu_only +def test_no_data_attr_still_round_trips_after_fix(tmp_path): + """Regression guard: the existing nodata + raster_type pass-through + still works after wiring in the new attrs.""" + import cupy + arr = np.array([[1.0, 2.0], [3.0, 4.0]], dtype=np.float32) + da_gpu = xr.DataArray( + cupy.asarray(arr), + dims=['y', 'x'], + coords={'y': np.arange(2.0), 'x': np.arange(2.0)}, + attrs={'crs': 4326, 'nodata': -9999.0, 'raster_type': 'point'}, + ) + out = str(tmp_path / 'nodata_1563.tif') + write_geotiff_gpu(da_gpu, out, compression='none') + + rd = open_geotiff(out) + assert rd.attrs.get('nodata') == -9999.0 + assert rd.attrs.get('raster_type') == 'point'